uva-dare (digital academic repository)as the \webklas" on the riemann hypothesis organised by...

164
UvA-DARE is a service provided by the library of the University of Amsterdam (https://dare.uva.nl) UvA-DARE (Digital Academic Repository) Algebraic complexity, asymptotic spectra and entanglement polytopes Zuiddam, J. Publication date 2018 Document Version Final published version License Other Link to publication Citation for published version (APA): Zuiddam, J. (2018). Algebraic complexity, asymptotic spectra and entanglement polytopes. Institute for Logic, Language and Computation. General rights It is not permitted to download or to forward/distribute the text or part of it without the consent of the author(s) and/or copyright holder(s), other than for strictly personal, individual use, unless the work is under an open content license (like Creative Commons). Disclaimer/Complaints regulations If you believe that digital publication of certain material infringes any of your rights or (privacy) interests, please let the Library know, stating your reasons. In case of a legitimate complaint, the Library will make the material inaccessible and/or remove it from the website. Please Ask the Library: https://uba.uva.nl/en/contact, or a letter to: Library of the University of Amsterdam, Secretariat, Singel 425, 1012 WP Amsterdam, The Netherlands. You will be contacted as soon as possible. Download date:29 Aug 2021

Upload: others

Post on 04-Jun-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch

UvA-DARE is a service provided by the library of the University of Amsterdam (httpsdareuvanl)

UvA-DARE (Digital Academic Repository)

Algebraic complexity asymptotic spectra and entanglement polytopes

Zuiddam J

Publication date2018Document VersionFinal published versionLicenseOther

Link to publication

Citation for published version (APA)Zuiddam J (2018) Algebraic complexity asymptotic spectra and entanglement polytopesInstitute for Logic Language and Computation

General rightsIt is not permitted to download or to forwarddistribute the text or part of it without the consent of the author(s)andor copyright holder(s) other than for strictly personal individual use unless the work is under an opencontent license (like Creative Commons)

DisclaimerComplaints regulationsIf you believe that digital publication of certain material infringes any of your rights or (privacy) interests pleaselet the Library know stating your reasons In case of a legitimate complaint the Library will make the materialinaccessible andor remove it from the website Please Ask the Library httpsubauvanlencontact or a letterto Library of the University of Amsterdam Secretariat Singel 425 1012 WP Amsterdam The Netherlands Youwill be contacted as soon as possible

Download date29 Aug 2021

Algebraic complexity

asymptotic spectra and

entanglement polytopes

Jeroen Zuiddam

Algebraic complexity

asymptotic spectra and

entanglement polytopes

ILLC Dissertation Series DS-2018-13

For further information about ILLC-publications please contact

Institute for Logic Language and ComputationUniversiteit van Amsterdam

Science Park 1071098 XG Amsterdam

phone +31-20-525 6051e-mail illcuvanl

homepage httpwwwillcuvanl

The investigations were supported by the Netherlands Organization for ScientificResearch NWO (617023116) the European Commission and the QuSoft ResearchCenter for Quantum Software

Copyright ccopy 2018 by Jeroen Zuiddam

ISBN 978-94-028-1175-9

Algebraic complexity

asymptotic spectra and

entanglement polytopes

Academisch Proefschrift

ter verkrijging van de graad van doctoraan de Universiteit van Amsterdamop gezag van de Rector Magnificus

prof dr ir KIJ Maexten overstaan van een door het College voor Promoties ingestelde

commissie in het openbaar te verdedigen in de Agnietenkapelop dinsdag 23 oktober 2018 te 1200 uur

door

Jeroen Zuiddam

geboren te Leiderdorp

Promotiecommisie

Promotores prof dr HM Buhrman Universiteit van Amsterdamprof dr M Christandl Koslashbenhavns Universitet

Overige leden prof dr M Laurent Tilburg Universityprof dr EM Opdam Universiteit van Amsterdamprof dr RM de Wolf Universiteit van Amsterdamdr J Briet CWI Amsterdamdr M Walter Universiteit van Amsterdam

Faculteit der Natuurwetenschappen Wiskunde en Informatica

Contents

Acknowledgements ix

1 Introduction 311 Matrix multiplication 512 The asymptotic spectrum of tensors 613 Higher-order CW method 1014 Abstract asymptotic spectra 1115 The asymptotic spectrum of graphs 1216 Tensor degeneration 1417 Combinatorial degeneration 1518 Algebraic branching program degeneration 1519 Organisation 17

2 The theory of asymptotic spectra 1921 Introduction 1922 Semirings and preorders 1923 Strassen preorders 2024 Asymptotic preorders 4sim 2125 Maximal Strassen preorders 2326 The asymptotic spectrum X(S6) 2527 The representation theorem 2628 Abstract rank and subrank RQ 2729 Topological aspects 29210 Uniqueness 30211 Subsemirings 31212 Subsemirings generated by one element 32213 Universal spectral points 33

v

214 Conclusion 33

3 The asymptotic spectrum of graphs Shannon capacity 3531 Introduction 3532 The asymptotic spectrum of graphs 37

321 The semiring of graph isomorphism classes G 37322 Strassen preorder via graph homomorphisms 38323 The asymptotic spectrum of graphs X(G) 39324 Shannon capacity Θ 39

33 Universal spectral points 41331 Lovasz theta number ϑ 41332 Fractional graph parameters 41

34 Conclusion 46

4 The asymptotic spectrum of tensors matrix multiplication 4741 Introduction 4742 The asymptotic spectrum of tensors 49

421 The semiring of tensor equivalence classes T 49422 Strassen preorder via restriction 49423 The asymptotic spectrum of tensors X(T ) 49424 Asymptotic rank and asymptotic subrank 50

43 Gauge points ζ(i) 5144 Support functionals ζθ 5245 Upper and lower support functionals ζθ ζθ 5646 Asymptotic slice rank 5847 Conclusion 63

5 Tight tensors and combinatorial subrank cap sets 6551 Introduction 6552 Higher-order CoppersmithndashWinograd method 68

521 Construction 69522 Computational remarks 77523 Examples type sets 78

53 Combinatorial degeneration method 7954 Cap sets 81

541 Reduced polynomial multiplication 81542 Cap sets 82

55 Graph tensors 8556 Conclusion 86

6 Universal points in the asymptotic spectrum of tensors entan-glement polytopes moment polytopes 8761 Introduction 87

vi

62 SchurndashWeyl duality 8863 Kronecker and LittlewoodndashRichardson coefficients gλmicroν c

λmicroν 90

64 Entropy inequalities 9165 Hilbert spaces and density operators 9266 Moment polytopes P(t) 93

661 General setting 93662 Tensor spaces 94

67 Quantum functionals F θ(t) 9568 Outer approximation 10069 Inner approximation for free tensors 101610 Quantum functionals versus support functionals 102611 Asymptotic slice rank 103612 Conclusion 105

7 Algebraic branching programs approximation and nondetermi-nism 10771 Introduction 10772 Definitions and basic results 110

721 Computational models 110722 Complexity classes VP VPe VPk 111723 The theorem of Ben-Or and Cleve 112724 Approximation closure C 115725 Nondeterminism closure N(C) 115

73 Approximation closure of VP2 11674 Nondeterminism closure of VP1 11975 Conclusion 122

Bibliography 125

Glossary 139

Samenvatting 141

Summary 143

vii

Acknowledgements

First of all I thank all my coauthors for very fruitful collaboration Harry BuhrmanMatthias Christandl Peter Vrana Jop Briet Chris Perry Asger Jensen MarkusBlaser Christian Ikenmeyer and Karl Bringmann

Chris Zaal Leen Torenvliet and Robert Belleman I thank for all their effortsto set up for me the ldquodouble bachelor programmerdquo in Mathematics and Computerscience at the University of Amsterdam (UvA) in 2009 This programme as wellas the ldquowebklasrdquo on the Riemann hypothesis organised by Jan van de Craats andRoland van der Veen and the close vicinity of the UvA to the Dutch nationalresearch institute for mathematics and computer science (CWI) made me decideto come to Amsterdam My enjoyable master thesis project in mathematics withEric Opdam made me follow the academic path for which I thank Eric

Of course most importantly I thank my PhD supervisor Harry Buhrmanfor introducing me to research as a bachelor student for absorbing me into theAlgorithms and Complexity group at CWI for having enough faith in me to hireme as his PhD student in 2014 and for his general guidance throughout I feelvery lucky for the opportunities and scientific freedom that this has brought me

Matthias Christandl has been my closest collaborator and mentor since wemet in Berkeley in 2014 In practice this meant countless nights of fun Skypesessions between Amsterdam and Copenhagen countless enjoyable visits to theUniversity of Copenhagen and countless kitchen table sessions at the HallinsgadeThanks Matthias for the energy inspiration and optimism And thanks Matthiasand Henriette for the hospitality

Jop Briet I thank for his general guidance and for lots of inspiration Thepolynomial method reading group which he mainly organised inspired partof my paper with Matthias Christandl and Peter Vrana on universal points inthe asymptotic spectrum of tensors (This reading group also resulted in DionGijswijtrsquos paper on cap sets) My paper with Jop on round elimination laterinspired me to write the paper on the asymptotic spectrum of graphs

ix

Christian Ikenmeyer I thank for numerous inspiring discussions on algebraiccomplexity theory and tensors which greatly influenced my papers on tensor rankand our joint paper with Karl Bringmann on algebraic branching programs

Peter Vrana I thank for our many enjoyable research collaborations the resultsof which form a central part of this dissertation for his clever insights and forfinding several mathematical mistakes while reading the draft of this dissertation

Ronald de Wolf I thank for his general advice throughout my PhD and formany suggestions regarding the current version of this dissertation which will beincorporated in the next version (but not in the printed version because of theregulations of the University of Amsterdam)

Jop Briet Monique Laurent Lex Schrijver Peter Vrana Matthias ChristandlMaris Ozols Michael Walter and Bart Sevenster I thank for helpful discussionsregarding the results in Chapter 2 and Chapter 3 of this dissertation

Srinivasan Arunachalam I thank for sharing the ups and downs during ourfour years as PhD students at CWI Florian Speelman Farrokh Labib SvenPolak Bart Litjens and Bart Sevenster I thank for numerous valuable researchdiscussions

Bikkie Aldeias and Rob van Rooijen I thank for their excellent library servicesMartijn Zuiddam and Maris Ozols I thank for proofreading the draft of this

dissertationFinally I thank my parents and my brothers and my friends for their support

Amsterdam Jeroen ZuiddamAugust 31 2018

x

Publications

[BCPZ16] Harry Buhrman Matthias Christandl Christopher Perry and JeroenZuiddam Clean quantum and classical communication protocolsPhysical Review Letters 117230503 2016httpslinkapsorgdoi101103PhysRevLett117230503

httparxivorgabs160507948

[BCZ17a] Markus Blaser Matthias Christandl and Jeroen Zuiddam The bordersupport rank of two-by-two matrix multiplication is sevenManuscript 2017httpsarxivorgabs170509652

[BCZ17b] Harry Buhrman Matthias Christandl and Jeroen ZuiddamNondeterministic Quantum Communication Complexity the CyclicEquality Game and Iterated Matrix MultiplicationIn Christos H Papadimitriou editor 8th Innovations in TheoreticalComputer Science Conference (ITCS) 2017httpdropsdagstuhldeopusvolltexte20178181

[BIZ17] Karl Bringmann Christian Ikenmeyer and Jeroen Zuiddam On algebraicbranching programs of small widthIn Ryan OrsquoDonnell editor 32nd Computational ComplexityConference (CCC) 2017httpsdoiorg104230LIPIcsCCC201720

httpsarxivorgabs170205328

Journal of the ACM Vol 65 No 5 Article 32 2018httpsdoiorg1011453209663

1

2 Acknowledgements

[BZ17] Jop Briet and Jeroen Zuiddam On the orthogonal rank of Cayley graphsand impossibility of quantum round eliminationQuantum Information and Computation 2017httpwwwrintonpresscomxxqic17qic-17-120106-0116pdf

httpsarxivorgabs160806113

[CJZ18] Matthias Christandl Asger Kjaeligrulff Jensen and Jeroen ZuiddamTensor rank is not multiplicative under the tensor productLinear Algebra and its Applications 543125ndash139 2018httpsdoiorg101016jlaa201712020

httpsarxivorgabs170509379

[CVZ16] Matthias Christandl Peter Vrana and Jeroen Zuiddam Asymptotic tensorrank of graph tensors beyond matrix multiplicationManuscript 2016httpsarxivorgabs160907476

[CVZ18] Matthias Christandl Peter Vrana and Jeroen Zuiddam Universal Pointsin the Asymptotic Spectrum of Tensors Extended AbstractIn Proceedings of 50th Annual ACM SIGACT Symposium on the Theory ofComputing (STOC) 2018httpsdoiorg10114531887453188766

httpsarxivorgabs170907851

[CZ18] Matthias Christandl and Jeroen Zuiddam Tensor surgery and tensor rank(Journal of) computational complexity 2018httpsdoiorg101007s00037-018-0164-8

httpsarxivorgabs160604085

[Zui17] Jeroen Zuiddam A note on the gap between rank and border rankLinear Algebra and its Applications 52533ndash44 2017httpsdoiorg101016jlaa201703015

httparxivorgabs150405597

[Zui18] Jeroen Zuiddam The asymptotic spectrum of graphs and the ShannoncapacityManuscript 2018httparxivorgabs180700169

This dissertation is based on the above papers with primary focus on the fourhighlighted papers

Toelichting op het relatieve belang van de co-auteurs voor elk artikel is hetbelang van de co-auteurs ongeveer gelijk verdeeld

Chapter 1

Introduction

Volker Strassen published in 1969 his famous algorithm for multiplying anytwo ntimes n matrices using only O(n281) rather than O(n3) arithmetical opera-tions [Str69] His discovery marked the beginning of a still ongoing line of researchin the field of algebraic complexity theory a line of research that by now touchesseveral fields of mathematics including algebraic geometry representation theory(quantum) information theory and combinatorics This dissertation is inspired byand contributes to this line of research

No further progress followed for almost 10 years after Strassenrsquos discoverydespite the fact that ldquomany scientists understood that discovery as a signal toattack the problem and to push the exponent further downrdquo [Pan84] Then in 1978Pan improved the exponent from 281 to 279 [Pan78 Pan80] One year later BiniCapovani Lotti and Romani improved the exponent to 278 by constructing fastldquoapproximativerdquo algorithms for matrix multiplication and making these algorithmsexact via the method of interpolation [BCRL79 Bin80] Cast in the languageof tensors the result of Bini et al corresponds to what we now call a ldquoborderrankrdquo upper bound The idea of studying approximative complexity or bordercomplexity for algebraic problems has nowadays become an important theme inalgebraic complexity theory

Schonhage then obtained the exponent 255 by constructing a fast algorithm forcomputing many ldquodisjointrdquo small matrix multiplications and transforming this intoan algorithm for one large matrix multiplication [Sch81] The upper bound was im-proved shortly after by works of Pan [Pan81] Romani [Rom82] and Coppersmithand Winograd [CW82] resulting in the exponent 250 Then in 1987 Strassenpublished the laser method with which he obtained the exponent 248 [Str87] Thelaser method was used in the same year by Coppersmith and Winograd to obtainthe exponent 238 [CW87] To do this they invented a method for constructingcertain large combinatorial structures This method or actually the extendedversion that Strassen published in [Str91] we now call the CoppersmithndashWinogradmethod All further improvements on upper bounding the exponent essentially

3

4 Chapter 1 Introduction

follow the framework of Coppersmith and Winograd and the improvements donot affect the first two digits after the comma [CW90 Sto10 Wil12 LG14]

Define ω to be the optimal exponent in the complexity of matrix multiplicationWe call ω the exponent of matrix multiplication To summarise the above historicalaccount on upper bounds ω lt 238 On the other hand the only lower bound wecurrently have is the trivial lower bound 2 le ω

The history of upper bounds on the matrix multiplication exponent ω whichbegan with Strassenrsquos algorithm and ended with the Strassen laser methodand CoppersmithndashWinograd method is well-known and well-documented seeeg [BCS97 Section 1513] However there is remarkable work of Strassen ona theory of lower bounds for ω and similar types of exponents and this workhas received almost no attention This theory of lower bounds is the theory ofasymptotic spectra of tensors and is the topic of a series of papers by Strassen[Str86 Str87 Str88 Str91 Str05]

In the foregoing the word tensor has popped up twicemdashnamely when wementioned border rank and just now when we mentioned asymptotic spectraof tensorsmdashbut we have not discussed at all why tensors should be relevant forunderstanding the complexity of matrix multiplication First we give a mini courseon tensors A k-tensor t = (ti1ik)i1ik is a k-dimensional array of numbers fromsome field say the complex numbers C Thus a 2-tensor is simply a matrix Ak-tensor is called simple if there exist k vectors v1 vk such that the entries of tare given by the products ti1ik = (v1)i1 middot middot middot (vk)ik for all indices ij The tensorrank of t is the smallest number n such that t can be written as a sum of n simpletensors Thus the tensor rank of a 2-tensor is simply its matrix rank Returning tothe problem of finding the complexity of matrix multiplication there is a special3-tensor called the matrix multiplication tensor that encodes the computationalproblem of multiplying two 2times 2 matrices This 3-tensor is commonly denotedby 〈2 2 2〉 It turns out that the matrix multiplication exponent ω is exactly theasymptotic rate of growth of the tensor rank of the ldquoKronecker powersrdquo of thetensor 〈2 2 2〉 This important observation follows from the fundamental fact thatthe computational problem of multiplying matrices is ldquoself-reduciblerdquo Namely wecan multiply two matrices by viewing them as block matrices and then performmatrix multiplication at the level of the blocks

We wrap up this introductory story To understand the computational com-plexity of matrix multiplication one should understand the asymptotic rate ofgrowth of the tensor rank of a certain family of tensors a family that is obtainedby taking powers of a fixed tensor The theory of asymptotic spectra is the theoryof bounds on such asymptotic parameters of tensors

The main story line of this dissertation concerns the theory of asymptoticspectra In Section 11 of this introduction we discuss in more detail the computa-tional problem of multiplying matrices In Section 12 we discuss the asymptoticspectrum of tensors and discuss a new result an explicit description of infinitely

11 Matrix multiplication 5

many elements in the asymptotic spectrum of tensors In Section 13 we considera new higher-order CoppersmithndashWinograd method

The theory of asymptotic spectra of tensors is a special case of an abstracttheory of asymptotic spectra of preordered semirings which we discuss in Sec-tion 14 In Section 15 we apply this abstract theory to a new setting namelyto graphs By doing this we obtain a new dual characterisation of the Shannoncapacity of graphs

The second story line of this dissertation is about degeneration an algebraickind of approximation related to the concept of border rank of Bini et al We discussdegeneration in the context of tensors in Section 16 There is a combinatorialversion of tensor degeneration which we call combinatorial degeneration Wediscuss a new result regarding combinatorial degeneration in Section 17 FinallySection 18 is about a new result concerning degeneration for algebraic branchingprograms an algebraic model of computation

We finish in Section 19 with a discussion of the organisation of this dissertationinto chapters

11 Matrix multiplication

In this section we discuss in more detail the computational problem of multiplyingtwo matrices

Algebraic complexity theory studies algebraic algorithms for algebraic problemsRoughly speaking algebraic algorithms are algorithms that use only the basicarithmetical operations + and times over some field say R or C A fundamentalexample of an algebraic problem is matrix multiplication

If we multiply two ntimesn matrices by computing the inner products between anyrow of the first matrix and any column of the second matrix one by one we needroughly 2 middot n3 arithmetical operations (+ and times) For example we can multiplytwo 2times2 matrices with 12 arithmetical operations namely 8 multiplications and 4additions(

a11 a12

a21 a22

)(b11 b12

b21 b22

)=

(a11b11 + a12b21 a11b12 + a12b22

a21b11 + a22b21 a21b12 + a22b22

)

Since matrix multiplication is a basic operation in linear algebra it is worthwhileto see if we can do better than 2 middot n3 In 1969 Strassen [Str69] published a betteralgorithm The base routine of Strassenrsquos algorithm is an algorithm for multiplyingtwo 2times 2 matrices with 7 multiplications 18 additions and certain sign changes(

a11 a12

a21 a22

)(b11 b12

b21 b22

)=

(x1 + x4 minus x5 + x7 x3 + x5

x2 + x4 x1 + x3 minus x2 + x6

)with

x1 = (a11 + a22)(b11 + b22)

6 Chapter 1 Introduction

x2 = (a21 + a22)b11

x3 = a11(b12 minus b22)

x4 = a22(minusb11 + b21)

x5 = (a11 + a12)b22

x6 = (minusa11 + a21)(b11 + b12)

x7 = (a12 minus a22)(b21 + b22)

The general routine of Strassenrsquos algorithm multiplies two n times n matrices byrecursively dividing the matrices into four blocks and applying the base routineto multiply the blocks (this is the self-reducibility of matrix multiplication thatwe mentioned earlier) The base routine does not assume commutativity of thevariables for correctness so indeed we can take the variables to be matrices Afterexpanding the recurrence we see that Strassenrsquos algorithm uses 47middotnlog2 7 asymp 47middotn281

arithmetical operations Over the years Strassenrsquos algorithm was improved bymany researchers The best algorithm known today uses C middot n238 arithmeticaloperations where C is some constant [CW90 Sto10 Wil12 LG14] The exponentof matrix multiplication ω is the infimum over all real numbers β such that forsome constant Cβ we can multiply for any n isin N any two ntimes n matrices with atmost Cβ middot nβ arithmetical operations From the above it follows that ω le 238From a simple flattening argument it follows that 2 le ω We are left with thefollowing well-known open problem what is the value of the matrix multiplicationexponent ω

The constant C for the currently best algorithm is impractically large (fora discussion of this issue see eg [Pan18]) For a practical fast algorithm oneshould either improve C or find a balance between C and the exponent of nWe will ignore the size of C in this dissertation and focus on the exponent ωFor an overview of the field of algebraic complexity theory the reader shouldconsult [BCS97] and [Sap16]

12 The asymptotic spectrum of tensors

We now discuss the theory of asymptotic spectra for tensors

Let s and t be k-tensors over a field F s isin Fn1otimesmiddot middot middototimesFnk t isin Fm1otimesmiddot middot middototimesFmk We say s restricts to t and write s gt t if there are linear maps Ai Fni rarr Fmisuch that (A1 otimes middot middot middot otimes Ak)(s) = t Let [n] = 1 n for n isin N We definethe product s otimes t isin Fn1m1 otimes middot middot middot otimes Fnkmk by (s otimes t)(i1j1)(ikjk) = si1iktj1jkfor i isin [n1] times middot middot middot [nk] and j isin [m1] times middot middot middot times [mk] This product generalizes thewell-known Kronecker product of matrices We refer to this product as the tensor(Kronecker) product We define the direct sum s oplus t isin Fn1+m1 otimes middot middot middot otimes Fnk+mk

by (s oplus t)`1`k = s`1`k if ` isin [n1] times middot middot middot times [nk] (s oplus t)n1+`1nk+`k = t`1`k if` isin [m1]times middot middot middot times [mk] and (soplus t)`1`k = 0 for the remaining indices

12 Asymptotic spectra of tensors 7

The asymptotic restriction problem asks to compute the infimum of all realnumbers β ge 0 such that for all n isin N

sotimesβn+o(n) gt totimesn

We may think of the asymptotic restriction problem as having two directionsnamely to find

1 obstructions ldquocertificatesrdquo that prohibit sotimesβn+o(n) gt totimesn or

2 constructions linear maps that carry out sotimesβn+o(n) gt totimesn

Ideally we would like to find matching obstructions and constructions so that weindeed learn the value of β

What do obstructions look like We set β equal to one it turns out that itis sufficient to understand this case We say s restricts asymptotically to t andwrite s gtsim t if

sotimesn+o(n) gt totimesn

What do obstructions look like for asymptotic restriction gtsim More precisely whatdo obstructions look like for gtsim restricted to a subset S sube k-tensors over FLet us assume S is closed under direct sum and tensor product and containsthe diagonal tensors 〈n〉 =

sumni=1 ei otimes middot middot middot otimes ei for n isin N where e1 en is the

standard basis of Fn Let X(S) be the set of all maps φ S rarr Rge0 that are

(a) monotone under restriction gt

(b) multiplicative under the tensor Kronecker product otimes

(c) additive under the direct sum oplus

(d) normalised to φ(〈n〉) = n at the diagonal tensor 〈n〉

The elements φ isin X(S) are called spectral points of S The set X(S) is called theasymptotic spectrum of S

Spectral points φ isin X(S) are obstructions Let s t isin S If s gtsim t thenby definition we have a restriction sotimesn+o(n) gt totimesn Then (a) and (b) imply theinequality φ(s)n+o(n) = φ(sotimesn+o(n)) gt φ(totimesn) = φ(t)n This implies φ(s) gt φ(t)We negate that statement if φ(s) lt φ(t) then not s gtsim t In that case φ is anobstruction to s gtsim t

The remarkable fact is that X(S) is a complete set of obstructions for gtsimNamely for s t isin S the asymptotic restriction s gtsim t holds if and only if we haveφ(s) gt φ(t) for all spectral points φ isin X(S) This was proven by Volker Strassenin [Str86 Str88] His proof uses a theorem of Becker and Schwarz [BS83] which iscommonly referred to as the KadisonndashDubois theorem (for historical reasons) or

8 Chapter 1 Introduction

the real representation theorem (We will say more about this completeness resultin Section 14)

Let us introduce tensor rank and subrank and their asymptotic versionsThe tensor rank of t is the size of the smallest diagonal tensor that restrictsto t R(t) = minr isin N t 6 〈r〉 and the subrank of t is the size of thelargest diagonal tensor to which t restricts Q(t) = maxr isin N 〈r〉 6 tAsymptotic rank is defined as ˜R(t) = limnrarrinfinR(totimesn)1n and asymptotic subrankis defined as ˜Q(t) = limnrarrinfinQ(totimesn)1n From Feketersquos lemma it follows that

˜Q(t) = supn Q(totimesn)1n and ˜R(t) = infn R(totimesn)1n One easily verifies that everyspectral point φ isin X(S) is an upper bound on asymptotic subrank and a lowerbound on asymptotic rank for any tensor t isin S

˜Q(t) le φ(t) le ˜R(t)

Strassen used the completeness of X(S) for 6sim to prove ˜Q(t) = minφisinX(S) φ(t) and

˜R(t) = maxφisinX(S) φ(t) One should think of these expressions as being dual to thedefining expressions for ˜Q and ˜R

We mentioned that Strassen was motivated to study the asymptotic spectrumof tensors by the study of the complexity of matrix multiplication The preciseconnection with matrix multiplication is as follows The matrix multiplicationexponent ω is characterised by the asymptotic rank ˜R(〈2 2 2〉) of the matrixmultiplication tensor

〈2 2 2〉 =sum

ijkisin[2]

eij otimes ejk otimes eki isin F4 otimes F4 otimes F4

via ˜R(〈2 2 2〉) = 2ω We know the trivial lower bound 2 le ω see Section 43 Weknow the (nontrivial) upper bound ω le 23728639 which is by Coppersmith andWinograd [CW90] and improvements by Stothers [Sto10] Williams [Wil12] andLe Gall [LG14] It may seem that for the study of matrix multiplication only theasymptotic rank ˜R is of interest and that the asymptotic subrank ˜Q is just a toyparameter Asymptotic subrank however plays an important role in the currentlybest matrix multiplication algorithms We will discuss this idea in the context ofthe asymptotic subrank of so-called complete graph tensors in Section 55

The important message is understanding the asymptotic spectrum of ten-sors X(S) means understanding asymptotic restriction 6sim the asymptotic sub-rank ˜Q and the asymptotic rank ˜R of tensors Of course we should now find anexplicit description of X(S)

Our main result regarding the asymptotic spectrum of tensors is the explicitdescription of an infinite family of elements in the asymptotic spectrum of allcomplex tensors X(complex k-tensors) which we call the quantum function-als (Chapter 6) Finding such an infinite family has been an open problemsince the work of Strassen Moment polytopes (studied under the name en-tanglement polytopes in quantum information theory [WDGC13]) play a key

12 Asymptotic spectra of tensors 9

role here To each tensor t is associated a convex polytope P(t) collectingrepresentation-theoretic information about t called the moment polytope of t(See eg [Nes84 Bri87 WDGC13 SOK14]) The moment polytope has twoimportant equivalent descriptions

Quantum marginal spectra description We begin with the descriptionof P(t) in terms of quantum marginal spectra

Let V be a (finite-dimensional) Hilbert space In quantum information theorya positive semidefinite hermitian operator ρ V rarr V with trace one is calleda density operator The sequence of eigenvalues of a density operator ρ is aprobability vector We let spec(ρ) = (p1 pn) be the sequence of eigenvalues of ρordered non-increasingly p1 ge middot middot middot ge pn Let V1 and V2 be Hilbert spaces Given adensity operator ρ on V1 otimes V2 the reduced density operator ρ1 = tr2 ρ is uniquelydefined by the property that tr(ρ1X1) = tr(ρ(X1otimesIdV2)) for all operators X1 on V1The operator ρ1 is again a density operator The operation tr2 is called the partialtrace over V2 In an explicit form ρ1 is given by 〈ei ρ1(ej)〉 =

sum`〈eiotimesf` ρ(ejotimesf`)〉

where the ei form a basis of V1 and the fi form an orthonormal basis of V2 (thestatement is independent of basis choice)

Let Vi be a Hilbert space and consider the tensor product V1 otimes V2 otimes V3Associate with t isin V1 otimes V2 otimes V3 the dual element tlowast = 〈t middot〉 isin (V1 otimes V2 otimes V3)

lowastThen ρt = ttlowast〈t t〉 = t〈t middot〉〈t t〉 is a density operator on V1otimesV2otimesV3 Viewing ρt

as a density operator on the regrouped space V1 otimes (V2 otimes V3) we may take thepartial trace of ρt over V2otimesV3 as described above We denote the resulting densityoperator by ρt1 = tr23 ρ

t We similarly define ρt2 and ρt3

Let V = V1otimesV2otimesV3 Let G = GL(V1)timesGL(V2)timesGL(V3) act naturally on V Let t isin V 0 The moment polytope of t is

P(t) = P(G middot t) = (spec(ρu1) spec(ρu2) spec(ρu3)) u isin G middot t 0

Here G middot t denotes the Zariski closure or equivalently the Euclidean closure in Vof the orbit G middot t = g middot t g isin G

Representation-theoretic description On the other hand there is a de-scription of P(t) in terms of non-vanishing of representation-theoretic multiplicitiesWe do not state this description here but stress that it is crucial for our proofs

Quantum functionals For any probability vector θ isin Rk (iesumk

i=1 θ(i) = 1and θ(i) ge 0 for all i isin [k]) we define the quantum functional F θ as an optimisationover the moment polytope

F θ(t) = max

2sumki=1 θ(i)H(x(i)) (x(1) x(k)) isin P(t)

Here H(y) denotes Shannon entropy of the probability vector y We prove that F θ

satisfies properties (a) (b) (c) and (d) for all complex k-tensors

Theorem (Theorem 611) F θ isin X(complex k-tensors)

10 Chapter 1 Introduction

To put our result into context Strassen in [Str91] constructed elements in theasymptotic spectrum of S = oblique k-tensors over F with the preorder 6|SThe set S is a strict and non-generic subset of all k-tensors over F These elementswe call the (Strassen) support functionals On oblique tensors over C the quantumfunctionals and the support functionals coincide An advantage of the supportfunctionals over the quantum functionals is that they are defined over any fieldIn fact the support functionals are ldquopowerful enoughrdquo to reprove the result ofEllenbergndashGijswijt on cap sets [EG17] We discuss the support functionals inSection 44

13 Higher-order CW method

Recall that in the asymptotic restriction problem we have an obstruction directionand a construction direction The quantum functionals and the support functionalsprovide obstructions Now we look at the construction direction Constructionsare asymptotic transformations sotimesβn+o(n) gt totimesn We restrict attention to the casethat t is a diagonal tensor 〈r〉 Constructions in this case essentially correspondto lower bounds on the asymptotic subrank ˜Q(s) The goal is now to constructgood lower bounds on ˜Q(s)

Strassen solved the problem of computing the asymptotic subrank for so-calledtight 3-tensors with the CoppersmithndashWinograd (CW) method and the supportfunctionals [CW90 Str91] The CW method is combinatorial Let us introduce thecombinatorial viewpoint Let I1 Ik be finite sets We call a set D sube I1timesmiddot middot middottimesIka diagonal if any two distinct elements a b isin D differ in all k coordinates LetΦ sube I1timesmiddot middot middottimes Ik We call a diagonal D sube Φ free if D = Φcap (D1timesmiddot middot middottimesDk) HereDi = ai a isin D is the projection ofD onto the ith coordinate The subrank Q(Φ)of Φ is the size of the largest free diagonal D sube Φ For two sets Φ sube I1 times middot middot middot times Ikand Ψ sube J1 times middot middot middot times Jk we define the product ΦtimesΨ sube (I1 times J1)times middot middot middot times (Ik times Jk)by Φ times Ψ = ((a1 b1) (ak bk)) a isin Φ b isin Ψ The asymptotic subrankis defined as ˜Q(Φ) = limnrarrinfinQ(Φtimesn)1n One may think of Φ as a k-partitehypergraph and of a free diagonal in Φ as an induced k-partite matching

How does this combinatorial version of subrank relate to the tensor version ofsubrank that we defined earlier Let t isin Fn1otimesmiddot middot middototimesFnk Expand t in the standardbasis t =

sumiisin[n1]timesmiddotmiddotmiddottimes[nk] ti ei1 otimes middot middot middot otimes eik Let supp(t) be the support of t in the

standard basis supp(t) = i isin [n1]timesmiddot middot middottimes [nk] ti 6= 0 Then Q(supp(t)) le Q(t)We want to construct large free diagonals Let Φ sube I1timesmiddot middot middottimesIk We call Φ tight

if there are injective maps αi Ii rarr Z such that if a isin Φ thensumk

i=1 αi(ai) = 0For a set X let P(X) be the set of probability distributions on X For θ isin P([k]) letHθ(Φ) = maxPisinP(Φ)

sumki=1 θ(i)H(Pi) where H(Pi) denotes the Shannon entropy

of the ith marginal distribution of P In [Str91] Strassen used the CW methodand the support functionals to characterise the asymptotic subrank ˜Q(Φ) fortight Φ sube I1 times I2 times I3 He proved the following Let Φ sube I1 times I2 times I3 be tight

14 Abstract asymptotic spectra 11

Then

˜Q(Φ) = minθisinP([3])

2Hθ(Φ) = maxPisinP(Φ)

miniisin[3]

2H(Pi) (11)

We study the higher-order regime Φ sube I1 times middot middot middot times Ik k ge 4

Theorem (Theorem 57) Let Φ sube I1 times middot middot middot times Ik be tight Then ˜Q(Φ) is lowerbounded by an expression that generalizes the right-hand side of (11)

Stating the lower bound requires a few definitions so we do not state it hereIt is not known whether our new lower bound matches the upper bound given byquantum or support functionals

Using Theorem 57 we managed to exactly determine the asymptotic subranksof several new examples These results in turn we used to obtain upper boundson the asymptotic rank of so-called complete graph tensors via a higher-orderStrassen laser method

14 Abstract asymptotic spectra

Strassen mainly studied tensors but he developed an abstract theory of asymptoticspectra in a general setting In the next section we apply this abstract theory tographs We now introduce the abstract theory One has a semiring S (think of asemiring as a ring without additive inverses) that contains N and a preorder 6on S that (1) behaves well with respect to the semiring operations (2) inducesthe natural order on N and (3) for any a b isin S b 6= 0 there is an r isin N sube Swith a 6 r middot b We call such a preorder a Strassen preorder The main theoremis that the asymptotic version 6sim of the Strassen preorder is characterised by themonotone semiring homomorphisms S rarr Rge0 For a b isin S let a 6sim b if there is a

sequence xn isin NN with x1nn rarr 1 when nrarrinfin and an 6 bnxn for all n isin N Let

X = X(S6) = φ isin Hom(SRge0) foralla b isin S a 6 brArr φ(a) le φ(b)

The set X is called the asymptotic spectrum of (S6)

Theorem (Strassen) a 6sim b iff forallφ isin X φ(a) le φ(b)

Strassen applies this theorem to study rank and subrank of tensors Wedefine an abstract notion of rank R(a) = minn isin N a 6 n and an abstractnotion of subrank Q(a) = maxm isin N m 6 a We then naturally have anasymptotic rank ˜R(a) = limnrarrinfinR(an)1n and (under certain mild conditions) anasymptotic subrank ˜Q(a) = limnrarrinfinQ(an)1n In fact ˜R(a) = infn R(an)1n and

˜Q(a) = supn Q(an)1n by Feketersquos lemma The theorem implies the following dualcharacterisations

12 Chapter 1 Introduction

Corollary (Section 28) If a isin S with ak gt 2 for some k isin N then

˜Q(a) = minφisinX

φ(a)

If a isin S with φ(a) ge 1 for some φ isin X then

˜R(a) = maxφisinX

φ(a)

In Chapter 2 we will discuss the abstract theory of asymptotic spectra Wewill discuss a proof of the above theorem that is obtained by integrating the proofsof Strassen in [Str88] and the proof of the KadisonndashDubois theorem of Beckerand Schwarz in [BS83] We will also discuss some basic properties of generalasymptotic spectra

15 The asymptotic spectrum of graphs

In the previous section we have seen the abstract theory of asymptotic spectraWe now discuss a problem in graph theory where we can apply this abstracttheory Consider a communication channel with input alphabet a b c d e andoutput alphabet 1 2 3 4 5 When the sender gives an input to the channel thereceiver gets an output according to the following diagram where an outgoingarrow is picked randomly (say uniformly randomly)

a 1

b 2

c 3

d 4

e 5

Output 2 has an incoming arrow from a and an incoming arrow from b Wesay a and b are confusable because the receiver cannot know whether a or bwas given as an input to the channel In this channel the pairs of inputsa b b c c d d e e a are confusable If we restrict the input set toa subset of pairwise non-confusable letters say a c then we can use the channelto communicate two messages with zero error It is clear that for this channel anynon-confusable set of inputs has size at most two Can we make better use of thechannel if we use the channel twice Yes now the input set is the set of two letterwords aa ab ac ad ae ba bb and we have a set of pairwise non-confusablewords aa bc ce db ed which has size 5 Thus ldquoper channel userdquo we can send atleast

radic5 letters What happens if we use the channel n times

15 The asymptotic spectrum of graphs 13

The situation is concisely described by drawing the confusability graph of thechannel which has the input letters as vertices and the confusable pairs of inputletters as edges For the above channel the confusability graph is the 5-cycle C5

a

b

cd

e

A subset of inputs that are pairwise non-confusable corresponds to a subset ofthe vertices in the confusability graph that contains no edges an independent setThe independence number of any graph G is the size of the largest independentset in G and is denoted by α(G) If G is the confusability graph of some channelthen the confusability graph for using the channel n times is denoted by Gn (thegraph product is called the strong graph product) The question of how manyletters we can send asymptotically translates to computing the limit

Θ(G) = limnrarrinfin

α(Gn)1n

which exists because α is supermultiplicative under The parameter Θ(G) wasintroduced by Shannon [Sha56] and is called the Shannon capacity of the graph GComputing the Shannon capacity is a nontrivial problem already for small graphsLovasz in 1979 [Lov79] computed the value Θ(C5) =

radic5 by introducing and

evaluating a new graph parameter ϑ which is now known as the Lovasz thetanumber Already for the 7-cycle C7 the Shannon capacity is not known

Duality theorem We propose a new application of the abstract theory ofasymptotic spectra to graph theory The main theorem that results from this is adual characterisation of the Shannon capacity of graphs For graphs G and H wesay G 6 H if there is a graph homomorphism Grarr H ie from the complementof G to the complement of H We show graphs are a semiring under the stronggraph product and the disjoint union t and 6 is a Strassen preorder onthis semiring The rank in this setting is the clique cover number χ(middot) = χ( middot )ie the chromatic number of the complement The subrank in this setting is theindependence number α(middot) Let X(G) be the set of semiring homomorphismsfrom graphs to Rge0 that are monotone under 6 From the abstract theory ofasymptotic spectra we derive the following duality theorem

Theorem (Theorem 31) Θ(G) = minφisinX(G) φ(G)

In Chapter 3 we will prove Theorem 31 and we will discuss the known elementsin X(G) which are the Lovasz theta number and a family of parameters obtainedby ldquofractionalisingrdquo

14 Chapter 1 Introduction

16 Tensor degeneration

We move to the second story line that we mentioned earlier degeneration Degen-eration is a prominent theme in algebraic complexity theory Roughly speakingdegeneration is an algebraic notion of approximation defined via orbit closures

For tensors for example degeneration is defined as follows Let V1 V2 V3

be finite-dimensional complex vector spaces and let V = V1 otimes V2 otimes V3 be thetensor product space Let G = GL(V1) times GL(V2) times GL(V3) act naturally on V Let s t isin V Let G middot t = g middot t g isin G be the orbit of t under G We say tdegenerates to s and write t s if s is an element in the orbit closure G middot t Herethe closure is taken with respect to the Zariski topology or equivalently withrespect to the Euclidean topology One should think of this degeneration asa topologically closed version of the restriction preorder le for tensors that wedefined earlier Degeneration is a ldquolargerrdquo preorder than restriction in the sensethat s t implies s le t

In several algebraic models of computation approximative computations cor-respond to certain degenerations In some models such an approximative com-putation can be turned into an exact computation at a small cost for exampleusing the method of interpolation The currently fastest matrix multiplicationalgorithms are constructed in this way for example

On the other hand it turns out that if a lower bound technique for analgebraic measure of complexity is ldquocontinuousrdquo then the lower bounds obtainedwith this technique are already lower bounds on the approximative version ofthe complexity measure This observation turns approximative complexity anddegeneration into an interesting topic itself A research program in this directionis the geometric complexity theory program of Mulmuley and Sohoni towardsseparating the algebraic complexity class VP (and related classes) from VNP[MS01] (see also [Ike13])

In this section we briefly discuss three results related to degeneration of tensorsthat are not discussed further in this dissertation Then we will discuss resultson combinatorial degeneration in Section 17 and algebraic branching programdegeneration in Section 18

Ratio of tensor rank and border rank The approximative or degenera-tion version of tensor rank is called border rank and is denoted by R It has beenknown since the work of Bini and Strassen that tensor rank R and border rank Rare different How much can they be different In [Zui17] we showed the followinglower bound Let k ge 3 There is a sequence of k-tensors tn in (C2n)otimesk such thatR(tn)R(tn) ge k minus o(1) when n rarr infin This answers a question of Landsbergand Micha lek [LM16b] and disproves a conjecture of Rhodes [AJRS13] Furtherprogress will most likely require the construction of explicit tensors with hightensor rank which has implications in formula complexity [Raz13]

Border support rank Support rank is a variation on tensor rank whichhas its own approximative version called border support rank A border support

17 Combinatorial degeneration 15

rank upper bound for the matrix multiplication tensor yields an upper bound onthe asymptotic complexity This was shown by Cohn and Umans in the contextof the group theoretic approach towards fast matrix multiplication [CU13] Theyasked what is the border support rank of the smallest matrix multiplicationtensor 〈2 2 2〉 In [BCZ17a] we showed that it equals seven Our proof usesthe highest-weight vector technique (see also [HIL13]) Our original motivationto study support rank is a connection that we found between support rank andnondeterministic multiparty quantum communication complexity [BCZ17b]

Tensor rank under outer tensor product We applied degeneration asa tool to study an outer tensor product otimes on tensors For s isin Cn1 otimes middot middot middot otimes Cnk

and t isin Cm1 otimes middot middot middot otimes Cm` let s otimes t be the natural (k + `)-tensor in Cn1 otimes middot middot middot otimesCnk otimes Cm1 otimes middot middot middot otimes Cm` The products otimes and otimes differ by a regrouping of thetensor indices It is well known that tensor rank is not multiplicative under otimesIn [CJZ18] we showed that tensor rank is already not multiplicative under otimes astronger result Nonmultiplicativity occurs when taking a power of a tensor whoseborder rank is strictly smaller than its tensor rank This answers a question ofDraisma [Dra15] and Saptharishi et al [CKSV16]

17 Combinatorial degeneration

In the previous section we introduced the general idea of degeneration and discusseddegeneration of tensors Combinatorial degeneration is the combinatorial analogueof tensor degeneration Consider sets Φ sube Ψ sube I1 times middot middot middot times Ik of k-tuples Wesay Φ is a combinatorial degeneration of Ψ and write Ψ Φ if there are mapsui Ii rarr Z such that for all α isin I1 times middot middot middot times Ik if α isin Ψ Φ then

sumki=1 ui(αi) gt 0

and if α isin Φ thensumk

i=1 ui(αi) = 0 We prove that combinatorial asymptoticsubrank is nonincreasing under combinatorial degeneration

Theorem (Theorem 521) If Ψ Φ then ˜Q(Ψ) ge ˜Q(Φ)

The analogous statement for subrank of tensors is trivially true The crucialpoint is that Theorem 521 is about combinatorial subrank As an example Theo-rem 521 combined with the CW method yields an elegant optimal constructionof tri-colored sum-free sets which are combinatorial objects related to cap sets

18 Algebraic branching program degeneration

We now consider degeneration in the context of algebraic branching programs Acentral theme in algebraic complexity theory is the study of the power of differentalgebraic models of computation and the study of the corresponding complexityclasses We have already (implicitly) used an algebraic model of computationwhen we discussed matrix multiplication circuits

16 Chapter 1 Introduction

bull A circuit is a directed acyclic graph G with one or more source verticesand one sink vertex Each source vertex is labelled by a variable xi ora constant α isin F The other vertices are labelled by either + or times andhave in-degree 2 (that is fan-in 2) Each vertex of G naturally computesa polynomial The value of G is the element computed at the sink vertexThe size of G is the number of vertices (One may also allow multiple sinkvertices in order to compute multiple polynomials eg to compute matrixmultiplication) Here is an example of a circuit computing xy + 2x+ y minus 1

minus1 2 x y source vertices

times times

+ +

+ sink vertex

Consider the following two models

bull A formula is a circuit whose graph is a tree

bull An algebraic branching program (abp) is a directed acyclic graph G withone source vertex s one sink vertex t and affine linear forms over the basefield F as edge labels Moreover each vertex is labeled with an integer (itslayer) and the arrows in the abp point from vertices in layer i to vertices inlayer i+ 1 The cardinality of the largest layer we call the width of the abpThe number of vertices we call the size of the abp The value of an abp isthe sum of the values of all sndasht-paths where the value of an sndasht-path is theproduct of its edge labels We say that an abp computes its value Here isan example of a width-3 abp computing xy + 2x+ y minus 1

s

t

x2

xyminus1

19 Organisation 17

The above models of computation give rise to complexity classes A complexityclass consists of families of multivariate polynomials (fn)n = (f(x1 xqn)n)nisinNover some fixed field F We say a family of polynomials (fn)n is a p-family if thedegree of fn and the number of variables of fn grow polynomially in n Let VPbe the class of p-families with polynomially bounded circuit size Let VPe be theclass of p-families with polynomially bounded formula size For k isin N let VPk bethe class of families of polynomials computable by width-k abps of polynomiallybounded size Let VPs be the class of p-families computable by skew circuitsof polynomial size Skew circuits are a type of circuits between formulas andgeneral circuits The class VPs coincides with the class of families of polynomialscomputable by abps of polynomially bounded size (see eg [Sap16]) Ben-Orand Cleve proved that VP3 = VP4 = middot middot middot = VPe [BOC92] Allender and Wangproved VP2 ( VP3 [AW16] Thus VP2 ( VP3 = VP4 = middot middot middot = VPe sube VPsThe following separation problem is one of the many open problems regardingalgebraic complexity classes Is the inclusion VPe sube VPs strict Motivated by thisseparation problem we study the approximation closure of VPe We mentionedthat Ben-Or and Cleve proved that formula size is polynomially equivalent towidth-3 abp size [BOC92] Regarding width-2 there are explicit polynomials thatcannot be computed by any width-2 abp of any size [AW16] The abp model hasa natural notion of approximation When we allow approximation in our abpsthe situation changes completely

Theorem (Theorem 78) Any polynomial can be approximated by a width-2 abpof size polynomial in the formula size

In terms of complexity classes this means VP2 = VPe where middot denotes theldquoapproximation closurerdquo of the complexity class The theorem suggests an ap-proach regarding the separation of VPe and VPs Namely superpolynomial lowerbounds on formula size may be obtained from superpolynomial lower bounds onapproximate width-2 abp size We moreover study the nondeterminism closure ofcomplexity classes and prove a new characterisation of the complexity class VNP

19 Organisation

This dissertation is divided into chapters as follows We will begin with the abstracttheory of asymptotic spectra in Chapter 2 Then we introduce the asymptoticspectra of graphs and a new characterisation of the Shannon capacity in Chapter 3In Chapter 4 we introduce the asymptotic spectrum of tensors discuss the supportfunctionals of Strassen for oblique tensors and a characterisation of asymptoticslice rank of oblique tensors as the minimum over the support functionals InChapter 5 we discuss tight tensors the higher-order CoppersmithndashWinogradmethod the combinatorial degeneration method and applications to the cap setproblem type sets and graph tensors In Chapter 6 we introduce an infinite family

18 Chapter 1 Introduction

of elements in the asymptotic spectrum of complex k-tensors and characterise theasymptotic slice rank as the minimum over the quantum functionals Finally inChapter 7 we study algebraic branching programs and approximation closure andnondeterminism closure of algebraic complexity classes

Chapter 2

The theory of asymptotic spectra

21 Introduction

This is an expository chapter about the abstract theory of asymptotic spectra ofVolker Strassen [Str88] The theory studies semirings S that are endowed with apreorder 6 The main result Theorem 212 is that under certain conditions theasymptotic version 6sim of this preorder is characterised by the semiring homomor-phisms S rarr Rge0 that are monotone under 6 These monotone homomorphismsmake up the ldquoasymptotic spectrumrdquo of (S6) For the elements of S we havenatural notions of rank and subrank generalising rank and subrank of tensorsThe asymptotic spectrum gives a dual characterisation of the asymptotic versionsof rank and subrank This dual description may be thought of as a ldquolower boundrdquomethod in the sense of computational complexity theory In Chapter 3 andChapter 4 we will study two specific pairs (S6)

22 Semirings and preorders

A (commutative) semiring is a set S with a binary addition operation + a binarymultiplication operation middot and elements 0 1 isin S such that for all a b c isin S

(1) + is associative (a+ b) + c = a+ (b+ c)

(2) + is commutative a+ b = b+ a

(3) 0 + a = a

(4) middot is associative (a middot b) middot c = a middot (b middot c)

(5) middot is commutative a middot b = b middot a

(6) 1 middot a = a

19

20 Chapter 2 The theory of asymptotic spectra

(7) middot distributes over + a middot (b+ c) = (a middot b) + (a middot c)

(8) 0 middot a = 0

As usual we abbreviate a middot b as ab A preorder is a relation 4 on a set X such thatfor all a b c isin X

(1) 4 is reflexive a 4 a

(2) 4 is transitive a 4 b and b 4 c implies a 4 c

As usual a 4 b is the same as b lt a Let N = 0 1 2 be the set of naturalnumbers and let Ngt0 = 1 2 be the set of strictly-positive natural numbersWe write le for the natural order 0 le 1 le 2 le 3 le middot middot middot on N

23 Strassen preorders

Let S be a semiring with N sube S A preorder 4 on S is a Strassen preorder if

(1) forallnm isin N n le m iff n 4 m

(2) foralla b c d isin S if a 4 b and c 4 d then a+ c 4 b+ d and ac 4 bd

(3) foralla b isin S b 6= 0 existr isin N a 4 rb

Note that condition (2) is equivalent to the condition foralla b s isin S if a 4 b thena+ s 4 b+ s and as 4 bs

Let 4 be a Strassen preorder on S Then 0 4 1 by condition (1) For a isin Swe have a 4 a by reflexivity and thus 0 4 a by condition (2)

Examples

We give two examples of a semiring with a Strassen preorder Proofs and formaldefinitions are given later

Graphs Let S be the set of all (isomorphism classes of) finite simple graphsLet GH isin S Let G t H be the disjoint union of G and H Let G H bethe strong graph product of G and H (see Chapter 3) With addition t andmultiplication the set S becomes a semiring The 0 in S is the graph with novertices and the 1 in S is the graph with a single vertex Let G be the complementof G Define a preorder 6 on S by G 6 H if there is a graph homomorphismGrarr H Then 6 is a Strassen preorder We will investigate this semiring furtherin Chapter 3

24 Asymptotic preorders 4sim 21

Tensors Let F be a field Let k isin N Let S be the set of all k-tensors over Fwith arbitrary format that is S = cupFn1 otimes middot middot middot otimes Fnk n1 nk isin N Fors isin Fn1 otimes middot middot middot otimes Fnk and t isin Fm1 otimes middot middot middot otimes Fmk let s 6 t if there are linear mapsAi Fmi rarr Fni with (A1otimesmiddot middot middototimesAk)t = s We identify any s t isin S for which s 6 tand t 6 s Let oplus be the direct sum of k-tensors and let otimes be the tensor productof k-tensors (see Chapter 4) With addition oplus and multiplication otimes the set Sbecomes a semiring The 0 in S is the zero tensor and the 1 in S is the standardbasis element e1otimes middot middot middot otimes e1 isin F1otimes middot middot middot otimes F1 The preorder 6 is a Strassen preorderWe will investigate this semiring further in Chapter 4 Chapter 5 and Chapter 6

24 Asymptotic preorders 4simDefinition 21 Let 4 be a relation on S Define the relation 4sim on S by

a2 4sim a1 if exist(xN) isin NN infNx

1NN = 1 forallN isin N aN2 4 aN1 xN (21)

If 4 is a Strassen preorder then we may in (21) replace the infimum infN x1NN

by the limit limNrarrinfin x1NN since we may assume xN+M le xNxM (if aN2 4 aN1 xN

and aM2 4 aM1 xM then aN+M2 4 aN+M

1 xNxM) and then apply Feketersquos lemma(Lemma 22)

Lemma 22 (Feketersquos lemma see [PS98 No 98]) Let x1 x2 x3 isin Rge0 satisfyxn+m le xn + xm Then limnrarrinfin xnn = infn xnn

Proof Let y = infn xnn Let ε gt 0 Let m isin Ngt0 with xmm lt y + ε Anyn isin N can be written in the form n = qm+ r where r is an integer 0 le r le mminus 1Set x0 = 0 Then xn = xqm+r le xm + xm + middot middot middot+ xm + xr = qxm + xr Therefore

xnn

=xqm+r

qm+ rle qxm + xr

qm+ r=xmm

qm

qm+ r+xrn

Thus

y le xnnlt (y + ε)

qm

n+xrn

The claim follows because xrnrarr 0 and qmnrarr 1 when nrarrinfin

For a1 a2 isin S if a1 4 a2 then clearly a1 4sim a2

Lemma 23 Let 4 be a Strassen preorder on S Then 4sim is a Strassen preorderon S the ldquoasymptotic preorderrdquo corresponding to 4

Proof Let a b c d isin S We verify that 4sim is a preorderFirst reflexivity We have a 4 a so aN 4 aN middot 1 so a 4sim a

22 Chapter 2 The theory of asymptotic spectra

Second transitivity Let a 4sim b and b 4sim c This means aN 4 bNxN andbN 4 cNyN with x

1NN rarr 1 and y

1NN rarr 1 Then aN 4 bNxN 4 cNxNyN Since

(xNyN)1N rarr 1 we conclude a 4sim cWe verify condition (1) Let nm isin N If n le m then n 4 m so n 4sim m If

n 4sim m then nN 4 nMxN so nN le mNxN which implies n le mWe verify condition (2) Let a 4sim b and c 4sim d This means aN 4 bNxN and

cN 4 dNyN Thus aNcN 4 bNdNxNyN and so ac 4sim bd Assume xN and yN arenondecreasing (otherwise set xN = maxnleN xn) Then

(a+ c)N =Nsumm=0

(N

m

)amcNminusm 4

Nsumm=0

(N

m

)bmdNminusmxmyNminusm

4Nsumm=0

(N

m

)bmdNminusmxNyN = (b+ d)NxNyN

Thus a+ c 4sim b+ dWe verify (3) Let a b isin S b 6= 0 Then there is an r isin N with a 4 rb and

thus a 4sim rb

Lemma 24 Let 4 be a Strassen preorder on S Let a1 a2 b isin S

(i) If a2 + b 4 a1 + b then a2 4sim a1

(ii) If a2b 4 a1b with b 6= 0 then a2 4sim a1

(iii) If a24simsim a1 then a2 4sim a1

(iv) If exists isin S foralln isin N na2 4 na1 + s then a2 4sim a1

Proof (ii) Let a2b 4 a1b By an inductive argument similar to the argument weused to prove (24)

forallN isin N aN2 b 4 aN1 b (22)

Let m r isin N with 1 4 mb 4 r (We use b 6= 0) From (22) follows

forallN isin N aN2 4 aN2 mb 4 aN1 mb 4 aN1 r

Thus we conclude a2 4sim a1(iii) Let a2

4simsim a1 This means aN2 4sim aN1 xN with x1NN rarr 1 This in turn means

that (aN2 )M 4 (aN1 xN)MyNM with forallN isin N y1MNM rarr 1 that is

aNM2 4 aNM1 xMN yNM

Choose a sequence N 7rarrMN such that (yNMN)1MN le 2 eg given N let MN be

the smallest M for which (yNM)1M le 2 Then aNMN2 4 aNMN

1 xMNN yNMN

and

(xMNN yNMN

)1(NMN ) = x1NN (yNMN

)1(NMN ) le x1NN 21N rarr 1

25 Maximal Strassen preorders 23

We conclude a2 4sim a1(iv) Let s isin S with foralln isin N na2 4 na1 + s We may assume a1 6= 0 Let k isin N

with s 4 ka1 Then

foralln isin N kna2 4 kna1 + ka1 = ka1(n+ 1) (23)

Apply (ii) to (23) to get

foralln isin N a2n 4sim a1(n+ 1)

By an inductive argument

forallN isin N aN2 4sim aNminus12 a12 4sim aNminus2

2 a213 4sim middot middot middot 4sim aN1 (N + 1)

Since (N + 1)1N rarr 1 a24simsim a1 From (iii) follows a2 4sim a1

(i) Let a2 + b 4 a1 + b We first prove

forallq isin N qa2 + b 4 qa1 + b (24)

By assumption the statement is true for q = 1 suppose the statement is truefor q minus 1 then

qa2 + b = (q minus 1)a2 + (a2 + b) 4 (q minus 1)a2 + (a1 + b)

= ((q minus 1)a2 + b) + a1 4 ((q minus 1)a1 + b) + a1 = qa1 + b

which proves the statement by induction Then foralln isin N na2 4 na1 + b From (iv)follows a2 4sim a1

25 Maximal Strassen preorders

Let P be the set of Strassen preorders on S For 4142 isin P we write 42 sube 41

if for all a b isin S a 42 b implies a 41 b (The notation 42 sube 41 is natural if wethink of the relations 4i as sets of pairs (a b) with a 4i b)

Lemma 25 Let 4 isin P with 4 = 4sim and a2 64 a1 Then there is an element4a1a2 isin P with 4 sube 4a1a2 and a1 4a1a2 a2

Proof For x1 x2 isin S let

x1 4a1a2 x2 if exists isin S x1 + sa2 4 x2 + sa1

The relation 4a1a2 is reflexive since x + 0 middot a2 4 x + 0 middot a1 The relation 4a1a2

is transitive if x1 4a1a2 x2 and x2 4a1a2 x3 then x1 + sa2 4 x2 + sa1 andx2 + ta2 4 x3 + ta1 for some s t isin S and so x1 + (t + s)a2 4 x2 + ta2 + sa1 4x3 + ta1 + sa1 = x3 + (t + s)a1 Thus x1 4a1a2 x3 We conclude that 4a1a2 is apreorder on S

24 Chapter 2 The theory of asymptotic spectra

We prove that 4a1a2 is a Strassen preorder If x1 4a1a2 x2 and y1 4a1a2 y2then clearly x1 + y1 4a1a2 x2 + y2 If x1 4a1a2 x2 and y isin S then x1y 4a1a2 x2yFrom this follows if x1 4a1a2 x2 and y1 4a1a2 y2 then x1y2 4a1a2 x2y2

Let nm isin N If n le m then n 4 m so n 4a1a2 m If n 6le m then n ge m+ 1Suppose n 4a1a2 m Let s isin S with n+ sa2 4 m+ sa1 Adding m+ 1 4 n gives

m+ 1 + n+ sa2 4 n+m+ sa1

Since 4 = 4sim we may apply Lemma 24 (i) to obtain

1 + sa2 4 sa1 (25)

From (25) follows s 6= 0 From (25) also follows

sa2 4 sa1 (26)

Since 4 = 4sim we may apply Lemma 24 (ii) to (26) to obtain the contradiction

a2 4 a1

Therefore n 64a1a2 m We conclude that 4a1a2 is a Strassen preorder thatis 4 isin P

Finally we have a1 4a1a2 a2 since a1 + 1 middot a2 4 a2 + 1 middot a1 Also if x1 4 x2then x1 + 0 middot a2 4 x2 + 0 middot a1 that is 4 sube 4a1a2

Let 4 be a Strassen preorder Let P4 be the set of Strassen preorderscontaining 4 ordered by inclusion sube Let C sube P4 be any chain Then theunion of all preorders in C is an element of P4 and contains all elements of CTherefore by Zornrsquos lemma P4 contains a maximal element (maximal withrespect to inclusion sube)

Lemma 26 Let 4 be maximal in P Then 4 = 4sim

Proof Trivially 4 sube 4sim From Lemma 23 we know 4sim isin P From maximalityof 4 follows 4 = 4sim

A relation 4 on S is total if for all a b isin S a 4 b or b 4 a

Lemma 27 Let 4 be maximal in P Then 4 is total

Proof Suppose 4 is not total say a1 64 a2 and a2 64 a1 By Lemma 25 there is anelement 4a1a2isin P with 4 sube 4a1a2 and a1 4a1a2 a2 Then 4 is strictly containedin 4a1a2 which contradicts the maximality of 4 We conclude 4 is total

26 The asymptotic spectrum X(S6) 25

26 The asymptotic spectrum X(S6)

Definition 28 Let S be a semiring with N sube S and let 6 be a Strassen preorderon S Let

X(S6) = φ isin Hom(SRge0) a 6 brArr φ(a) le φ(b)

We call X(S6) the asymptotic spectrum of (S6) We call the elements ofX(S6) spectral points

Lemma 29 Let 4 isin P be total There is exactly one semiring homomorphismφ S rarr Rge0 with

a 4 brArr φ(a) le φ(b)

Moreover if 4 is maximal in P then

a 4 bhArr φ(a) le φ(b)

Proof Let 4 isin P be total For a isin S define

φ(a) = inf rs

r s isin N sa 4 rψ(a) = supu

v u v isin N u 4 va

We prove ψ(a) le φ(a) Let r s u v isin N Suppose u 4 va and sa 4 r Thenfollows su 4 vsa 4 vr Thus uv le rs We prove ψ(a) ge φ(a) Supposeψ(a) lt φ(a) Let r s isin N with ψ(a) lt rs lt φ(a) Then sa 64 r Fromtotality follows sa lt r Thus ψ(a) ge rs which is a contradiction We concludeψ(a) = φ(a)

Let a b isin S We prove φ(a+ b) le φ(a) + φ(b) Let sa sb ra rb isin N Supposesaa 4 ra and sbb 4 rb Then sasba 4 sbra and sasbb 4 sarb By additionsasb(a+b) 4 sbra+sarb Thus φ(a+b) le ra

sa+ rb

sb We prove ψ(a+b) ge ψ(a)+ψ(b)

Suppose ua 4 vaa and ub 4 vbb Then vbua 4 vavba and vaub 4 vavbb By additionvbua + vaub 4 vavb(a+ b) Thus ψ(a+ b) ge ua

va+ ub

vb We thus have additivity

We prove φ(ab) le φ(a)φ(b) Suppose saa 4 ra and sbb 4 rb Then sasbab 4rarb Thus φ(ab) le ra

sa

rbsb

We prove ψ(ab) ge ψ(a)ψ(b) Suppose ua 4 vaa and

ub 4 vbb Then uaub 4 vavbab Thus uava

ubvble ψ(ab) We thus have multiplicativity

We prove monotonicity a 4 brArr φ(a) le φ(b) Suppose sbb 4 rb From a 4 bfollows sba 4 sbb 4 rb Thus φ(a) le rb

sb

We prove φ(1) = 1 Trivially 1 4 1 Therefore φ(1) le 11

= 1 and ψ(1) ge 11

= 1We prove φ(0) = 0 Trivially sa0 4 0 so φ(0) le 0

sa= 0 Trivially 0 4 va0 so

φ(0) ge 0va

= 0We prove the uniqueness of φ Let φ1 φ2 be semiring homomorphisms S rarr Rge0

with a 4 b rArr φi(a) le φi(b) Suppose φ1(a) lt φ2(a) Let u v isin N with

26 Chapter 2 The theory of asymptotic spectra

φ1(a) lt uvlt φ2(a) Then va 64 u so by totality va lt u Thus φ1(a) ge u

v which

is a contradiction This proves uniquenessFinally suppose 4 is maximal in P Lemma 26 gives 4 = 4sim Let a 64 b

From Lemma 24 (iv) follows existn na 64 nb+ 1 By totality na lt nb+ 1 Apply φto get φ(a) ge φ(b) + 1

n In particular φ(a) gt φ(b)

Lemma 210 The map

X(S6)rarr maximal elements in P6 φ 7rarr 4φwith a 4φ b iff φ(a) le φ(b) is a bijection

Proof Let φ isin X(S6) One verifies that 4φ is a Strassen preorder and 6 sube 6sim sube4φ Let 4 be maximal in P4φ Lemma 27 says that 4 is total By Lemma 29there is a ψ isin X(S6) with 4 sube 4ψ Clearly 4φ sube 4ψ The uniqueness statementof Lemma 29 implies φ = ψ This means 4φ = 4 that is 4φ is maximal Weconclude that the map is well defined

Let 4 maximal in P6 Then 4 is total By Lemma 29 there is a φ isin X(S6)with 4 sube 4φ We conclude the map is surjective

Let φ ψ isin X(S6) with 4φ = 4ψ From Lemma 29 follows φ = ψ Weconclude the map is injective

Lemma 211 Let a b isin S Then a 6sim b iff a 4 b for all maximal 4 isin P6

Proof Let 4 isin P6 be maximal Then 6sim sube 4sim = 4 by Lemma 26 so a 6sim bimplies a 4 b

Suppose a 66sim b Let n isin Nge1 with na 66sim nb+1 (Lemma 24 (iv)) By Lemma 25there is an element 4nb+1na isin P with 6sim sube 4nb+1na and we may assume 4nb+1na

is maximal Then nb+ 1 4nb+1na na and so a 64nb+1na b

27 The representation theorem

The following theorem is the main theorem

Theorem 212 ([Str88 Th 24]) Let S be a commutative semiring with N sube Sand let 6 be a Strassen preorder on S Let X = X(S6) be the set of 6-monotonesemiring homomorphisms from S to Rge0

X = X(S6) = φ isin Hom(SRge0) foralla b isin S a 6 brArr φ(a) le φ(b)

For a b isin S let a 6sim b if there is a sequence (xN) isin NN with x1NN rarr 1 when

N rarrinfin such that forallN isin N aN 6 bNxN Then

foralla b isin S a 6sim b iff forallφ isin X φ(a) le φ(b)

Proof Let a b isin S Suppose a 6sim b Then clearly for all φ isin X we haveφ(a) le φ(b) Suppose a 66sim b By Lemma 211 there is a maximal 4isin P6 witha 64 b By Lemma 210 there is a φ isin X with φ(a) gt φ(b)

28 Abstract rank and subrank RQ 27

28 Abstract rank and subrank RQ

We generalise the notions of rank and subrank for tensors to arbitrary semiringswith a Strassen preorder Let a isin S Define the rank

R(a) = minr isin N a 6 r

and the subrank

Q(a) = maxr isin N r 6 a

Then Q(a) le R(a) Define the asymptotic rank

˜R(a) = limNrarrinfin

R(aN)1N

Define the asymptotic subrank

˜Q(a) = limNrarrinfin

Q(aN)1N

By Feketersquos lemma (Lemma 22) asymptotic rank is an infimum and asymptoticsubrank is a supremum as follows

˜R(a) = infN

R(aN)1N

˜Q(a) = supN

Q(aN)1N when a = 0 or a ge 1

Theorem 212 implies that the asymptotic rank and asymptotic subrank have thefollowing dual characterisation in terms of the asymptotic spectrum (This is astraightforward generalisation of [Str88 Th 38])

Corollary 213 (cf [Str88 Th 38]) For a isin S with existφ isin Xφ(a) ge 1

˜R(a) = maxφisinX

φ(a)

Proof Let φ isin X For N isin N R(aN) ge φ(a)N Therefore ˜R(a) ge φ(a)and so ˜R(a) ge maxφisinX φ(a) It remains to prove ˜R(a) le maxφisinX φ(a) Welet x = maxφisinX φ(a) By assumption x ge 1 By definition of x we have

forallφ isin X φ(a) le x

Take the mth power on both sides

forallφ isin Xm isin N φ(am) le xm

Take the ceiling on the right-hand side

forallφ isin Xm isin N φ(am) le dxme

28 Chapter 2 The theory of asymptotic spectra

Apply Theorem 212 to get asymptotic preorders

forallm isin N am 6sim dxme

Then by definition of asymptotic preorder

forallmN isin N amN 6 dxmeN2εmN for some εmN isin o(N)

Then

forallmN isin N R(amN)1mN le dxme1m2εmNmN

From x ge 1 follows dxme1m rarr x when m rarr infin Choose m = m(N) withm(N)rarrinfin as N rarrinfin and εm(N)N isin o(N) to get ˜R(a) = infN R(aN )1N le x

Corollary 214 (cf [Str88 Th 38]) For a isin S with existk isin N ak gt 2

˜Q(a) = minφisinX

φ(a)

Proof Let φ isin X For N isin N Q(aN ) le φ(a)N Therefore ˜Q(a) le φ(a) so ˜Q(a) leminφisinX φ(a) It remains to prove ˜Q(a) ge minφisinX φ(a) Let y = minφisinX φ(a)

From the assumption ak gt 2 follows y gt 1 By definition of y we have

forallφ isin X φ(a) ge y

Take the mth power on both sides

forallφ isin Xm isin N φ(am) ge ym

Take the floor on the right-hand side

forallφ isin Xm isin N φ(am) ge bymc

Apply Theorem 212 to get asymptotic preorders

forallm isin N am gtsim bymc

Then by definition of asymptotic preorder

forallmN isin N amN2εmN gt bymcN for some εmN isin o(N)

Now we use ak gt 2 to get

forallmN isin N amN+kεmN gt bymcN

Then

forallmN isin N Q(amN+kεmN )1

mN+kεmN ge bymcN

mN+kεmN

Choose m = m(N) with m(N) rarr infin as N rarr infin and εm(N)N isin o(N) to obtain

˜Q(a) = supN Q(aN)1N ge y

29 Topological aspects 29

29 Topological aspects

Theorem 212 does not tell the full story Namely there is also a topologicalcomponent which we will now discuss Let S be a semiring with N sube S Let 6 bea Strassen preorder on S Let X = X(S6) be the asymptotic spectrum of (S6)For a isin S let

a Xrarr Rge0 φ 7rarr φ(a) (27)

The map a simply evaluates a given homomorphism φ at a One may think of aas the collection (φ(a))φisinX of all evaluations of the elements of X at a Let Rge0

have the Euclidean topology Endow X with the weak topology with respect tothe family of functions a a isin S That is endow X with the coarsest topologysuch that each a becomes continuous

Let C(XRge0) be the semiring of continuous functions Xrarr Rge0 with additionand multiplication defined pointwise on X that is (f + g)(x) = f(x) + g(x)and (f middot g)(x) = f(x)g(x) for f g isin C(XRge0) and x isin X Define the semiringhomomorphism

Φ S rarr C(XRge0) a 7rarr a

which maps a to the evaluator a defined in (27)

Theorem 215 ([Str88 Th 24])

(i) X is a nonempty compact Hausdorff space

(ii) foralla b isin S a 6sim b iff Φ(a) le Φ(b) pointwise on X

(iii) Φ(S) separates the points of X

Proof Statement (ii) follows from Theorem 212Statement (iii) is clearWe prove statement (i) We have 2 66sim 1 so from Theorem 212 follows that X

cannot be emptyFor a isin S let na isin N with a le na Then for φ isin X φ(a) le na and so

φ(a) isin [0 na] Embed X subeprod

aisinS[0 na] as a set via φ 7rarr (φ(a))aisinS The setprodaisinS[0 na] with the product topology is compact by the theorem of TychonoffTo see that X is closed in

prodaisinS[0 na] we write X as an intersection of sets

X =φ isin

prodaisinS

[0 na] φ(0) = 0capφ isin

prodaisinS

[0 na] φ(1) = 1

cap⋂bcisinS

φ isin

prodaisinS

[0 na] φ(b+ c)minus φ(b)minus φ(c) = 0

cap⋂bcisinS

φ isin

prodaisinS

[0 na] φ(bc)minus φ(b)φ(c) = 0

30 Chapter 2 The theory of asymptotic spectra

cap⋂bcisinSblec

φ isin

prodaisinS

[0 na] φ(b) le φ(c)

and we observe that the intersected sets are closed

X = 0minus1(0) cap 1minus1(1)

cap⋂bcisinS

((b+ c)minus bminus c

)minus1(0)

cap⋂bcisinS

((bc)minus bc

)minus1(0)

cap⋂bcisinSblec

(cminus b

)minus1([0infin))

This implies X is also compactLet φ ψ isin X be distinct Let a isin S with φ(a) 6= ψ(a) Then a(φ) 6= a(ψ)

Let U 3 a(φ) V 3 a(ψ) be open and disjoint subsets of Rge0 Then aminus1(U) andaminus1(V ) are open and disjoint subsets of X We conclude that X is Hausdorff

210 Uniqueness

Let S be a semiring with N sube S Let 6 be a Strassen preorder on S LetX = X(S6) be the asymptotic spectrum of (S6) The object X is unique inthe following sense

Theorem 216 ([Str88 Cor 27]) Let Y be a compact Hausdorff space LetΨ S rarr C(YRge0) be a homomorphism of semirings such that

Ψ(S) separates the points of Y (28)

and

foralla b isin S a 6sim bhArr Ψ(a) le Ψ(b) pointwise on Y (29)

Then there is a unique homeomorphism (continuous bijection with continuousinverse) h Y rarr X such that the diagram

S

C(XRge0) C(YRge0)

ΨΦ

hlowast

(210)

commutes where hlowast φ 7rarr φ h Namely let h y 7rarr(a 7rarr Ψ(a)(y)

)

211 Subsemirings 31

Proof We prove uniqueness Suppose there are two such homeomorphisms

h1 h2 Y rarr X

Suppose x 6= h2(hminus11 (x)) for some x isin X Since Φ(S) separates the points of X

there is an a isin S with Φ(a)(x) 6= Φ(a)(h2(hminus11 (x))) Let y = hminus1

1 (x) isin Y ThenΦ(a)(h1(y)) 6= Φ(a)(h2(y)) Since (210) commutes Φ(a)(h1(y)) = Ψ(a)(y) andΦ(a)(h2(y)) = Ψ(a)(y) a contradiction

We prove existence Let h Y rarr X y 7rarr (a 7rarr Ψ(a)(y)) One verifies that his well-defined continuous injective and that the diagram in (210) commutes Itremains to show that h is surjective We know that Q middot Φ(S) is a Q-subalgebraof C(XR) which separates points and which contains the nonzero constantfunction Φ(1) so by the StonendashWeierstrass theorem Q middot Φ(S) is dense in C(XR)under the sup-norm Suppose h is not surjective Then h(Y) ( X is a properclosed subset Let x0 isin X h(Y) be in the complement Since X is a compactHausdorff space there is a continuous function f Xrarr [minus1 1] with

f(h(Y)) = 1

f(x0) = minus1

We know that f can be approximated by elements from Q middot Φ(S) ie let ε gt 0then there are a1 a2 isin S N isin N such that

1N

(Φ(a1)(x)minus Φ(a2)(x)

)gt 1minus ε for all x isin h(Y)

1N

(Φ(a1)(x0)minus Φ(a2)(x0)

)lt minus1 + ε

This means Ψ(a1) ge Ψ(a2) pointwise on Y so a1 gtsim a2 but also Φ(a1) 6ge Φ(a2)pointwise on X so a1 6gtsim a2 This is a contradiction

211 Subsemirings

Let S be a subsemiring of a semiring T and let 6 be a Strassen preorder on T Then the restriction 6|S is a Strassen preorder on S How are the asymptoticspectra X(S6|S) and X(T6) related Obviously for φ isin X(T6) we haveφ|S isin X(S6|S) In fact the uniqueness theorem of Section 210 implies that allelements of X(S6|S) are restrictions of elements of X(T6)

Corollary 217 Let S be a subsemiring of a semiring T Let 6 be a Strassenpreorder on T Then

X(S6|S) = X(T6)|S

Proof Let

X = X(S6|S)

32 Chapter 2 The theory of asymptotic spectra

Φ S rarr C(XRge0) a 7rarr a

and let

Y = X(T6)|S = φ|S φ isin X(T6)Ψ S rarr C(YRge0) a 7rarr

(φ|S 7rarr φ|S(a)

)

Then Y is a compact Hausdorff space Let φ|S ψ|S isin Y be distinct Then there isan a isin S with φ|S(a) 6= ψ|S(a) so (28) holds For a b isin S a 6sim b iff Φ(a) le Φ(b)iff Ψ(a) le Ψ(b) so (29) holds Therefore

h X(T6)|S rarr X(S6|S) φ|S 7rarr(a 7rarr Ψ(a)(φ|S)

)= φ|S

is a homeomorphism

212 Subsemirings generated by one element

Let S be a semiring and let 6 be a Strassen preorder on S We specialise to thesimplest type of subsemiring of S Namely let a isin S and let

N[a] = ksumi=0

ni ai k isin N ni isin N

sube S

be the subsemiring of S generated by a We call X(N[a]) = X(N[a]6|N[a]) theasymptotic spectrum of a

Corollary 218 (cf [Str88]) If ak gt 2 for some k isin N then

˜Q isin X(N[a])

If φ(a) ge 1 for some φ isin X then

˜R isin X(N[a])

Proof Let X = X(N[a]) Let n1 nq By Corollary 214

˜Q(an1 + middot middot middot+ anq) = minφisinX

φ(an1 + middot middot middot+ anq)

Since φ is a homomorphism φ(an1 + middot middot middot+ anq) = φ(a)n1 + middot middot middot+ φ(a)nq Now weobserve that xn1 + middot middot middot+ xnq is minimised by taking x minimal in the domain Weconclude

˜Q(an1 + middot middot middot+ anq) =

qsumi=1

(minφisinX

φ(a))ni = ˜Q(a)n1 + middot middot middot+ ˜Q(a)nq

The claim for asymptotic rank ˜R similarly follows from Corollary 213

213 Universal spectral points 33

Remark 219 In general asymptotic subrank ˜Q and asymptotic rank ˜R are notelements of the asymptotic spectrum We will see an example in Chapter 4 relatedto the matrix multiplication tensor

Remark 220 Corollary 218 is closely related to Schonhagersquos τ -theorem fortensors also called Schonhagersquos asymptotic sum inequality The τ -theorem featuresin every recent fast matrix multiplication algorithm (ie every algorithm based onthe laser method)

Remark 221 An element φ isin X(N[a]) is uniquely determined by the valueof φ(a) isin Rge0 We may thus identify the asymptotic spectrum X(N[a]) with acompact (ie closed and bounded) subset of the positive reals Rge0 via φ 7rarr φ(a)

213 Universal spectral points

Having discussed the simplest type of subsemiring in the previous section letus discuss the most difficult type of supersemiring When applying the theoryof asymptotic spectra to some setting there is a natural largest semiring S inwhich the objects of study live For example we may study the semiring S of all(equivalence classes of) 3-tensors of arbitrary format over F Or we may studythe semiring S of all (isomorphism classes of) finite simple graphs We refer tothe elements of the asymptotic spectrum X(S) of the ldquoambientrdquo semiring S bythe term universal spectral points (cf [Str88 page 119]) The universal spectralpoints are the most useful monotone homomorphisms

214 Conclusion

To a semiring S with a Strassen preorder 6 we associated an asymptotic pre-order 6sim We proved that this asymptotic preorder is characterised by the6-monotone semiring homomorphisms S rarr Rge0 which make up the asymp-totic spectrum X(S6) of (S6) For (S6) we naturally have a rank functionR S rarr N and a subrank function Q S rarr N Their asymptotic versions

˜R(a) = infn R(an)1n and ˜Q(a) = supn Q(an)1n coincide with maxφisinX(S6) φ(a)

and minφisinX(S6) φ(a) respectively assuming existφ isin Xφ(a) ge 1 and existk isin N ak gt 2respectively Unfortunately we have proved the existence of the asymptotic spec-trum by nonconstructive means Explicitly constructing spectral points for a givenpair (S6) will be a challenging task

Some remarks about our proof in this chapter The proof in [Str88] uses theKadisonndashDubois theorem from the paper of Becker and Schwartz [BS83] as ablack-box Our presentation basically integrates the proof of Strassen with theproof of Becker and Schwartz The notions of rank and subrank were in [Str88] onlydiscussed for tensors We considered the straightforward generalisation to arbitrary

34 Chapter 2 The theory of asymptotic spectra

semirings with a Strassen preorder An evident feature of our presentation is thatwe do not pass from the semiring to its Grothendieck ring but instead stay inthe semiring In this way we stay close to the ldquoreal worldrdquo objects I thank JopBriet and Lex Schrijver for this idea There is a large body of literature on theKadisonndashDubois theorem for which we refer to the modern books by Prestel andDelzell [PD01 Theorem 526] and Marshall [Mar08 Theorem 544]

Chapter 3

The asymptotic spectrum of graphsShannon capacity

This chapter is based on the manuscript [Zui18]

31 Introduction

This chapter is about the Shannon capacity of graphs which was introduced byClaude Shannon in the context of coding theory [Sha56] More precisely we willapply the theory of asymptotic spectra of Chapter 2 to gain a better understandingof Shannon capacity (and other asymptotic properties of graphs)

We first recall the definition of the Shannon capacity of a graph Let G be a(finite simple) graph with vertex set V (G) and edge set E(G) An independent setor stable set in G is a subset of V (G) that contains no edges The independencenumber or stability number α(G) is the cardinality of the largest independentset in G For graphs G and H the and-product GH also called strong graphproduct is defined by

V (GH) = V (G)times V (H)

E(GH) =(g h) (gprime hprime)

(g gprime isin E(G) or g = gprime

)and

(h hprime isin E(H) or h = hprime

)and (g h) 6= (gprime hprime)

The Shannon capacity Θ(G) is defined as the limit

Θ(G) = limNrarrinfin

α(GN)1N (31)

This limit exists and equals the supremum supN α(GN)1N by Feketersquos lemma(Lemma 22)

Computing the Shannon capacity is nontrivial already for small graphs Lovaszin [Lov79] computed the value Θ(C5) =

radic5 where Ck denotes the k-cycle graph

by introducing and evaluating a new graph parameter ϑ which is now known as

35

36 Chapter 3 The asymptotic spectrum of graphs Shannon capacity

the Lovasz theta number For example the value of Θ(C7) is currently not knownThe Shannon capacity Θ is not known to be hard to compute in the sense ofcomputational complexity On the other hand deciding whether α(G) le k givena graph G and k isin N is NP-complete [Kar72]

New result dual description of Shannon capacity

The new result of this chapter is a dual characterisation of the Shannon capacityof graphs This characterisation is obtained by applying Strassenrsquos theory ofasymptotic spectra of Chapter 2 Thus this chapter also serves as an illustrationof the theory of asymptotic spectra

To state the theorem we need the standard notions graph homomorphismgraph complement and graph disjoint union Let G and H be graphs A graphhomomorphism f G rarr H is a map f V (G) rarr V (H) such that for allu v isin V (G) if u v isin E(G) then f(u) f(v) isin E(H) In other words a graphhomomorphism maps edges to edges The complement G of G is defined by

V (G) = V (G)

E(G) =u v u v 6isin E(G) u 6= v

We define a relation 6 on graphs let G 6 H if there is a graph homomor-phism Grarr H from the complement of G to the complement of H The disjointunion G tH is defined by

V (G tH) = V (G) t V (H)

E(G tH) = E(G) t E(H)

For n isin N the complete graph Kn is the graph with V (Kn) = [n] = 1 2 nand E(Kn) = i j i j isin [n] i 6= j Thus K0 = K0 is the empty graphand K1 = K1 is the graph consisting of a single vertex and no edges

Theorem 31 Let S sube graphs be a collection of graphs which is closed underthe disjoint union t and the strong graph product and which contains the graphwith a single vertex K1 Define the asymptotic spectrum X(S) as the set of allmaps φ S rarr Rge0 such that for all GH isin S

(1) if G 6 H then φ(G) le φ(H)

(2) φ(G tH) = φ(G) + φ(H)

(3) φ(GH) = φ(G)φ(H)

(4) φ(K1) = 1

32 The asymptotic spectrum of graphs 37

Let G 6sim H if there is a sequence (xN) isin NN with x1NN rarr 1 when N rarr infin such

that for every N isin N

GN 6 (HN)txN = HN t middot middot middot tHN︸ ︷︷ ︸xN

Then

(i) G 6sim H iff forallφ isin X(S) φ(G) le φ(H)

(ii) Θ(G) = minφisinX(S) φ(G)

Statement (ii) of Theorem 31 is nontrivial in the sense that Θ is not anelement of X(graphs) Namely Θ is not additive under t by a result ofAlon [Alo98] and Θ is not multiplicative under by a result of Haemers [Hae79]It turns out that the graph parameter G 7rarr maxφisinX(graphs) φ(G) is itself anelement of X(graphs) and is equal to the fractional clique cover number χf (seeSection 332 and eg [Sch03 Eq (67112)]) Fritz in [Fri17] proves (independentlyof Strassenrsquos line of work) a statement that is weaker than Theorem 31 Namelyhe proves the statement of Theorem 31 without the additivity condition (2)

In Section 32 we will prove Theorem 31 by applying the theory of asymptoticspectra of Chapter 2 to the appropriate semiring and preorder In Section 33 wewill discuss the elements in the asymptotic spectrum of graphs X(graphs) thatare currently known to me the Lovasz theta number the fractional clique covernumber the fractional orthogonal rank of the complement and the fractionalHaemers bounds We moreover prove a sufficient condition for the ldquofractionalisa-tionrdquo of a graph parameter to be in the asymptotic spectrum of graphs

32 The asymptotic spectrum of graphs

In this section we prove Theorem 31 by applying the theory of asymptotic spectrato the appropriate semiring

321 The semiring of graph isomorphism classes GA graph homomorphism f Grarr H is a graph isomorphism if f is bijective asa map V (G)rarr V (H) and bijective as a map E(G)rarr E(H) We write G sim= Hif there is a graph isomorphism f G rarr H The relation sim= is an equivalencerelation on graphs which we call isomorphism For example the graphs Gand H given by

V (G) = a b c d E(G) = a b b c c d a dV (H) = 1 2 3 4 E(H) = 1 3 2 3 2 4 1 4

38 Chapter 3 The asymptotic spectrum of graphs Shannon capacity

are isomorphic Let G = graphssim= be the set of equivalence classes in graphsunder sim= ie the isomorphism classes The relation 6 is a preorder on G Recallthat Kn is the complete graph on n vertices and thus Kn is the graph with nvertices and no edges

Lemma 32 Let ABC isin graphs

(i) t and are commutative and associative operations on G

(ii) distributes over t on G ie A (B t C) = (AB) t (A C)

(iii) K1 A = A

(iv) K0 A = K0

(v) K0 t A = A

(vi) Kn tKm = Kn+m

Proof We leave the proof to the reader

In other words Lemma 32 says that (Gt K0 K1) is a (commutative)semiring in which the elements K0 K1 K2 behave like the natural numbers NWe will denote this semiring simply by G

322 Strassen preorder via graph homomorphisms

Let G be the semiring of graphs Recall that G 6 H if there is a graph homomor-phism f Grarr H

Lemma 33 The preorder 6 is a Strassen preorder on G That is for graphsABCD isin G we have the following

(i) For nm isin N Kn 6 Km iff n le m

(ii) If A 6 B and C 6 D then A t C 6 B tD and A C 6 B D

(iii) For AB isin G if B 6= K0 then there is an r isin N with A 6 Kr B

Proof Statement (i) is easy to verify We prove (ii) Let f Ararr B and g C rarr Dbe graph homomorphisms Let the map f t g V (A) t V (C)rarr V (B) t V (D) bedefined by

(f + g)(a) = f(a) for a isin V (A)

(f + g)(c) = g(c) for c isin V (C)

32 The asymptotic spectrum of graphs 39

One verifies directly that f t g is a graph homomorphism A t C rarr B tD Letthe map f g V (A)times V (C)rarr V (B)times V (D) be defined by

(f g)(a c) = (f(a) g(c))

One verifies directly that f g is a graph homomorphism A C rarr B D Thisproves (ii) We prove (iii) Let r = |V (A)| Then A 6 Kr By assumptionB 6= K0 so K1 6 B Therefore A 6 Kr

sim= Kr1 6 KrB This proves (iii)

323 The asymptotic spectrum of graphs X(G)

We thus have a semiring G with a Strassen preorder 6 We are therefore in theposition to apply the theory of asymptotic spectra (Chapter 2) Let us translatethe abstract terminology to this setting

Let G 6sim H if there is a sequence (xN) isin NN with (xN)1N rarr 1 such that forevery N isin N we have GN 6 HN KxN ie GN 6 (HN)txN

Let S sube G be a subsemiring For example one may take S = G or one maychoose any set X sube G and let S = N[X] be the subsemiring of G generated by Xunder t and

The asymptotic spectrum of S is the set X(S) of 6-monotone semiring homo-morphisms S rarr Rge0 ie all maps φ S rarr Rge0 such that for all GH isin S

(1) if G 6 H then φ(G) le φ(H)

(2) φ(G tH) = φ(G) + φ(H)

(3) φ(GH) = φ(G)φ(H)

(4) φ(K1) = 1

We call X(G) the asymptotic spectrum of graphs

Theorem 34 Let GH isin S Then G 6sim H iff forallφ isin X(S) φ(G) le φ(H)

Proof By Lemma 32 we have a semigroup S and by Lemma 33 we have aStrassen preorder 6 so we may apply Theorem 212

We refer to Chapter 2 for a discussion of the topological properties of X(S)

324 Shannon capacity Θ

Let us discuss the (asymptotic) rank and (asymptotic) subrank for (G6) Recallthat an independent set in G is a subset of V (G) that contains no edges andthe independence number α(G) is the cardinality of the largest independent setin G A colouring of G is an assignment of colours to the elements of V (G) suchthat connected vertices get distinct colours The chromatic number χ(G) is the

40 Chapter 3 The asymptotic spectrum of graphs Shannon capacity

smallest number of colours in any colouring of G The clique cover number χ(G)is defined as the chromatic number of the complement χ(G) = χ(G)

For the semiring G with preorder 6 the abstract definition of subrank ofSection 28 becomes Q(G) = maxm isin N Km 6 G and the abstract definitionof rank becomes R(G) = minn isin N G 6 Kn

Lemma 35

(i) α(G) = Q(G)

(ii) χ(G) = R(G)

Proof We leave the proof to the reader

We see directly that the asymptotic rank is the Shannon capacity

˜R(G) = limNrarrinfin

R(GN)1N = limNrarrinfin

α(GN)1N = Θ(G)

and that the asymptotic subrank is the asymptotic clique cover number

˜Q(G) = limNrarrinfin

Q(GN)1N = limNrarrinfin

χ(GN)1N = ˜χ(G)

Let S sube G be a subsemiring Let G isin S

Corollary 36 Θ(G) = minφisinX(S) φ(G)

Proof Let G be a graph Either G = K0 or K1 6 G 6 K1 or G contains at leastone edge In the first two cases the claim is clearly true In the third case G gt K2

and we may thus apply Corollary 213

Corollary 37 ˜χ(G) = maxφisinX(S) φ(G)

Proof This is Corollary 214

Remark 38 As mentioned earlier it turns out that ˜χ is in fact itself an elementof X(G) See Section 332 (This is a striking difference with the situation fortensors which we will discuss in Chapter 4 there both asymptotic rank andasymptotic subrank are not in the asymptotic spectrum see Remark 44)

Shannon capacity is not in the asymptotic spectrum

Lemma 39 GG ge K|V (G)|

Proof Let D = (u u) u isin V (G) Let (u u) (v v) isin D Then eitheru v isin E(G) or u v isin E(G) (exclusive or) and so (u u) (v v) 6isin E(GG)Therefore the subgraph in GG induced by D is isomorphic to K|V (G)|

Example 310 Let G be the Schlafli graph This is a graph with 27 verticesThus Θ(GG) ge |V (G)| = 27 On the other hand Haemers in [Hae79] showedthat Θ(G)Θ(G) le 21 This implies the map Θ is not in X(G) since it is notmultiplicative under

33 Universal spectral points 41

33 Universal spectral points

The abstract theory of asymptotic spectra of Chapter 2 does not explicitly describethe elements of X(G) ie the universal spectral points (cf Section 213) Howeverseveral graph parameters from the literature can be shown to be universal spectralpoints In fact recently in [BC18] the first infinite family of universal spectralpoints was found the fractional Haemers bounds We give a brief (and probablyincomplete) overview of currently known elements in X(G)

331 Lovasz theta number ϑ

For any real symmetric matrix A let Λ(A) be the largest eigenvalue The Lovasztheta number ϑ(G) is defined as

ϑ(G) = minΛ(A) A isin RV (G)timesV (G) symmetric u v 6isin E(G)rArr Auv = 1

The parameter ϑ(G) was introduced by Lovasz in [Lov79] We refer to [Knu94]and [Sch03] for a survey It follows from well-known properties that ϑ isin X(G)

332 Fractional graph parameters

Besides the Lovasz theta number there are several elements in X(G) that arenaturally obtained as fractional versions of -submultiplicative t-subadditive6-monotone maps G rarr Rge0 For any map φ G rarr Rge0 we define a fractionalversion φf by

φf (G) = infd

φ(GKd

)d

We will discuss several fractional parameters from the literature and prove ageneral theorem about fractional parameters

Fractional clique cover number

We consider the fractional version of the clique cover number χ(G) = χ(G) It iswell-known that χf isin X(G) see eg [Sch03] The fractional clique cover number χfin fact equals the asymptotic clique cover number ˜χ(G) = limNrarrinfin χ(GN)1N

which we introduced in the previous section see [MP71] and also [Sch03 Th 6717]

Fractional Haemers bound

Let rank(A) denote the matrix rank of any matrix A For any set C of matricesdefine rank(C) = minrank(A) A isin C For a field F and a graph G define theset of matrices

MF(G) = A isin FV (G)timesV (G) foralluv Avv 6= 0 u v 6isin E(G)rArr Auv = 0

42 Chapter 3 The asymptotic spectrum of graphs Shannon capacity

Let RF(G) = rank(MF(G)) The parameter RF(G) was introduced by Haemersin [Hae79] and is known as the Haemers bound The fractional Haemers bound RF

f

was studied by Anna Blasiak in [Bla13] and was recently shown to be -multiplica-tive by Bukh and Cox in [BC18] From this it is not hard to prove that RF

f isin X(G)Bukh and Cox in [BC18] furthermore prove a separation result for any field F ofnonzero characteristic and any ε gt 0 there is a graph G such that for any field Fprimewith char(F) 6= char(Fprime) the inequality RF

f(G) lt εRFprimef (G) holds This separation

result implies that there are infinitely many elements in X(G)

Fractional orthogonal rank

In [CMR+14] the orthogonal rank ξ(G) and its fractional version the projectiverank ξf (G) are studied It easily follows from results in [CMR+14] that G 7rarr ξf (G)is in X(G)

General fractional parameters

We will prove something general about fractional parameters Define the lexico-graphic product GnH by

V (GnH) = V (G)times V (H)

E(GnH) =(g h) (gprime hprime) g gprime isin E(G)

or (g = gprime and h hprime isin E(H))

The lexicographic product satisfies GnH = GnH Also define the or-productG lowastH by

V (G lowastH) = V (G)times V (H)

E(G lowastH) =(g h) (gprime hprime) g gprime isin E(G) or h hprime isin E(H)

The or-product and the strong graph product are related by G lowastH = GH Thestrong graph product gives a subgraph of the lexicographic product which gives asubgraph of the or-product

GH sube GnH sube G lowastH

Therefore G lowastH 6 GnH 6 GH Finally GnKd = G lowastKd and of courseGKd = Gtd

We will prove if φ G rarr Rge0 is -submultiplicative t-subadditive and 6-monotone then φf is again -submultiplicative t-subadditive and 6-monotoneMoreover if φ G rarr N is 6-monotone and satisfies

forallGH isin G φ(GnH) ge φ(GnKφ(H))

then φf is n-supermultiplicative and more importantly φf is -supermultiplica-tive

33 Universal spectral points 43

Lemma 311

(i) If φ is t-superadditive then φf is t-superadditive

(ii) If φ is 6-monotone then φf is 6-monotone

(iii) If φ is t-subadditive and 6-monotone then φf is t-subadditive

(iv) If foralln isin N φ(Kn) = n then foralln isin N φf (Kn) = n

(v) If φ is -submultiplicative and 6-monotone then φf is -submultiplicative

Proof Let GH isin G Let d isin N(i) The lexicographic product distributes over the disjoint union

(G tH) nKd = (GnKd) t (H nKd)

By superadditivity

φ((GnKd) t (H nKd)) ge φ(GnKd) + φ(H nKd)

Therefore

φf (G tH) = infd

φ((G tH) nKd)

d

= infd

φ((GnKd) t (H nKd))

d

ge infd

φ(GnKd)

d+φ(H nKd)

d

ge infd1

φ(GnKd1)

d1

+ infd2

φ(H nKd2)

d2

= φf (G) + φf (H)

(ii) Let G 6 H Then G n Kd 6 H n Kd Thus φ(G n Kd) le φ(H n Kd)Therefore φf (G) le φf (H)

(iii) We have GnKd 6 GKd = Gtd Thus by monotonicity and subadditivity

φ(GnKd) le dφ(G)

and for d e isin N

φ(GnKde) = φ((GnKd) nKe) le eφ(GnKd)

We use this inequality to get for d1 d2 isin N

φ(GnKd1)

d1

+φ(H nKd2)

d2

ge φ(GnKd1d2) + φ(H nKd1d2)

d1d2

44 Chapter 3 The asymptotic spectrum of graphs Shannon capacity

From subadditivity follows

φ(GnKd1d2) + φ(H nKd1d2)

d1d2

ge φ((GnKd1d2) t (H nKd1d2))

d1d2

=φ((G tH) nKd1d2)

d1d2

ge φf (G tH)

We conclude φf (G) + φf (H) ge φf (G tH)(iv) Let n isin N Then φf (Kn) = infd φ(Kn nKd)d = infd φ(Knd)d = n(v) Let d1 d2 isin N We claim

(GH) nKd1d2 le (GnKd1) (H nKd2)

This is the same as saying there is a graph homomorphism

(GH) nKd1d2 rarr (GnKd1) (H nKd2)

which is the same as saying there is a graph homomorphism

(G lowastH) nKd1d2 rarr (GnKd1) lowast (H nKd2)

where lowast denotes the or-product of graphs One verifies that (g h (i j)) 7rarr((g i) (h j)) is such a graph homomorphism proving the claim The claimtogether with monotonicity and submultiplicativity gives

φ((GH)nKd1d2) le φ((GnKd1) (H nKd2)) le φ(GnKd1)φ(H nKd2)

Therefore

φf (GH) = infd

φ((GH) nKd)

d

= infd1d2

φ((GH) nKd1d2)

d1d2

le infd1d2

φ(GnKd1)

d1

φ(H nKd2)

d2

= φf (G)φf (H)

This concludes the proof of the lemma

Lemma 312 Let φ G rarr N satisfy

forallGH isin G φ(GnH) ge φ(GnKφ(H)) (32)

Then

infH

φ(GnH)

φ(H)= inf

d

φ(GnKd)

d

33 Universal spectral points 45

Proof From (32) follows

φ(GnH)

φ(H)geφ(GnKφ(H))

φ(H)

and so

φ(GnH)

φ(H)ge inf

d

φ(GnKd)

d

We take the infimum over H to get

infH

φ(GnH)

φ(H)ge inf

d

φ(GnKd)

d

The inequality in the other direction

infH

φ(GnH)

φ(H)le inf

d

φ(GnKd)

d

is trivially true

Lemma 313 Let φ G rarr N be 6-monotone and satisfy

forallGH isin G φ(GnH) ge φ(GnKφ(H))

Then φf is n- and -supermultiplicative

Proof Let AB isin G We have AB gt AnB so

φf (AB) ge φf (AnB)

It remains to show φf (AnB) ge φf (A)φf (B) We have

φ(AnB nH)

φ(H)=φ(An (B nH))

φ(B nH)

φ(B nH)

φ(H)

which implies

φ(AnB nH)

φ(H)ge inf

Hprime

φ(AnH prime)

φ(H prime)infHprimeprime

φ(B nH primeprime)

φ(H primeprime)= φf (A)φf (B)

Take the infimum over H to obtain φf (AnB) ge φf (A)φf (B)

Theorem 314 Let φ G rarr N be t-additive -submultiplicative 6-monotoneand Kn-normalised and satisfy

forallGH isin G φ(GnH) ge φ(GnKφ(H))

Then φf is in X(G)

Proof This follows from Lemma 311 Lemma 312 and Lemma 313

46 Chapter 3 The asymptotic spectrum of graphs Shannon capacity

34 Conclusion

In this chapter we introduced a new connection between Strassenrsquos theory ofasymptotic spectra and the Shannon capacity of graphs In particular we charac-terised the Shannon capacity (which is defined as a supremum) as a minimisationover elements in the asymptotic spectrum of graphs Known elements in theasymptotic spectrum of graphs include the fractional clique cover number theLovasz theta number the projective rank and the fractional Haemers bound Weare left with a clear goal for future work find all elements in the asymptoticspectrum of graphs

Chapter 4

The asymptotic spectrum of tensorsexponent of matrix multiplication

This chapter is based on joint work with Matthias Christandl and Peter Vrana [CVZ18]

41 Introduction

This chapter is about tensors t isin Fn1 otimes middot middot middot otimes Fnk and their asymptotic propertiesThe theory of asymptotic spectra of Chapter 2 was developed by Strassen exactlyfor the purpose of understanding the asymptotic properties of tensors Thischapter is expository and provides the necessary background for understandingChapter 5 and Chapter 6

Let us first define the asymptotic properties of interest and discuss some oftheir applications We need the concepts restriction tensor product and diagonaltensor Let s isin Fn1 otimes middot middot middot otimes Fnk and t isin Fm1 otimes middot middot middot otimes Fmk be tensors We say srestricts to t and write s gt t if there are linear maps Ai Fni rarr Fmi suchthat t = (A1 otimes middot middot middot otimes Ak) middot s The tensor product of s and t is the elements otimes t isin Fn1m1 otimes middot middot middot otimes Fnkmk with coordinates (s otimes t)ij = sitj We naturallydefine the direct sum s oplus t isin Fn1+m1 otimes middot middot middot otimes Fnk+mk We define the diagonaltensors 〈n〉 =

sumni=1 ei otimes middot middot middot otimes ei for n isin N where e1 en is the standard basis

of Fn The tensor rank R(t) is the smallest number n isin N such that t can bewritten as a sum of simple tensors a simple tensor being a tensor of the formv1 otimes middot middot middot otimes vk Equivalently R(t) = minn isin N t 6 〈n〉 The asymptotic rankis the regularisation ˜R(t) = limnrarrinfinR(totimesn)1n While tensor rank is known to behard to compute [Has90 Shi16] we do not know whether asymptotic rank is hardto compute

The exponent of matrix multiplication

The motivating example for studying asymptotic rank is the problem of findingthe exponent of matrix multiplication ω Recall from the introduction that ω

47

48 Chapter 4 The asymptotic spectrum of tensors matrix multiplication

is the infimum over a isin R such that two n times n matrices can be multipliedusing O(na) arithmetic operations (in the algebraic circuit model) It turns out(see [BCS97]) that ω is characterised by the asymptotic rank ˜R(〈2 2 2〉) of thematrix multiplication tensor

〈2 2 2〉 =sum

ijkisin[2]

eij otimes ejk otimes eki isin F4 otimes F4 otimes F4

Namely ˜R(〈2 2 2〉) = 2ω We know the trivial lower bound 2 le ω see Section 43We know the (non-trivial) upper bound ω le 23728639 which is by Coppersmithand Winograd [CW90] and improvements by Stothers Williams and Le Gall[Sto10 Wil12 LG14]

Asymptotic subrank and asymptotic restriction

Besides (asymptotic) rank we naturally define subrank Q(t) = maxm isin N 〈m〉 6 t and the asymptotic subrank ˜Q(t) = limnrarrinfinQ(totimesn)1n Moreover wesay s restricts asymptotically to t written s gtsim t if there is a sequence of naturalnumbers a(n) isin o(n) such that for all n isin N

sotimesn otimes 〈2〉otimesa(n) gt totimesn

One can prove (see [Str91]) that

sotimesn otimes 〈2〉otimeso(n) gt totimesn iff sotimesn+o(n) gt totimesn

Our goal is to understand asymptotic restriction asymptotic rank and asymptoticsubrank

More connections quantum information combinatorics algebraic prop-erty testing

Besides matrix multiplication other applications of asymptotic restriction oftensors asymptotic rank of tensors and asymptotic subrank of tensors includedeciding the feasibility of an asymptotic transformation between pure quantumstates via stochastic local operations and classical communication (slocc) inquantum information theory [BPR+00 DVC00 VDDMV02 HHHH09] boundingthe size of combinatorial structures like cap sets and tri-colored sum-free sets inadditive combinatorics [Ede04 Tao08 ASU13 CLP17 EG17 Tao16 BCC+17KSS16 TS16] see Chapter 5 and bounding the query complexity of certainproperties in algebraic property testing [KS08 BCSX10 Sha09 BX15 HX17FK14]

This chapter is organised as follows In Section 42 we briefly discuss thesemiring of tensors the asymptotic spectrum of tensors and asymptotic rank and

42 The asymptotic spectrum of tensors 49

subrank In Section 43 we discuss the gauge points a simple construction of finitelymany elements in the asymptotic spectrum of tensors In Section 44 we discussthe Strassen support functionals a family of elements in the asymptotic spectrumof ldquoobliquerdquo tensors This family is parametrised by probability distributionson [k] In Section 45 we discuss an extension of the support functionals calledthe Strassen upper support functionals which have the potential to be universalFinally in Section 46 we prove a new result we show how asymptotic slice rankis related to the support functionals

42 The asymptotic spectrum of tensors

Let us properly set up the semiring of tensors and the asymptotic spectrum Forthe proofs we refer to [Str87 Str88 Str91]

421 The semiring of tensor equivalence classes TWe begin by putting an equivalence relation on tensors For example we want toidentify isomorphic tensors and also for any tensor t isin Fn1 otimes middot middot middot otimes Fnk we wantto identify t with toplus 0 where 0 isin Fm1 otimes middot middot middot otimes Fmk is a zero tensor of any format

We say s is isomorphic to t and write s sim= t if there are bijective linear mapsAi Fmi rarr Fni such that t = (A1 Ak) middot s

We say s and t are equivalent and write s sim t if there are zero tensorss0 = 0 isin Fa1 times middot middot middot times Fak and t0 = 0 isin Fb1 times middot middot middot times Fbk such that s oplus s0

sim= t oplus t0The equivalence relation sim is in fact the equivalence relation generated by therestriction preorder 6

Let T be the set of sim-equivalence classes of k-tensors over F for some fixed kand field F The direct sum and the tensor product naturally carry over to T and T becomes a semiring with additive unit 〈0〉 and multiplicative unit 〈1〉(more precisely the equivalence classes of those tensors but we will not make thisdistinction)

422 Strassen preorder via restriction

Restriction 6 induces a partial order on T which behaves well with respect tothe semiring operations and naturally n le m if and only if 〈n〉 6 〈m〉 Thereforerestriction 6 is a Strassen preorder on T

423 The asymptotic spectrum of tensors X(T )

Let S sube T be a subsemiring Let

X(S) = X(S6) = φ isin Hom(SRge0) foralla b isin S a 6 brArr φ(a) le φ(b)

50 Chapter 4 The asymptotic spectrum of tensors matrix multiplication

We call X(S) the asymptotic spectrum of S and we call X(T ) the asymptoticspectrum of k-tensors over F

Theorem 41 ([Str88]) Let s t isin S Then s t iff forallφ isin X(S) φ(s) le φ(t)

Proof This follows from Theorem 212

We refer to Chapter 2 for a discussion of the topological properties of X(S)

Remark 42 We mention that X(S) may equivalently be defined with degenera-tion instead of restriction ge Over C we say f degenerates to g written f gif f sim= f prime and g sim= gprime and gprime is in the Euclidean closure (or equivalently Zariskiclosure) of the orbit GLn1 times middot middot middot timesGLnk middot f prime It is a nontrivial fact from algebraicgeometry (see [Kra84 Lemma III231] or [BCS97]) that there is a degenerationf g if and only if there are matrices Ai with entries polynomial in ε such that(A1 Ak) middot f = εdg + εd+1g1 + middot middot middot + εd+ege for some elements g1 ge Thelatter definition of degeneration is valid when C is replaced by an arbitrary fieldF and that is how degeneration is defined for an arbitrary field Degenerationis weaker than restriction f ge g implies f g Asymptotically however thenotions coincide f amp g if and only if fotimesn otimes 〈2〉otimeso(n) gotimesn We mention thatanalogous to restriction degeneration gives rise to border rank and border subrankR(f) = minr isin N f 〈r〉 and Q(f) = maxs isin N 〈s〉 f respectively

424 Asymptotic rank and asymptotic subrank

The abstract theory of asymptotic spectra characterises asymptotic subrank andasymptotic rank as follows

Corollary 43 Let S sube T be a subsemiring Let a isin S Then

˜Q(a) = minφisinX(S)

φ(a) (41)

˜R(a) = maxφisinX(S)

φ(a) (42)

Proof Statement (42) follows from Corollary 213 since either a = 0 or a gt 1For statement (41) if totimesk gt 2 for some k isin N then we apply Corollary 214Otherwise one can show that ˜Q(t) equals 0 or 1 using the gauge points of thenext section (see [Str88 Lemma 37])

Remark 44 One verifies that ˜R and ˜Q are 6-monotones and have value non 〈n〉 They are not universal spectral points however Namely the asymptoticrank of each of the three tensors

〈2 1 1〉 = e1 otimes e1 otimes 1 + e2 otimes e2 otimes 1 isin F2 otimes F2 otimes F1

〈1 1 2〉 = e1 otimes 1otimes e1 + e2 otimes 1otimes e2 isin F2 otimes F1 otimes F2

43 Gauge points ζ(i) 51

〈1 2 2〉 = 1otimes e1 otimes e1 + 1otimes e2 otimes e2 isin F1 otimes F2 otimes F2

equals 2 whereas their tensor product equals the matrix multiplication ten-sor 〈2 2 2〉 whose tensor rank equals 7 and whose asymptotic rank is thus atmost 7 ie strictly smaller than 23 Therefore asymptotic rank is not multiplica-tive On the other hand the asymptotic subrank of each of the above three tensorsequals 1 whereas the asymptotic subrank of 〈2 2 2〉 equals 4 see Chapter 5Therefore asymptotic subrank is not multiplicative

Goal 45 Our goal is now to explicitly describe elements in X(T ) universalspectral points or more modestly to describe elements in X(S) for interestingsubsemirings S sube T

Strassen constructed a finite family of elements in X(T ) the gauge points andan infinite family of elements in X(oblique tensors) the support functionalsThe support functionals are powerful enough to determine the asymptotic subrankof any ldquotight tensorrdquo Tight tensors are discussed in Chapter 5 In Chapter 6 weconstruct an infinite family in X(k-tensors over C) the quantum functionalsIn the rest of this chapter we discuss the gauge points and the support functionalsWe will focus on the case k = 3 for clarity of exposition

43 Gauge points ζ(i)

Strassen in [Str88] introduced a finite family of elements in X(T ) called the gaugepoints We focus on 3-tensors but the construction generalises immediately tok-tensors Let Vi = Fni Let t isin V1 otimes V2 otimes V3 Let i isin [3] Let flatteni(t) bethe image of t under the grouping V1 otimes V2 otimes V3 rarr Vi otimes (

otimesj 6=i Vj) We think

of flatteni(t) as a matrix Let ζ(i) T rarr N t 7rarr rank(flatteni(t)) with rankdenoting matrix rank We call ζ(1) ζ(2) ζ(3) the gauge points From the propertiesof matrix rank follows directly that ζ(i) is multiplicative under otimes additive under oplusmonotone under restriction 6 (and under degeneration ) and normalised to 1on 〈1〉 = e1 otimes e1 otimes e1

Theorem 46 ζ(1) ζ(2) ζ(3) isin X(T )

Recall ˜Q(t) le φ(t) le ˜R(t) for φ isin X(T ) In particular maxi ζ(i)(t) le ˜R(t)

We do not know whether maxiisin[3] ζ(i) equals ˜R To be precise we do not know any t

for which maxi ζ(i)(t) lt ˜R(t) and we do not know a proof that maxi ζ

(i)(t) = ˜R(t)for all t There are various families of tensors t for which maxi ζ

(i)(t) = ˜R(t) isproven We will see such a family in Section 542 For the matrix multiplicationtensor 〈2 2 2〉 we have 4 = maxi ζ

(i)(〈2 2 2〉) le 2ω so maxi ζ(i)(t) = ˜R(t) would

imply that the matrix multiplication exponent ω equals 2On the other hand ˜Q(t) le mini ζ

(i)(t) There exist t for which ˜Q(t) is

strictly smaller than miniisin[3] ζ(i)(t) To show this strict inequality we need another

52 Chapter 4 The asymptotic spectrum of tensors matrix multiplication

technique of Strassen the support functionals The support functionals are thetopic of the next section

44 Support functionals ζθ

Strassen in [Str91] constructed an infinite family of elements in the asymptoticspectrum of oblique k-tensors called the support functionals In this section we ex-plain the construction of the support functionals The support functionals providethe benchmark for our new quantum functionals (Chapter 6) and are relevant inthe context of combinatorial problems like the cap set problem (Section 542) Forclarity of exposition we focus on 3-tensors The ideas extend directly to k-tensors

Oblique tensors are tensors for which in some basis the support has thefollowing special structure Let t isin Fn1 otimes Fn2 otimes Fn3 Let e1 eni be thestandard basis of Fni Write t =

sumijk tijk ei otimes ej otimes ek Let [ni] = 1 2 ni

Let supp(t) = (i j k) tijk 6= 0 sube [n1] times [n2] times [n3] be the support of t withrespect to the standard basis Let [ni] have the natural ordering 1 lt 2 lt middot middot middot lt [ni]and let [n1]times [n2]times [n3] have the product order denoted by le That is x le yif for all i isin [3] holds xi le yi We call supp(t) oblique if supp(t) is an antichainwith respect to le ie if any two elements in supp(t) are incomparable withrespect to le We call a tensor t oblique if supp(g middot t) is oblique for some groupelement g isin G(t) = GLn1 times GLn2 times GLn3 The family of oblique tensors is asemiring under oplus and otimes

Not all tensors are oblique Obliqueness is not a generic property (see Propo-sition 621) However many tensors that are of interest in algebraic complexitytheory are oblique notably the matrix multiplication tensors

〈a b c〉 =sumiisin[a]

sumjisin[b]

sumkisin[c]

eij otimes ejk otimes eki isin Fab otimes Fbc otimes Fca

For any finite set X let P(X) be the set of all probability distributions on XFor any probability distribution P isin P(X) the Shannon entropy of P is definedas H(P ) = minus

sumxisinX P (x) log2 P (x) with 0 log2 0 understood as 0 Given finite

sets X1 Xk and a probability distribution P isin P(X1 times middot middot middot times Xk) on theproduct set X1 times middot middot middot timesXk we denote the marginal distribution of P on Xi by Pithat is Pi(a) =

sumxxi=a

P (x) for any a isin Xi

Definition 47 Let θ isin Θ = P([3]) For t isin Fn1 otimes Fn2 otimes Fn3 0 with supp(t)oblique define

ζθ(t) = max2sum3i=1 θ(i)H(Pi) P isin P(supp(t))

We call the ζθ for θ isin Θ the support functionals

Theorem 48 ζθ isin X(oblique) for θ isin Θ

44 Support functionals ζθ 53

We work towards the proof of Theorem 48 For p isin [0 1] let h(p) be thebinary entropy function h(p) = minusp log2 p minus (1 minus p) log2(1 minus p) ie h(p) is theShannon entropy of the probability vector (p 1minus p) The following properties ofthe Shannon entropy are well-known

Lemma 49

(i) H(P otimesQ) = H(P ) +H(Q) for P isin P(X1) Q isin P(X2)

(ii) H(P ) le H(P1) +H(P2) for P isin P(X1 timesX2)

(iii) H(pPoplus(1minusp)Q) = pH(P )+(1minusp)H(Q)+h(p) for PQ isin P(X) p isin [0 1]

(iv) 2a + 2b = max0leple1 2pa+(1minusp)b+h(p) for a b isin R

For X sube [n1]times [n2]times [n3] let Xle = y isin [n1]times [n2]times [n3] existx isin X y le x bethe downward closure of X Let max(X) = y isin X forallx isin X y le x rArr y = xbe the maximal points of X with respect to le Let Sn be the symmetric groupof permutations of [n] Then the product group Sn1 times Sn2 times Sn3 acts naturallyon [n1]times [n2]times [n3]

Lemma 410 Let t isin Fn1 otimes Fn2 otimes Fn3 For every g isin G(t) there is a triple ofpermutations w isin W (t) = Sn1 times Sn2 times Sn3 with w middotmax(supp(g middot t)) sube supp(t)le

Proof We prepare for the construction of w Let n isin N Let e1 en bethe standard basis of Fn Let g isin GLn Let f1 fn with fj = g middot ej be thetransformed basis of Fn Let (Ei)iisin[n] and (Fj)jisin[n] be the complete flags of Fnwith

Ei = Spanei ei+1 enFj = Spanfj fj+1 fn

Define the map

π [n]rarr [n] j 7rarr maxi isin [n] Ei cap (fj + Fj+1) 6= empty

(43)

We prove π is injective Let j k isin [n] with j le k and suppose i = π(j) = π(k)Let Ftimes = F 0 From (43) follows

(Ftimesei + Ei+1) cap (fj + Fj+1) 6= empty (44)

Ei+1 cap (fj + Fj+1) = empty (45)

(Ftimesei + Ei+1) cap (fk + Fk+1) 6= empty (46)

Suppose j lt k Then from (44) and (46) we obtain a contradiction to (45) Weconclude that j = k Thus π is injective

54 Chapter 4 The asymptotic spectrum of tensors matrix multiplication

For each Fni define as above the standard complete flag (Eij)jisin[ni] of Fni the

complete flag (F ij )jisin[ni] corresponding to the basis given by gi and the permuta-

tion πi [ni]rarr [ni] Let w = (π1 π2 π3) isin W (t)We will prove w middot max(supp(g middot t)) sube supp(t)le Let y isin max(supp(g middot t))

Let x = w middot y By construction of πi the intersection Eixicap (f iyi + F i

yi+1) is notempty Choose

f iyi isin Eixicap (f iyi + F i

yi+1)

Let tlowast be the multilinear map Fn1 times Fn2 times Fn3 rarr F with tlowast(ei ej ek) = tijk for alli isin [n1] j isin [n2] k isin [n3] Then

tlowast(f 1y1 f 2y2 f 3y3

) = tlowast(f 1y1 f 2y2 f 3y3

) +sum

zisin[n1]times[n2]times[n3]zgty

cz tlowast(f 1

z1 f 2z2 f 3z3

) (47)

for some cz isin F Since y is maximal in supp(gmiddott) the sum over z gt y in (47) equalszero We conclude tlowast(f 1

y1 f 2y2 f 3y3

) = tlowast(f 1y1 f 2y2 f 3y3

) 6= 0 Thus tlowast(E1x1timesE2

x2timesE3

x3)

is not zero and thus x isin supp(t)le

Proof of Theorem 48 We prove ζθ on oblique tensors is otimes-multiplicative oplus-additive 6-monotone and normalised to 1 on 〈1〉 = e1otimese1otimese1 The normalisationζθ(〈1〉) = 1 is clear

We prove ζθ is otimes-supermultiplicative Let s isin Fn1 otimes Fn2 otimes Fn3 and lett isin Fm1otimesFm2otimesFm3 Let P isin P(supp(t)) and Q isin P(supp(s)) Then the productP otimesQ isin P(supp(sotimes t)) has marginals PiotimesQi Since H(PiotimesQi) = H(Pi)+H(Qi)(Lemma 49(i)) we conclude ζθ(s)ζθ(t) le ζθ(sotimes t)

We prove ζθ is otimes-submultiplicative For P isin P(supp(t)) and θ isin Θ we use thenotation Hθ(P ) =

sum3i=1 θ(i)H(Pi) We naturally identify supp(t) with a subset

of [n1] times [n2] times [n3] times [m1] times [m2] times [m3] Let P isin P(supp(t)) Let P[3] be themarginal distribution of P on [n1] times [n2] times [n3] and let P3+[3] be the marginaldistribution of P on [m1]times [m2]times [m3] Then Hθ(P ) le Hθ(P[3]) +Hθ(P3+[3]) byLemma 49(ii) We conclude ζθ(sotimes t) le ζθ(s)ζθ(t)

We prove ζθ is oplus-additive By definition

ζθ(soplus t) = max2Hθ(P ) P isin P(supp(soplus t))= max

max0leple1

2Hθ(pPoplus(1minusp)Q) P isin P(supp(s)) Q isin P(supp(t))

From Lemma 49(iii) and (iv) follows

max

max0leple1

2Hθ(pPoplus(1minusp)Q) P isin P(supp(s)) Q isin P(supp(t))

= max

max0leple1

2pHθ(P )+(1minusp)Hθ(Q)+h(p) P isin P(supp(s)) Q isin P(supp(t))

= max

2Hθ(P ) + 2Hθ(Q) P isin P(supp(s)) Q isin P(supp(t))

44 Support functionals ζθ 55

= ζθ(s) + ζθ(t)

We conclude ζθ(soplus t) = ζθ(s) + ζθ(t)We prove ζθ is 6-monotone Let s 6 t with supp(s) and supp(t) oblique Then

there are linear maps Ai with s = (A1 otimes A2 otimes A3) middot t If A1 A2 A3 are of theform diag(1 1 0 0) then ζθ(s) le ζθ(t) Suppose g = (A1 A2 A3) isin G(t)Let P isin P(supp(t)) maximise Hθ on P(supp(t)) Let σ isin W such that σ middot Phas non-increasing marginals Then Hθ(σ middot P ) = Hθ(P ) and σ middot P maximises Hθ

on P(supp(σ middot t)) Then σ middot P maximises Hθ on P(supp(σ middot t)le) by Lemma 412below Let Q isin P(supp(g middot t)) maximise Hθ on P(supp(g middot t)) By Lemma 410there is a w isin W with w middot supp(g middot t) sube supp(σ middot t)le Then Hθ(w middotQ) = Hθ(Q) leHθ(σ middot P ) = Hθ(P ) Thus maxPisinP(supp(gmiddott)) Hθ(P ) le maxPisinsupp(t) Hθ(P ) Weconclude ζθ(g middot t) le ζθ(t)

The following two lemmas finish the above proof of Theorem 48 Recall thatin the proof we defined Hθ(P ) =

sum3i=1 θ(i)H(Pi) for θ isin Θ

Lemma 411 ([Str91 Prop 21]) Let Φ sube [n1] times [n2] times [n3] Let P isin P(Φ)Let supp(P ) be the support x isin Φ P (x) 6= 0 For x isin Φ define hP (x) =minussum3

i=1 θ(i) log2 Pi(xi) Then P maximises Hθ on P(Φ) if and only if

forallx isin supp(P ) hP (x) = maxyisinΦ

hP (y) (48)

Proof We write Hθ(P ) in terms of hP

Hθ(P ) =3sumi=1

θ(i)H(Pi) =sum

xisinsupp(P )

P (x)hP (x) (49)

For Q isin P(Φ)

limεrarr0+

d

dεHθ

((1minus ε)P + εQ

)= lim

εrarr0+

d

sumx

((1minus ε)P (x) + εQ(x)

)h(1minusε)P+εQ(x)

=sumx

P (x)

( 3sumi=1

θ(i)Pi(xi)minusQi(xi)

Pi(xi) ln(2)

)+sumx

(minusP (x) +Q(x)

)hP (x)

=sumx

Q(x)hP (x)minussumx

P (x)hP (x)

Therefore since Hθ is continuous and concave P maximises Hθ if and only if

forallQ isin P(Φ)sumx

Q(x)hP (x)minussumx

P (x)hP (x) le 0 (410)

56 Chapter 4 The asymptotic spectrum of tensors matrix multiplication

We will prove (410) is equivalent to (48) Supposesum

xQ(x)hP (x) lesum

x P (x)hP (x)for every Q isin P(Φ) In particular hP (y) le

sumx P (x)hP (x) for every y isin Φ so

maxyisinΦ hP (y) lesum

x P (x)hP (x) Then maxyisinΦ hP (y) =sum

x P (x)hP (x) We con-clude maxyisinΦ hP (y) = hP (x) for every x isin supp(P )

Suppose maxyisinΦ hP (y) = hP (x) for every x isin supp(P ) Then hP (y) le hP (x)for every Q isin P(Φ) y isin supp(Q) x isin supp(P ) We conclude

sumxQ(x)hP (x) lesum

x P (x)hP (x)

Lemma 412 ([Str91 Cor 22]) Let Φ sube [n1]times [n2]times [n3] Let P maximise Hθ

on P(Φ) Suppose Pi is nonincreasing on [ni] for each i isin [3] Then P max-imises Hθ on P(Φle) where Φle is the downward closure of Φ with respect to le

Proof We know P satisfies (48) We will prove P satisfies (48) with Φ replacedby Φle Then we are done by Lemma 411 Let x isin Φle Then x le y forsome y isin Φ Then (P1(x1) P2(x2) P3(x3)) ge (P1(y1) P2(y2) P3(y3)) since each Piis nonincreasing Then hP (x) le hP (y) We conclude maxΦle hP le maxΦ hP Onthe other hand Φ sube Φle Therefore maxΦ hP le maxΦle hP

Using the support functionals Strassen managed to fully compute the asymp-totic spectrum of several semirings generated by oblique tensors We will see anexample in Section 542

45 Upper and lower support functionals ζθ ζθ

In Section 44 we defined the support functionals ζθ oblique rarr Rge0 andproved that ζθ isin X(oblique) From the general theory of asymptotic spectra(Chapter 2) we know ζθ is the restriction of some map φ tensors rarr Rge0

in X(T ) However the proof of that fact was non-constructive In other wordswe know that ζθ can be extended to an element of X(T ) In this short sectionwe discuss a candidate extension proposed by Strassen called the upper supportfunctional We also discuss a companion called the lower support functional

For arbitrary t isin Fn1 otimes Fn2 otimes Fn3 the upper support functional and the lowersupport functional are defined as

ζθ(t) = mingisinG(t)

max2Hθ(P ) P isin P(supp(g middot t))

ζθ(t) = maxgisinG(t)

max2Hθ(P ) P isin P(max(supp(g middot t)))

with G(t) = GLn1 timesGLn2 timesGLn3 and Hθ(P ) =sum3

i=1 θ(i)H(Pi) We summarisethe known properties of the upper and lower support functional

Theorem 413 ([Str91]) Let s isin Fn1 otimes Fn2 otimes Fn3 and t isin Fm1 otimes Fm2 otimes Fm3Let θ isin Θ = P([3])

45 Upper and lower support functionals ζθ ζθ 57

(i) ζθ(〈n〉) = n for n isin N

(ii) ζθ(soplus t) = ζθ(s) + ζθ(t)

(iii) ζθ(sotimes t) le ζθ(s)ζθ(t)

(iv) If s gt t then ζθ(s) ge ζθ(t)

Theorem 414 ([Str91]) Let s isin Fn1 otimes Fn2 otimes Fn3 and t isin Fm1 otimes Fm2 otimes Fm3Let θ isin Θ

(i) ζθ(〈n〉) = n for n isin N

(ii) ζθ(soplus t) ge ζθ(s) + ζθ(t)

(iii) ζθ(sotimes t) ge ζθ(s)ζθ(t)

(iv) If s gt t then ζθ(s) ge ζθ(t)

Theorem 415 ([Str91]) ζθ(sotimes t) ge ζθ(s)ζθ(t) and ζθ(t) ge ζθ(t) for θ isin Θ

Regarding statement (ii) in Theorem 414 Burgisser [Bur90] shows that thelower support functional ζθ is not in general additive under the direct sumwhen θi gt 0 for all i See also [Str91 Comment (iii)] In particular this impliesthat the upper support functional ζθ(t) and the lower support functional ζθ(t)are not equal in general the upper support functional being additive In factto show that the lower support functional is not additive Burgisser first showsthat when F is algebraically closed the generic value of ζθ on Fn otimes Fn otimes Fnequals (1minusmini θi) log2 n+ o(n) On the other hand Tobler [Tob91] shows thatthe generic value of ζθ on FnotimesFnotimesFn equals log2 n So even generically ζθ and ζθare different on Fn otimes Fn otimes Fn

For θ isin Θ we say f is θ-robust if ζθ(t) = ζθ(t) We say t is robust if t is θ-robustfor all θ isin Θ Let us try to understand what robust tensors look like A tensor tis θ-robust if and only if

ζθ(t) le ζθ(t) (411)

The set of θ-robust tensors is closed under oplus and otimes since

ζθ(soplus t) = ζθ(s) + ζθ(t) = ζθ(s) + ζθ(t) le ζθ(soplus t)

and

ζθ(sotimes t) le ζθ(s)ζθ(t) = ζθ(s)ζθ(t) le ζθ(sotimes t)

For X sube [n1] times [n2] times [n3] we use the notation Hθ(X) = maxPisinP(X) Hθ(P )Let t isin Fn1 otimes Fn2 otimes Fn3 0 Equation (411) means that there are g h isin G(t)and P isin P(max supp(h middot t)) such that Hθ(supp(g middot t)) le Hθ(P ) In this case we

58 Chapter 4 The asymptotic spectrum of tensors matrix multiplication

have ζθ(t) = ζθ(t) = 2Hθ(P ) In particular t is θ-robust if there is a g isin G(t) suchthat the maximisation Hθ(supp(g middot t) is attained by a P isin P(max(supp(g middot t)))This criterion is automatically satisfied for all θ when supp(g middot t) = max(supp(g middot t))for some g isin G(t) Suppose t is oblique Then supp(g middot t) is an antichainfor some g isin G(t) and thus supp(g middot t) = max supp(g middot t) Then t is robustand ζθ(t) = ζθ(t) = 2Hθ(supp(gmiddott))

46 Asymptotic slice rank

Slice rank is a variation on tensor rank that was introduced by Terence Taoin [Tao16] to study cap sets We will look at cap sets in Section 54 Here westudy the relationship between asymptotic slice rank and the support functionals

Consider the following characterisation of tensor rank Let a simple tensor beany tensor of the form v1 otimes v2 otimes v3 isin V1 otimes V2 otimes V3 with vi isin Vi for i isin [k] Thenthe rank R(t) of t isin V1 otimes V2 otimes V3 is the smallest number r such that t can bewritten as a sum of r simple tensors

Slice rank is defined similarly but with simple tensors replaced by slicesFor S sube [k] let VS =

otimesiisinS Vi For j isin [k] let j = j A tensor in V1 otimes V2 otimes V3

is called a slice if it is of the form v otimes w with v isin Vj and w isin Vj for some j isin [k](under the natural reordering of the tensor legs) Let t isin V1 otimes V2 otimes V3 The slicerank of t denoted by SR(t) is the smallest number r such that t can be writtenas a sum of r slices For example the tensor

W = e1 otimes e1 otimes e2 + e1 otimes e2 otimes e1 + e2 otimes e1 otimes e1 isin F2 otimes F2 otimes F2 (412)

has slice rank 2 since we can write W = e1 otimes (e1 otimes e2 + e2 otimes e1) + e2 otimes e1 otimes e1In fact the slice rank of any element in V1 otimes V2 otimes V3 is at most mini dimVi Thetensor rank of W on the other hand is known to be 3

Slice rank is clearly monotone under restriction The slice rank of the diagonaltensor 〈r〉 equals r [Tao16] It follows that subrank is at most slice rank

Q(t) le SR(t)

The motivation for the introduction of slice rank in [Tao16] was finding upperbounds on subrank Q(t) and asymptotic subrank ˜Q(t)

The main result of this section is the following theorem Recall that a tensor tis oblique if the support supp(g middot t) is an antichain for some g isin G(t)

Theorem 416 Let t be oblique Then

limnrarrinfin

SR(totimesn)1n = minθisinP([3])

ζθ(t)

Our proof of Theorem 416 is based on a proof of Tao and Sawin in [TS16]and discussions of the author with Dion Gijswijt The explicit connection betweenasymptotic slice rank and the support functionals is new

46 Asymptotic slice rank 59

We use Theorem 416 before giving its proof to see that SR is not submulti-plicative and not supermultiplicative under the tensor product otimes In particular wecannot use Feketersquos lemma Lemma 22 to prove that the limit limnrarrinfin SR(totimesn)1n

exists Thus the existence of the limit is a non-trivial consequence of Theorem 416Let W as in (412) Then SR(W ) = 2 We have ζ(131313)(W ) = 2h(13) lt 2

From Theorem 416 follows SR(Wotimesn) le 2nh(13)+o(1) We conclude SR(Wotimesn) lt 2n

for n large enough We conclude SR is not supermultiplicative Now it is alsoclear that slice rank is not the same as (border) subrank since (border) subrankis supermultiplicative

Next the tensorssumn

i=1 eiotimeseiotimes1sumn

i=1 eiotimes1otimeseisumn

i=1 1otimeseiotimesei have slice rankone while their tensor product equals the matrix multiplication tensor 〈n n n〉which has slice rank n2 by Theorem 416 and Theorem 53 in the next chapterapplied to the tight tensor 〈n n n〉 We conclude SR is not submultiplicative

Slice rank and hitting set number

We study the hitting set number of the support of a tensor Let Φ sube [n1]times[n2]times[n3]A hitting set for Φ is a 3-tuple of sets A1 sube [n1] A3 sube [n2] A3 sube [n3] such that forevery a isin Φ there is an i isin [3] with ai isin Ai We may think of Φ as a 3-partite3-uniform hypergraph Then the definition of hitting set says every edge a isin Φ ishit by an element of some Ai A hitting set is also called a vertex cover everyedge being covered by some vertex or a transversal The size of the hittingset (A1 A2 A3) is |A1|+ |A2|+ |A3| The hitting set number τ(Φ) is the size ofthe smallest hitting set for Φ Let t isin Fn1 otimes Fn2 otimes Fn3

Lemma 417 Let g isin G(t) = GLn1timesGLn2timesGLn3 Then SR(t) le τ(supp(g middot t))

Proof This is clear

Lemma 418 Let g isin G(t) Then SR(t) ge τ(max(supp(g middot t)))

Proof It is sufficient to consider g = e Let

t =

r1sumi=1

v1i otimes u1

i +

r2sumi=1

v2i otimes u2

i +

r3sumi=1

v3i otimes u3

i

be a slice decomposition We may assume vj1 vjrj

are linearly independent

Let Vj = Spanvj1 vjrj sube Fnj Let Wj sube (Fnj)lowast be the elements in thedual space that vanish on Vj Let Bj sube Wj be a basis with the followingproperty with respect to the standard basis the matrix with the elementsof Bj as columns is in reduced row echelon form ie each column is of theform (lowast middot middot middot lowast 1 0 middot middot middot 0)T and the pivot elements (the 1rsquos) are all in different rowsLet Sj sube [nj] be the indices of the pivot element Let Sj = [nj] Sj be thecomplement Then |Sj| = rj We claim (S1 S2 S3) is a hitting set for max(supp(t))

60 Chapter 4 The asymptotic spectrum of tensors matrix multiplication

Then r1 + r2 + r3 = |S1|+ |S2|+ |S3| ge τ(max(supp(t))) Let x isin max(supp(t))Suppose x isin S1 times S2 times S3 For every j isin [3] let φj isin Bj have its pivot element atindex xj Let φ = φ1 otimes φ2 otimes φ3 Then φ isin W1 otimesW2 otimesW3 so φ(t) = 0 Since x ismaximal and each Bj is in reduced row echelon form

φ(t) =sumylex

ty φ(ey1 otimes ey2 otimes ey3)

=sumyltx

ty φ(ey1 otimes ey2 otimes ey3) + tx ex1 otimes ex2 otimes ex3

=sumyltx

sy ey1 otimes ey2 otimes ey3 + tx ex1 otimes ex2 otimes ex3

for some sy isin F From φ(t) = 0 follows tx = 0 This contradicts x isin supp(t) sox 6isin S1 times S2 times S3 ie there is a j isin [3] with xj isin Sj

Asymptotic hitting set number

We now study the asymptotic hitting set number ˜τ(Φ) = limnrarrinfin τ(Φtimesn)1nWe will use some basic facts of types and type classes Let X be a finite

set Let N isin N An N-type on X is a probability distribution P on X withN middot P (x) isin N for all x isin X Let P be an N -type on X The type class TNP sube XN

is the set of sequences s = (s1 sN) with x occuring N middot P (x) times in s forevery x isin X ie |i isin [N ] si = x| = N middot P (x)

Lemma 419 The number of N-types on X equals(N+|X|minus1|X|minus1

) Let P be an

N-type The size of the type class TNP equals the multinomial coefficient(NNP

)

Proof We leave the proof to the reader

Lemma 420 Let P be an N-type on X Then

1

(N + 1)|X|2NH(P ) le

(N

NP

)le 2NH(P )

Proof See eg [CT12 Theorem 1113]

Lemma 421 log2˜τ(Φ) le maxPisinP(Φ) miniisin[3]H(Pi)

Proof Let P maximise maxPisinP(Φ) miniH(Pi) Let n isin N We construct a hittingset (A1 A2 A3) for Φn as follows Let x isin Φn Viewing x as an n-tuple of elementsin Φ let Q isin Pn(Φ) be the type of x (ie the empirical distribution) Let j isin [3]with H(Qj) = miniisin[3]H(Qi) By our choice of P we have

H(Qj) = miniisin[3]

H(Qi) le miniisin[3]

H(Pi)

46 Asymptotic slice rank 61

Viewing x as a 3-tuple (x1 x2 x3) add xj to Aj We repeat this for all x isin ΦnThe final (A1 A2 A3) is a hitting set for Φn by construction For each j isin [3]

|Aj| lesumQj

|T nQj | lesumQj

2nH(Qj)

where the sum is over Qj isin Pn(Φj) with H(Qj) le miniisin[3]H(Pi) Then

|Aj| le |Pn(Φj)| 2nminiH(Pi) = poly(n)2nminiH(Pi)

We conclude |A1|+ |A2|+ |A3| le poly(n)2nminiH(Pi)

Lemma 422 log2˜τ(Φ) ge maxPisinP(Φ) miniisin[3]H(Pi)

Proof Let P maximise maxPisinP(Φ) miniH(Pi) Let n isin N Let (A1 A2 A3) be ahitting set for Φn Let Q isin Pn(Φ) be an n-type with miniH(Qi) = miniH(Pi)minuso(n) Let Ψ = T nQ sube Φn be the set of strings with type Q Then (A1 A2 A3) is ahitting set for Ψ Let πi Ψrarr Φn

i (x1 x2 x3) 7rarr xi Then

Ψ = πminus11 (A1) cup πminus1

2 (A2) cup πminus13 (A3)

Let j isin [3] with |πminus1j (Aj)| ge 1

3|Ψ| The fiber πminus1

j (a) has constant size over a isin Ψj

Let cj = |πminus1j (a)| be this size Then

|Ψ| =sumaisinΨj

|πminus1j (a)| =

sumaisinΨj

cj = |Ψj| cj

And

|πminus1j (Aj)| =

sumaisinAjcapΨj

|πminus1j (a)| = |Aj capΨj| cj le |Aj| cj

Therefore

|Aj| ge|πminus1j (Aj)|cj

ge13|Ψ|cj

= 13|Ψj|

We have |Ψj| ge 2nH(Qj)minuso(n) ge 2nminiH(Qi)minuso(n) ge 2nminiH(Pi)minuso(n) We conclude|A1|+ |A2|+ |A3| ge |Aj| ge 1

3|Ψj| ge 1

32nminiH(Pi)minuso(n)

Lemma 423 log2˜τ(Φ) = maxPisinP(Φ) miniisin[3] H(Pi)

Proof This follows directly from the above lemmas

62 Chapter 4 The asymptotic spectrum of tensors matrix multiplication

Asymptotic slice rank

We now combine the above lemmas about slice rank and the asymptotic hittingset number to prove Theorem 416 First we have the following basic lemma

Lemma 424 minθisinΘ maxPisinP(Φ)Hθ(P ) = maxPisinP(Φ) miniisin[3]H(Pi)

Proof Since Hθ(P ) is convex in θ and concave in P von Neumannrsquos minimaxtheorem gives minθ maxPisinP(Φ)Hθ(P ) = maxPisinP(Φ) minθHθ(P ) Finally we usethat minθHθ(P ) = miniH(Pi)

Define fsim(t) = lim supnrarrinfin f(totimesn)1n and fsim(t) = lim infnrarrinfin f(totimesn)1n

Lemma 425 Let t isin Fn1 otimes Fn2 otimes Fn3 Then

maxgisinG(t)

maxPisinP(max supp(gmiddott))

miniH(Pi) le SRsim(t) le SRsim(t) le min

θζθ(t)

Proof By definition SRsim(t) le SRsim(t) From Lemma 417 follows

SRsim(t) le˜τ(supp(g middot t))

for any g isin G(t) Lemma 423 gives ˜τ(supp(g middot t)) = maxPisinP(supp(gmiddott)) mini 2H(Pi)

Thus with the help of Lemma 424

SRsim(t) le mingisinG(t)

maxPisinP(supp(gmiddott))

mini

2H(Pi) = minθζθ(t)

From Lemma 418 follows

˜τ(max(supp(g middot t))) le SRsim(t)

for any g isin G(t) Lemma 423 gives

maxgisinG(t)

maxPisinP(max(supp(gmiddott)))

mini

2H(Pi) le SRsim(t)

This proves the lemma

Proof of Theorem 416 We may assume Φ = supp(t) is oblique Then withthe help of Lemma 424 and Lemma 425

minθisinΘ

ζθ(t) = minθisinΘ

ζθ(t)

= minθisinΘ

maxPisinmax(Φ)

2Hθ(P )

= maxPisinmax(Φ)

miniisin[3]

2H(Pi)

le maxgisinG(t)

maxPisinP(max(supp(gmiddott)))

miniisin[3]

2H(Pi)

le SRsim(t)

le SRsim(t)

le minθisinΘ

ζθ(t)

This proves the claim

47 Conclusion 63

47 Conclusion

The study of asymptotic rank of tensors is motivated by the open problem of findingthe exponent of matrix multiplication Asymptotic subrank has applicationsin for example combinatorics and algebraic property testing Via the theoryof asymptotic spectra Strassen characterised asymptotic rank and asymptoticsubrank in terms of the asymptotic spectrum of tensors Strassen introduced thegauge points in X(T ) and the support functionals in X(oblique) More preciselythere are the lower support functionals and the upper support functionals Thelower support functionals are not additive and can thus not be universal spectralpoints The upper support functionals may be universal spectral points but thiscan however not be shown with the help of the lower support functionals Finallywe showed that for oblique tensors the asymptotic slice rank exists and equals theminimum value over the support functionals In the next chapter we will see asubfamily of the oblique 3-tensors for which the support functionals are powerfulenough to compute the asymptotic subrank

Chapter 5

Tight tensors and combinatorialsubrank cap sets

This chapter is based on joint work with Matthias Christandl and Peter Vrana [CVZ16

CVZ18]

51 Introduction

In the previous chapter we discussed the gauge points and the support function-als ζθ The gauge points are in the asymptotic spectrum of all tensors while thesupport functionals are in the asymptotic spectrum of oblique tensors

How ldquopowerfulrdquo are the support functionals We know ˜Q(t) le ζθ(t) le ˜R(t) for

oblique t Thus maxθ ζθ(t) le ˜R(t) In fact maxθ ζ

θ(t) is at most the maximumover the gauge points maxS ζ(S) and in turn maxS ζ(S) is at most ˜R(t) Asremarked earlier it is not known whether maxS ζ(S) equals ˜R(t) in general

On the other hand we have ˜Q(t) le minθ ζθ(t) Do we attain equality here

in general ˜Q(t) = minθ ζθ(t) The answer is ldquoyesrdquo for the subsemiring of tight

3-tensors In this chapter we study tight k-tensors

Tight tensors

Let I1 Ik be finite sets Let Φ sube I1 times middot middot middot times Ik We say Φ is tight if there areinjective maps ui Ii rarr Z for i isin [k] such that

forallα isin Φ u1(α1) + middot middot middot+ uk(αk) = 0

We say t isin Fn1 otimes middot middot middot otimes Fnk is tight if there is a g isin G(t) = GLn1 times middot middot middot times GLnksuch that the support supp(g middot t) is tight

Recall that a tensor is oblique if the support is an antichain in some basisClearly tight tensors are oblique To summarise the families of tensors that we

65

66 Chapter 5 Tight tensors and combinatorial subrank cap sets

have defined up to now we have

tight sube oblique sube robust sube θ-robust

Recall that the families of oblique robust and θ-robust tensors each form asemiring under otimes and oplus Tight tensors have the same property [Str91 Section 5]Another property is that any subset of a tight set is tight

Example 51 Let k ge 3 be fixed For any integer n ge 1 and c isin [n] the set

Φn(c) = α isin 0 nminus 1k α1 + middot middot middot+ αk = c

is tight For any integer n ge 2 and any c isin [n] the set

Ψn(c) = α isin 0 nminus 1k α1 + middot middot middot+ αk = c mod n

is not tight (cf Exercise 1520 in [BCS97])

Example 52 When F contains a primitive nth root of unity ζ the tensor

tn =sum

αisinΨn(nminus1)

eα1 otimes middot middot middot otimes eαk isin (Fn)otimesk

which has support Ψn(n minus 1) is tight Namely the elements vj =sumn

i=1 ζijei

for j isin [n] form a basis of Fn Let g isin G(tn) be the corresponding basistransformation Then we have tn =

sumnj=1 vj otimes middot middot middot otimes vj and we see that the

support supp(g middot tn) = α isin [n]k α1 = middot middot middot = αk is tight (See also [BCS97Exercise 1525]) When the characteristic of F equals n the tensor tn is also tightas we will see in Section 542

Combinatorial subrank and the CoppersmithndashWinograd method

We care about tight tensors because of a remarkable theorem for tight 3-tensors ofStrassen (Theorem 53 below) To understand the theorem we need the concept ofcombinatorial asymptotic subrank (cf [Str91 Section 5]) We say D sube I1timesmiddot middot middottimesIkis a diagonal when any two distinct α β isin D are distinct in all k coordinates Inother words for elements in D the value at one coordinate uniquely determinesthe value at the other k minus 1 coordinates Let Φ sube I1 times middot middot middot times Ik We say adiagonal D sube I1 times middot middot middot times Ik is free for Φ or simply D sube Φ is a free diagonalif D = Φ cap (D1 times middot middot middot times Dk) where Di = xi (x1 xk) isin D Define the(combinatorial) subrank Q(Φ) as the size of the largest free diagonal D sube ΦFor Φ sube I1 times middot middot middot times Ik and Ψ sube J1 times middot middot middot times Jk we naturally define the productΦtimesΨ sube (I1 times J1)times middot middot middot times (Ik times Jk) by

ΦtimesΨ = ((α1 β1) (αk βk)) α isin Φ β isin Ψ

51 Introduction 67

Define the (combinatorial) asymptotic subrank ˜Q(Φ) = limnrarrinfinQ(Φtimesn)1n Lett isin Fn1 otimes middot middot middot otimes Fnk and let Φ be the support of t in the standard basis ThenQ(Φ) le Q(t) and ˜Q(Φ) le ˜Q(t) The number Q(Φ) may be interpreted as thelargest number n such that 〈n〉 can be obtained from t using a restriction thatconsists of matrices that have at most one nonzero entry in each row and ineach column (This is called M-restriction in [Str87 Section 6] which standsfor monomial restriction) We may also interpret Φ as a k-partite hypergraphThen Q(Φ) is the size of the largest induced k-partite matching in Φ

Let Φ sube [n1]timesmiddot middot middottimes [nk] and let t isin Fn1 otimesmiddot middot middototimesFnk be any tensor with supportequal to Φ Then the (asymptotic) subranks of Φ and t are related as follows

Q(Φ) le Q(t) and ˜Q(Φ) le ˜Q(t)

Strassen proved the following theorem using the method of Coppersmith andWinograd [CW90] Recall that for Φ sube I1 times I2 times I3 we let P(Φ) be the set ofprobability distributions on Φ For P isin P(Φ) let P1 P2 P3 be the marginaldistributions of P on the 3 components of I1 times I2 times I3

Theorem 53 ([Str91 Lemma 51]) Let Φ sube I1 times I2 times I3 be tight Then

˜Q(Φ) = maxPisinP(Φ)

miniisin[3]

2H(Pi) (51)

The consequence of Theorem 53 is that the support functionals are sufficientlypowerful to compute the asymptotic subrank of tight 3-tensors

Corollary 54 ([Str91 Proposition 54]) Let t isin Fn1 otimes Fn2 otimes Fn3 be tight Then

˜Q(t) = minθisinP([3])

ζθ(t)

Moreover if Φ = supp(g middot t) is tight for some g isin G(t) then ˜Q(t) = ˜Q(Φ)

Remark 55 Strassen conjectured in [Str94 Conjecture 53] that for the familyof tight 3-tensors the support functionals give all spectral points in the asymp-totic spectrum X(tight 3-tensors) In [Str91] numerous examples are given ofsubfamilies of tight 3-tensors for which this is the case

Remark 56 Equation (51) becomes false when we let Φ sube I1 times middot middot middot times Ikwith k ge 4 and we let the right-hand side of the equation be maxPisinP(Φ) mini 2H(Pi)see [CVZ16 Example 1138]

New results in this chapter

This chapter is an investigation of tight tensors combinatorial asymptotic subrankand applications More precisely this chapter contains the following new results

68 Chapter 5 Tight tensors and combinatorial subrank cap sets

Higher-order CoppersmithndashWinograd method In Section 52 we extendTheorem 53 to obtain a lower bound for ˜Q(Φ) for tight sets Φ sube I1 times middot middot middot times Ikwith k ge 4 Our lower bound is not known to be optimal in general We computeexamples for which the lower bound is optimal

Combinatorial degeneration method In Section 53 we further extend therange of application of the CoppersmithndashWinograd method via a partial order

on supports of tensors called combinatorial degeneration We prove that if Φ Ψthen ˜Q(Φ) le ˜Q(Ψ) Suppose Ψ is not tight but Φ is tight then we may apply the(higher-order) CoppersmithndashWinograd method to obtain a lower bound on ˜Q(Φ)and thus on ˜Q(Ψ)

Cap sets In Section 54 we relate the theory of asymptotic spectra theCoppersmithndashWinograd method and the combinatorial degeneration methodto the problem of upper bounding the maximum size of cap sets in Fnp

Graph tensors Graph tensors are generalisations of the matrix multiplicationtensor 〈2 2 2〉 parametrised by graphs In Section 55 we discuss how one canapply the higher-order CoppersmithndashWinograd method to obtain upper boundson the asymptotic rank of complete graph tensors We also briefly discuss thesurgery method which gives good upper bounds on the asymptotic rank of graphtensors for sparse graphs like cycle graphs

52 Higher-order CW method

In this section we extend Theorem 53 to tight Φ sube I1 times middot middot middot times Ik with k ge 4We introduce some notation Let P(Φ) be the set of probability distributionson Φ For P isin P(Φ) let P1 Pk be the marginal distributions of P on the kcomponents of I1 times middot middot middot times Ik Let R(Φ) be the set of all subsets R sube Φ2 suchthat R 6sube (x x) x isin Φ and R sube (x y) isin Φ2 xi = yi for some i isin [k]For P isin P(Φ) and R isin R(Φ) let Q(R (P1 Pk)) be the set of probabilitydistributions Q on R whose marginal distributions on the 2k components of Rsatisfy Qi = Qk+i = Pi for i isin [k]

Let I1 Ik be finite subsets of Z The result of this section is a lower bound onthe asymptotic subrank of any Φ sube I1timesmiddot middot middottimesIk satisfying foralla isin Φ

sumki=1 ai = 0 For

R sube R(Φ) let r(R) be the rank over Q of the matrix with rows xminusy (x y) isin R

Theorem 57 Let Φ sube Zk be a finite set with foralla isin Φsumk

i=1 ai = 0 Then

log2 ˜Q(Φ) ge maxP

minRQ

H(P )minus (k minus 2)H(Q)minusH(P )

r(R)

with P isin P(Φ) R isin R(Φ) and Q isin Q(R (P1 Pk))

52 Higher-order CoppersmithndashWinograd method 69

521 Construction

We prepare for the proof of Theorem 57 by discussing some basic facts

Average-free sets

Lemma 58 Let k isin N Let M isin N We say a subset B sube ZMZ is (k minus 1)-average-free if

forallx1 xk isin B x1 + middot middot middot+ xkminus1 = (k minus 1)xk rArr x1 = middot middot middot = xk

There is a (k minus 1)-average-free set B sube ZMZ of size |B| = M1minuso(1)

Proof There is a set A sube 1 bMminus1kminus1c of size |A| = M1minuso(1) with

forallx1 xk isin A x1 + middot middot middot+ xkminus1 = (k minus 1)xk rArr x1 = middot middot middot = xk (52)

see [VC15 Lemma 10] Let B = a mod M a isin A sube ZMZ Then |B| = |A|Let x1 xk isin B with x1 + middot middot middot+ xkminus1 = (k minus 1)xk View x1 xk as elementsin 1 bMminus1

kminus1c Then x1 + middot middot middot+xkminus1 = (kminus1)xk still holds From (52) follows

x1 = middot middot middot = xk in Z and hence also in ZMZ

Linear combinations of uniform variables

Lemma 59 Let M be a prime Let u1 un be independently uniformly dis-tributed over ZMZ Let v1 vm be (ZMZ)-linear combinations of u1 unThen the vector v = (v1 vm) is uniformly distributed over the range of v in(ZMZ)m

Proof Let vi =sum

j cijuj with cij isin ZMZ Then v = Cu with u = (u1 un)and C the matrix with entries Cij = cij Let y in the image of C Then thecardinality of the preimage Cminus1(y) equals the cardinality of the kernel of CIndeed if Cx = y then Cminus1(y) = x + ker(C) Since u is uniform we concludethat v is uniform on the image of C

Free diagonals

Lemma 510 Let G be a graph with n vertices and m edges Then G has at leastnminusm connected components

Proof A graph without edges has n connected components For every edge thatwe add to the graph we lose at most one connected component

Lemma 511 Let I1 Ik be finite sets Let Ψ sube I1 times middot middot middot times Ik Let

C = a b sube Ψ a 6= bexisti isin [k] ai = bi

Then Q(Ψ) ge |Ψ| minus |C| Obviously the statement remains true if we replace C bythe larger set (a b) isin Ψ2 a 6= bexisti isin [k] ai = bi

70 Chapter 5 Tight tensors and combinatorial subrank cap sets

Proof Let G = (Ψ C) be the graph with vertex set Ψ and edge set C Let Γ sube Ψcontain exactly one vertex per connected component of G The vertices in Γ arepairwise not adjacent So Γ is a diagonal Of course Γ sube Ψcap (Γ1times middot middot middot times Γk) Leta isin Ψ cap (Γ1 times middot middot middot times Γk) Let x1 xk isin Γ with

(x1)1 = a1 (x2)2 = a2 (xk)k = ak

Then x1 xk are all adjacent to a in G ie they are all in the same connectedcomponent Then x1 = middot middot middot = xk since Γ contains precisely one vertex perconnected component So a = x1 = middot middot middot = xk So a isin Γ We conclude thatΓ supe Ψ cap (Γ1 times middot middot middot times Γk) Finally |Γ| ge |Ψ| minus |C| by Lemma 510

We now give the proof of Theorem 57 We repeat some notation from aboveLet k ge 3 Let Φ sube Zk be a finite set Let P(Φ) be the set of probabilitydistributions on Φ For P isin P(Φ) let P1 Pk be the marginal distributionsof P on the k components of Zk Let R(Φ) be the set of all subsets R sube Φ2 suchthat R 6sube (x x) x isin Φ and R sube (x y) isin Φ2 xi = yi for some i isin [k]For P isin P(Φ) and R isin R(Φ) let Q(R (P1 Pk)) be the set of probabilitydistributions Q on R whose marginal distributions on the 2k components of Rsatisfy Qi = Qk+i = Pi for i isin [k] For R sube R(Φ) let r(R) be the rank over Q ofthe matrix with rows

xminus y (x y) isin R

For any prime M let rM(R) be the rank over ZMZ of the same matrix

Theorem (Theorem 57) Let Φ sube Zk be a finite set with foralla isin Φsumk

i=1 ai = 0Then

log2 ˜Q(Φ) ge maxP

minRQ

H(P )minus (k minus 2)H(Q)minusH(P )

r(R)

with P isin P(Φ) R isin R(Φ) and Q isin Q(R (P1 Pk))

Proof Let P be a rational probability distribution on Φ ie foralla isin Φ P (a) isin Q

Choice of parameters

This proof involves a variable N that we will let go to infinity and a primenumber M that depends on N For the sake of rigor we first set the dependenceof M on N and make sure that N is large enough for M to have good properties

Let n isin N such that P is an n-type ie foralla isin Φ nP (a) isin N Let N = tn be amultiple of n Let

f(N) = log2

(2|Φ|

2

maxRisinR(Φ)

(N + |R| minus 1

|R| minus 1

))isin o(N) (53)

52 Higher-order CoppersmithndashWinograd method 71

Let

g(N) = |Φ| log2(N + 1) isin o(N)

By Lemma 420

2NH(P )minusg(N) le(N

NP

) (54)

Let

micro(N) = maxRQ

H(Q)minusH(P ) + (1 + g(N) + f(N)) 1N

r(R)(55)

with R isin R(Φ) and Q isin Q(R (P1 Pk)) Let M be a prime with

d2micro(N)Ne leM le 2d2micro(N)Ne (56)

Such a prime exists by Bertrandrsquos postulate see eg [AZ14] We can make Marbitrarily large by choosing N large enough Choose N = tn large enough suchthat

M gt k minus 1 (57)

forallR isin R(Φ) rM(R) = r(R) (58)

We will later let t and thus N go to infinity

Restrict to marginal type classes

The set ΦotimesN is a finite subset of (ZN)k Let a isin ΦotimesN Then we have thatai = ((ai)1 (ai)N) isin ZN for i isin [k] We restrict to those a for which ai is inthe type class TNPi for all i isin [k] Thus let

Ψ = ΦotimesN cap (TNP1times middot middot middot times TNPk)

We prove a lower bound on the size of Ψ Let (s1 sN ) isin TNP Then sj isin Φ forj isin [N ] and ((s1)i (sN)i) isin TNPi for i isin [k] So(

((s1)1 (sN)1) ((s1)k (sN)k))isin ΦotimesN cap (TNP1

times middot middot middot times TNPk) = Ψ

Thus |Ψ| ge |TNP | By Lemma 419 |TNP | =(NNP

) By Lemma 420

(NNP

)ge

2NH(P )minusg(N) Therefore

|Ψ| ge 2NH(P )minusg(N) (59)

72 Chapter 5 Tight tensors and combinatorial subrank cap sets

Hashing

Let u1 ukminus1 v1 vN isin ZMZ For i isin [k] let

hi ZN rarr ZMZ

x 7rarr

ui +

sumNj=1 xjvj for 1 le i le k minus 1

1kminus1

(u1 + middot middot middot+ ukminus1 minus

sumNj=1 xjvj

)for i = k

Note that kminus1 is invertible in ZMZ by (57) Let a isin Ψ Then ((a1)j (ak)j) isinΦ for j isin [N ] So

sumki=1(ai)j = 0 for every j isin [N ] Thus

ksumi=1

Nsumj=1

(ai)jvj =Nsumj=1

vj

ksumi=1

(ai)j = 0

Therefore

h1(a1) + middot middot middot+ hkminus1(akminus1) = (k minus 1)hk(ak)

Restrict to average-free set

Let B sube ZMZ be a (k minus 1)-average-free set of size

|B| geM1minusκ(M) with κ(M) isin o(1) (510)

meaning

forallx1 xk isin B x1 + middot middot middot+ xkminus1 = (k minus 1)xk rArr x1 = middot middot middot = xk (511)

(Lemma 58) Let Ψprime sube Ψ be the subset

Ψprime = a isin Ψ foralli isin [k] hi(ai) isin B

Let a isin Ψprime Then a isin Ψ so

h1(a1) + middot middot middot+ hkminus1(akminus1) = (k minus 1)hk(ak)

Since hi(ai) isin B for every i isin [k] (511) implies

h1(a1) = middot middot middot = hk(ak)

Probabilistic method

Clearly Q(ΦotimesN) ge Q(Ψ) ge Q(Ψprime) Let

C prime = (a b) isin Ψprime2 a 6= bexisti isin [k] ai = bi

52 Higher-order CoppersmithndashWinograd method 73

Let X = |Ψprime| and Y = |C prime| By Lemma 511

Q(Ψprime) ge X minus Y

Let u1 ukminus1 v1 vN be independent uniformly random variables over thefield ZMZ Then X and Y are random variables Then

Q(Ψprime) ge E[X minus Y ] = E[X]minus E[Y ]

where the expectation is over u1 ukminus1 v1 vN We will prove

E[X] = |B| |Ψ|Mminus(kminus1) (512)

E[Y ] le |B|maxRQ

2NH(Q)+f(N)Mminus(kminus1)minusr(R) (513)

with f(N) as defined in (53) and R isin R(Φ) Q isin Q(R (P1 Pk)) Beforeproving (512) and (513) we derive the final bound

Derivation of final bound

From (512) and (513) follows

E[X]minus E[Y ] ge |B| |Ψ|Mminus(kminus1) minus |B|maxRQ

2NH(Q)+f(N)Mminus(kminus1)minusr(R)

We factor out |B| |Ψ| and Mminus(kminus1)

E[X]minus E[Y ] ge |B| |Ψ|Mminus(kminus1)(

1minus 1

|Ψ|maxRQ

2NH(Q)+f(N)Mminusr(R))

From our choice of micro(N) from (55)

micro(N) = maxRQ

H(Q)minusH(P ) + (1 + g(N) + f(N)) 1N

r(R)

follows

maxRQ

2N(H(Q)minusH(P )minusr(R)micro(N))+g(N)+f(N) le 1

2 (514)

Apply |B| geM1minusκ(M) from (510) and |Ψ| ge 2NH(P )minusg(N) from (59) to get

E[X]minus E[Y ] geM1minusκ(M)2NH(P )minusg(N)Mminus(kminus1)

middot(

1minus 2minusNH(P )+g(N) maxRQ

2NH(Q)+f(N)Mminusr(R))

geMminus(kminus2+κ(M))2NH(P )minusg(N)

74 Chapter 5 Tight tensors and combinatorial subrank cap sets

middot(

1minusmaxRQ

2NH(Q)minusNH(P )+g(N)+f(N)Mminusr(R))

(Here we used (514) to see that the second factor is nonnegative) Apply theupper bound 2micro(N)N leM le 2micro(N)N+2 from (56) to get

E[X]minus E[Y ] ge (2micro(N)N+2)minus(kminus2+κ(M))2NH(P )minusg(N)

middot(

1minusmaxRQ

2NH(Q)minusNH(P )+g(N)+f(N)(2micro(N)N)minusr(R))

= 2N(H(P )minus(kminus2+κ(M))micro(N))minus2(kminus2+κ(M))minusg(N)

middot(

1minusmaxRQ

2N(H(Q)minusH(P )minusr(R)micro(N))+g(N)+f(N))

Using (514) we get

E[X]minus E[Y ] ge 2N(H(P )minus(kminus2+κ(M))micro(N))minus2(kminus2+κ(M))minusg(N)(1minus 1

2)

= 2N(H(P )minus(kminus2+κ(M))micro(N))minus2(kminus2+κ(M))minusg(N)minus1

Then

1

Nlog2 Q(ΦotimesN)

ge 1

Nlog2(E[X]minus E[Y ])

ge H(P )minus (k minus 2 + κ(M)) maxRQ

H(Q)minusH(P ) + (1 + g(N) + f(N)) 1N

r(R)

minus 2(k minus 2 + κ(M)) + g(N) + 1

N

We let t and thus N go to infinity and obtain

log2 ˜Q(Φ) ge H(P )minus (k minus 2) maxRQ

H(Q)minusH(P )

r(R)

This lower bound holds for any rational probability distribution P on Φ and bycontinuity for any real probability distribution P on Φ

It remains to prove (512) and (513) We do this in the lemmas below

Lemma 512 E[X] = |B| |Ψ|Mminus(kminus1)

Proof Let a isin Ψ Then h1(a1) + middot middot middot+ hkminus1(akminus1) = (k minus 1)hk(ak) The followingfour statements are equivalent

a isin Ψprime

foralli isin [k] hi(ai) isin B

52 Higher-order CoppersmithndashWinograd method 75

existb isin B h1(a1) = middot middot middot = hk(ak) = b

existb isin B h1(a1) = middot middot middot = hkminus1(akminus1) = b

Therefore

P[a isin Ψprime] =sumbisinB

P[h1(a1) = middot middot middot = hkminus1(akminus1) = b]

For b isin B

P[h1(a1) = middot middot middot = hkminus1(akminus1) = b] = (Mminus1)kminus1

We conclude

E[X] =sumaisinΨ

P[a isin Ψprime]

=sumaisinΨ

sumbisinB

P[h1(a1) = middot middot middot = hkminus1(akminus1) = b]

=sumaisinΨ

sumbisinB

(Mminus1)kminus1

= |Ψ| |B|Mminus(kminus1)

This proves the lemma

Lemma 513 E[Y ] le |B|maxRQ 2NH(Q)+f(N)Mminus(kminus1)minusr(R)

Proof Let

C = (a aprime) isin Ψ2 a 6= aprimeexisti isin [k] ai = aprimei

Let (a aprime) isin C The following statements are equivalent

(a aprime) isin C prime (515)

a aprime isin Ψprime (516)

foralli isin [k] hi(ai) hi(aprimei) isin B (517)

existb isin B h1(a1) = middot middot middot = hk(ak) = h1(aprime1) = middot middot middot = hk(aprimek) = b (518)

Therefore

E[Y ] =sum

(aaprime)isinC

P[(a aprime) isin C prime]

=sum

(aaprime)isinC

sumbisinB

P[h1(a1) = middot middot middot = hk(ak) = h1(aprime1) = middot middot middot = hk(aprimek) = b]

76 Chapter 5 Tight tensors and combinatorial subrank cap sets

Let (a aprime) isin C Then hi(ai) and hi(aprimei) are ZMZ-linear combinations of

u1 ukminus1 v1 vN The random variable(h1(a1) hk(ak) h1(aprime1) hk(a

primek))

is uniformly distributed over the image subspace V sube (ZMZ)2k Let b isin BThen (b b) isin V since u1 = middot middot middot = uk = b v1 vN = 0 is a valid assignmentTherefore

P[h1(a1) = middot middot middot = hk(ak) = h1(aprime1) = middot middot middot = hk(aprimek) = b] = |V |minus1

And |V | equals M to the power the rank of the matrix1 0 middot middot middot 0 1

kminus11 0 middot middot middot 0 1

kminus1

0 1 0 1kminus1

0 1 0 1kminus1

0 0 middot middot middot 1 1

kminus10 0 middot middot middot 1 1

kminus1

a1 a2 middot middot middot akminus1 minus akkminus1

aprime1 aprime2 middot middot middot aprimekminus1 minusaprimekkminus1

(519)

over ZMZ with a1 ak aprime1 a

primek thought of as column vectors in (ZMZ)N

With column operations we transform (519) into0 0 middot middot middot 0 0 1 0 middot middot middot 0 00 0 middot middot middot 0 0 0 1 0 0

0 0 middot middot middot 0 0 0 0 1 0

a1 minus aprime1 a2 minus aprime2 middot middot middot akminus1 minus aprimekminus1 ak minus aprimek aprime1 aprime2 middot middot middot aprimekminus1 0

(520)

Matrix (520) has rank equal to k minus 1 plus rM(a aprime) = rk(A(a aprime)) where

A(a aprime) =(a1 minus aprime1 a2 minus aprime2 middot middot middot ak minus aprimek

)

We obtain

E[Y ] lesum

(aaprime)isinC

sumbisinB

Mminus(kminus1+rM (aaprime))

Since the summands are independent of b we get

E[Y ] le |B|sum

(aaprime)isinC

Mminus(kminus1+rM (aaprime))

Let (a aprime) isin C Consider the rows of A(a aprime) The N rows are of theform xi minus yi with (xi yi) isin Φ2 Let s = ((x1 y1) (xN yN)) Let R =

52 Higher-order CoppersmithndashWinograd method 77

(x1 y1) (xN yN) We have rM(a aprime) = rM(R) and rM(R) = r(R) by (58)Let Q be the N -type with supp(Q) = R and s isin TNQ From a 6= aprime followsR 6sube (x x) x isin Φ From existi isin [k] ai = aprimei follows existi isin [k] R sube (x y) xi = yiFrom a aprime isin TNP1

times middot middot middot times TNPk follows Qi = Qk+i = Pi for all i isin [k] We thus have

E[Y ] le |B|sum

RisinR(Φ)

sumQisinQ(R(P1Pk))

supp(Q)=RQ is N -type

sumsisinTNQ

Mminus(kminus1+r(R))

The number of N -types Q with supp(Q) = R is at most the number of N -typeson R which is at most

(N+|R|minus1|R|minus1

)(Lemma 419) For any Q isin Q(R (P1 Pk))

|TNQ | le 2NH(Q) (Lemma 419) Therefore

E[Y ] le |B|sum

RisinR(Φ)

(N + |R| minus 1

|R| minus 1

)max

QisinQ(R(P1Pk))2NH(Q) Mminus(kminus1+r(R))

Also |R(Φ)| le 2|Φ|2

Therefore

E[Y ] le |B| 2|Φ|2 maxRisinR(Φ)

(N + |R| minus 1

|R| minus 1

)max

QisinQ(R(P1Pk))2NH(Q) Mminus(kminus1+r(R))

We conclude that

E[Y ] le |B|maxRQ

2NH(Q)+f(N)Mminus(kminus1)minusr(R)

This proves the lemma

522 Computational remarks

The following two lemmas are helpful when applying Theorem 57 We leave theproof to the reader

Lemma 514 Let P isin P(Φ) Let RRprime isin R(Φ) with R sube Rprime and r(R) = r(Rprime)Then

maxQisinQ(R(P1Pk))

H(Q)minusH(P )

r(R)le max

QisinQ(Rprime(P1Pk))

H(Q)minusH(P )

r(Rprime)

Lemma 515 Let R isin R(Φ) There is an equivalence relation Rprime isin R(Φ) withR sube Rprime and r(R) = r(Rprime)

78 Chapter 5 Tight tensors and combinatorial subrank cap sets

523 Examples type sets

We discuss some examples The first example we will use to get good upperbounds on the asymptotic rank of complete graph tensors in Section 55 We focuson one family of examples that is parametrised by partitions Let λ ` k be aninteger partition of k with d parts Let

Φλ = a isin 0 1 dminus 1 type(a) = λ

The set Φλ is tight

Theorem 516 log2 ˜Q(Φ(22)) = 1

Proof Let Φ = Φ(22) Clearly ˜Q(Φ) le 2 After relabelling foralla isin Φsumk

i=1 ai = 0We may thus apply Theorem 57 Let P be the uniform probability distributionon Φ Then H(P ) = log2 6

Let R isin R(Φ) We may assume that

R sube (1 1 0 0) (1 0 1 0) (1 0 0 1)2

cup(0 0 1 1) (0 1 0 1) (0 1 1 0)2

We may assume R is an equivalence relation (Lemma 515) Let (x y) isin RLet Rprime = R cup ((1 1 1 1) minus x (1 1 1 1) minus y) isin R(Φ) Then R sube Rprime andRprime isin R(Φ) and r(R) = r(Rprime) We may thus assume that if (x y) isin R then also((1 1 1 1)minus x (1 1 1 1)minus y) isin R (Lemma 514)

Let S = (1 1 0 0) (1 0 1 0) (1 0 0 1) By the above observation it sufficesto consider equivalence relations on S There are three types of such equivalencerelations

Type (3) all three elements of S are equivalent Then |R| = 18 and r(R) = 2Type (2 1) two elements of S are equivalent and inequivalent to the third

element (which is equivalent to itself) Then |R| = 10 and r(R) = 1Type (1 1 1)) all elements of S are inequivalent Then R sube (x x) x isin Φ

which is a contradictionFor type (3) and (2 1) the uniform probability distribution Q on R has

marginals Qi = Q4+i = Pi for i isin [4] The uniform Q is optimal Then H(Q) =log2 |R| Let R(3) and R(21) be equivalence relations of type (3) and (2 1) Then

log2 ˜Q(Φ) ge minH(P )minus 2

r(R(3))

(log2 |R(3)| minusH(P )

)

H(P )minus 2

r(R(21))

(log2 |R(21)| minusH(P )

)= min log2 6minus 2

2(log2 18minus log2 6)

log2 6minus 21(log2 10minus log2 6)

= min1 log25425 = 1

This proves the theorem

53 Combinatorial degeneration method 79

Theorem 517 log2 ˜Q(Φ(0kminus11)) = h(1k)

Proof We refer to [CVZ16]

With Srinivasan Arunachalam and Peter Vrana we have the following unpub-lished result

Theorem 518 log2 ˜Q(Φ(0k21k2)) = 1

53 Combinatorial degeneration method

In this section we extend the (higher-order) CoppersmithndashWinograd method via apreorder called combinatorial degeneration Suppose Ψ sube I1timesmiddot middot middottimes Ik is not tightbut has a tight subset Φ sube Ψ In the rest of this section we focus on obtaining alower bound on ˜Q(Ψ) via Φ This has an application in the context of tri-coloredsum-free sets (Section 542) for example

Definition 519 ([BCS97]) Let Φ sube Ψ sube I1 times middot middot middot times Ik We say that Φ is acombinatorial degeneration of Ψ and write Ψ Φ if there are maps ui Ii rarr Z(i isin [k]) such that for all α isin I1 times middot middot middot times Ik if α isin Ψ Φ then

sumki=1 ui(αi) gt 0

and if α isin Φ thensumk

i=1 ui(αi) = 0 Note that the maps ui need not be injective

Combinatorial degeneration gets its name from the following standard proposi-tion see eg [BCS97 Proposition 1530]

Proposition 520 Let t isin Fn1 otimes middot middot middot otimes Fnk Let Ψ = supp(t) Let Φ sube Ψ suchthat Ψ Φ Then t t|Φ

Proposition 520 brings us only slightly closer to our goal Namely givent isin Fn1 otimesmiddot middot middototimesFnk with Ψ = supp(t) and given Φ sube Ψ such that ΨΦ it followsdirectly from Proposition 520 that t t|Φ and thus ˜Q(t) ge ˜Q(t|Φ) This howeverdoes not give us a lower bound on the combinatorial asymptotic subrank ˜Q(Ψ)The following theorem does Our theorem extends a result in [KSS16]

Theorem 521 Let Φ sube Ψ sube I1 times middot middot middot times Ik If Ψ Φ then

˜Q(Ψ) ge ˜Q(Φ)

Lemma 522 Let Φ sube Ψ sube I1 times middot middot middot times Ik If Ψ Φ then ˜Q(Ψ) ge Q(Φ)

Proof Pick maps ui Ii rarr Z such that

ksumi=1

ui(αi) = 0 for α isin Φ

ksumi=1

ui(αi) gt 0 for α isin Ψ Φ

80 Chapter 5 Tight tensors and combinatorial subrank cap sets

Let D be a free diagonal in Φ with |D| = Q(Φ) and let

wi =sumxisinDi

ui(x)

Let n isin N and define

Wi =

(x1 xn|D|) isin Itimesn|D|i

n|D|sumj=1

ui(xj) = nwi

Then

Ψtimesn|D| cap (W1 times middot middot middot timesWk) = Φtimesn|D| cap (W1 times middot middot middot timesWk)

The inclusion supe is clear To show sube let (x1 xk) isin Ψtimesn|D| cap (W1 times middot middot middot timesWk)Write xi = (xi1 xi2 xin|D|) and consider the n|D| times k matrix of evaluations

u1(x11) u2(x21) middot middot middot uk(xk1)u1(x12) u2(x22) middot middot middot uk(xk2)

u1(x1n|D|) u2(x2n|D|) middot middot middot uk(xkn|D|)

The sum of the ith column is nwi by definition of Wi andsumk

i=1 nwi = 0 Therow sums are nonnegative by definition of the maps u1 uk We conclude thatthe row sums are zero Therefore (x1 xk) is an element of Φtimesn|D|

Since D is a free diagonal in Φ Dtimesn|D| is a free diagonal in Φtimesn|D| and alsoDtimesn|D| cap (W1times middot middot middot timesWk) is a free diagonal in Φtimesn|D| cap (W1times middot middot middot timesWk) which inturn is equal to Ψtimesn|D| cap (W1 times middot middot middot timesWk) Therefore Dtimesn|D| cap (W1 times middot middot middot timesWk) isalso a free diagonal in Ψtimesn|D| ie

Q(Ψtimesn|D|) ge |Dtimesn|D| cap (W1 times middot middot middot timesWk)|

In the set Dtimesn|D| consider the strings with uniform type ie where all |D|elements of D occur exactly n times These are clearly in W1 times middot middot middot timesWk andtheir number is

(n|D|nn

) Therefore

Q(Ψtimesn|D|) ge(

n|D|n n

)= |D|n|D|minuso(n)

which implies ˜Q(Ψ) = limnrarrinfinQ(Ψtimesn|D|)1

n|D| ge |D|

Proof of Theorem 521 We have ˜Q(Ψ) = limnrarrinfin ˜Q(Ψtimesn)1n It follows fromLemma 522 that

limnrarrinfin ˜Q(Ψtimesn)1n ge lim

nrarrinfinQ(Φtimesn)1n

The right-hand side is ˜Q(Φ)

54 Cap sets 81

54 Cap sets

A subset A sube (Z3Z)n is called a cap set if any line in A is a point a linebeing a triple of points of the form (u u + v u + 2v) Until recently it wasnot known whether the maximal size of a cap set in (Z3Z)n grows like 3nminuso(n)

or like cnminuso(n) for some c lt 3 Gijswijt and Ellenberg in [EG17] inspired bythe work of Croot Lev and Pach in [CLP17] settled this question showing thatc le 3(207+33

radic33)138 asymp 2755 Tao realised in [Tao16] that the cap set question

may naturally be phrased as the problem of computing the size of the largestmain diagonal in powers of the ldquocap set tensorrdquo

sumα eα1 otimes eα2 otimes eα3 where the

sum is over α1 α2 α3 isin F3 with α1 + α2 + α3 = 0 Here main diagonal refersto a subset A of the basis elements such that restricting the cap set tensor toAtimesAtimesA gives the tensor

sumvisinA votimes votimes v We show that the cap set tensor is in

the GL3(F3)times3 orbit of the ldquoreduced polynomial multiplication tensorrdquo which wasstudied in [Str91] and we show how recent results follow from this connectionusing Theorem 521

541 Reduced polynomial multiplication

Let tn be the tensorsum

α eα1 otimes eα2 otimes eα3 where the sum is over (α1 α2 α3) in0 1 nminus13 such that α1 +α2 = α3 We call tn the reduced polynomial multi-plication tensor since tn is essentially the structure tensor of the algebra F[x](xn)of univariate polynomials modulo the ideal generated by xn The support of tnequals

(α1 α2 α3) isin 0 nminus 13

∣∣α1 + α2 = α3

which via α3 7rarr nminus 1minus α3 we may identify with the set

Φn =

(α1 α2 α3) isin 0 nminus 13∣∣α1 + α2 + α3 = nminus 1

(521)

The support Φn is tight (cf Example 51) Strassen proves in [Str91 Theorem 67]using Corollary 54 that ˜Q(tn) = ˜Q(Φn) = z(n) where z(n) is defined as

z(n) =γn minus 1

γ minus 1γminus2(nminus1)3 (522)

with γ equal to the unique positive real solution of the equation 1γminus1minus n

γnminus1= nminus1

3

The following table contains values of z(n) for small n See also [Str91 Table 1]

82 Chapter 5 Tight tensors and combinatorial subrank cap sets

n z(n)

rounded exact

2 188988 3223 = 2h(13)

3 275510 3(207 + 33radic

33)1384 3610725 4461586 5309737 6156208 7001559 78461210 869012

In fact [Str91 Theorem 67] says that the asymptotic spectrum of tn is completelydetermined by the support functionals and that the possible values that thespectral points can take on tn form the closed interval [z(n) n] (cf Remark 221)

X(N[tn]) = ζθ|N[tn] θ isin P([3]) φ(tn) φ isin X(N[tn]) = [z(n) n]

542 Cap sets

We turn to cap sets

Definition 523 A three-term progression-free set is a set A sube (ZmZ)n satisfy-ing the following For all (x1 x2 x3) isin Atimes3 there are u v isin (ZmZ)n such that(x1 x2 x3) = (u u + v u + 2v) if and only if x1 = x2 = x3 Let r3((ZmZ)n) bethe size of the largest three-term progression-free set in (ZmZ)n and define theregularisation ˜r3(ZmZ) = limnrarrinfin r3((ZmZ)n)1n

A three-term progression-free set in (Z3Z)n is called a cap or cap set Wenext discuss an asymmetric variation on three-term progression free sets calledtri-colored sum-free sets which are potentially larger They are interesting sinceall known upper bound techniques for the size of three-term progression-free setsturn out to be upper bounds on the size of tri-colored sum-free sets

Definition 524 Let G be an abelian group Let Γ sube GtimesGtimesG For i isin [3] wedefine the marginal sets Γi = x isin G existα isin Γ αi = x We say Γ is tricoloredsum-free if the following holds The set Γ is a diagonal and for any α isin Γ1timesΓ2timesΓ3α1 + α2 + α3 = 0 if and only if α isin Γ (Recall that Γ sube I1 times I2 times I3 is a diagonalwhen any two distinct α β isin Γ are distinct in all coordinates) Let s3(G) be thesize of the largest tricolored sum-free set in GtimesGtimesG and define the regularisation

˜s3(G) = limnrarrinfin s3(Gtimesn)1n

Equivalently Γ sube GtimesGtimesG is a tricolored sum-free set if and only if Γ is afree diagonal in α isin GtimesGtimesG α1 + α2 + α3 = 0

54 Cap sets 83

If the set A sube G = (ZmZ)n is three-term progression-free then the setΓ = (a aminus2a) a isin A sube G times G times G is tri-colored sum-free Therefore wehave ˜r3(ZmZ) le ˜s3(ZmZ)

We summarise the recent history of results on cap sets For clarity we focuson m = 3 we refer the reader to the references for the general results Edel in[Ede04] proved the lower bound 221739 le ˜r3(Z3Z) In [EG17] Ellenberg andGijswijt proved the upper bound

˜r3(Z3Z) le 3(207 + 33radic

33)138 asymp 2755

Blasiak et al [BCC+17] proved that in fact

˜s3(Z3Z) le 3(207 + 33radic

33)138

This upper bound was shown to be an equality in [KSS16 Nor16 Peb16]

Theorem 525 ˜s3(Z3Z) = 3(207 + 33radic

33)138

We reprove Theorem 525 by proving that ˜s3(ZmZ) equals the asymptoticsubrank z(m) of tm discussed in Section 541 when m is a prime power Thesignificance of our proof lies in the explicit connection to the framework ofasymptotic spectra and not in the obtained value which also for prime powers mwas already computed in [BCC+17 KSS16 Nor16 Peb16]

Proof We will prove ˜s3(ZmZ) = z(m) when m is a prime power By defini-tion ˜s3(ZmZ) equals the asymptotic subrank of the set

α isin 0 mminus 13 α1 + α2 + α3 = 0 mod m

which via α3 7rarr α3 minus (mminus 1) we may identify with the set

Ψm = α isin 0 mminus 13 α1 + α2 + α3 = mminus 1 mod m

and so ˜s3(ZmZ) = ˜Q(Ψm) Let

Φm = α isin 0 mminus 13 α1 + α2 + α3 = mminus 1

We know ˜Q(Φm) = z(m) (Section 541) We will show that ˜Q(Φm) = ˜Q(Ψm)when m is a prime power This proves the theorem

We prove ˜Q(Φm) le ˜Q(Ψm) There is a combinatorial degeneration Φm ΨmIndeed let ui 0 mminus 1 rarr 0 mminus 1 be the identity map If α isin Φmthen

sum3i=1 ui(αi) = m minus 1 and if α isin Ψm Φm then

sum3i=1 ui(αi) equals m minus 1

plus a positive multiple of m This means Theorem 521 applies and we thusobtain ˜Q(Φm) le ˜Q(Ψm) This proves the claim

We show ˜Q(Ψm) le ˜Q(Φm) when m is a power of the prime p Let F = FpLet fm isin Fm otimes Fm otimes Fm have support Ψm with all nonzero coefficients equal

84 Chapter 5 Tight tensors and combinatorial subrank cap sets

to 1 Obviously ˜Q(Ψm) le ˜Q(fm) To compute ˜Q(fm) we show that there is abasis in which the support of fm equals the tight set Φm Then ˜Q(fm) = ˜Q(Φm)(Corollary 54) This implies the claim We prepare to give the basis (which isthe same basis as used in [BCC+17]) First observe that the rule x 7rarr

(xa

)gives a

well-defined map ZmZrarr ZpZ since for a isin 0 1 mminus 1 if x = y mod mthen

(xa

)=(ya

)mod p by Lucasrsquo theorem Let (ex)x be the standard basis of Fm

The elements (sum

xisinZmZ(xa

)ex)aisinZmZ form a basis of Fm since the matrix (

(xa

))ax

is upper triangular with ones on the diagonal We will now rewrite fm in the basis((sum

x

(xa

)ex)a (

sumy

(yb

)ey)b (

sumz

(zc

)ez)c) Observe that

(x

mminus1

)equals 1 if and only

if x equals mminus 1 and hence

fm =sum

xyzisinZmZx+y+z=mminus1

ex otimes ey otimes ez =sum

xyzisinZmZ

(x+ y + z

mminus 1

)ex otimes ey otimes ez

The identity(x+y+zw

)=sum(

xa

)(yb

)(zc

)with sum over a b c isin 0 1 mminus 1 such

that a+ b+ c = w is true and thussumxyzisinZmZ

(x+ y + z

mminus 1

)ex otimes ey otimes ez

=sum

xyzisinZmZ

sumabcisin01mminus1

a+b+c=mminus1

(x

a

)(y

b

)(z

c

)ex otimes ey otimes ez (523)

We may simply rewrite (523) as

sumabcisin01mminus1

a+b+c=mminus1

sumxisinZmZ

(x

a

)ex otimes

sumyisinZmZ

(y

b

)eb otimes

sumzisinZmZ

(z

c

)ez

Therefore with respect to the basis ((sum

x

(xa

)ex)a (

sumy

(yb

)ey)b (

sumz

(zc

)ez)c) the

support of fm equals the tight set Φm (And even stronger fm is isomorphic tothe tensor F[x](xm) of Section 541)

Remark 526 Why did we reprove the cap set result Theorem 525 Ourmotivation being interested in the asymptotic spectrum of tensors was to seeif the techniques in the cap set papers are stronger than the Strassen supportfunctionals ie whether they give any new spectral points Above we have seenthat the cap set result itself can be proven with the support functionals In fact weshow in Section 46 that for oblique tensors the asymptotic slice-rank which wasintroduced in [Tao16] to give a concise proof of [EG17] equals the minimum valueover the support functionals In Section 611 we show that for all complex tensorsasymptotic slice-rank equals the minimum value of the quantum functionals

55 Graph tensors 85

55 Graph tensors

In this section we briefly discuss the application that motivated us to proveTheorem 57 in [CVZ16] namely upper bounding the asymptotic rank of so-calledgraph tensors Graph tensors are defined as follows

Let G = (VE) be a graph (or hypergraph) with vertex set V and edgeset E Let n isin N Let (bi)iisin[n] be the standard basis of Fn We define the graphtensor Tn(G) as

Tn(G) =sumiisin[n]E

otimesvisinV

(otimeseisinEvisine

bie

)

seen as a |V |-tensor Given a vertex v isin V let d(v) denote the degree of v thatis d(v) equals the number of edges e isin E that contain v Then Tn(G) is naturallyinotimes

visinV Fd(v) We write T(G) for T2(G) For example for the complete graphon four vertices K4 the graph tensor is

T(K4) = T( )

= T( )

otimes T( )

otimes T( )

otimes T( )

otimes T( )

otimes T( )

=sum

iisin016(bi1otimesbi2otimesbi5)otimes (bi2otimesbi3otimesbi6)otimes (bi3otimesbi4otimesbi5)otimes (bi1otimesbi4otimesbi6)

living in (C8)otimes4 Let Kk be the complete graph on k vertices The 2times2 matrix mul-tiplication tensor 〈2 2 2〉 equals the tensor T(K3) Define the exponent ω(T(G)) =log2 ˜R(T(G)) We study the exponent per edge τ(T(G)) = ω(T(G)) |E(G)|

Our result is an upper bound on τ(T(K4)) in terms of the combinatorialasymptotic subrank ˜Q(Φ(22)) which we studied in Theorem 516

Theorem 527 For any q ge 1 τ(T(K4)) le logq

( q + 2

˜Q(Φ(22))

)

Proof We apply a generalisation of the laser method See [CVZ16]

Corollary 528 Let k ge 4 Then τ(T(Kk)) le 0772943

Proof In the bound of Theorem 527 we plug in the value ˜Q(Φ(22)) = 2 fromTheorem 516 Then we optimise over q to obtain the value 0772943 Bya ldquocovering argumentrdquo we can show that τ(T(Kk)) is non-increasing when kincreases

For k ge 4 Corollary 528 improves the upper bound τ(T(Kk)) le 0790955that can be derived from the well-known upper bound of Le Gall [LG14] on theexponent of matrix multiplication ω = ω(T(K3))

86 Chapter 5 Tight tensors and combinatorial subrank cap sets

A standard ldquoflattening argumentrdquo (ie using the gauge points from the asymp-totic spectrum) yields the lower bound τ(T(Kk)) ge 1

2k(k minus 1) if k is even and

τ(T(Kk)) ge 12(k + 1)k if k is odd As a consequence if the exponent of matrix

multiplication ω equals 2 then τ(T(K4)) = τ(T(K3)) = 23 We raise the following

question is there a k ge 5 such that τ(T(Kk)) lt23

Tensor surgery cycle graphs

For graph tensors given by sparse graphs good upper bounds on the asymptoticrank can be obtained with an entirely different method called tensor surgery whichwe introduced in [CZ18] As an illustration let me mention the results we obtainedfor cycle graphs with tensor surgery Recall ω = log2 ˜R(〈2 2 2〉) = log2 ˜R(T(Ck))Let ωk = log2 ˜R(T(Ck)) First observe that ωk = k for even k For odd k triviallyk minus 1 le ωk le k We prove the following

Theorem 529 For k ` odd ωk+`minus1 le ωk + ω`

Corollary 530 Let k ge 5 odd Then ωk le ωkminus2 + ω3 and thus ωk le kminus12ω

Corollary 531 If ω = 2 then ωk = k minus 1 for all odd k

See [CZ18] for the proofs

56 Conclusion

Tight tensors are a subfamily of the oblique tensors For tight 3-tensors theminimum over the support functionals equals the asymptotic subrank This isproven via the CoppersmithndashWinograd method The construction is in fact of avery combinatorial nature In this chapter we studied the combinatorial notion ofsubrank We proved that combinatorial subrank is monotone under combinatorialdegeneration We studied the cap set problem via the support functionals Weextended the CoppersmithndashWinograd method to higher-order tensors and appliedthis method to study graph tensors

Chapter 6

Universal points in the asymp-totic spectrum of tensors entanglementpolytopes moment polytopes

This chapter is based on joint work with Matthias Christandl and Peter Vrana [CVZ18]

61 Introduction

In Chapter 4 following Strassen we introduced the asymptotic spectrum oftensors X(T ) = X(T 6) for T the semiring of k-tensors over F for some fixedinteger k and field F with addition given by direct sum oplus multiplication givenby tensor product otimes and preorder 6 given by restriction (or degeneration) Theasymptotic spectrum characterises the asymptotic rank ˜R and the asymptoticsubrank ˜Q We have seen that the asymptotic rank plays an important role inalgebraic complexity theory the asymptotic rank of the matrix multiplicationtensor 〈2 2 2〉 =

sumijkisin[2] eij otimes ejkotimes eki isin F4otimesF4otimesF4 characterises the exponent

of the arithmetic complexity of multiplying two n times n matrices over F thatis ˜R(〈2 2 2〉) = 2ω We have also seen in Chapter 5 how one may use theasymptotic subrank to upper bound the size of combinatorial objects like forexample cap sets in Fn3

New results in this chapter

So far the only elements we have seen in X(T ) (ie universal spectral pointscf Section 213) are the gauge points (Section 43) Besides that we have seenin Section 44 that the Strassen support functionals ζθ are in X(oblique) Inthis chapter we introduce for the first time an explicit infinite family of universalspectral points (over the complex numbers) the quantum functionals Our newinsight is to use the moment polytope Given a tensor t isin Cn1 otimes Cn2 otimes Cn3 themoment polytope P(t) is a convex polytope that carries representation-theoretic

87

88 Chapter 6 Universal points in the asymptotic spectrum of tensors

information about t The quantum functionals are defined as maximisations overmoment polytopes

Let me immediately put a disclaimer The quantum functionals do not give anew lower bound on the asymptotic rank of matrix multiplication 〈2 2 2〉 namelythe quantum functionals give the same lower bound as the gauge points Alsothe quantum functionals being defined for tensors over complex numbers only wedo not expect to get new upper bounds on the size of combinatorial objects thatare ldquolike cap setsrdquo

So what have we gained Arguably we have found the ldquorightrdquo viewpoint onhow to construct universal spectral points for tensors (In fact after writing ourpaper [CVZ18] we realised that Strassen had begun a study of moment polytopesin the appendix of the German survey [Str05] Strassen did not construct newuniversal spectral points however not in that publication at least) If there aremore universal spectral points then our viewpoint may lead the way to findingthem Moreover whereas no efficient algorithm is known for evaluating the supportfunctionals the moment polytope viewpoint may open the way to having efficientalgorithms for evaluating the quantum functionals

In Sections 62ndash67 we work towards the construction of the quantum functionalsand we give a proof that they are universal spectral points In Sections 68ndash610 wecompare the quantum functionals and the support functionals and in Section 611we relate asymptotic slice rank to the quantum functionals

In this chapter we will focus on 3-tensors but the theory naturally generalisesto k-tensors

62 SchurndashWeyl duality

For background on representation theory we refer to [Kra84] [Ful97] and [GW09]Let Sn be the symmetric group on n symbols Let Sn act on the tensor

space (Cd)otimesn by permuting the tensor legs

π middot v1 otimes middot middot middot otimes vn = vπminus1(1) otimes middot middot middot otimes vπminus1(n) π isin Sn

Let GLd be the general linear group of Cd Let GLd act on (Cd)otimesn via the diagonalembedding GLd rarr GLtimesnd g 7rarr (g g)

g middot v1 otimes middot middot middot otimes vn = (gv1)otimes middot middot middot otimes (gvn) g isin GLd

The actions of Sn and GLd commute so we have a well-defined action of the productgroup Sn timesGLd on (Cd)otimesn SchurndashWeyl duality describes the decomposition ofthe space (Cd)otimesn into a direct sum of irreducible Sn timesGLd representations Thisdecomposition is

(Cd)otimesn sim=oplusλ`dn

[λ]otimes Sλ(Cd) (61)

62 SchurndashWeyl duality 89

with [λ] an irreducible Sn representation of type λ and Sλ(Cd) an irreducibleGLd-representation of type λ when `(λ) le d and 0 when `(λ) gt d We use thenotation λ `d n for the partitions of n with at most d parts Let

Pλ (Cd)otimesn rarr (Cd)otimesn

be the equivariant projector onto the isotypical component of type λ ie onto thesubspace of (Cd)otimesn isomorphic to [λ]otimes Sλ(Cd) The projector Pλ is given by theaction of the group algebra element

Pλ =(dim[λ]

n

)2 sumTisinTab(λ)

cT isin C[Sn]

where Tab(λ) is the set of Young tableaux of shape λ filled with [n] and with cTthe Young symmetrizer

cT =sum

σisinC(T )

sgn(σ)σsum

πisinR(T )

π

where C(T ) R(T ) sube Sn are the subgroups of permutations inside columns andpermutations inside rows respectively The element Pλ is a minimal centralidempotent in C[Sn] and

sumλ`n Pλ = e

Back to the decomposition of (Cd)otimesn We need a handle on the size of thecomponents in the direct sum decomposition (61) For our application it is goodto think of d as a constant and n as a large number The number of summands inthe direct sum decomposition (61) is upper bounded by a polynomial in n

|λ `d n| le (n+ 1)d

ie there are only few summands compared to the total dimension dn There arethe following well-known bounds on the dimensions of the irreducible representa-tions [λ] and Sλ(Cd) that make up the summands

nprodd`=1(λ` + dminus `)

le dim[λ] le nprodd`=1 λ`

(62)

dimSλ(Cd) le (n+ 1)d(dminus1)2 (63)

Let p isin Rn be a probability vector iesumn

i=1 pi = 1 and pi ge 0 for i isin [n]Let H(p) be the Shannon entropy of the probability vector p

H(p) =nsumi=1

pi log2

1

pi

For α isin [0 1] let h(α) = H((α 1 minus α)) be the binary entropy For a partitionλ = (λ1 λ`) ` n let λ = λn = (λ1n λ`n) be the probability vectorobtained by normalising λ

90 Chapter 6 Universal points in the asymptotic spectrum of tensors

Let λ ` n For N isin N let Nλ = (Nλ1 Nλ2 Nλ`) be the stretchedpartition We see that asymptotically in the stretching factor N the dimensionof [Nλ] behaves like a multinomial coefficient and

2NnH(λ)minuso(N) le dim[Nλ] le 2NnH(λ) (64)

63 Kronecker and LittlewoodndashRichardson coef-

ficients gλmicroν cλmicroν

Let micro ν ` n Let Sn rarr SntimesSn π 7rarr (π π) be the diagonal embedding Considerthe decomposition of the tensor product [micro] otimes [ν] restricted along the diagonalembedding

[micro]otimes [ν] darrSntimesSnSnsim=otimesλ`n

HomSn([λ] [micro]otimes [ν])otimes [λ]

Define the Kronecker coefficient

gλmicroν = dim HomSn([λ] [micro]otimes [ν])

ie gλmicroν is the multiplicity of [λ] in [micro]otimes [ν]

Let λ `a+b Let GLa timesGLb rarr GLa+b (AB) 7rarr AoplusB be the block-diagonalembedding Consider the decomposition of the representation Sλ(Ca+b) restrictedalong the block-diagonal embedding

Sλ(Ca+b) darrGLa+b

GLatimesGLbsim=oplusmicro`aν`b

Hλmicroν otimes Smicro(Ca)otimes Sν(Cb)

with

Hλmicroν = HomGLatimesGLb(Smicro(Ca)otimes Sν(Cb)Sλ(Ca+b))

Define the LittlewoodndashRichardson coefficient cλmicroν = dimHλmicroν

For partitions λ λprime ` define λ + λprime elementwise The Kronecker and theLittlewoodndashRichardson coefficients have the following semigroup property (seeeg [CHM07])

Lemma 61 Let λ micro ν α β γ ` be partitions

(i) If gλmicroν gt 0 and gαβγ gt 0 then gλ+α micro+β ν+γ gt 0

(ii) If cλmicroν gt 0 and cαβγ gt 0 then cλ+αmicro+β ν+γ gt 0

64 Entropy inequalities 91

64 Entropy inequalities

The semigroup properties imply the following lemma Of this lemma the firststatement can be found in a paper by Christandl and Mitchison [CM06] while wedo not know of any source that explicitly states the second statement For theconvenience of the reader we give the proofs of both statements

Lemma 62 Let λ micro ν ` be partitions

(i) If gλmicroν gt 0 then H(λ) le H(micro) +H(ν)

(ii) If cλmicroν gt 0 then H(λ) le |micro||micro|+|ν|H(micro) + |ν|

|micro|+|ν|H(ν) + h( |micro||micro|+|ν|

)

Proof (i) Let gλmicroν gt 0 Suppose λ micro ν ` n Let N isin N Then Lemma 61implies gNλNmicroNν gt 0 This means HomSnN ([Nλ] [Nmicro]otimes [Nν]) 6= 0 which impliesdim[Nλ] le dim[Nmicro] dim[Nν] From (64) we have the dimension bounds

2NnH(λ)minuso(N) le dim[Nλ]

dim[Nmicro] le 2NnH(micro)

dim[Nν] le 2NnH(ν)

Thus NnH(λ) minus o(N) le NnH(micro) + NnH(ν) Divide by Nn and let N go toinfinity to get H(λ) le H(micro) +H(ν)

(ii) We restrict the decomposition

(Ca+b)otimesn sim=oplusλ`a+bn

[λ]otimes Sλ(Ca+b)

along the block-diagonal embedding to get

(Ca+b)otimesn darrGLa+b

GLatimesGLbsim=otimesλ`a+bn

[λ]otimes Sλ(Ca+b) darrGLa+b

GLatimesGLb

sim=oplusλ`a+bn

[λ]otimesoplusmicro`aν`b

Ccλmicroν otimes Smicro(Ca)otimes Sν(Cb)

sim=oplusmicro`aν`b

(oplusλ`a+bn

[λ]otimes Ccλmicroν)otimes Smicro(Ca)otimes Sν(Cb)

On the other hand

(Ca+b)otimesn darr sim= (Ca oplus Cb)otimesn darrsim= (Ca)otimesn oplus ((Ca)otimesnminus1 otimes Cb)oplus middot middot middot oplus (Cb)otimesn darr

sim=noplusk=0

C(nk) otimesoplusmicro`ak

([micro]otimes Smicro(Ca))otimesoplus

ν`bnminusk

([ν]otimes Sν(Cb))

92 Chapter 6 Universal points in the asymptotic spectrum of tensors

sim=noplusk=0

oplusmicro`akν`bnminusk

(C(nk) otimes [micro]otimes [ν]

)otimes Smicro(Ca)otimes Sν(Cb)

Suppose cλmicroν gt 0 Comparing the above expressions gives the inequality dim[λ] le(n|micro|

)dim[micro] dim[ν] By the semigroup property Lemma 61 we have cNλNmicroNν gt 0

for all N isin N Thus dim[Nλ] le(NnN |micro|

)dim[Nmicro] dim[Nν] for all N isin N Then

from (64) follows

2NnH(λ)minuso(N) le 2Nnh(|micro|n

)2N |micro|H(micro)2N |ν|H(ν)

We conclude H(λ) le h( |micro|n

) + |micro|nH(micro) + |ν|

nH(ν)

Let x = (x(1) x(2) x(3)) be a triple of probability vectors x(i) isin Rni Let θ isin Θbe a weighting Let Hθ(x) be the θ-weighted average of the Shannon entropies ofthe probability vectors x(1) x(2) and x(3)

Hθ(x) = θ(1)H(x(1)) + θ(2)H(x(2)) + θ(3)H(x(3))

(Note that this notation is slightly different from the notation used in Chapter 4)We will use the notation λ `3 n to say that λ is a triple of partitions of n ie λequals (λ(1) λ(2) λ(3)) where each λ(i) is a partition of n We write λ for the

normalised triple (λ(1) λ(2) λ(3))

Lemma 63 Let λ micro ν `3 be three triples of partitions

(i) If gλ(i)micro(i)ν(i) gt 0 for all i then 2Hθ(λ) le 2Hθ(micro)2Hθ(ν)

(ii) If micro `3 m ν `3 nminusm and cλ(i)

micro(i)ν(i) gt 0 for all i then 2Hθ(λ) le 2Hθ(micro) + 2Hθ(ν)

Proof (i) Suppose gλ(i)micro(i)ν(i) gt 0 for all i Then H(λ(i)) le H(micro(i)) +H(ν(i)) for

all i by Lemma 62 Thussum

i θ(i)H(λ(i)) lesum

i θ(i)H(micro(i))+sum

i θ(i)H(ν(i)) Then

Hθ(λ) le Hθ(micro) +Hθ(ν) We conclude 2Hθ(λ) le 2Hθ(micro)2Hθ(ν)

(ii) Suppose cλ(i)

micro(i)ν(i) gt 0 for all i Then H(λ(i)) le mnH(micro(i))+ nminusm

nH(ν(i))+h(m

n)

by Lemma 62 We take the θ-weighted average to get Hθ(λ) le mnHθ(micro) +

nminusmmHθ(ν) + h(m

n) We conclude 2Hθ(λ) le 2Hθ(micro) + 2Hθ(ν) by Lemma 49(iv)

65 Hilbert spaces and density operators

Endow the vector space Cn with a hermitian inner product (one may take thestandard hermitian inner product 〈u v〉 =

sumni=1 uivi for u v isin Cn where middot denotes

taking the complex conjugate) so that it is a Hilbert space

66 Moment polytopes P(t) 93

Let (V1 〈middot middot〉) and (V2 〈middot middot〉) be Hilbert spaces On V1 oplus V2 we define the innerproduct by 〈u1 oplus u2 v1 oplus v2〉 = 〈u1 v1〉+ 〈u2 v3〉 On V1 otimes V2 we define the innerproduct by 〈u1 otimes u2 v1 otimes v2〉 = 〈u1 v1〉〈u2 v2〉 and extending linearly

Let V be a Hilbert space A positive semidefinite hermitian operator ρ V rarr Vwith trace one is called a density operator The sequence of eigenvalues of a densityoperator ρ is a probability vector Let spec(ρ) = (p1 pn) be the sequence ofeigenvalues of ρ ordered non-increasingly p1 ge middot middot middot ge pn

Let V1 and V2 be Hilbert spaces Given a density operator ρ on V1 otimes V2the reduced density operator ρ1 = tr2 ρ is uniquely defined by the property thattr(ρ1X1) = tr(ρ(X1otimes IdV2)) for all operators X1 on V1 The operator ρ1 is again adensity operator The operation tr2 is called the partial trace over V2 Explicitly ρ1

is given by 〈ei ρ1(ej)〉 =sum

`〈ei otimes f` ρ(ej otimes f`)〉 where the ei are some basis of V1

and the fi are some basis of V2 (the statement is independent of basis choice)Let Vi be a Hilbert space and consider the tensor product V1otimesV2otimesV3 Associate

with t isin V1 otimes V2 otimes V3 the dual element tlowast = 〈t middot〉 isin (V1 otimes V2 otimes V3)lowast Then

ρt = ttlowast〈t t〉 = t〈t middot〉〈t t〉

is a density operator on V1 otimes V2 otimes V3 Viewing ρt as a density operator on theregrouped space V1otimes (V2otimes V3) we may take the partial trace of ρt over V2otimes V3 asdescribed above We denote the resulting density operator by ρt1 = tr23 ρ

t Wesimilarly define ρt2 = tr13 ρ

t and ρt3 = tr12 ρt

66 Moment polytopes P(t)

We give a brief introduction to moment polytopes We refer to [Nes84 Bri87Fra02 Wal14] for more information We begin with the general setting and thenspecialise to orbit closures in tensor spaces

661 General setting

Let G be a connected reductive algebraic group (We refer to Kraft [Kra84] andHumphreys [Hum75] for an introduction to algebraic groups) Fix a maximal torusT sube G and a Borel subgroup T sube B sube G We have the character group X(T ) theWeyl group W the root system Φ sube X(T ) and the system of positive roots Φ+ sube ΦFor λ micro isin X(T ) we set λ 4 micro if micro minus λ is a sum of positive roots Let V bea rational G-representation The restriction of the action of G to T gives adecomposition

V =oplus

λisinX(T )

Vλ Vλ = v isin V forallt isin T t middot v = λ(t)v

This decomposition is called the weight decomposition of V The λ isin X(T )with Vλ 6= 0 are called the weights of V with respect to T The Vλ are the

94 Chapter 6 Universal points in the asymptotic spectrum of tensors

weight spaces of V For v isin V let vλ be the component of v in Vλ Letsupp(v) = λ vλ 6= 0

Let E be the real vector space E = X(T ) otimes R The Weyl group W actson X(T ) and thus on E We enlarge 4 to a partial order on E as follows Forx y isin E let x 4 y if y minus x is a nonnegative linear combination of positive rootsLet D sube E be the positive Weyl chamber For every x isin E the orbit W middot xintersects the positive Weyl chamber D in exactly one point which we denote bydom(x)

Let V be a finite-dimensional rational G-module Let χ isin X(T ) cap D bea dominant character We denote the χ-isotypical component of V with V(χ)Let Z sube V be a Zariski closed set We denote the coordinate ring of Z with C[Z]We denote the degree d part of C[Z] with C[Z]d If Z is G-stable then C[Z]d is aG-module

Definition 64 Let V be a rational G-module and Z sube V a nontrivial irreducibleclosed G-stable cone The moment polytope of Z denoted by

P(Z)

is defined as the Euclidean closure in E of the set

R(Z) = χd (C[Z]d)(χlowast) 6= 0

of normalised characters χd for which the χlowast-isotypical component (C[Z]d)(χlowast) isnot zero

Theorem 65 (MumfordndashNess [Nes84] Brion [Bri87] Franz [Fra02]) The momentpolytope is indeed a convex polytope and it is equal to the image of the so-calledmoment map intersected with the positive Weyl chamber

P(Z) = micro(Z 0) capD

Let Z = G middot v be the orbit closure (in the Zariski topology) of a vector v isin V 0and suppose G middot v is a cone

Lemma 66 (See eg [Str05]) Suppose G middot v is a cone Then

R(G middot v) = χd (C[G middot v]d)(χlowast) 6= 0= χd (lin(G middot votimesd))(χ) 6= 0

662 Tensor spaces

We specialise to 3-tensors Let V = V1 otimes V2 otimes V3 with Vi = Cni Let

G = GLn1 timesGLn2 timesGLn3

67 Quantum functionals F θ(t) 95

T = T1 times T2 times T3

with Ti the diagonal matrices in GLni The weight decomposition of V is thedecomposition with respect to the standard basis elements ex1 otimes ex2 otimes ex3 wherex isin [n1]times [n2]times [n3] The support supp(v) is the support of v with respect to thestandard basis

In the current setting there is a beautiful rephrasing of Theorem 65 in termsof ordered spectra of reduced density matrices Recall from Section 65 that forv isin V 0 we have a density matrix ρv and reduced density matrices ρvi of whichwe may take the non-increasingly ordered spectra spec(ρvi )

Theorem 67 (WalterndashDoranndashGrossndashChristandl [WDGC13]) Let Z sube V be anontrivial irreducible closed G-stable cone Then

P(Z) = (spec ρz1 spec ρz2 spec ρz3) z isin Z 0

Let v isin V 0 We consider the moment polytope of the orbit closure Z = G middot vIn this setting Lemma 66 specialises to the following

Lemma 68 (See eg [Str05])

R(G middot v) = χd (C[G middot v]d)(χlowast) 6= 0= χd (lin(G middot votimesd))(χ) 6= 0= χd Pχv

otimesd 6= 0

where Pχ = Pχ(1) otimes Pχ(2) otimes Pχ(3) with Pχ(i) V otimesdi rarr V otimesdi the projector onto the

isotypical component of type χ(i) discussed in Section 62

On the other hand Theorem 67 immediately gives a description of the momentpolytope P(G middot v) in terms of ordered spectra of reduced density matrices

Theorem 69 Let v isin V 0 Then

P(G middot v) = (spec ρu1 spec ρu2 spec ρu3) u isin G middot v 0

Summarising we have two descriptions of the moment polytope a represen-tation-theoretic or invariant-theoretic description (Lemma 68) and a quantummarginal spectra description (Theorem 69) These two descriptions are the keyto proving the properties of the quantum functionals that we need

67 Quantum functionals F θ(t)

We will now define the quantum functionals and prove that they are universalspectral points

96 Chapter 6 Universal points in the asymptotic spectrum of tensors

Let p = (p1 pn) isin Rn be a probability vector iesumn

i=1 pi = 1 andpi ge 0 for all i isin [n] Recall that H(p) denotes the Shannon entropy of theprobability vector p H(p) =

sumni=1 pi log2 1pi Let x = (x(1) x(2) x(3)) be a

triple of probability vectors x(i) isin Rni Let θ isin Θ be a weighting Recallthat Hθ(x) denotes the θ-weighted average of the Shannon entropies of the threeprobability vectors x(1) x(2) x(3)

Hθ(x) = θ(1)H(x(1)) + θ(2)H(x(2)) + θ(3)H(x(3))

Let V = Cn1 otimes Cn2 otimes Cn3 Let G = GLn1 timesGLn2 timesGLn3 Let v isin V 0 Weuse the notation P(v) = P(G middot v) for the moment polytope of the orbit closureof v

Definition 610 For θ isin Θ and v isin V 0 let

F θ(v) = max2Hθ(x) x isin P(v)

Let F θ(0) = 0 We call the functions F θ the quantum functionals The namequantum functional comes from the fact that the moment polytope P(t) consistsof triples of quantum marginal entropies

Theorem 611 Let T be the semiring of 3-tensors over C Let 6 be the restrictionpreorder For θ isin Θ

F θ isin X(T 6)

In other words F θ is a semiring homomorphism T rarr Rge0 which is monotoneunder degeneration 6 In fact F θ is monotone under degeneration

Remark 612 The results in this chapter generalise to k-tensors over C In ourpaper [CVZ18] we discuss this general situation in detail and make a distinctionbetween upper quantum functionals and lower quantum functionals

Let p isin Rn and q isin Rm be probability vectors The tensor product potimesq isin Rnm

defined by

potimes q = (piqj i isin [n] j isin [m])

is a probability vector The direct sum poplus q isin Rn+m defined by

poplus q = (p1 pn q1 qm)

is a probability vectorLet x = (x(1) x(2) x(3)) and y = (y(1) y(2) y(3)) be triples of probability vectors

We define the tensor product xotimes y elementwise

xotimes y = (x(1) otimes y(1) x(2) otimes y(2) x(3) otimes y(3))

67 Quantum functionals F θ(t) 97

We define the direct sum xoplus y elementwise

xoplus y = (x(1) oplus y(1) x(2) oplus y(2) x(3) oplus y(3))

For x otimes y and x oplus y to be in the moment polytope we will need to reorder thecomponents non-increasingly For a triple of probability vectors x = (x(1) x(2) x(3))let

dom(x)

be the triple of probability vectors obtained from x be reordering the compo-nents x(i) such that they become non-increasing Let dom(S) = dom(x) x isin S

For v isin Cn1otimesCn2otimesCn3 we will use the notation G(v) = GLn1timesGLn2timesGLn3

to denote the group that naturally corresponds to the space that v lives in Wewill use the notation P(v) = P(G(v) middot v) for the moment polytope of the orbitclosure of v

Theorem 613 Let s isin Cn1 otimes Cn2 otimes Cn3 and t isin Cm1 otimes Cm2 otimes Cm3

(i) dom(P(s)otimesP(t)

)sube P(sotimes t)

(ii) forallα isin [0 1] dom(αP(s)oplus (1minus α) P(t)

)sube P(soplus t)

(iii) If s t isin Cn1 otimes Cn2 otimes Cn3 0 and s isin G(t) middot t then P(s) sube P(t)

(iv) P(soplus 0) = P(s)oplus 0

(v) P(〈1〉) = ((1) (1) (1)) with 〈1〉 = e1 otimes e1 otimes e1 isin C1 otimes C1 otimes C1

Proof To prove statements (i) and (ii) let x isin P(s) and y isin P(t) Then thereare elements a isin G(s) middot s and b isin G(t) middot t with ordered marginal spectra x and y

x = (spec ρa1 spec ρa2 spec ρa3)

y = (spec ρb1 spec ρb2 spec ρb3)

We prove statement (i) We have aotimes b isin G(sotimes t) middot sotimes t Thus

dom(xotimes y) = (spec ρaotimesb1 spec ρaotimesb2 spec ρaotimesb3 ) isin P(sotimes t)

We conclude dom(P(s)otimesP(t)) sube P(sotimes t) We prove statement (ii) Let α isin [0 1]Define the tensor u(α) isin Cn1+m1 otimes Cn2+m2 otimes Cn3+m3 by

u(α) =

radicαradic〈s s〉

aoplusradic

1minus αradic〈t t〉

b

Then u(α) isin G(soplus t) middot soplus t We have ρu(α)i = αρai oplus (1 minus α)ρbi From the

observation

spec(αρai oplus (1minus α)ρbi) = dom(αxoplus (1minus α)y)

98 Chapter 6 Universal points in the asymptotic spectrum of tensors

follows dom(αxoplus (1minus α)y) isin P(G(soplus t) middot soplus t) We conclude

dom(αP(s)oplus (1minus α)P(t)) sube P(soplus t)

We have thus proven statement (i) and (ii)We prove statement (iii) Let G = G(t) = G(s) Let s isin G middot t Then

G middot s sube G middot t so we have a G-equivariant restriction map C[G middot s] C[G middot t] onthe coordinate rings Let χd isin R(G middot s) with (C[G middot s]d)(χlowast) 6= 0 Then also(C[G middot t]d)(χlowast) 6= 0 by Schurrsquos lemma Thus χd isin R(G middot t) sube P(G middot t) Weconclude P(s) sube P(t)

We prove statement (iv) Let χd isin R(G(soplus 0) middot (soplus 0)) with Pχ(soplus0)otimesd 6= 0Recall from Section 62 that Pχ is given by the action of an element in the groupalgebra C[Sd] which we also denoted by Pχ From this viewpoint we see that also

Pχsotimesd 6= 0 So χd isin R(G(s) middot s)Statement (v) is a direct observation

Corollary 614

(i) F θ(s)F θ(t) le F θ(sotimes t)

(ii) F θ(s) + F θ(t) le F θ(soplus t)

(iii) If s t then F θ(s) le F θ(t)

(iv) F θ(〈1〉) = 1

Proof (i) Let x isin P(s) and y isin P(t) Then xotimesy isin P(sotimest) by Theorem 613 It isa basic fact that Hθ(x)+Hθ(y) = Hθ(xotimesy) (Lemma 49) so 2Hθ(x)2Hθ(y) = 2Hθ(xotimesy)We conclude F θ(s)F θ(t) le F θ(sotimes t)

(ii) Let x isin P(s) and y isin P(t) Then by Theorem 613 for all α isin [0 1]

dom(αxoplus (1minus α)y) isin P(soplus t)

It is a basic fact that αHθ(x) + (1 minus α)Hθ(y) + h(α) = Hθ(αx oplus (1 minus α)y)(Lemma 49) Thus for any α isin [0 1] we have 2αHθ(x)+(1minusα)Hθ(y)+h(α) le F θ(soplus t)Using Lemma 49(iv) we conclude F θ(s) + F θ(t) le F θ(soplus t)

(iii) This follows from statement (iii) and (iv) of Theorem 613 since bydefinition degeneration s t means soplus 0 isin G(toplus 0) middot (toplus 0)

(iv) This follows from statement (v) of Theorem 613

67 Quantum functionals F θ(t) 99

Theorem 615

(i) R(sotimes t) sube λN existmicroN isin R(s) νN isin R(t) gλ(i)micro(i)ν(i) gt 0 for all i

(ii) R(soplus t) sube λN existmicrom isin R(s) ν(N minusm) isin R(t) cλ(i)

micro(i)ν(i) gt 0 for all i

Proof (i) Let s isin V1 otimes V2 otimes V3 and let t isin W1 otimesW2 otimesW3 Let λN isin R(sotimes t)with Pλ(sotimes t)otimesN 6= 0 Let π be the natural reordering map

π ((V1 otimesW1)otimes (V2 otimesW2)otimes (V3 otimesW3))otimesN

rarr (V1 otimes V2 otimes V3)otimesN otimes (W1 otimesW2 otimesW3)otimesN

Then

(sotimes t)otimesN =summicroν

πminus1(Pmicro otimes Pν)π(sotimes t)otimesN

Let micro ν `3 N with Pλπminus1(Pmicro otimes Pν)π(s otimes t)otimesN 6= 0 Then Pmicros

otimesN 6= 0 andPνt

otimesN 6= 0 ie microN isin R(s) and νN isin R(t) Moreover Pλπminus1(Pmicro otimes Pν)π 6= 0

which means the Kronecker coefficients gλ(i)micro(i)ν(i) are nonzero(ii) Let λN isin R(soplus t) with Pλ(soplus t)otimesN 6= 0 Let us expand (soplus t)otimesN as

(soplus t)otimesN = sotimesN oplus (sotimesNminus1 otimes t)oplus middot middot middot oplus totimesN

Then Pλ does not vanish on some summand which we may assume to be of theform sotimesm otimes totimesNminusm Let π be the natural projection

π ((V1 oplusW1)otimes (V2 oplusW2)otimes (V3 oplusW3))otimesN

rarr (V1 otimes V2 otimes V3)otimesm otimes (W1 otimesW2 otimesW3)otimesNminusm

Let micro ν with Pλπminus1(Pmicro otimes Pν)π(soplus t)otimesN 6= 0 Then Pmicros

otimesm 6= 0 and PνtotimesNminusm 6= 0

Moreover Pλπminus1(Pmicro otimes Pν)π 6= 0 Therefore the LittlewoodndashRichardson coeffi-

cients cλ(i)

micro(i)ν(i) are nonzero

Corollary 616

(i) F θ(sotimes t) le F θ(s)F θ(t)

(ii) F θ(soplus t) le F θ(s) + F θ(t)

Proof (i) Let λN isin R(s otimes t) By Theorem 615 there is a microN isin R(s) and aνN isin R(t) such that the Kronecker coefficient gλ(i)micro(i)ν(i) is nonzero for every i

Then 2Hθ(micro) le F θ(s) and 2Hθ(ν) le F θ(t) by definition of F θ The Kroneckercoefficients being nonzero implies

2Hθ(λ) le 2Hθ(micro)2Hθ(ν)

100 Chapter 6 Universal points in the asymptotic spectrum of tensors

by Lemma 63 We conclude F θ(sotimes t) le F θ(s)F θ(t)

(ii) Let λN isin R(s oplus t) Then by Theorem 615 there are microm isin R(s) and

ν(N minus m) isin R(t) such that the LittlewoodndashRichardson coefficient cλ(i)

micro(i)ν(i) isnonzero for every i This means

2Hθ(λ) le 2Hθ(micro) + 2Hθ(ν)

by Lemma 63 We conclude F θ(soplus t) le F θ(s) + F θ(t)

Proof of Theorem 611 Corollary 614 and Corollary 616 together prove The-orem 611

68 Outer approximation

In this section we discuss an outer approximation of P(t) We will use this outerapproximation to show that the quantum functionals are at most the supportfunctionals

Let 4 be the dominance order ie majorization order on triples of probabilityvectors For any set S sube Rn1 times Rn2 times Rn3 of triples of probability vectors let S4

denote the upward closure with respect to 4

S4 = y isin Rn1 times Rn2 times Rn3 existx isin S x 4 y

Let conv(S) denote the convex hull of S in Rn1 times Rn2 times Rn3 Recall that for x isinS we defined dom(x) as the triple of probability vectors obtained from x =(x(1) x(2) x(3)) by reordering the components x(i) such that they become non-increasing and dom(S) = dom(x) x isin S

Theorem 617 (Strassen [Str05]) Let v isin V 0 Then

P(v) sube (dom conv supp v)4 (65)

Proof We give the proof for the convenience of the reader Let χd isin R(G middot v)Then (lin(G middot votimesd))(χ) 6= 0 Let Mχ sube lin(G middot votimesd) be a simple G-submodule withhighest weight χ Let N sube V otimesd be the G-module complement N oplusMχ = V otimesdThen votimesd is not in N Let v =

oplusγisinsupp v vγ be the weight decomposition Then votimesd

is a sum of tensor products of the vγ At least one summand is not in N say ofweight η =

sumγ dγγ with

sumγ dγ = d The projection V otimesd rarrMχ along N maps this

summand onto a nonzero weight vector of weight η So η is a weight of Mχ Thenalso dom(η) is a weight of Mχ Since χ is the highest weight of Mχ dom(η) 4 χ

Then dom(ηd) 4 χd We have ηd =sum

γdγdγ isin conv supp v We conclude

R(G middot v) sube (dom conv supp v)4 and thus P(G middot v) sube (dom conv supp v)4

69 Inner approximation for free tensors 101

69 Inner approximation for free tensors

In this section we discuss an inner approximation for the moment polytope of afree tensor We will use this inner approximation in the next section to prove thatthe quantum functionals coincide with the support functionals when restricted tofree tensors We will prove that not all tensors are free

We say a set Φ sube [n1] times [n2] times [n3] is free if every two different elementsof Φ differ in at least two coordinates in other words if the elements of Φ haveHamming distance at least two We say v isin V = Cn1 otimes Cn2 otimes Cn3 is free if forsome g isin G(v) = GLn1 timesGLn2 timesGLn3 the support supp(g middot v) sube [n1]times [n2]times [n3]is free (Free is called schlicht in [Str05])

Theorem 618 (Strassen [Str05]) Let v isin V 0 with supp(v) free Then

dom conv supp v sube P(v)

Proof We refer to [Str05]

Corollary 619 Let v isin V 0 with supp(v) free Then

P(v)4 =(dom conv supp v

)4

Proof By Theorem 618 dom conv supp v sube P(v) We take the upward closureon both sides to get (dom conv supp v)4 sube P(v)4 On the other hand fromTheorem 617 follows P(v)4 sube (dom conv supp v)4

Remark 620 Recall that v isin V is oblique if the support supp(g middot v) is anantichain for some g isin G(v) (Section 44) Such antichains are free so obliquetensors are free Thus tight sube oblique sube free Like the tight tensors andoblique tensors free tensors from a semigroup under otimes and oplus

Proposition 621 For n ge 5 there exists a tensor that is not free in CnotimesCnotimesCn

Proof We upper bound the maximal size of a free support Let Φ sube [n]times [n]times [n]be free Any two distinct elements in Φ are still distinct if we forget the thirdcoefficient of each Therefore |Φ| = |(α1 α2) α isin Φ| le n2 (This is a specialcase of the Singleton bound [Sin64] from coding theory This upper bound is tightsince Φ = (a b c) a b c isin [n] c = a+ b mod n is free and has size n2) Secondwe apply the following observation of Burgisser [Bur90 page 3] Let

Zn = t isin Cn otimes Cn otimes Cn existg isin G(t) |supp(g middot t)| lt n3 minus 3n2

Let Yn = Cn otimes Cn otimes Cn Zn Then the set Yn is Zariski open and nonemptyNow let n ge 5 and let t isin Yn Then forallg isin G(t) |supp(g middot t)| ge n3 minus 3n2 gt n2 Weconclude t is not free

102 Chapter 6 Universal points in the asymptotic spectrum of tensors

610 Quantum functionals versus support func-

tionals

We discussed the support functionals ζθ isin X(oblique 3-tensors over F) in Chap-ter 4 We recall its definition over C Let V = Cn1otimesCn2otimesCn3 For θ isin Θ = P([3])and t isin V 0 with supp(t) oblique

ζθ(t) = max2Hθ(P ) P isin P(supp(t))

We also discussed an extension of ζθ to all 3-tensors over C the upper supportfunctional

ζθ(t) = mingisinG(t)

max2Hθ(P ) P isin P(supp(g middot t))

We know ζθ(s otimes t) le ζθ(s)ζθ(t) ζθ(s oplus t) = ζθ(s) + ζθ(t) ζθ(〈1〉) = 1 ands 6 trArr ζθ(s) le ζθ(t) for any s t isin V

The set conv supp(g middot t) is the set of marginals of probability distributions onsupp(g middot t) Thus dom conv supp(g middot t) is the set of ordered marginals of probabilitydistributions on supp(g middot t) Therefore

ζθ(t) = mingisinG(t)

maxxisinS(gmiddott)

2Hθ(x)

with S(w) = dom conv suppw Let X sube Rn1 times Rn2 times Rn3 be a set of triples ofprobability vectors From Schur-convexity of the Shannon entropy function followsmaxxisinX 2Hθ(x) = maxxisinX4 2Hθ(x) Also Hθ(x) = Hθ(domx)

Theorem 622 ζθ(t) ge F θ(t)

Proof Let g isin G(t) such that

maxxisinS

2Hθ(x) = ζθ(t)

with S = dom conv supp(g middot t) We have

maxxisinS

2Hθ(x) = maxxisinS4

2Hθ(x)

By Theorem 617 P(t) sube S4 We conclude F θ(t) le ζθ(t)

Theorem 623 Let t isin V be free Then ζθ(t) = F θ(t)

Proof We know from Theorem 622 that ζθ(t) ge F θ(t) We prove ζθ(t) le F θ(t)Let g isin G(t) such that supp(g middot t) is free Let S = dom conv supp(g middot t) Thenζθ(t) le maxxisinS 2Hθ(x) = maxxisinS4 2Hθ(x) By Theorem 618 we have S4 = P(t)4We conclude ζθ(t) le F θ(t)

611 Asymptotic slice rank 103

We can show that the regularised upper support functional equals the quantumsupport functional As a consequence the quantum functional is at least the lowersupport functional which was discussed in Chapter 4

Theorem 624 limnrarrinfin1nζθ(totimesn)1n = F θ(t)

Proof We refer the reader to [CVZ18]

Corollary 625 F θ(v) ge ζθ(v)

Proof By Theorem 624 F θ(v) = limnrarrinfin ζθ(votimesn)1n We know ζθ(v) ge ζθ(v)

by Theorem 415 and thus limnrarrinfin ζθ(votimesn)1n ge limnrarrinfin ζθ(v

otimesn)1n The lowersupport functional ζθ is supermultiplicative under otimes (Theorem 414) so

limnrarrinfin

ζθ(votimesn)1n ge ζθ(v)

Combining these three inequalities proves the theorem

611 Asymptotic slice rank

We proved in Section 46 that for oblique t isin Fn1 otimes Fn2 otimes Fn3 the asymptotic slicerank limnrarrinfin SR(totimesn)1n exists and equals minθisinΘ ζ

θ(t) with Θ = P([3]) In thissection we prove the analogous statement for the quantum functionals

Theorem 626 Let t isin Cn1 otimes Cn2 otimes Cn3 Then

limnrarrinfin

SR(totimesn)1n = minθisinΘ

F θ(t)

We work towards the proof of Theorem 626 Let t isin Cn1 otimesCn2 otimesCn3 0 LetEθ(t) = log2 F

θ(t)

Lemma 627 For any ε gt 0 there is an n0 isin N such that for all n ge n0 there isa λn isin R(t) with miniisin[3]H(λ(i)) ge minθisinΘE

θ(t)minus ε

Proof By definition

minθisinΘ

Eθ(t) = minθisinΘ

maxxisinP(t)

sumjisin[3]

θ(j)H(x(j))

By Von Neumannrsquos minimax theorem the right-hand side equals

maxxisinP(t)

minθisinΘ

sumjisin[3]

θ(j)H(x(j))

which equals

maxxisinP(t)

minjisin[3]

H(x(j))

104 Chapter 6 Universal points in the asymptotic spectrum of tensors

Let ε gt 0 Let microm isin R(t) with minjisin[3] H(micro(j)) ge minθisinΘ Eθ(t)minus ε2 We will

use two facts We have (P(1) otimes P(1) otimes P(1))t = t 6= 0 The triples of partitions λwith Pλt

otimesn 6= 0 for some n form a semigroup Let n isin N We can write n = qm+rwith q r isin N 0 le r lt m Let λ(j) = qmicro(j) + (r) Then by the semigroup property

Pλtotimesn 6= 0 ie λn isin R(t) We have 1

n(qmicro(j) + (r)) = qm

nmicro(j) + r

n(r) By concavity

of Shannon entropy

H( 1n(qmicro(j) + (r))) = H( qm

nmicro(j) + r

n(r))

ge qmnH(micro(j))

ge (1minus mn

)H(micro(j))

When n is large enough (1minus mn

)H(micro(j)) is at least H(micro(j))minus ε2 Let n0 isin N suchthat this is the case for all j isin [3]

Lemma 628 Let λn isin R(t) Then SR(totimesn) ge miniisin[3] dim[λ(i)]

Proof We have the restriction totimesn ge Pλtotimesn 6= 0 Choose rank-one projections Aj

in the vector spaces Sλ(j)(Cnj) with

s = (id[λ(1)]otimesA1)otimes (id[λ(2)]otimesA2)otimes (id[λ(3)]otimesA3)Pλtotimesn 6= 0

The tensor s is invariant under Sn acting diagonally on (Cn1)otimesnotimes(Cn2)otimesnotimes(Cn3)otimesnThus the marginal spectra spec ρsi are uniform This implies s is semistableFrom [BCC+17 Theorem 46] follows that SR(s) equals miniisin[3] dim[λ(i)]

Lemma 629 lim infnrarrinfin SR(totimesn)1n ge minθisinΘ Fθ(t)

Proof Let ε gt 0 For n large enough choose λn isin R(t) as in Lemma 627 ByLemma 628 SR(totimesn) ge miniisin[3] dim[λ(i)] The right-hand side we lower bound by

miniisin[3]

dim[λ(i)] ge miniisin[3]

2nH(λ(i))2minuso(n) ge 2n(minθisinΘ Eθ(t)minusε)2minuso(n)

Then lim infnrarrinfin SR(totimesn)1n ge 2minθisinΘ Eθ(t)minusε

Lemma 630 lim supnrarrinfin SR(totimesn)1n le F θ(t)

Proof Let n isin N Define s1 s2 s3 isin Cn1 otimes Cn2 otimes Cn3 by

s1 =(sumλ(1)`n

H(λ(1))leEθ(t)

Pλ(1) otimes Idotimes Id)totimesn

s2 =(sumλ(2)`n

H(λ(2))leEθ(t)

Idotimes Pλ(2) otimes Id)

(totimesn minus s1)

612 Conclusion 105

s3 =(sumλ(3)`n

H(λ(3))leEθ(t)

Idotimes Idotimes Pλ(3)

)(totimesn minus s1 minus s2)

Then totimesn = s1 +s2 +s3 The slice rank of an element in the image of Pλ(1)otimes Idotimes Id

is at most dim[λ(1)] otimes Sλ(1)(Cn1) which is at most 2nH(λ(1))+o(n) (Section 62)Similarly for Id otimes Pλ(2) otimes Id and Id otimes Id otimes Pλ(3) The tensor s1 is in the imageof the sum

sumλ(1) Pλ(1) otimes Id otimes Id over λ(1) ` n with at most n1 parts There are

at most (n+ 1)n1 such partitions Thus SR(s1) le (n+ 1)n12nEθ(t)+o(n) Similarly

for s2 and s3 Therefore

lim supnrarrinfin

SR(totimesn)1n le lim supnrarrinfin

(3(n+ 1)maxiisin[3] ni 2nE

θ(t)+o(n))1n

(66)

The right-hand side of (66) equals F θ(t)

Proof of Theorem 626 Lemma 629 and Lemma 630 together prove Theo-rem 626

612 Conclusion

In this chapter we constructed the first infinite family of spectral points for 3-tensors over C the quantum functionals For 30 years the only explicit spectralpoints known were the gauge points The constructions in this chapter naturallygeneralise to higher-order tensors for which we refer to our paper [CVZ18] Wedo not know whether the quantum functionals are all spectral points for 3-tensorsover C Finally we showed that for complex tensors the asymptotic slice rankexists and equals the minimum value over the quantum functionals

Chapter 7

Algebraic branching programsapproximation and nondeterminism

This chapter is based on joint work with Karl Bringmann and Christian

Ikenmeyer [BIZ17]

71 Introduction

The study of asymptotic tensor rank in previous chapters was originally motivatedby the study of the complexity of matrix multiplication in the algebraic circuitmodel an algebraic model of computation In this chapter we will study severalother algebraic models of computation and algebraic complexity classes

Formulas the class VPe and the determinant

An (arithmetic) formula is a rooted binary tree whose leaves are each labeledwith a variable or a field constant and whose root and intermediate vertices arelabeled with either + (addition) or times (multiplication) In the natural way viarecursion over the tree structure a formula computes a multivariate polynomial f The formula size of a multivariate polynomial f is the smallest number of verticesrequired for any formula to compute f Here is an example of a formula of size 7computing the polynomial (3 + x)(3 + y)

3 x 3 y

+ +

times

A sequence of multivariate polynomials (fn)nisinN is called a family Valiant inhis seminal paper [Val79] introduced the complexity class VPe that is defined as

107

108 Chapter 7 Algebraic branching programs

the set of all families whose formula size is polynomially bounded (We say asequence (an)n isin NN of natural numbers is polynomially bounded if there exists aunivariate polynomial q such that an le q(n) for all n) For example the family((x1)

n + (x2)n + middot middot middot+ (xn)n)n is in VPe because the formula size of this family

grows quadratically

The smallest known formulas for the determinant family detn have size nO(logn)This follows from Berkowitzrsquo algorithm [Ber84] which gives an algebraic cir-cuit of depth O(log2 n) and thus by expanding we get an algebraic formula ofdepth O(log2 n) whose size is then trivially bounded by 2O(log2 n) = nO(logn) Itis a major open question in algebraic complexity theory whether formulas ofpolynomially bounded size exist for detn This question can be phrased in termsof complexity classes as asking whether or not the inclusion VPe sube VPs is strict(We will define VPs shortly)

Motivated by this question we study the closure class VPe of families ofpolynomials that can be approximated arbitrarily closely by families in VPe

(see Section 724 for the formal definition) Over the field R or C one can thinkof VPe as the set of families whose border formula size is polynomially boundedThe border formula size of a polynomial f is the smallest number c such that thereexists a sequence gi of polynomials with formula size at most c and limirarrinfin gi = f

Continuous lower bounds

In algebraic complexity theory problem instances correspond to vectors v isin FnA complexity lower bound often takes the form of a function f Fn rarr F that is zeroon the vectors of ldquolow complexityrdquo and nonzero on v We refer to Grochow [Gro13]for a discussion of settings where complexity lower bounds are obtained in thisway (eg [NW97 Raz09 LO15 GKKS13 LMR13 BI13]) Over the complexnumbers we can in fact assume that these functions f are continuous [Gro13](and even so-called highest-weight vector polynomials) If C and D are algebraiccomplexity classes with C sube D (for example C = VPe and D = VPs) thena proof of separation D 6sube C in this continuous manner implies the strongerseparation D 6sube C In our case it is thus natural to aim for the separation VPs 6subeVPe instead of the slightly weaker VPs 6sube VPe which provides further motivationfor studying VPe This is exactly analogous to the geometric complexity theoryapproach of Mulmuley and Sohoni (see eg [MS01 MS08] and the exposition[BLMW11 Sec 9]) which aims to prove the separation VNP 6sube VPs to attackValiantrsquos famous conjecture VPs 6= VNP [Val79] (Here VNP is the class ofp-definable families see Section 724)

New results in this chapter

We prove two new results in this chapter

71 Introduction 109

Algebraic branching programs of width 2 An algebraic branching pro-gram (abp) is a directed acyclic graph with a source vertex s and a sink vertex tthat has affine linear forms over the base field F as edge labels Moreover werequire that each vertex is labeled with an integer (its layer) and that edges in theabp only point from vertices in layer i to vertices in layer i+ 1 The width of anabp is the cardinality of its largest layer The size of an abp is the number of itsvertices The value of an abp is the sum of the values of all sndasht-paths where thevalue of an sndasht-path is the product of its edge labels We say that an abp computesits value The class VPs coincides with the class of families of polynomials thatcan be computed by abps of polynomially bounded size see eg [Sap16]

For k isin N we introduce the class VPk as the class of families of polyno-mials computable by width-k abps of polynomially bounded size It is well-known (see Lemma 72) that VPk sube VPe for all k ge 1 In 1992 Ben-Or andCleve [BOC92] showed that VPk = VPe for all k ge 3 In 2011 Allender andWang [AW16] showed that width-2 abps cannot compute every polynomial so inparticular we have a strict inclusion VP2 ( VP3

We prove that the closure of VP2 and the closure of VPe are equal

VP2 = VPe (71)

when char(F) 6= 2 From (71) and the result of Allender and Wang follows directlythat the inclusion VP2 ( VP2 is strict We have thus separated a complexityclass from its approximation closure

VNP via affine linear forms Every algebraic complexity class has a nondeter-ministic closure (see Section 725 for the definition) The nondeterministic closureof VP is called VNP and the nondeterministic closure of VPe is called VNPeIn 1980 Valiant [Val80] proved VNPe = VNP The nondeterministic closureof VP1 and VP2 we call VNP1 and VNP2 Using interpolation techniques wecan deduce VNP2 = VNP from (71) provided the field is infinite Using moresophisticated techniques we prove

VNP1 = VNP (72)

From (72) easily follows VP1 ( VNP1 Also from [AW16] we get VP2 ( VNP2We have thus separated complexity classes from their nondeterministic closures

Further related work

An excellent exposition on the history of small-width computation can be foundin [AW16] along with an explicit polynomial that cannot be computed by width-2abps namely x1x2 + x3x4 + middot middot middot+ x15x16 Saha Saptharishi and Saxena in [SSS09Cor 14] showed that x1x2 + x3x4 + x5x6 cannot be computed by width-2 abpsthat correspond to the iterated matrix multiplication of upper triangular matrices

110 Chapter 7 Algebraic branching programs

Burgisser in [Bur04] studied approximations in the model of general algebraiccircuits finding general upper bounds on the error degree For most algebraiccomplexity classes C the relation between C and C has not been an activeobject of study As pointed out recently by Forbes [For16] Nisanrsquos result [Nis91]implies that C = C for C being the class of size-k algebraic branching programson noncommuting variables A structured study of VP and VPs was startedin [GMQ16] Much work in lower bounds for algebraic approximation algorithmshas been done in the area of bilinear complexity dating back to [BCRL79 Str83Lic84] and more recently eg [Lan06 LO15 HIL13 Zui17 LM16a]

This chapter is organised as follows In Section 72 we discuss definitions andbasic results In Section 73 we prove that the approximation closure of VP2

equals the approximation closure of VPe ie VP2 = VPe In Section 74 we provethat the nondeterminism closure of VP1 equals VNP

72 Definitions and basic results

We briefly recall the definition of circuits formulas and branching programs andwe recall the definition of the corresponding complexity classes Then we discusssome straightforward relationships among these classes and review the proof of atheorem by Ben-Or and Cleve which inspired our work Finally we discuss theapproximation closure and the nondeterminism closure for algebraic complexityclasses

721 Computational models

Let x1 x2 be formal variables By F[x] we mean the ring of polynomials over Fwith variables x1 x2 xk with k large enough

A circuit is a directed acyclic graph G with one or more source vertices andone sink vertex Each source vertex is labelled by a variable xi or a constant c isin FThe other vertices are labelled by either + or times and have in-degree 2 (that isfan-in 2) Each vertex computes an element in F[x] by recursion over the graphThe element computed by the sink is the element computed by the circuit Thesize of a circuit is the number of vertices

A formula is a circuit whose graph is a treeAn algebraic branching program (abp) is a directed acyclic graph with a source

vertex s and a sink vertex t that has affine linear forms αxi + β α β isin F asedge labels Moreover we require that each vertex is labeled with an integer (itslayer) and that edges in the abp only point from vertices in layer i to vertices inlayer i+ 1 The width of an abp is the cardinality of its largest layer The size ofan abp is the number of its vertices The value of an abp is the sum of the valuesof all sndasht-paths where the value of an sndasht-path is the product of its edge labels

72 Definitions and basic results 111

We say that an abp computes its value

For example the following abp has depth 5 width 3 and computes thepolynomial x1x2 + x2 + 2x1 minus 1

x1 2

x1x2minus1

An abp G corresponds naturally to an iterated product of matrices for any twoconsecutive layers Li Li+1 in G let Mi be the matrix (evw)visinLiwisinLi+1

with evwthe label of the edge from v to w (or 0 if there is no edge from v to w) Then thevalue of G equals the product Mk middot middot middotM2M1

For example the above abp corresponds to the following iterated matrixproduct

(1 1 1

)minus1 0 00 x2 00 0 x1

1 0 0x1 1 00 0 2

1

11

722 Complexity classes VP VPe VPk

The circuit size of a polynomial f is the size of the smallest circuit computing f The formula size of a polynomial f is the size of the smallest formula computing f

A family is a sequence (fn)nisinN of multivariate polynomials over F A class is aset of families The class VP consists of all families (fn) with circuit size degreeand number of variables in poly(n) The class VPe consists of all families (fn)with formula size in poly(n) (The origin of the subscript e in VPe is the termldquoarithmetic expressionrdquo) Clearly VPe sube VP

We introduce classes defined by abps Let k ge 1 The class VPk consists of allfamilies computed by polynomial-size width-k abps with edges labelled by affinelinear forms

sumi αixi + β with coefficients αi β isin F

We note that the above classes depend on the choice of the ground field F

In our paper [BIZ17] we make a distinction between three different types ofedge labels for abps The class VPk in this chapter corresponds to the class VPg

k

in [BIZ17]

112 Chapter 7 Algebraic branching programs

723 The theorem of Ben-Or and Cleve

This subsection is about the relations among VPk and VPe

Lemma 71 VPk sube VP` when k le `

Proof This is clearly true

Lemma 72 VPk sube VPe for any k

Proof For the simple proof we refer to [BIZ17]

Ben-Or and Cleve [BOC92] showed that for k ge 3 the classes VPk and VPe

are in fact equal

Theorem 73 (Ben-Or and Cleve [BOC92]) For k ge 3 VPk = VPe

We will review the construction of Ben-Or and Cleve here because we will useit to prove Theorem 78 and Theorem 715 The following depth-reduction lemmafor formulas by Brent is a crucial ingredient

Lemma 74 (Brent [Bre74]) Let f be an n-variate degree-d polynomial computedby a formula of size s Then f can also be computed by a formula of size poly(s n d)and depth O(log s)

Proof See the survey of Saptharishi [Sap16 Lemma 55] for a modern proof

Proof of Theorem 73 Lemma 72 says VPk sube VPe We will prove theinlusion VPe sube VP3 from which follows VPe sube VPk by Lemma 71 andthus VPk = VPe For a polynomial h define the matrix

M(h) =

1 0 0h 1 00 0 1

which as part of an abp looks like

h

We call the following matrices primitive

bull M(h) with h any variable or any constant in F

bull the 3times 3 permutation matrices denoted by Mπ with π isin S3

bull the diagonal matrices Mabc = diag(a b c) with a b c isin F

72 Definitions and basic results 113

The entries of the primitives are variables or constants in F making them suitableto use in the construction of a width-3 abp

Let (fn) isin VPe Then fn can be computed by a formula of size s(n) isin poly(n)By Brentrsquos depth-reduction theorem for formulas (Lemma 74) fn can be computedby a formula of size poly(n) and depth d(n) isin O(log s(n))

We will construct a sequence of primitives A1 Am(n) such that

A1 middot middot middotAm(n) =

1 0 0fn 1 00 0 1

with m(n) isin O(4d(n)) = poly(n) Then

fn(x) = ( 1 1 1 )Mminus110A1 middot middot middotAm(

111

)

so fn(x) can be computed by a width-3 abp of length poly(n) proving the theoremTo explain the construction let h be a polynomial and consider a formula

computing h of depth d The goal is to construct (recursively on the formulastructure) primitives A1 Am such that

A1 middot middot middotAm =

1 0 0h 1 00 0 1

with m isin O(4d)

Suppose h is a variable or a constant Then M(h) is itself a primitive matrixSuppose h = f + g is a sum of two polynomials f g and suppose M(f) and

M(g) can be written as a product of primitives Then M(f + g) equals a productof primitives because M(f + g) = M(f)M(g) This can easily be verified directlyor by noting that in the corresponding partial abps the top-bottom paths (ui-vjpaths) have the same value

u1 u2 u3

v1 v2 v3

f

gsim

u1 u2 u3

v1 v2 v3

f+g

Suppose h = fg is a product of two polynomials f g and suppose M(f) andM(g) can be written as a product of primitives Then M(fg) equals a product ofprimitives because

M(f middot g) = M(23)

(M1minus11M(123)M(g)M(132)M(f)

)2M(23)

114 Chapter 7 Algebraic branching programs

(here (23) isin S3 denotes the transposition 1 7rarr 1 2 7rarr 3 3 7rarr 2 and (123) isin S3

denotes the cyclic shift 1 7rarr 2 2 7rarr 3 3 7rarr 1) as can be verified either directly orby checking that in the corresponding partial abps the top-bottom paths (ui-vjpaths) have the same value

u1 u2 u3

v1 v2 v3

f

minus1

g

f

g

minus1

sim

u1 u2 u3

v1 v2 v3

f middotg

This completes the construction

The length m of the construction is m(h) = 1 for h a variable or constant andrecursively m(f + g) = m(f) +m(g) m(f middot g) = 2(m(f) +m(g)) so m isin O(4d)where d is the formula size of h

The above result of Ben-Or and Cleve (Theorem 73) raises the intriguingquestion whether the inclusion VP2 sube VPe is strict Allender and Wang [AW16]show that the inclusion is indeed strict in fact they show that some polynomialscannot be computed by any width-2 abp

Theorem 75 (Allender and Wang [AW16]) The polynomial

x1x2 + x3x4 + middot middot middot+ x15x16

cannot be computed by any width-2 abp Therefore we have the separation ofclasses VP2 ( VP3 = VPe

72 Definitions and basic results 115

724 Approximation closure C

We define the norm of a complex multivariate polynomial as the sum of theabsolute values of its coefficients This defines a topology on the polynomial ringC[x1 xm] Given a complexity measure L say abp size or formula size thereis a natural notion of approximate complexity that is called border complexityNamely a polynomial f isin C[x] has border complexity Ltop at most c if there isa sequence of polynomials g1 g2 in C[x] converging to f such that each gisatisfies L(gi) le c It turns out that for reasonable classes over the field of complexnumbers C this topological notion of approximation is equivalent to what we callalgebraic approximation (see eg [Bur04]) Namely a polynomial f isin C[x] satisfiesL(f)alg le c iff there are polynomials f1 fe isin C[x] such that the polynomial

h = f + εf1 + ε2f2 + middot middot middot+ εefe isin C[εx]

has complexity LC(ε)(h) le c where ε is a formal variable and LC(ε)(h) denotesthe complexity of h over the field extension C(ε) This algebraic notion ofapproximation makes sense over any base field and we will use it in the statementsand proofs of this chapter

Definition 76 Let C(F) be a class over the field F We define the approximationclosure C(F) as follows a family (fn) over F is in C(F) if there are polynomialsfni(x) isin F[x] and a function e Nrarr N such that the family (gn) defined by

gn(x) = fn(x) + εfn1(x) + ε2fn2(x) + middot middot middot+ εe(n)fne(n)(x)

is in C(F(ε)) We define the poly-approximation closure Cpoly

(F) similarly butwith the additional requirement that e(n) isin poly(n) We call e(n) the errordegree

725 Nondeterminism closure N(C)

We introduce the nondeterminism closure for algebraic complexity classes

Definition 77 Let C be a class The class N(C) consists of families (fn) withthe following property there is a family (gn) isin C and p(n) q(n) isin poly(n) suchthat

fn(x) =sum

bisin01p(n)

gq(n)(bx)

where x and b denote sequences of variables x1 x2 and b1 b2 bp(n) We saythat f(x) is a hypercube sum over g and that b1 b2 bp(n) are the hypercubevariables For any subscript x we will use the notation VNPx to denote N(VPx)We remark that the map C 7rarr N(C) trivially satisfies all properties of being aKuratowski closure operator ie N(empty) = empty C sube N(C) N(CcupD) = N(C)cupN(D)and N(N(C)) = N(C)

116 Chapter 7 Algebraic branching programs

73 Approximation closure of VP2

We show that every polynomial can be approximated by a width-2 abp Evenbetter we show that every polynomial can be approximated by a width-2 abpof size polynomial in the formula size and with error degree polynomial in theformula size This is the main result of the current chapter

Theorem 78 VPe sube VP2poly

when char(F) 6= 2

Proof For a polynomial h define the matrix M(h) =(

1 0h 1

) We call the following

matrices primitives

bull M(h) with h any variable or constant in F

bull(

12ε

00 1

)

(minus2ε 0

0 1

)

(minus1 ε0 1

)

(minus1 minusε0 1

)

(minus1 00 1

)The entries of the primitives are variables or constants in the base field F(ε)making them suitable to use in a width-2 abp over the base field F(ε)

Let (fn) isin VPe so fn(x) can be computed by a formula of size s(n) isin poly(n)By Brentrsquos depth reduction theorem for formulas (Lemma 74) fn can be computedby a formula of size poly(n) and depth d(n) isin O(log s(n))

We will construct a sequence of primitives A1 Am(n) such that

A1 middot middot middotAm(n) =

(1 0fn 1

)+ ε

(fn111 fn112

fn121 fn122

)+ ε2

(fn211 fn212

fn221 fn222

)+ middot middot middot + εe

(fne11 fne12

fne21 fne22

)for some fnijk isin F[x] with m(n) e(n) isin O(8d(n)) = poly(n) Then

( 1 1 )(minus1 00 1

)A1 middot middot middotAm(n)( 1

1 ) = fn(x) +O(ε)

so fn(x) can be approximated by a width-2 abp of length poly(n) and with errordegree poly(n) proving the theorem

We begin with the construction Let h be a polynomial and consider a formulacomputing h of depth d The goal is to construct recursively on the tree structureof the formula a sequence of primitives A1 Am such that for some hijk isin F[x]

A1 middot middot middotAm =

(1 0h 1

)+ ε

(0 0h121 0

)+ ε2

(h211 h212

h221 h222

)+

middot middot middot + εe(he11 he12

he21 he22

)(73)

with m e isin O(8d) Notice the particular first-degree error pattern in (73) whichour recursion will rely on

73 Approximation closure of VP2 117

Suppose h is a variable or a constant Then M(h) is itself a primitive satisfy-ing (73)

Suppose h = f + g is a sum of two polynomials f g and suppose that

F =

(1 0f 1

)+ ε

(0 0f prime 0

)+O(ε2) (74)

G =

(1 0g 1

)+ ε

(0 0gprime 0

)+O(ε2) (75)

are products of primitives for some f prime gprime isin F[x] Then

G middot F =

(1 0

f + g 1

)+ ε

(0 0

f prime + gprime 0

)+O(ε2)

is a product of primitives satisfying (73)Suppose h = fg is a product of two polynomials and suppose that F and G

are of the form (74) and (75) and are products of primitives We will constructM((f + g)2) M(minusf 2) M(minusg2) approximately in such a way that when we usethe identity (f + g)2 minus f 2 minus g2 = 2fg the error terms cancel properly Define theexpressions sq+(A) and sqminus(A) by

sqplusmn(A) =

(minusε 00 1

)middot A middot

(minus1 plusmnε0 1

)middot A middot

(1ε

00 1

)

Then

sqplusmn(F ) =

(1∓ εf 0

plusmnf 2 +O(ε) 1plusmn εf

)+O(ε2)

We have

sqminus(F ) middot sqminus(G) middot sq+(G middot F )

=

(1 + εg 0

minusg2 +O(ε) 1minus εg

)middot(

1 + εf 0minusf 2 +O(ε) 1minus εf

)middot(

1minus ε(f + g) 0(f + g)2 +O(ε) 1 + ε(f + g)

)+O(ε2)

which simplifies to

sqminus(F ) middot sqminus(G) middot sq+(G middot F ) =

(1 0

2fg +O(ε) 1

)+O(ε2)

118 Chapter 7 Algebraic branching programs

We conclude(2 00 1

)middot sqminus(G) middot sqminus(F ) middot sq+(G middot F ) middot

(12

00 1

)=

(minus2ε 0

0 1

)middotG middot

(minus1 minusε0 1

)middotG middot

(minus1 00 1

)middot F middot

(minus1 minusε0 1

)F

middot(minus1 00 1

)middotG middot F middot

(minus1 ε0 1

)middotG middot F middot

(12ε

00 1

)=

(1 0

fg +O(ε) 1

)+O(ε2)

This completes the constructionThe length m of the construction is m(h) = 1 for h a variable or constant

and recursively m(f + g) = m(f) + m(g) m(f middot g) = 4(m(f) + m(g)) + 7 Weconclude m isin O(8d) The error degree e of the construction satisfies the samerecursion so e isin O(8d)

Remark 79 The construction in the above proof of Theorem 78 is differentfrom the construction in our paper [BIZ17] The recursion in the above proof issimpler while the construction in [BIZ17] has a better error degree and has aspecial form which relates it to a family of polynomials called continuants

Corollary 710 VP2 = VPe and VP2poly

= VPepoly

when char(F) 6= 2

Proof We have VP2 sube VPe by Lemma 72 Taking closures on both sides weobtain VP2 sube VPe and VP2

polysube VPepoly

When char(F) 6= 2 VPe sube VP2

poly(Theorem 78) By taking closures follows

VPe sube VP2 and VPepolysube VP2

poly

Corollary 711 VP2poly

= VPe when char(F) 6= 2 and F is infinite

Proof By Corollary 710 VP2poly

= VPepoly

We prove VPepoly

= VPe inLemma 712 below

Lemma 712 VPepoly

= VPe when char(F) 6= 2 and F is infinite

Proof The inclusion VPe sube VPepoly

is trivially true We prove the other directionLet (fn) isin VPe

poly Then there are polynomials fni(x) isin F[x] and e(n) isin poly(n)

such that

fn(x) + εfn1(x) + ε2fn2(x) + middot middot middot+ εe(n)fne(n)(x)

is computed by a poly-size formula Γ over F(ε) Let α0 α1 αe(n) be distinctelements in F such that replacing ε by αj in Γ is a valid substitution ie not

74 Nondeterminism closure of VP1 119

causing division by zero These αj exist since our field is infinite by assumptionView

gn(ε) = fn(x) + εfn1(x) + ε2fn2(x) + middot middot middot+ εe(n)fne(n)(x)

as a polynomial in ε The polynomial gn(ε) has degree at most e(n) so we canwrite gn(ε) as follows (Lagrange interpolation on e(n) + 1 points)

gn(ε) =

e(n)sumj=0

gn(αj)prod

0lemlee(n)m6=j

εminus αmαj minus αm

(76)

Clearly fn(x) = gn(0) However replacing ε by 0 in Γ is not a valid substitutionin general From (76) we see directly how to write gn(0) as a linear combinationof the values gn(αj) namely

gn(0) =

e(n)sumj=0

gn(αj)prod

0lemlee(n)m6=j

minusαmαj minus αm

that is

gn(0) =e(n)sumj=0

βj gn(αj) with βj =prod

0lemlee(n)m6=j

αmαm minus αj

The value gn(αj) is computed by the formula Γ with ε replaced by αj which we

denote by Γ|ε=αj Thus fn(x) is computed by the poly-size formulasume(n)

j=0 βj Γ|ε=αj We conclude (fn) isin VPe

Remark 713 The statement of Lemma 712 also holds with VPe replaced withVPs or with VP by a similar proof

74 Nondeterminism closure of VP1

Recall the definition of VNPx = N(VPx) from Definition 77 Valiant proved thefollowing characterisation of VNP in his seminal work [Val80] See also [BCS97Thm 2126] [Bur00 Thm 213] and [MP08 Thm 2]

Theorem 714 (Valiant [Val80]) VNPe = VNP

We strengthen Valiantrsquos characterisation of VNP from VNPe to VNP1

Theorem 715 VNP1 = VNP when char(F) 6= 2

120 Chapter 7 Algebraic branching programs

The idea of the proof is ldquoto simulate in VNP1rdquo the primitives that we used inthe proof of VPe sube VP3 (Theorem 73)

Proof of Theorem 715 Clearly VNP1 sube VNP by Lemma 72 and takingthe nondeterminism closure N We will prove that VNP sube VNP1 Recall thatin the proof of VPe sube VP3 (Theorem 73) we defined for any polynomial h thematrix

M(h) =

1 0 0h 1 00 0 1

and we called the following matrices primitives

bull M(h) with h any variable or any constant in F

bull the 3times 3 permutation matrices denoted by Mπ for π isin S3

bull the diagonal matrices Mabc = diag(a b c) with a b c isin F

In the proof of VPe sube VP3 we constructed for any family (fn) isin VPe a sequenceof primitive matrices An1 Ant(n) with t(n) isin poly(n) such that

fn(x) = ( 1 1 1 )Mminus110A1 middot middot middotAm(

111

) (77)

We will show VPe sube VNP1 by constructing a hypercube sum over a width-1abp that evaluates the right-hand side of (77) This implies VNPe sube VNP1 bytaking the N-closure Then by Valiantrsquos Theorem 714 VNP sube VNP1

Let f(x) be a polynomial and let A1 Ak be primitive matrices suchthat f(x) is computed as

f(x) = ( 1 1 1 )Ak middot middot middotA1

(111

)

View this expression as a width-3 abp G with vertex layers labeled as shown inthe left-hand diagram in Fig 71 Assume for simplicity that all edges betweenlayers are present possibly with label 0 The sum of the values of every sndasht pathin G equals f(x)

f(x) =sumjisin[3]k

Ak[jk jkminus1] middot middot middotA1[j2 j1] (78)

We introduce some hypercube variables To every vertex of G except s and twe associate a bit the bits in the ith layer we call b1[i] b2[i] b3[i] To an sndashtpath in G we associate an assignment of the bj[i] by setting the bits of verticesvisited by the path to 1 and the others to 0 For example in the right-hand

74 Nondeterminism closure of VP1 121

s

0

1

2

kminus1

k

t

A1

A2

Ak

s

1 0 0

0 1 0

0 1 0

0 0 1

0 1 0

t

Figure 71 Illustration of the layer labelling and the path labelling used in theproof of Theorem 715

diagram in Fig 71 we show an sndasht path with the corresponding assignment of thebits bj[i] The assignments of the bj[i] corresponding to sndasht paths are preciselythe assignments such that for every i isin [k] exactly one of b1[i] b2[i] b3[i] equals 1Let

V (b1 b2 b3) =prodiisin[k]

(b1[i] + b2[i] + b3[i]

)prodstisin[3]s 6=t

(1minus bs[i]bt[i]

) (79)

Then the assignments of the bj[i] corresponding to sndasht paths are precisely theassignments such that V (b1 b2 b3) = 1 Otherwise V (b1 b2 b3) = 0

We will write f(x) as a hypercube sum by replacing each Ai[ji jiminus1] in (78)by a product of affine linear forms Si(Ai) with variables b and xsum

b

V (b1 b2 b3)Sk(Ak) middot middot middotS1(A1)

Define the expression Eq(α β) = (1minus α minus β)(1minus α minus β) for α β isin 0 1 Theexpression Eq(α β) evaluates to 1 if α equals β and evaluates to 0 otherwise

bull For any variable or constant x define

Si(M(x)) =(1 + (xminus 1)(b1[i]minus b1[iminus1])

)middot(1minus (1minus b2[i])b2[iminus1]

)middot Eq

(b3[iminus1] b3[i]

)

122 Chapter 7 Algebraic branching programs

bull For any permutation π isin S3 define

Si(Mπ) = Eq(b1[iminus1] bπ(1)[i]

)middot Eq

(b2[iminus1] bπ(2)[i]

)middot Eq

(b3[iminus1] bπ(3)[i]

)

bull For any constants a b c isin F define

Si(Mabc) =(a middot b1[iminus1] + b middot b2[iminus1] + c middot b3[iminus 1]

)middot Eq

(b1[iminus1] b1[i]

)middot Eq

(b2[iminus1] b2[i]

)middot Eq

(b3[iminus1] b3[i]

)

One verifies that

f(x) =sumb

V (b1 b2 b3)Sk(Ak) middot middot middotS1(A1)

Some of the factors in the expressions for the Si(Ai) are not affine linear As afinal step we apply the equality 1 + xy = 1

2

sumcisin01(x + 1 minus 2c)(y + 1 minus 2c) to

write these factors as products of affine linear forms introducing new hypercubevariables

75 Conclusion

We finish with an overview of inclusions equalities and separations among theclasses VPk VPe VP and their approximation and nondeterminism closures(when char(F) 6= 2) see Fig 72 The figure relies on the following two simplelemmas of which proofs can be found in our paper [BIZ17]

Lemma 716 ([BIZ17 Prop 510]) VP1 = VP1

Lemma 717 ([BIZ17 Prop 511]) VP1 ( VNP1 when char(F) 6= 2

75 Conclusion 123

VP1 VP2 VPe VP

VP1 VP2 VPe VP

VNP1 VNP2 VNPe VNP=

(

(

= =

(

= sube

( sube[AW16]

717

=716 sube sube(

subesube(710

715 [Val80]

[Val79]

Figure 72 Overview of relations among the algebraic complexity classes VPkVPe VP and their approximation and nondeterminism closures (when char(F) isnot 2) The relations without reference are either by definition or follow logicallyfrom the other relations

Bibliography

[AJRS13] Elizabeth S Allman Peter D Jarvis John A Rhodes andJeremy G Sumner Tensor rank invariants inequalities andapplications SIAM J Matrix Anal Appl 34(3)1014ndash1045 2013doi101137120899066 p 14

[Alo98] Noga Alon The Shannon capacity of a union Combinatorica18(3)301ndash310 1998 doi101007PL00009824 p 37

[ASU13] Noga Alon Amir Shpilka and Christopher Umans On sunflowersand matrix multiplication Comput Complexity 22(2)219ndash243Jun 2013 doi101007s00037-013-0060-1 p 48

[AW16] Eric Allender and Fengming Wang On the power of algebraicbranching programs of width two Comput Complexity25(1)217ndash253 2016 doi101007s00037-015-0114-7 p 17109 114 123

[AZ14] Martin Aigner and Gunter M Ziegler Proofs from The BookSpringer-Verlag Berlin fifth edition 2014doi101007978-3-662-44205-0 p 71

[BC18] Boris Bukh and Christopher Cox On a fractional version ofHaemersrsquo bound arXiv 2018 arXiv180200476 p 41 42

[BCC+17] Jonah Blasiak Thomas Church Henry Cohn Joshua A GrochowEric Naslund William F Sawin and Chris Umans On cap setsand the group-theoretic approach to matrix multiplication DiscreteAnal 2017 arXiv160506702 doi1019086da1245 p 4883 84 104

125

126 Bibliography

[BCPZ16] Harry Buhrman Matthias Christandl Christopher Perry andJeroen Zuiddam Clean quantum and classical communicationprotocols Phys Rev Lett 117230503 Dec 2016doi101103PhysRevLett117230503 p 1

[BCRL79] Dario Bini Milvio Capovani Francesco Romani and Grazia LottiO(n27799) complexity for ntimes n approximate matrix multiplicationInf Process Lett 8(5)234ndash235 1979doi1010160020-0190(79)90113-3 p 3 110

[BCS97] Peter Burgisser Michael Clausen and M Amin ShokrollahiAlgebraic complexity theory volume 315 of Grundlehren MathWiss Springer-Verlag Berlin 1997doi101007978-3-662-03338-8 p 4 6 48 50 66 79 119

[BCSX10] Arnab Bhattacharyya Victor Chen Madhu Sudan and Ning XieTesting Linear-Invariant Non-linear Properties A Short Reportpages 260ndash268 Springer Berlin Heidelberg Berlin Heidelberg2010 doi101007978-3-642-16367-8_18 p 48

[BCZ17a] Markus Blaser Matthias Christandl and Jeroen Zuiddam Theborder support rank of two-by-two matrix multiplication is sevenarXiv 2017 arXiv170509652 p 1 15

[BCZ17b] Harry Buhrman Matthias Christandl and Jeroen ZuiddamNondeterministic Quantum Communication Complexity the CyclicEquality Game and Iterated Matrix Multiplication In Christos HPapadimitriou editor 8th Innovations in Theoretical ComputerScience Conference (ITCS 2017) pages 241ndash2418 2017arXiv160303757 doi104230LIPIcsITCS201724 p 115

[Ber84] Stuart J Berkowitz On computing the determinant in smallparallel time using a small number of processors Inform ProcessLett 18(3)147ndash150 1984 doi1010160020-0190(84)90018-8p 108

[BI13] Peter Burgisser and Christian Ikenmeyer Explicit lower bounds viageometric complexity theory Proceedings 45th Annual ACMSymposium on Theory of Computing 2013 pages 141ndash150 2013doi10114524886082488627 p 108

[Bin80] Dario Bini Relations between exact and approximate bilinearalgorithms Applications Calcolo 17(1)87ndash97 1980doi101007BF02575865 p 3

Bibliography 127

[BIZ17] Karl Bringmann Christian Ikenmeyer and Jeroen Zuiddam OnAlgebraic Branching Programs of Small Width In Ryan OrsquoDonnelleditor 32nd Computational Complexity Conference (CCC 2017)pages 201ndash2031 2017 doi104230LIPIcsCCC201720 p 1107 111 112 118 122

[Bla13] Anna Blasiak A graph-theoretic approach to network coding PhDthesis Cornell University 2013 URL httpsecommonscornelledubitstreamhandle181334147ab675pdf p 42

[BLMW11] Peter Burgisser Joseph M Landsberg Laurent Manivel and JerzyWeyman An overview of mathematical issues arising in thegeometric complexity theory approach to VP 6= VNP SIAM JComput 40(4)1179ndash1209 2011 doi101137090765328 p 108

[BOC92] Michael Ben-Or and Richard Cleve Computing algebraic formulasusing a constant number of registers SIAM J Comput21(1)54ndash58 1992 doi1011370221006 p 17 109 112

[BPR+00] Charles H Bennett Sandu Popescu Daniel Rohrlich John ASmolin and Ashish V Thapliyal Exact and asymptotic measuresof multipartite pure-state entanglement Phys Rev A63(1)012307 2000 doi101103PhysRevA63012307 p 48

[Bre74] Richard P Brent The parallel evaluation of general arithmeticexpressions J ACM 21(2)201ndash206 April 1974doi101145321812321815 p 112

[Bri87] Michel Brion Sur lrsquoimage de lrsquoapplication moment In Seminairedrsquoalgebre Paul Dubreil et Marie-Paule Malliavin (Paris 1986)volume 1296 of Lecture Notes in Math pages 177ndash192 SpringerBerlin 1987 doi101007BFb0078526 p 9 93 94

[BS83] Eberhard Becker and Niels and Schwartz Zum Darstellungssatzvon Kadison-Dubois Arch Math (Basel) 40(5)421ndash428 1983doi101007BF01192806 p 7 12 33

[Bur90] Peter Burgisser Degenerationsordnung und Tragerfunktionalbilinearer Abbildungen PhD thesis Universitat Konstanz 1990httpnbn-resolvingdeurnnbndebsz352-opus-20311p 57 101

[Bur00] Peter Burgisser Completeness and reduction in algebraiccomplexity theory volume 7 of Algorithms and Computation inMathematics Springer-Verlag Berlin 2000doi101007978-3-662-04179-6 p 119

128 Bibliography

[Bur04] Peter Burgisser The complexity of factors of multivariatepolynomials Found Comput Math 4(4)369ndash396 2004doi101007s10208-002-0059-5 p 110 115

[BX15] Arnab Bhattacharyya and Ning Xie Lower bounds for testingtriangle-freeness in boolean functions Comput Complexity24(1)65ndash101 2015 doi101007s00037-014-0092-1 p 48

[BZ17] Jop Briet and Jeroen Zuiddam On the orthogonal rank of Cayleygraphs and impossibility of quantum round elimination QuantumInf Comput 17(1amp2) 2017 URL httpwwwrintonpresscomxxqic17qic-17-120106-0116pdfarXiv160806113 p 2

[CHM07] Matthias Christandl Aram W Harrow and Graeme MitchisonNonzero Kronecker coefficients and what they tell us about spectraComm Math Phys 270(3)575ndash585 2007doi101007s00220-006-0157-3 p 90

[CJZ18] Matthias Christandl Asger Kjaeligrulff Jensen and Jeroen ZuiddamTensor rank is not multiplicative under the tensor product LinearAlgebra Appl 543125ndash139 2018doi101016jlaa201712020 p 2 15

[CKSV16] Suryajith Chillara Mrinal Kumar Ramprasad Saptharishi andV Vinay The chasm at depth four and tensor rank Old resultsnew insights arXiv 2016 arXiv160604200 p 15

[CLP17] Ernie Croot Vsevolod F Lev and Peter Pal Pach Progression-freesets in Zn

4 are exponentially small Ann of Math (2)185(1)331ndash337 2017 doi104007annals201718517 p 4881

[CM06] Matthias Christandl and Graeme Mitchison The spectra ofquantum states and the Kronecker coefficients of the symmetricgroup Comm Math Phys 261(3)789ndash797 2006doi101007s00220-005-1435-1 p 91

[CMR+14] Toby Cubitt Laura Mancinska David E Roberson SimoneSeverini Dan Stahlke and Andreas Winter Bounds onentanglement-assisted source-channel coding via the Lovasz thetanumber and its variants IEEE Trans Inform Theory60(11)7330ndash7344 2014 arXiv13107120doi101109TIT20142349502 p 42

Bibliography 129

[CT12] Thomas M Cover and Joy A Thomas Elements of informationtheory John Wiley amp Sons 2012 p 60

[CU13] Henry Cohn and Christopher Umans Fast matrix multiplicationusing coherent configurations In Proceedings of the Twenty-FourthAnnual ACM-SIAM Symposium on Discrete Algorithms pages1074ndash1086 SIAM 2013 p 15

[CVZ16] Matthias Christandl Peter Vrana and Jeroen ZuiddamAsymptotic tensor rank of graph tensors beyond matrixmultiplication arXiv 2016 arXiv160907476 p 2 65 67 7985

[CVZ18] Matthias Christandl Peter Vrana and Jeroen Zuiddam Universalpoints in the asymptotic spectrum of tensors In Proceedings of 50thAnnual ACM SIGACT Symposium on the Theory of Computing(STOCrsquo18) ACM New York 2018 arXiv170907851doi10114531887453188766 p 2 47 65 87 88 96 103 105

[CW82] Don Coppersmith and Shmuel Winograd On the asymptoticcomplexity of matrix multiplication SIAM J Comput11(3)472ndash492 1982 doi1011370211038 p 3

[CW87] Don Coppersmith and Shmuel Winograd Matrix multiplication viaarithmetic progressions In Proceedings of the nineteenth annualACM symposium on Theory of computing pages 1ndash6 ACM 1987p 3

[CW90] Don Coppersmith and Shmuel Winograd Matrix multiplication viaarithmetic progressions J Symbolic Comput 9(3)251ndash280 1990doi101016S0747-7171(08)80013-2 p 4 6 8 10 48 67

[CZ18] Matthias Christandl and Jeroen Zuiddam Tensor surgery andtensor rank Comput Complexity Mar 2018doi101007s00037-018-0164-8 p 2 86

[Dra15] Jan Draisma Multilinear Algebra and Applications (lecture notes)2015 URL httpsmathsitesunibechjdraismapublicationsmlapplpdfp 15

[DVC00] Wolfgang Dur Guivre Vidal and Juan Ignacio Cirac Three qubitscan be entangled in two inequivalent ways Phys Rev A (3)62(6)062314 12 2000 doi101103PhysRevA62062314 p 48

130 Bibliography

[Ede04] Yves Edel Extensions of generalized product caps Des CodesCryptogr 31(1)5ndash14 2004 doi101023A1027365901231p 48 83

[EG17] Jordan S Ellenberg and Dion Gijswijt On large subsets of Fnq with

no three-term arithmetic progression Ann of Math (2)185(1)339ndash343 2017 doi104007annals201718518 p 1048 81 83 84

[FK14] Hu Fu and Robert Kleinberg Improved lower bounds for testingtriangle-freeness in boolean functions via fast matrix multiplicationIn Approximation Randomization and CombinatorialOptimization Algorithms and Techniques (APPROXRANDOM2014) pages 669ndash676 2014doi104230LIPIcsAPPROX-RANDOM2014669 p 48

[For16] Michael Forbes Some concrete questions on the border complexityof polynomials Presentation given at the Workshop on AlgebraicComplexity Theory WACT 2016 in Tel Avivhttpswwwyoutubecomwatchv=1HMogQIHT6Q 2016 p 110

[Fra02] Matthias Franz Moment polytopes of projective G-varieties andtensor products of symmetric group representations J Lie Theory12(2)539ndash549 2002 URLhttpemisamsorgjournalsJLTvol12_no216htmlp 93 94

[Fri17] Tobias Fritz Resource convertibility and ordered commutativemonoids Math Structures Comput Sci 27(6)850ndash938 2017doi101017S0960129515000444 p 37

[Ful97] William Fulton Young tableaux volume 35 of LondonMathematical Society Student Texts Cambridge University PressCambridge 1997 With applications to representation theory andgeometry p 88

[GKKS13] Ankit Gupta Pritish Kamath Neeraj Kayal and RamprasadSaptharishi Approaching the chasm at depth four In 2013 IEEEConference on Computational ComplexitymdashCCC 2013 pages 65ndash73IEEE Computer Soc Los Alamitos CA 2013doi101109CCC201316 p 108

[GMQ16] Joshua A Grochow Ketan D Mulmuley and Youming QiaoBoundaries of VP and VNP In Ioannis Chatzigiannakis MichaelMitzenmacher Yuval Rabani and Davide Sangiorgi editors 43rd

Bibliography 131

International Colloquium on Automata Languages andProgramming (ICALP 2016) volume 55 pages 341ndash3414 2016arXiv160502815 doi104230LIPIcsICALP201634 p 110

[Gro13] Joshua A Grochow Unifying and generalizing known lower boundsvia geometric complexity theory arXiv 2013 arXiv13046333p 108

[GW09] Roe Goodman and Nolan R Wallach Symmetry representationsand invariants volume 255 of Graduate Texts in MathematicsSpringer Dordrecht 2009 doi101007978-0-387-79852-3p 88

[Hae79] Willem Haemers On some problems of Lovasz concerning theShannon capacity of a graph IEEE Trans Inform Theory25(2)231ndash232 1979 doi101109TIT19791056027 p 37 4042

[Has90] Johan Hastad Tensor rank is NP-complete J Algorithms11(4)644ndash654 1990 doi1010160196-6774(90)90014-6 p 47

[HHHH09] Ryszard Horodecki Pawe l Horodecki Micha l Horodecki and KarolHorodecki Quantum entanglement Rev Modern Phys81(2)865ndash942 2009 doi101103RevModPhys81865 p 48

[HIL13] Jonathan D Hauenstein Christian Ikenmeyer and Joseph MLandsberg Equations for lower bounds on border rank ExpMath 22(4)372ndash383 2013 doi101080105864582013825892p 15 110

[Hum75] James E Humphreys Linear algebraic groups Springer-VerlagNew York-Heidelberg 1975 Graduate Texts in Mathematics No21 p 93

[HX17] Ishay Haviv and Ning Xie Sunflowers and testing triangle-freenessof functions Comput Complexity 26(2)497ndash530 Jun 2017doi101007s00037-016-0138-7 p 48

[Ike13] Christian Ikenmeyer Geometric complexity theory tensor rankand LittlewoodndashRichardson coefficients PhD thesis UniversitatPaderborn 2013 p 14

[Kar72] Richard M Karp Reducibility among combinatorial problems InComplexity of computer computations (Proc Sympos IBM ThomasJ Watson Res Center Yorktown Heights NY 1972) pages85ndash103 Plenum New York 1972 p 36

132 Bibliography

[Knu94] Donald E Knuth The sandwich theorem Electron J Combin 11994 URL httpwwwcombinatoricsorgVolume_1Abstractsv1i1a1htmlp 41

[Kra84] Hanspeter Kraft Geometrische Methoden in der InvariantentheorieSpringer 1984 doi101007978-3-663-10143-7 p 50 88 93

[KS08] Tali Kaufman and Madhu Sudan Algebraic property testing Therole of invariance In Proceedings of the Fortieth Annual ACMSymposium on Theory of Computing STOC rsquo08 pages 403ndash412New York NY USA 2008 ACMdoi10114513743761374434 p 48

[KSS16] Robert Kleinberg William F Sawin and David E Speyer Thegrowth rate of tri-colored sum-free sets arXiv 2016arXiv160700047 p 48 79 83

[Lan06] Joseph M Landsberg The border rank of the multiplication of2times 2 matrices is seven J Amer Math Soc 19(2)447ndash459 2006doi101090S0894-0347-05-00506-0 p 110

[LG14] Francois Le Gall Powers of tensors and fast matrix multiplicationIn ISSAC 2014mdashProceedings of the 39th International Symposiumon Symbolic and Algebraic Computation pages 296ndash303 ACM NewYork 2014 doi10114526086282608664 p 4 6 8 48 85

[Lic84] Thomas Lickteig A note on border rank Inf Process Lett18(3)173ndash178 1984 doi1010160020-0190(84)90023-1p 110

[LM16a] Joseph M Landsberg and Mateusz Micha lek A 2n2 minus log(n)minus 1lower bound for the border rank of matrix multiplication arXiv2016 arXiv160807486 p 110

[LM16b] Joseph M Landsberg and Mateusz Micha lek Abelian tensorsJ Math Pures Appl 2016 doi101016jmatpur201611004p 14

[LMR13] Joseph M Landsberg Laurent Manivel and Nicolas RessayreHypersurfaces with degenerate duals and the geometric complexitytheory program Comment Math Helv 88(2)469ndash484 2013doi104171CMH292 p 108

[LO15] Joseph M Landsberg and Giorgio Ottaviani New lower bounds forthe border rank of matrix multiplication Theory Comput

Bibliography 133

11285ndash298 2015 arXiv11126007doi104086toc2015v011a011 p 108 110

[Lov79] Laszlo Lovasz On the Shannon capacity of a graph IEEE TransInform Theory 25(1)1ndash7 1979 doi101109TIT19791055985p 13 35 41

[Mar08] Murray Marshall Positive polynomials and sums of squaresvolume 146 of Mathematical Surveys and Monographs AmericanMathematical Society Providence RI 2008doi101090surv146 p 34

[MP71] Robert J McEliece and Edward C Posner Hide and seek datastorage and entropy The Annals of Mathematical Statistics42(5)1706ndash1716 1971 doi101214aoms1177693169 p 41

[MP08] Guillaume Malod and Natacha Portier Characterizing Valiantrsquosalgebraic complexity classes J Complexity 24(1)16ndash38 2008doi101016jjco200609006 p 119

[MS01] Ketan D Mulmuley and Milind Sohoni Geometric complexitytheory I An approach to the P vs NP and related problemsSIAM J Comput 31(2)496ndash526 2001doi101137S009753970038715X p 14 108

[MS08] Ketan D Mulmuley and Milind Sohoni Geometric complexitytheory II Towards explicit obstructions for embeddings amongclass varieties SIAM J Comput 38(3)1175ndash1206 2008doi101137080718115 p 108

[Nes84] Linda Ness A stratification of the null cone via the moment mapAmer J Math 106(6)1281ndash1329 1984 With an appendix byDavid Mumford doi1023072374395 p 9 93 94

[Nis91] Noam Nisan Lower bounds for non-commutative computation InProceedings of the twenty-third annual ACM symposium on Theoryof computing pages 410ndash418 ACM 1991doi101145103418103462 p 110

[Nor16] Sergey Norin A distribution on triples with maximum entropymarginal arXiv 2016 arXiv160800243 p 83

[NW97] Noam Nisan and Avi Wigderson Lower bounds on arithmeticcircuits via partial derivatives Comput Complexity 6(3)217ndash234199697 doi101007BF01294256 p 108

134 Bibliography

[Pan78] Victor Ya Pan Strassenrsquos algorithm is not optimal Trilineartechnique of aggregating uniting and canceling for constructingfast algorithms for matrix operations In 19th Annual Symposiumon Foundations of Computer Science (Ann Arbor Mich 1978)pages 166ndash176 IEEE Long Beach Calif 1978 p 3

[Pan80] Victor Ya Pan New fast algorithms for matrix operations SIAMJ Comput 9(2)321ndash342 1980 doi1011370209027 p 3

[Pan81] Victor Ya Pan New combinations of methods for the accelerationof matrix multiplication Comput Math Appl 7(1)73ndash125 1981doi1010160898-1221(81)90009-2 p 3

[Pan84] Victor Ya Pan How to multiply matrices faster volume 179 ofLecture Notes in Computer Science Springer-Verlag Berlin 1984doi1010073-540-13866-8 p 3

[Pan18] Victor Ya Pan Fast feasible and unfeasible matrix multiplicationarXiv 2018 arXiv180404102 p 6

[PD01] Alexander Prestel and Charles N Delzell Positive polynomialsSpringer Monographs in Mathematics Springer-Verlag Berlin2001 From Hilbertrsquos 17th problem to real algebradoi101007978-3-662-04648-7 p 34

[Peb16] Luke Pebody Proof of a conjecture of Kleinberg-Sawin-SpeyerarXiv 2016 arXiv160805740 p 83

[PS98] George Polya and Gabor Szego Problems and theorems inanalysis I Classics in Mathematics Springer-Verlag Berlin 1998Series integral calculus theory of functions Translated from theGerman by Dorothee Aeppli Reprint of the 1978 Englishtranslation doi101007978-3-642-61905-2 p 21

[Raz09] Ran Raz Multi-linear formulas for permanent and determinant areof super-polynomial size J ACM 56(2)Art 8 17 2009doi10114515027931502797 p 108

[Raz13] Ran Raz Tensor-rank and lower bounds for arithmetic formulasJ ACM 60(6)Art 40 15 2013 doi1011452535928 p 14

[Rom82] Francesco Romani Some properties of disjoint sums of tensorsrelated to matrix multiplication SIAM J Comput 11(2)263ndash2671982 doi1011370211020 p 3

Bibliography 135

[Sap16] Ramprasad Saptharishi A survey of lower bounds in arithmeticcircuit complexity 302 2016 Online survey URLhttpsgithubcomdasarpmarlowerbounds-survey p 6 17109 112

[Sch81] Arnold Schonhage Partial and total matrix multiplication SIAMJ Comput 10(3)434ndash455 1981 p 3

[Sch03] Alexander Schrijver Combinatorial optimization polyhedra andefficiency volume 24 Springer Science amp Business Media 2003p 37 41

[Sha56] Claude E Shannon The zero error capacity of a noisy channelInstitute of Radio Engineers Transactions on Information TheoryIT-2(September)8ndash19 1956 doi101109TIT19561056798p 13 35

[Sha09] Asaf Shapira Greenrsquos conjecture and testing linear-invariantproperties In Proceedings of the Forty-first Annual ACMSymposium on Theory of Computing STOC rsquo09 pages 159ndash166New York NY USA 2009 ACMdoi10114515364141536438 p 48

[Shi16] Yaroslav Shitov How hard is the tensor rank arXiv 2016arXiv161101559 p 47

[Sin64] Richard C Singleton Maximum distance q-nary codes IEEETrans Information Theory IT-10116ndash118 1964doi101109TIT19641053661 p 101

[SOK14] Adam Sawicki Micha l Oszmaniec and Marek Kus Convexity ofmomentum map Morse index and quantum entanglement RevMath Phys 26(3)1450004 39 2014doi101142S0129055X14500044 p 9

[SSS09] Chandan Saha Ramprasad Saptharishi and Nitin Saxena Thepower of depth 2 circuits over algebras In IARCS AnnualConference on Foundations of Software Technology and TheoreticalComputer Science volume 4 pages 371ndash382 2009arXiv09042058 doi104230LIPIcsFSTTCS20092333p 109

[Sto10] Andrew James Stothers On the complexity of matrix multiplicationPhD thesis University of Edinburgh 2010httphdlhandlenet18424734 p 4 6 8 48

136 Bibliography

[Str69] Volker Strassen Gaussian elimination is not optimal NumerMath 13(4)354ndash356 1969 doi101007BF02165411 p 3 5

[Str83] Volker Strassen Rank and optimal computation of generic tensorsLinear Algebra Appl 5253645ndash685 1983doi1010160024-3795(83)80041-X p 110

[Str86] Volker Strassen The asymptotic spectrum of tensors and theexponent of matrix multiplication In Proceedings of the 27thAnnual Symposium on Foundations of Computer Science SFCS rsquo86pages 49ndash54 Washington DC USA 1986 IEEE Computer Societydoi101109SFCS198652 p 4 7

[Str87] Volker Strassen Relative bilinear complexity and matrixmultiplication J Reine Angew Math 375376406ndash443 1987doi101515crll1987375-376406 p 3 4 49 67

[Str88] Volker Strassen The asymptotic spectrum of tensors J ReineAngew Math 384102ndash152 1988doi101515crll1988384102 p 4 7 12 19 26 27 28 2930 32 33 49 50 51

[Str91] Volker Strassen Degeneration and complexity of bilinear mapssome asymptotic spectra J Reine Angew Math 413127ndash1801991 doi101515crll1991413127 p 3 4 10 48 49 5255 56 57 66 67 81 82

[Str94] Volker Strassen Algebra and complexity In First EuropeanCongress of Mathematics Vol II (Paris 1992) volume 120 ofProgr Math pages 429ndash446 Birkhauser Basel 1994doi101007s10107-008-0221-1 p 67

[Str05] Volker Strassen Komplexitat und Geometrie bilinearerAbbildungen Jahresber Deutsch Math-Verein 107(1)3ndash31 2005p 4 88 94 95 100 101

[Tao08] Terence Tao Structure and randomness pages from year one of amathematical blog American Mathematical Soc 2008 p 48

[Tao16] Terence Tao A symmetric formulation of theCrootndashLevndashPachndashEllenbergndashGijswijt capset boundhttpsterrytaowordpresscom 2016 p 48 58 81 84

[Tob91] Verena Tobler Spezialisierung und Degeneration von TensorenPhD thesis Universitat Konstanz 1991httpnbn-resolvingdeurnnbndebsz352-opus-20324p 57

Bibliography 137

[TS16] Terence Tao and Will Sawin Notes on the ldquoslice rankrdquo of tensorshttpsterrytaowordpresscom 2016 p 48 58

[Val79] Leslie G Valiant Completeness classes in algebra In ConferenceRecord of the Eleventh Annual ACM Symposium on Theory ofComputing (Atlanta Ga 1979) pages 249ndash261 ACM New York1979 doi101145800135804419 p 107 108 123

[Val80] Leslie G Valiant Reducibility by algebraic projections Universityof Edinburgh Department of Computer Science 1980 InternalReport p 109 119 123

[VC15] Peter Vrana and Matthias Christandl Asymptotic entanglementtransformation between W and GHZ states J Math Phys56(2)022204 12 2015 arXiv13103244doi10106314908106 p 69

[VDDMV02] F Verstraete J Dehaene B De Moor and H Verschelde Fourqubits can be entangled in nine different ways Phys Rev A (3)65(5 part A)052112 5 2002 doi101103PhysRevA65052112p 48

[Wal14] Michael Walter Multipartite quantum states and their marginalsPhD thesis ETH Zurich 2014 arXiv14106820 p 93

[WDGC13] Michael Walter Brent Doran David Gross and MatthiasChristandl Entanglement polytopes multiparticle entanglementfrom single-particle information Science 340(6137)1205ndash12082013 arXiv12080365 doi101126science1232957 p 8 995

[Wil12] Virginia Vassilevska Williams Multiplying matrices faster thanCoppersmith-Winograd Extended abstract InSTOCrsquo12mdashProceedings of the 2012 ACM Symposium on Theory ofComputing pages 887ndash898 ACM New York 2012doi10114522139772214056 p 4 6 8 48

[Zui17] Jeroen Zuiddam A note on the gap between rank and border rankLinear Algebra Appl 52533ndash44 2017doi101016jlaa201703015 p 2 14 110

[Zui18] Jeroen Zuiddam The asymptotic spectrum of graphs and theShannon capacity arXiv 2018 arXiv180700169 p 35

Glossary

〈n〉 ntimes middot middot middot times n diagonal tensor 47

〈a b c〉 matrix multiplication tensor 48

G lowastH or-product 42

GH strong graph product and-product 35

α(G) stability number 35

χ(G) clique cover number 40

Kk complete graph on k vertices 36

F θ(t) quantum functional 96

G(t) GLn1 times middot middot middot timesGLnk for t isin Fn1 otimes middot middot middot otimes Fnk 52

H(P ) Shannon entropy of probability distribution P 52

h(p) binary entropy of probability p isin [0 1] 53

τ(Φ) hitting set number 59

˜τ(Φ) asymptotic hitting set number 60

ω matrix multiplication exponent 47

P moment polytope 94

139

140 Glossary

P(X) the set of probability distributions on X 52

R rank 27

˜R asymptotic rank 27

R(t) border rank 50

R(G) rank of a graph clique cover number 40

R(t) tensor rank 47

SR(t) slice rank 58

Q subrank 27

˜Q asymptotic subrank 27

Q(t) border subrank 50

Q(Φ) combinatorial subrank 10

Q(G) subrank of a graph stability number 40

supp(t) support 52

Θ(G) Shannon capacity 35

ϑ(G) Lovasz theta number 41

G tH disjoint union 36

W (t) Sn1 times middot middot middot times Snk for t isin Fn1 otimes middot middot middot otimes Fnk 53

X(S6) asymptotic spectrum of semiring S with Strassen preorder 6 25

ζ(S)(t) gauge point 51

ζθ(t) support functional 52

Samenvatting

Algebraısche complexiteit asymptotische spectra enverstrengelingspolytopen

Het is welbekend dat de rang van een matrix multiplicatief is onder het Krone-ckerproduct additief onder de directe som genormaliseerd op identiteitsmatricesen niet-stijgend onder vermenigvuldiging van links en van rechts met matricesMatrixrang is zelfs de enige reele parameter met deze vier eigenschappen In 1986initieerde Strassen de studie van de uitbreiding naar tensoren vind alle afbeel-dingen van k-tensoren naar de reele getallen die multiplicatief zijn onder hettensor Kroneckerproduct additief onder de directe som genormaliseerd op ldquoiden-titeitstensorenrdquo en niet-stijgend onder het toepassen van lineaire afbeeldingen opde k tensorfactoren Strassen noemde de verzameling van deze afbeeldingen hetldquoasymptotische spectrum van k-tensorenrdquo Hij bewees als we het asymptotischespectrum begrijpen dan begrijpen we de asymptotische relaties tussen tensorswaaronder de asymptotische subrang en de asymptotische rang In het bijzonderals we het asymptotische spectrum kennen dan kennen we de aritmetische com-plexiteit van matrixvermenigvuldiging een centraal probleem in de algebraıschecomplexiteitstheorie

Een van de hoofdresultaten in dit proefschrift is de eerste expliciete construc-tie van een oneindige familie van elementen in het asymptotische spectrum vancomplexe k-tensoren genaamd de quantumfunctionalen Onze constructie is geba-seerd op informatietheorie en momentpolytopen ook wel verstrengelingspolytopengenoemd Daarnaast bestuderen we onder andere de relatie tussen de recentgeıntroduceerde slice rang en de quantumfunctionalen en we bewijzen dat deldquoasymptotischerdquo slice rang gelijk is aan het minimum over de quantumfunctionalenNaast het bestuderen van de bovengenoemde tensorparameters geven we eenuitbreiding van de CoppersmithndashWinograd-methode (voor het verkrijgen vanondergrenzen op de asymptotische combinatorische subrang) naar hogere-orde

141

142 Samenvatting

tensoren dwz tensoren van orde minstens 4 We passen deze uitbreiding toeom nieuwe bovengrenzen te krijgen op de asymptotische tensorrang van complete-graaftensoren via de lasermethode (Gezamenlijk werk met Christandl en VranaQIP 2018 STOC 2018)

Als een nieuwe toepassing van de abstracte theorie van asymptotische spectraintroduceren we het asymptotische spectrum van grafen in de grafentheorie Ana-loog aan de situatie voor tensoren geldt als we het asymptotisch spectrum vangrafen begrijpen dan begrijpen we de Shannoncapaciteit een graafparameter diede zero-error-communicatiecomplexiteit van communicatiekanalen karakteriseertMet andere woorden we bewijzen een nieuwe dualiteitsstelling voor de Shannon-capaciteit Voorbeelden van elementen in het asymptotische spectrum van grafenzijn het thetagetal van Lovasz en de fractionele Haemersgrenzen

Tot slot bestuderen we een algebraısch model van berekening genaamd algebraicbranching programs Een algebraic branching program (abp) is het spoor vaneen product van matrices met polynomen van graad hoogstens 1 als elementenDe maximale grootte van de matrices heet de breedte van de abp In 1992bewezen Ben-Or en Cleve dat elk polynoom berekend kan worden door eenbreedte-3 abp met een aantal matrices dat polynomiaal is in de formula size vanhet polynoom Daarentegen bewezen Allender en Wang in 2011 dat sommigepolynomen niet berekend kunnen worden door breedte-2 abps Wij bewijzen dat elkpolynoom benaderd kan worden door een breedte-2 abp met een aantal matricesdat polynomiaal is in de formula size van het polynoom waarbij benaderingwordt bedoeld in de zin van degeneration (Gezamenlijk werk met Ikenmeyer enBringmann CCC 2017 JACM 2018)

Summary

Algebraic complexity asymptotic spectra andentanglement polytopes

Matrix rank is well-known to be multiplicative under the Kronecker productadditive under the direct sum normalised on identity matrices and non-increasingunder multiplying from the left and from the right by any matrices In fact matrixrank is the only real matrix parameter with these four properties In 1986 Strassenproposed to study the extension to tensors find all maps from k-tensors to thereals that are multiplicative under the tensor Kronecker product additive underthe direct sum normalised on ldquoidentity tensorsrdquo and non-increasing under actingwith linear maps on the k tensor factors Strassen called the collection of thesemaps the ldquoasymptotic spectrum of k-tensorsrdquo He proved that understandingthe asymptotic spectrum implies understanding the asymptotic relations amongtensors including the asymptotic subrank and the asymptotic rank In particularknowing the asymptotic spectrum means knowing the arithmetic complexity ofmatrix multiplication a central problem in algebraic complexity theory

One of the main results in this dissertation is the first explicit construction ofan infinite family of elements in the asymptotic spectrum of complex k-tensorscalled the quantum functionals Our construction is based on information theoryand moment polytopes ie entanglement polytopes Moreover among otherthings we study the relation of the recently introduced slice rank to the quantumfunctionals and find that ldquoasymptoticrdquo slice rank equals the minimum over thequantum functionals Besides studying the above tensor parameters we extendthe CoppersmithndashWinograd method (for obtaining asymptotic combinatorialsubrank lower bounds) to higher-order tensors ie order at least 4 We applythis generalisation to obtain new upper bounds on the asymptotic tensor rankof complete graph tensors via the laser method (Joint work with Christandland Vrana QIP 2018 STOC 2018)

143

144 Summary

In graph theory as a new instantiation of the abstract theory of asymptoticspectra we introduce the asymptotic spectrum of graphs Analogous to thesituation for tensors understanding the asymptotic spectrum of graphs meansunderstanding the Shannon capacity a graph parameter capturing the zero-errorcommunication complexity of communication channels In different words weprove a new duality theorem for Shannon capacity Some known elements in theasymptotic spectrum of graphs are the Lovasz theta number and the fractionalHaemers bounds

Finally we study an algebraic model of computation called algebraic branchingprograms An algebraic branching program (abp) is the trace of a product ofmatrices with affine linear forms as matrix entries The maximum size of thematrices is called the width of the abp In 1992 Ben-Or and Cleve provedthat width-3 abps can compute any polynomial efficiently in the formula sizeOn the other hand in 2011 Allender and Wang proved that some polynomialscannot be computed by any width-2 abp We prove that any polynomial can beefficiently approximated by a width-2 abp where approximation is defined in thesense of degeneration (Joint work with Ikenmeyer and Bringmann CCC 2017JACM 2018)

Titles in the ILLC Dissertation Series

ILLC DS-2009-01 Jakub SzymanikQuantifiers in TIME and SPACE Computational Complexity of GeneralizedQuantifiers in Natural Language

ILLC DS-2009-02 Hartmut FitzNeural Syntax

ILLC DS-2009-03 Brian Thomas SemmesA Game for the Borel Functions

ILLC DS-2009-04 Sara L UckelmanModalities in Medieval Logic

ILLC DS-2009-05 Andreas WitzelKnowledge and Games Theory and Implementation

ILLC DS-2009-06 Chantal BaxSubjectivity after Wittgenstein Wittgensteinrsquos embodied and embedded subjectand the debate about the death of man

ILLC DS-2009-07 Kata BaloghTheme with Variations A Context-based Analysis of Focus

ILLC DS-2009-08 Tomohiro HoshiEpistemic Dynamics and Protocol Information

ILLC DS-2009-09 Olivia LadinigTemporal expectations and their violations

ILLC DS-2009-10 Tikitu de JagerrdquoNow that you mention it I wonderrdquo Awareness Attention Assumption

ILLC DS-2009-11 Michael FrankeSignal to Act Game Theory in Pragmatics

ILLC DS-2009-12 Joel UckelmanMore Than the Sum of Its Parts Compact Preference Representation OverCombinatorial Domains

ILLC DS-2009-13 Stefan BoldCardinals as Ultrapowers A Canonical Measure Analysis under the Axiom ofDeterminacy

ILLC DS-2010-01 Reut TsarfatyRelational-Realizational Parsing

ILLC DS-2010-02 Jonathan ZvesperPlaying with Information

ILLC DS-2010-03 Cedric DegremontThe Temporal Mind Observations on the logic of belief change in interactivesystems

ILLC DS-2010-04 Daisuke IkegamiGames in Set Theory and Logic

ILLC DS-2010-05 Jarmo KontinenCoherence and Complexity in Fragments of Dependence Logic

ILLC DS-2010-06 Yanjing WangEpistemic Modelling and Protocol Dynamics

ILLC DS-2010-07 Marc StaudacherUse theories of meaning between conventions and social norms

ILLC DS-2010-08 Amelie GheerbrantFixed-Point Logics on Trees

ILLC DS-2010-09 Gaelle FontaineModal Fixpoint Logic Some Model Theoretic Questions

ILLC DS-2010-10 Jacob VosmaerLogic Algebra and Topology Investigations into canonical extensions dualitytheory and point-free topology

ILLC DS-2010-11 Nina GierasimczukKnowing Onersquos Limits Logical Analysis of Inductive Inference

ILLC DS-2010-12 Martin Mose BentzenStit Iit and Deontic Logic for Action Types

ILLC DS-2011-01 Wouter M KoolenCombining Strategies Efficiently High-Quality Decisions from ConflictingAdvice

ILLC DS-2011-02 Fernando Raymundo Velazquez-QuesadaSmall steps in dynamics of information

ILLC DS-2011-03 Marijn KoolenThe Meaning of Structure the Value of Link Evidence for Information Retrieval

ILLC DS-2011-04 Junte ZhangSystem Evaluation of Archival Description and Access

ILLC DS-2011-05 Lauri KeskinenCharacterizing All Models in Infinite Cardinalities

ILLC DS-2011-06 Rianne KapteinEffective Focused Retrieval by Exploiting Query Context and Document Struc-ture

ILLC DS-2011-07 Jop BrietGrothendieck Inequalities Nonlocal Games and Optimization

ILLC DS-2011-08 Stefan MinicaDynamic Logic of Questions

ILLC DS-2011-09 Raul Andres LealModalities Through the Looking Glass A study on coalgebraic modal logic andtheir applications

ILLC DS-2011-10 Lena KurzenComplexity in Interaction

ILLC DS-2011-11 Gideon BorensztajnThe neural basis of structure in language

ILLC DS-2012-01 Federico SangatiDecomposing and Regenerating Syntactic Trees

ILLC DS-2012-02 Markos MylonakisLearning the Latent Structure of Translation

ILLC DS-2012-03 Edgar Jose Andrade LoteroModels of Language Towards a practice-based account of information innatural language

ILLC DS-2012-04 Yurii KhomskiiRegularity Properties and Definability in the Real Number Continuum idealizedforcing polarized partitions Hausdorff gaps and mad families in the projectivehierarchy

ILLC DS-2012-05 David Garcıa SorianoQuery-Efficient Computation in Property Testing and Learning Theory

ILLC DS-2012-06 Dimitris GakisContextual Metaphilosophy - The Case of Wittgenstein

ILLC DS-2012-07 Pietro GallianiThe Dynamics of Imperfect Information

ILLC DS-2012-08 Umberto GrandiBinary Aggregation with Integrity Constraints

ILLC DS-2012-09 Wesley Halcrow HollidayKnowing What Follows Epistemic Closure and Epistemic Logic

ILLC DS-2012-10 Jeremy MeyersLocations Bodies and Sets A model theoretic investigation into nominalisticmereologies

ILLC DS-2012-11 Floor SietsmaLogics of Communication and Knowledge

ILLC DS-2012-12 Joris DormansEngineering emergence applied theory for game design

ILLC DS-2013-01 Simon PauwSize Matters Grounding Quantifiers in Spatial Perception

ILLC DS-2013-02 Virginie FiutekPlaying with Knowledge and Belief

ILLC DS-2013-03 Giannicola ScarpaQuantum entanglement in non-local games graph parameters and zero-errorinformation theory

ILLC DS-2014-01 Machiel KeestraSculpting the Space of Actions Explaining Human Action by IntegratingIntentions and Mechanisms

ILLC DS-2014-02 Thomas IcardThe Algorithmic Mind A Study of Inference in Action

ILLC DS-2014-03 Harald A BastiaanseVery Many Small Penguins

ILLC DS-2014-04 Ben RodenhauserA Matter of Trust Dynamic Attitudes in Epistemic Logic

ILLC DS-2015-01 Marıa Ines CrespoAffecting Meaning Subjectivity and evaluativity in gradable adjectives

ILLC DS-2015-02 Mathias Winther MadsenThe Kid the Clerk and the Gambler - Critical Studies in Statistics andCognitive Science

ILLC DS-2015-03 Shengyang ZhongOrthogonality and Quantum Geometry Towards a Relational Reconstructionof Quantum Theory

ILLC DS-2015-04 Sumit SourabhCorrespondence and Canonicity in Non-Classical Logic

ILLC DS-2015-05 Facundo CarreiroFragments of Fixpoint Logics Automata and Expressiveness

ILLC DS-2016-01 Ivano A CiardelliQuestions in Logic

ILLC DS-2016-02 Zoe ChristoffDynamic Logics of Networks Information Flow and the Spread of Opinion

ILLC DS-2016-03 Fleur Leonie BouwerWhat do we need to hear a beat The influence of attention musical abilitiesand accents on the perception of metrical rhythm

ILLC DS-2016-04 Johannes MartiInterpreting Linguistic Behavior with Possible World Models

ILLC DS-2016-05 Phong LeLearning Vector Representations for Sentences - The Recursive Deep LearningApproach

ILLC DS-2016-06 Gideon Maillette de Buy WennigerAligning the Foundations of Hierarchical Statistical Machine Translation

ILLC DS-2016-07 Andreas van CranenburghRich Statistical Parsing and Literary Language

ILLC DS-2016-08 Florian SpeelmanPosition-based Quantum Cryptography and Catalytic Computation

ILLC DS-2016-09 Teresa PiovesanQuantum entanglement insights via graph parameters and conic optimization

ILLC DS-2016-10 Paula HenkNonstandard Provability for Peano Arithmetic A Modal Perspective

ILLC DS-2017-01 Paolo GaleazziPlay Without Regret

ILLC DS-2017-02 Riccardo PinosioThe Logic of Kantrsquos Temporal Continuum

ILLC DS-2017-03 Matthijs WesteraExhaustivity and intonation a unified theory

ILLC DS-2017-04 Giovanni CinaCategories for the working modal logician

ILLC DS-2017-05 Shane Noah Steinert-ThrelkeldCommunication and Computation New Questions About Compositionality

ILLC DS-2017-06 Peter HawkeThe Problem of Epistemic Relevance

ILLC DS-2017-07 Aybuke OzgunEvidence in Epistemic Logic A Topological Perspective

ILLC DS-2017-08 Raquel Garrido AlhamaComputational Modelling of Artificial Language Learning Retention Recogni-tion amp Recurrence

ILLC DS-2017-09 Milos StanojevicPermutation Forests for Modeling Word Order in Machine Translation

ILLC DS-2018-01 Berit JanssenRetained or Lost in Transmission Analyzing and Predicting Stability in DutchFolk Songs

ILLC DS-2018-02 Hugo HuurdemanSupporting the Complex Dynamics of the Information Seeking Process

ILLC DS-2018-03 Corina KoolenReading beyond the female The relationship between perception of authorgender and literary quality

ILLC DS-2018-04 Jelle BruinebergAnticipating Affordances Intentionality in self-organizing brain-body-environmentsystems

ILLC DS-2018-05 Joachim DaiberTypologically Robust Statistical Machine Translation Understanding and Ex-ploiting Differences and Similarities Between Languages in Machine Transla-tion

ILLC DS-2018-06 Thomas BrochhagenSignaling under Uncertainty

ILLC DS-2018-07 Julian SchloderAssertion and Rejection

ILLC DS-2018-08 Srinivasan ArunachalamQuantum Algorithms and Learning Theory

ILLC DS-2018-09 Hugo de Holanda Cunha NobregaGames for functions Baire classes Weihrauch degrees transfinite computa-tions and ranks

ILLC DS-2018-10 Chenwei ShiReason to Believe

ILLC DS-2018-11 Malvin GattingerNew Directions in Model Checking Dynamic Epistemic Logic

ILLC DS-2018-12 Julia IlinFiltration Revisited Lattices of Stable Non-Classical Logics

  • Acknowledgements
  • Introduction
    • Matrix multiplication
    • The asymptotic spectrum of tensors
    • Higher-order CW method
    • Abstract asymptotic spectra
    • The asymptotic spectrum of graphs
    • Tensor degeneration
    • Combinatorial degeneration
    • Algebraic branching program degeneration
    • Organisation
      • The theory of asymptotic spectra
        • Introduction
        • Semirings and preorders
        • Strassen preorders
        • Asymptotic preorders
        • Maximal Strassen preorders
        • The asymptotic spectrum
        • The representation theorem
        • Abstract rank and subrank
        • Topological aspects
        • Uniqueness
        • Subsemirings
        • Subsemirings generated by one element
        • Universal spectral points
        • Conclusion
          • The asymptotic spectrum of graphs Shannon capacity
            • Introduction
            • The asymptotic spectrum of graphs
              • The semiring of graph isomorphism classes
              • Strassen preorder via graph homomorphisms
              • The asymptotic spectrum of graphs
              • Shannon capacity
                • Universal spectral points
                  • Lovaacutesz theta number
                  • Fractional graph parameters
                    • Conclusion
                      • The asymptotic spectrum of tensors matrix multiplication
                        • Introduction
                        • The asymptotic spectrum of tensors
                          • The semiring of tensor equivalence classes
                          • Strassen preorder via restriction
                          • The asymptotic spectrum of tensors
                          • Asymptotic rank and asymptotic subrank
                            • Gauge points
                            • Support functionals
                            • Upper and lower support functionals
                            • Asymptotic slice rank
                            • Conclusion
                              • Tight tensors and combinatorial subrank cap sets
                                • Introduction
                                • Higher-order CoppersmithndashWinograd method
                                  • Construction
                                  • Computational remarks
                                  • Examples type sets
                                    • Combinatorial degeneration method
                                    • Cap sets
                                      • Reduced polynomial multiplication
                                      • Cap sets
                                        • Graph tensors
                                        • Conclusion
                                          • Universal points in the asymptotic spectrum of tensors entanglement polytopes moment polytopes
                                            • Introduction
                                            • SchurndashWeyl duality
                                            • Kronecker and LittlewoodndashRichardson coefficients
                                            • Entropy inequalities
                                            • Hilbert spaces and density operators
                                            • Moment polytopes
                                              • General setting
                                              • Tensor spaces
                                                • Quantum functionals
                                                • Outer approximation
                                                • Inner approximation for free tensors
                                                • Quantum functionals versus support functionals
                                                • Asymptotic slice rank
                                                • Conclusion
                                                  • Algebraic branching programs approximation and nondeterminism
                                                    • Introduction
                                                    • Definitions and basic results
                                                      • Computational models
                                                      • Complexity classes
                                                      • The theorem of Ben-Or and Cleve
                                                      • Approximation closure
                                                      • Nondeterminism closure
                                                        • Approximation closure of VP2
                                                        • Nondeterminism closure of VP1
                                                        • Conclusion
                                                          • Bibliography
                                                          • Glossary
                                                          • Samenvatting
                                                          • Summary
Page 2: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch

Algebraic complexity

asymptotic spectra and

entanglement polytopes

Jeroen Zuiddam

Algebraic complexity

asymptotic spectra and

entanglement polytopes

ILLC Dissertation Series DS-2018-13

For further information about ILLC-publications please contact

Institute for Logic Language and ComputationUniversiteit van Amsterdam

Science Park 1071098 XG Amsterdam

phone +31-20-525 6051e-mail illcuvanl

homepage httpwwwillcuvanl

The investigations were supported by the Netherlands Organization for ScientificResearch NWO (617023116) the European Commission and the QuSoft ResearchCenter for Quantum Software

Copyright ccopy 2018 by Jeroen Zuiddam

ISBN 978-94-028-1175-9

Algebraic complexity

asymptotic spectra and

entanglement polytopes

Academisch Proefschrift

ter verkrijging van de graad van doctoraan de Universiteit van Amsterdamop gezag van de Rector Magnificus

prof dr ir KIJ Maexten overstaan van een door het College voor Promoties ingestelde

commissie in het openbaar te verdedigen in de Agnietenkapelop dinsdag 23 oktober 2018 te 1200 uur

door

Jeroen Zuiddam

geboren te Leiderdorp

Promotiecommisie

Promotores prof dr HM Buhrman Universiteit van Amsterdamprof dr M Christandl Koslashbenhavns Universitet

Overige leden prof dr M Laurent Tilburg Universityprof dr EM Opdam Universiteit van Amsterdamprof dr RM de Wolf Universiteit van Amsterdamdr J Briet CWI Amsterdamdr M Walter Universiteit van Amsterdam

Faculteit der Natuurwetenschappen Wiskunde en Informatica

Contents

Acknowledgements ix

1 Introduction 311 Matrix multiplication 512 The asymptotic spectrum of tensors 613 Higher-order CW method 1014 Abstract asymptotic spectra 1115 The asymptotic spectrum of graphs 1216 Tensor degeneration 1417 Combinatorial degeneration 1518 Algebraic branching program degeneration 1519 Organisation 17

2 The theory of asymptotic spectra 1921 Introduction 1922 Semirings and preorders 1923 Strassen preorders 2024 Asymptotic preorders 4sim 2125 Maximal Strassen preorders 2326 The asymptotic spectrum X(S6) 2527 The representation theorem 2628 Abstract rank and subrank RQ 2729 Topological aspects 29210 Uniqueness 30211 Subsemirings 31212 Subsemirings generated by one element 32213 Universal spectral points 33

v

214 Conclusion 33

3 The asymptotic spectrum of graphs Shannon capacity 3531 Introduction 3532 The asymptotic spectrum of graphs 37

321 The semiring of graph isomorphism classes G 37322 Strassen preorder via graph homomorphisms 38323 The asymptotic spectrum of graphs X(G) 39324 Shannon capacity Θ 39

33 Universal spectral points 41331 Lovasz theta number ϑ 41332 Fractional graph parameters 41

34 Conclusion 46

4 The asymptotic spectrum of tensors matrix multiplication 4741 Introduction 4742 The asymptotic spectrum of tensors 49

421 The semiring of tensor equivalence classes T 49422 Strassen preorder via restriction 49423 The asymptotic spectrum of tensors X(T ) 49424 Asymptotic rank and asymptotic subrank 50

43 Gauge points ζ(i) 5144 Support functionals ζθ 5245 Upper and lower support functionals ζθ ζθ 5646 Asymptotic slice rank 5847 Conclusion 63

5 Tight tensors and combinatorial subrank cap sets 6551 Introduction 6552 Higher-order CoppersmithndashWinograd method 68

521 Construction 69522 Computational remarks 77523 Examples type sets 78

53 Combinatorial degeneration method 7954 Cap sets 81

541 Reduced polynomial multiplication 81542 Cap sets 82

55 Graph tensors 8556 Conclusion 86

6 Universal points in the asymptotic spectrum of tensors entan-glement polytopes moment polytopes 8761 Introduction 87

vi

62 SchurndashWeyl duality 8863 Kronecker and LittlewoodndashRichardson coefficients gλmicroν c

λmicroν 90

64 Entropy inequalities 9165 Hilbert spaces and density operators 9266 Moment polytopes P(t) 93

661 General setting 93662 Tensor spaces 94

67 Quantum functionals F θ(t) 9568 Outer approximation 10069 Inner approximation for free tensors 101610 Quantum functionals versus support functionals 102611 Asymptotic slice rank 103612 Conclusion 105

7 Algebraic branching programs approximation and nondetermi-nism 10771 Introduction 10772 Definitions and basic results 110

721 Computational models 110722 Complexity classes VP VPe VPk 111723 The theorem of Ben-Or and Cleve 112724 Approximation closure C 115725 Nondeterminism closure N(C) 115

73 Approximation closure of VP2 11674 Nondeterminism closure of VP1 11975 Conclusion 122

Bibliography 125

Glossary 139

Samenvatting 141

Summary 143

vii

Acknowledgements

First of all I thank all my coauthors for very fruitful collaboration Harry BuhrmanMatthias Christandl Peter Vrana Jop Briet Chris Perry Asger Jensen MarkusBlaser Christian Ikenmeyer and Karl Bringmann

Chris Zaal Leen Torenvliet and Robert Belleman I thank for all their effortsto set up for me the ldquodouble bachelor programmerdquo in Mathematics and Computerscience at the University of Amsterdam (UvA) in 2009 This programme as wellas the ldquowebklasrdquo on the Riemann hypothesis organised by Jan van de Craats andRoland van der Veen and the close vicinity of the UvA to the Dutch nationalresearch institute for mathematics and computer science (CWI) made me decideto come to Amsterdam My enjoyable master thesis project in mathematics withEric Opdam made me follow the academic path for which I thank Eric

Of course most importantly I thank my PhD supervisor Harry Buhrmanfor introducing me to research as a bachelor student for absorbing me into theAlgorithms and Complexity group at CWI for having enough faith in me to hireme as his PhD student in 2014 and for his general guidance throughout I feelvery lucky for the opportunities and scientific freedom that this has brought me

Matthias Christandl has been my closest collaborator and mentor since wemet in Berkeley in 2014 In practice this meant countless nights of fun Skypesessions between Amsterdam and Copenhagen countless enjoyable visits to theUniversity of Copenhagen and countless kitchen table sessions at the HallinsgadeThanks Matthias for the energy inspiration and optimism And thanks Matthiasand Henriette for the hospitality

Jop Briet I thank for his general guidance and for lots of inspiration Thepolynomial method reading group which he mainly organised inspired partof my paper with Matthias Christandl and Peter Vrana on universal points inthe asymptotic spectrum of tensors (This reading group also resulted in DionGijswijtrsquos paper on cap sets) My paper with Jop on round elimination laterinspired me to write the paper on the asymptotic spectrum of graphs

ix

Christian Ikenmeyer I thank for numerous inspiring discussions on algebraiccomplexity theory and tensors which greatly influenced my papers on tensor rankand our joint paper with Karl Bringmann on algebraic branching programs

Peter Vrana I thank for our many enjoyable research collaborations the resultsof which form a central part of this dissertation for his clever insights and forfinding several mathematical mistakes while reading the draft of this dissertation

Ronald de Wolf I thank for his general advice throughout my PhD and formany suggestions regarding the current version of this dissertation which will beincorporated in the next version (but not in the printed version because of theregulations of the University of Amsterdam)

Jop Briet Monique Laurent Lex Schrijver Peter Vrana Matthias ChristandlMaris Ozols Michael Walter and Bart Sevenster I thank for helpful discussionsregarding the results in Chapter 2 and Chapter 3 of this dissertation

Srinivasan Arunachalam I thank for sharing the ups and downs during ourfour years as PhD students at CWI Florian Speelman Farrokh Labib SvenPolak Bart Litjens and Bart Sevenster I thank for numerous valuable researchdiscussions

Bikkie Aldeias and Rob van Rooijen I thank for their excellent library servicesMartijn Zuiddam and Maris Ozols I thank for proofreading the draft of this

dissertationFinally I thank my parents and my brothers and my friends for their support

Amsterdam Jeroen ZuiddamAugust 31 2018

x

Publications

[BCPZ16] Harry Buhrman Matthias Christandl Christopher Perry and JeroenZuiddam Clean quantum and classical communication protocolsPhysical Review Letters 117230503 2016httpslinkapsorgdoi101103PhysRevLett117230503

httparxivorgabs160507948

[BCZ17a] Markus Blaser Matthias Christandl and Jeroen Zuiddam The bordersupport rank of two-by-two matrix multiplication is sevenManuscript 2017httpsarxivorgabs170509652

[BCZ17b] Harry Buhrman Matthias Christandl and Jeroen ZuiddamNondeterministic Quantum Communication Complexity the CyclicEquality Game and Iterated Matrix MultiplicationIn Christos H Papadimitriou editor 8th Innovations in TheoreticalComputer Science Conference (ITCS) 2017httpdropsdagstuhldeopusvolltexte20178181

[BIZ17] Karl Bringmann Christian Ikenmeyer and Jeroen Zuiddam On algebraicbranching programs of small widthIn Ryan OrsquoDonnell editor 32nd Computational ComplexityConference (CCC) 2017httpsdoiorg104230LIPIcsCCC201720

httpsarxivorgabs170205328

Journal of the ACM Vol 65 No 5 Article 32 2018httpsdoiorg1011453209663

1

2 Acknowledgements

[BZ17] Jop Briet and Jeroen Zuiddam On the orthogonal rank of Cayley graphsand impossibility of quantum round eliminationQuantum Information and Computation 2017httpwwwrintonpresscomxxqic17qic-17-120106-0116pdf

httpsarxivorgabs160806113

[CJZ18] Matthias Christandl Asger Kjaeligrulff Jensen and Jeroen ZuiddamTensor rank is not multiplicative under the tensor productLinear Algebra and its Applications 543125ndash139 2018httpsdoiorg101016jlaa201712020

httpsarxivorgabs170509379

[CVZ16] Matthias Christandl Peter Vrana and Jeroen Zuiddam Asymptotic tensorrank of graph tensors beyond matrix multiplicationManuscript 2016httpsarxivorgabs160907476

[CVZ18] Matthias Christandl Peter Vrana and Jeroen Zuiddam Universal Pointsin the Asymptotic Spectrum of Tensors Extended AbstractIn Proceedings of 50th Annual ACM SIGACT Symposium on the Theory ofComputing (STOC) 2018httpsdoiorg10114531887453188766

httpsarxivorgabs170907851

[CZ18] Matthias Christandl and Jeroen Zuiddam Tensor surgery and tensor rank(Journal of) computational complexity 2018httpsdoiorg101007s00037-018-0164-8

httpsarxivorgabs160604085

[Zui17] Jeroen Zuiddam A note on the gap between rank and border rankLinear Algebra and its Applications 52533ndash44 2017httpsdoiorg101016jlaa201703015

httparxivorgabs150405597

[Zui18] Jeroen Zuiddam The asymptotic spectrum of graphs and the ShannoncapacityManuscript 2018httparxivorgabs180700169

This dissertation is based on the above papers with primary focus on the fourhighlighted papers

Toelichting op het relatieve belang van de co-auteurs voor elk artikel is hetbelang van de co-auteurs ongeveer gelijk verdeeld

Chapter 1

Introduction

Volker Strassen published in 1969 his famous algorithm for multiplying anytwo ntimes n matrices using only O(n281) rather than O(n3) arithmetical opera-tions [Str69] His discovery marked the beginning of a still ongoing line of researchin the field of algebraic complexity theory a line of research that by now touchesseveral fields of mathematics including algebraic geometry representation theory(quantum) information theory and combinatorics This dissertation is inspired byand contributes to this line of research

No further progress followed for almost 10 years after Strassenrsquos discoverydespite the fact that ldquomany scientists understood that discovery as a signal toattack the problem and to push the exponent further downrdquo [Pan84] Then in 1978Pan improved the exponent from 281 to 279 [Pan78 Pan80] One year later BiniCapovani Lotti and Romani improved the exponent to 278 by constructing fastldquoapproximativerdquo algorithms for matrix multiplication and making these algorithmsexact via the method of interpolation [BCRL79 Bin80] Cast in the languageof tensors the result of Bini et al corresponds to what we now call a ldquoborderrankrdquo upper bound The idea of studying approximative complexity or bordercomplexity for algebraic problems has nowadays become an important theme inalgebraic complexity theory

Schonhage then obtained the exponent 255 by constructing a fast algorithm forcomputing many ldquodisjointrdquo small matrix multiplications and transforming this intoan algorithm for one large matrix multiplication [Sch81] The upper bound was im-proved shortly after by works of Pan [Pan81] Romani [Rom82] and Coppersmithand Winograd [CW82] resulting in the exponent 250 Then in 1987 Strassenpublished the laser method with which he obtained the exponent 248 [Str87] Thelaser method was used in the same year by Coppersmith and Winograd to obtainthe exponent 238 [CW87] To do this they invented a method for constructingcertain large combinatorial structures This method or actually the extendedversion that Strassen published in [Str91] we now call the CoppersmithndashWinogradmethod All further improvements on upper bounding the exponent essentially

3

4 Chapter 1 Introduction

follow the framework of Coppersmith and Winograd and the improvements donot affect the first two digits after the comma [CW90 Sto10 Wil12 LG14]

Define ω to be the optimal exponent in the complexity of matrix multiplicationWe call ω the exponent of matrix multiplication To summarise the above historicalaccount on upper bounds ω lt 238 On the other hand the only lower bound wecurrently have is the trivial lower bound 2 le ω

The history of upper bounds on the matrix multiplication exponent ω whichbegan with Strassenrsquos algorithm and ended with the Strassen laser methodand CoppersmithndashWinograd method is well-known and well-documented seeeg [BCS97 Section 1513] However there is remarkable work of Strassen ona theory of lower bounds for ω and similar types of exponents and this workhas received almost no attention This theory of lower bounds is the theory ofasymptotic spectra of tensors and is the topic of a series of papers by Strassen[Str86 Str87 Str88 Str91 Str05]

In the foregoing the word tensor has popped up twicemdashnamely when wementioned border rank and just now when we mentioned asymptotic spectraof tensorsmdashbut we have not discussed at all why tensors should be relevant forunderstanding the complexity of matrix multiplication First we give a mini courseon tensors A k-tensor t = (ti1ik)i1ik is a k-dimensional array of numbers fromsome field say the complex numbers C Thus a 2-tensor is simply a matrix Ak-tensor is called simple if there exist k vectors v1 vk such that the entries of tare given by the products ti1ik = (v1)i1 middot middot middot (vk)ik for all indices ij The tensorrank of t is the smallest number n such that t can be written as a sum of n simpletensors Thus the tensor rank of a 2-tensor is simply its matrix rank Returning tothe problem of finding the complexity of matrix multiplication there is a special3-tensor called the matrix multiplication tensor that encodes the computationalproblem of multiplying two 2times 2 matrices This 3-tensor is commonly denotedby 〈2 2 2〉 It turns out that the matrix multiplication exponent ω is exactly theasymptotic rate of growth of the tensor rank of the ldquoKronecker powersrdquo of thetensor 〈2 2 2〉 This important observation follows from the fundamental fact thatthe computational problem of multiplying matrices is ldquoself-reduciblerdquo Namely wecan multiply two matrices by viewing them as block matrices and then performmatrix multiplication at the level of the blocks

We wrap up this introductory story To understand the computational com-plexity of matrix multiplication one should understand the asymptotic rate ofgrowth of the tensor rank of a certain family of tensors a family that is obtainedby taking powers of a fixed tensor The theory of asymptotic spectra is the theoryof bounds on such asymptotic parameters of tensors

The main story line of this dissertation concerns the theory of asymptoticspectra In Section 11 of this introduction we discuss in more detail the computa-tional problem of multiplying matrices In Section 12 we discuss the asymptoticspectrum of tensors and discuss a new result an explicit description of infinitely

11 Matrix multiplication 5

many elements in the asymptotic spectrum of tensors In Section 13 we considera new higher-order CoppersmithndashWinograd method

The theory of asymptotic spectra of tensors is a special case of an abstracttheory of asymptotic spectra of preordered semirings which we discuss in Sec-tion 14 In Section 15 we apply this abstract theory to a new setting namelyto graphs By doing this we obtain a new dual characterisation of the Shannoncapacity of graphs

The second story line of this dissertation is about degeneration an algebraickind of approximation related to the concept of border rank of Bini et al We discussdegeneration in the context of tensors in Section 16 There is a combinatorialversion of tensor degeneration which we call combinatorial degeneration Wediscuss a new result regarding combinatorial degeneration in Section 17 FinallySection 18 is about a new result concerning degeneration for algebraic branchingprograms an algebraic model of computation

We finish in Section 19 with a discussion of the organisation of this dissertationinto chapters

11 Matrix multiplication

In this section we discuss in more detail the computational problem of multiplyingtwo matrices

Algebraic complexity theory studies algebraic algorithms for algebraic problemsRoughly speaking algebraic algorithms are algorithms that use only the basicarithmetical operations + and times over some field say R or C A fundamentalexample of an algebraic problem is matrix multiplication

If we multiply two ntimesn matrices by computing the inner products between anyrow of the first matrix and any column of the second matrix one by one we needroughly 2 middot n3 arithmetical operations (+ and times) For example we can multiplytwo 2times2 matrices with 12 arithmetical operations namely 8 multiplications and 4additions(

a11 a12

a21 a22

)(b11 b12

b21 b22

)=

(a11b11 + a12b21 a11b12 + a12b22

a21b11 + a22b21 a21b12 + a22b22

)

Since matrix multiplication is a basic operation in linear algebra it is worthwhileto see if we can do better than 2 middot n3 In 1969 Strassen [Str69] published a betteralgorithm The base routine of Strassenrsquos algorithm is an algorithm for multiplyingtwo 2times 2 matrices with 7 multiplications 18 additions and certain sign changes(

a11 a12

a21 a22

)(b11 b12

b21 b22

)=

(x1 + x4 minus x5 + x7 x3 + x5

x2 + x4 x1 + x3 minus x2 + x6

)with

x1 = (a11 + a22)(b11 + b22)

6 Chapter 1 Introduction

x2 = (a21 + a22)b11

x3 = a11(b12 minus b22)

x4 = a22(minusb11 + b21)

x5 = (a11 + a12)b22

x6 = (minusa11 + a21)(b11 + b12)

x7 = (a12 minus a22)(b21 + b22)

The general routine of Strassenrsquos algorithm multiplies two n times n matrices byrecursively dividing the matrices into four blocks and applying the base routineto multiply the blocks (this is the self-reducibility of matrix multiplication thatwe mentioned earlier) The base routine does not assume commutativity of thevariables for correctness so indeed we can take the variables to be matrices Afterexpanding the recurrence we see that Strassenrsquos algorithm uses 47middotnlog2 7 asymp 47middotn281

arithmetical operations Over the years Strassenrsquos algorithm was improved bymany researchers The best algorithm known today uses C middot n238 arithmeticaloperations where C is some constant [CW90 Sto10 Wil12 LG14] The exponentof matrix multiplication ω is the infimum over all real numbers β such that forsome constant Cβ we can multiply for any n isin N any two ntimes n matrices with atmost Cβ middot nβ arithmetical operations From the above it follows that ω le 238From a simple flattening argument it follows that 2 le ω We are left with thefollowing well-known open problem what is the value of the matrix multiplicationexponent ω

The constant C for the currently best algorithm is impractically large (fora discussion of this issue see eg [Pan18]) For a practical fast algorithm oneshould either improve C or find a balance between C and the exponent of nWe will ignore the size of C in this dissertation and focus on the exponent ωFor an overview of the field of algebraic complexity theory the reader shouldconsult [BCS97] and [Sap16]

12 The asymptotic spectrum of tensors

We now discuss the theory of asymptotic spectra for tensors

Let s and t be k-tensors over a field F s isin Fn1otimesmiddot middot middototimesFnk t isin Fm1otimesmiddot middot middototimesFmk We say s restricts to t and write s gt t if there are linear maps Ai Fni rarr Fmisuch that (A1 otimes middot middot middot otimes Ak)(s) = t Let [n] = 1 n for n isin N We definethe product s otimes t isin Fn1m1 otimes middot middot middot otimes Fnkmk by (s otimes t)(i1j1)(ikjk) = si1iktj1jkfor i isin [n1] times middot middot middot [nk] and j isin [m1] times middot middot middot times [mk] This product generalizes thewell-known Kronecker product of matrices We refer to this product as the tensor(Kronecker) product We define the direct sum s oplus t isin Fn1+m1 otimes middot middot middot otimes Fnk+mk

by (s oplus t)`1`k = s`1`k if ` isin [n1] times middot middot middot times [nk] (s oplus t)n1+`1nk+`k = t`1`k if` isin [m1]times middot middot middot times [mk] and (soplus t)`1`k = 0 for the remaining indices

12 Asymptotic spectra of tensors 7

The asymptotic restriction problem asks to compute the infimum of all realnumbers β ge 0 such that for all n isin N

sotimesβn+o(n) gt totimesn

We may think of the asymptotic restriction problem as having two directionsnamely to find

1 obstructions ldquocertificatesrdquo that prohibit sotimesβn+o(n) gt totimesn or

2 constructions linear maps that carry out sotimesβn+o(n) gt totimesn

Ideally we would like to find matching obstructions and constructions so that weindeed learn the value of β

What do obstructions look like We set β equal to one it turns out that itis sufficient to understand this case We say s restricts asymptotically to t andwrite s gtsim t if

sotimesn+o(n) gt totimesn

What do obstructions look like for asymptotic restriction gtsim More precisely whatdo obstructions look like for gtsim restricted to a subset S sube k-tensors over FLet us assume S is closed under direct sum and tensor product and containsthe diagonal tensors 〈n〉 =

sumni=1 ei otimes middot middot middot otimes ei for n isin N where e1 en is the

standard basis of Fn Let X(S) be the set of all maps φ S rarr Rge0 that are

(a) monotone under restriction gt

(b) multiplicative under the tensor Kronecker product otimes

(c) additive under the direct sum oplus

(d) normalised to φ(〈n〉) = n at the diagonal tensor 〈n〉

The elements φ isin X(S) are called spectral points of S The set X(S) is called theasymptotic spectrum of S

Spectral points φ isin X(S) are obstructions Let s t isin S If s gtsim t thenby definition we have a restriction sotimesn+o(n) gt totimesn Then (a) and (b) imply theinequality φ(s)n+o(n) = φ(sotimesn+o(n)) gt φ(totimesn) = φ(t)n This implies φ(s) gt φ(t)We negate that statement if φ(s) lt φ(t) then not s gtsim t In that case φ is anobstruction to s gtsim t

The remarkable fact is that X(S) is a complete set of obstructions for gtsimNamely for s t isin S the asymptotic restriction s gtsim t holds if and only if we haveφ(s) gt φ(t) for all spectral points φ isin X(S) This was proven by Volker Strassenin [Str86 Str88] His proof uses a theorem of Becker and Schwarz [BS83] which iscommonly referred to as the KadisonndashDubois theorem (for historical reasons) or

8 Chapter 1 Introduction

the real representation theorem (We will say more about this completeness resultin Section 14)

Let us introduce tensor rank and subrank and their asymptotic versionsThe tensor rank of t is the size of the smallest diagonal tensor that restrictsto t R(t) = minr isin N t 6 〈r〉 and the subrank of t is the size of thelargest diagonal tensor to which t restricts Q(t) = maxr isin N 〈r〉 6 tAsymptotic rank is defined as ˜R(t) = limnrarrinfinR(totimesn)1n and asymptotic subrankis defined as ˜Q(t) = limnrarrinfinQ(totimesn)1n From Feketersquos lemma it follows that

˜Q(t) = supn Q(totimesn)1n and ˜R(t) = infn R(totimesn)1n One easily verifies that everyspectral point φ isin X(S) is an upper bound on asymptotic subrank and a lowerbound on asymptotic rank for any tensor t isin S

˜Q(t) le φ(t) le ˜R(t)

Strassen used the completeness of X(S) for 6sim to prove ˜Q(t) = minφisinX(S) φ(t) and

˜R(t) = maxφisinX(S) φ(t) One should think of these expressions as being dual to thedefining expressions for ˜Q and ˜R

We mentioned that Strassen was motivated to study the asymptotic spectrumof tensors by the study of the complexity of matrix multiplication The preciseconnection with matrix multiplication is as follows The matrix multiplicationexponent ω is characterised by the asymptotic rank ˜R(〈2 2 2〉) of the matrixmultiplication tensor

〈2 2 2〉 =sum

ijkisin[2]

eij otimes ejk otimes eki isin F4 otimes F4 otimes F4

via ˜R(〈2 2 2〉) = 2ω We know the trivial lower bound 2 le ω see Section 43 Weknow the (nontrivial) upper bound ω le 23728639 which is by Coppersmith andWinograd [CW90] and improvements by Stothers [Sto10] Williams [Wil12] andLe Gall [LG14] It may seem that for the study of matrix multiplication only theasymptotic rank ˜R is of interest and that the asymptotic subrank ˜Q is just a toyparameter Asymptotic subrank however plays an important role in the currentlybest matrix multiplication algorithms We will discuss this idea in the context ofthe asymptotic subrank of so-called complete graph tensors in Section 55

The important message is understanding the asymptotic spectrum of ten-sors X(S) means understanding asymptotic restriction 6sim the asymptotic sub-rank ˜Q and the asymptotic rank ˜R of tensors Of course we should now find anexplicit description of X(S)

Our main result regarding the asymptotic spectrum of tensors is the explicitdescription of an infinite family of elements in the asymptotic spectrum of allcomplex tensors X(complex k-tensors) which we call the quantum function-als (Chapter 6) Finding such an infinite family has been an open problemsince the work of Strassen Moment polytopes (studied under the name en-tanglement polytopes in quantum information theory [WDGC13]) play a key

12 Asymptotic spectra of tensors 9

role here To each tensor t is associated a convex polytope P(t) collectingrepresentation-theoretic information about t called the moment polytope of t(See eg [Nes84 Bri87 WDGC13 SOK14]) The moment polytope has twoimportant equivalent descriptions

Quantum marginal spectra description We begin with the descriptionof P(t) in terms of quantum marginal spectra

Let V be a (finite-dimensional) Hilbert space In quantum information theorya positive semidefinite hermitian operator ρ V rarr V with trace one is calleda density operator The sequence of eigenvalues of a density operator ρ is aprobability vector We let spec(ρ) = (p1 pn) be the sequence of eigenvalues of ρordered non-increasingly p1 ge middot middot middot ge pn Let V1 and V2 be Hilbert spaces Given adensity operator ρ on V1 otimes V2 the reduced density operator ρ1 = tr2 ρ is uniquelydefined by the property that tr(ρ1X1) = tr(ρ(X1otimesIdV2)) for all operators X1 on V1The operator ρ1 is again a density operator The operation tr2 is called the partialtrace over V2 In an explicit form ρ1 is given by 〈ei ρ1(ej)〉 =

sum`〈eiotimesf` ρ(ejotimesf`)〉

where the ei form a basis of V1 and the fi form an orthonormal basis of V2 (thestatement is independent of basis choice)

Let Vi be a Hilbert space and consider the tensor product V1 otimes V2 otimes V3Associate with t isin V1 otimes V2 otimes V3 the dual element tlowast = 〈t middot〉 isin (V1 otimes V2 otimes V3)

lowastThen ρt = ttlowast〈t t〉 = t〈t middot〉〈t t〉 is a density operator on V1otimesV2otimesV3 Viewing ρt

as a density operator on the regrouped space V1 otimes (V2 otimes V3) we may take thepartial trace of ρt over V2otimesV3 as described above We denote the resulting densityoperator by ρt1 = tr23 ρ

t We similarly define ρt2 and ρt3

Let V = V1otimesV2otimesV3 Let G = GL(V1)timesGL(V2)timesGL(V3) act naturally on V Let t isin V 0 The moment polytope of t is

P(t) = P(G middot t) = (spec(ρu1) spec(ρu2) spec(ρu3)) u isin G middot t 0

Here G middot t denotes the Zariski closure or equivalently the Euclidean closure in Vof the orbit G middot t = g middot t g isin G

Representation-theoretic description On the other hand there is a de-scription of P(t) in terms of non-vanishing of representation-theoretic multiplicitiesWe do not state this description here but stress that it is crucial for our proofs

Quantum functionals For any probability vector θ isin Rk (iesumk

i=1 θ(i) = 1and θ(i) ge 0 for all i isin [k]) we define the quantum functional F θ as an optimisationover the moment polytope

F θ(t) = max

2sumki=1 θ(i)H(x(i)) (x(1) x(k)) isin P(t)

Here H(y) denotes Shannon entropy of the probability vector y We prove that F θ

satisfies properties (a) (b) (c) and (d) for all complex k-tensors

Theorem (Theorem 611) F θ isin X(complex k-tensors)

10 Chapter 1 Introduction

To put our result into context Strassen in [Str91] constructed elements in theasymptotic spectrum of S = oblique k-tensors over F with the preorder 6|SThe set S is a strict and non-generic subset of all k-tensors over F These elementswe call the (Strassen) support functionals On oblique tensors over C the quantumfunctionals and the support functionals coincide An advantage of the supportfunctionals over the quantum functionals is that they are defined over any fieldIn fact the support functionals are ldquopowerful enoughrdquo to reprove the result ofEllenbergndashGijswijt on cap sets [EG17] We discuss the support functionals inSection 44

13 Higher-order CW method

Recall that in the asymptotic restriction problem we have an obstruction directionand a construction direction The quantum functionals and the support functionalsprovide obstructions Now we look at the construction direction Constructionsare asymptotic transformations sotimesβn+o(n) gt totimesn We restrict attention to the casethat t is a diagonal tensor 〈r〉 Constructions in this case essentially correspondto lower bounds on the asymptotic subrank ˜Q(s) The goal is now to constructgood lower bounds on ˜Q(s)

Strassen solved the problem of computing the asymptotic subrank for so-calledtight 3-tensors with the CoppersmithndashWinograd (CW) method and the supportfunctionals [CW90 Str91] The CW method is combinatorial Let us introduce thecombinatorial viewpoint Let I1 Ik be finite sets We call a set D sube I1timesmiddot middot middottimesIka diagonal if any two distinct elements a b isin D differ in all k coordinates LetΦ sube I1timesmiddot middot middottimes Ik We call a diagonal D sube Φ free if D = Φcap (D1timesmiddot middot middottimesDk) HereDi = ai a isin D is the projection ofD onto the ith coordinate The subrank Q(Φ)of Φ is the size of the largest free diagonal D sube Φ For two sets Φ sube I1 times middot middot middot times Ikand Ψ sube J1 times middot middot middot times Jk we define the product ΦtimesΨ sube (I1 times J1)times middot middot middot times (Ik times Jk)by Φ times Ψ = ((a1 b1) (ak bk)) a isin Φ b isin Ψ The asymptotic subrankis defined as ˜Q(Φ) = limnrarrinfinQ(Φtimesn)1n One may think of Φ as a k-partitehypergraph and of a free diagonal in Φ as an induced k-partite matching

How does this combinatorial version of subrank relate to the tensor version ofsubrank that we defined earlier Let t isin Fn1otimesmiddot middot middototimesFnk Expand t in the standardbasis t =

sumiisin[n1]timesmiddotmiddotmiddottimes[nk] ti ei1 otimes middot middot middot otimes eik Let supp(t) be the support of t in the

standard basis supp(t) = i isin [n1]timesmiddot middot middottimes [nk] ti 6= 0 Then Q(supp(t)) le Q(t)We want to construct large free diagonals Let Φ sube I1timesmiddot middot middottimesIk We call Φ tight

if there are injective maps αi Ii rarr Z such that if a isin Φ thensumk

i=1 αi(ai) = 0For a set X let P(X) be the set of probability distributions on X For θ isin P([k]) letHθ(Φ) = maxPisinP(Φ)

sumki=1 θ(i)H(Pi) where H(Pi) denotes the Shannon entropy

of the ith marginal distribution of P In [Str91] Strassen used the CW methodand the support functionals to characterise the asymptotic subrank ˜Q(Φ) fortight Φ sube I1 times I2 times I3 He proved the following Let Φ sube I1 times I2 times I3 be tight

14 Abstract asymptotic spectra 11

Then

˜Q(Φ) = minθisinP([3])

2Hθ(Φ) = maxPisinP(Φ)

miniisin[3]

2H(Pi) (11)

We study the higher-order regime Φ sube I1 times middot middot middot times Ik k ge 4

Theorem (Theorem 57) Let Φ sube I1 times middot middot middot times Ik be tight Then ˜Q(Φ) is lowerbounded by an expression that generalizes the right-hand side of (11)

Stating the lower bound requires a few definitions so we do not state it hereIt is not known whether our new lower bound matches the upper bound given byquantum or support functionals

Using Theorem 57 we managed to exactly determine the asymptotic subranksof several new examples These results in turn we used to obtain upper boundson the asymptotic rank of so-called complete graph tensors via a higher-orderStrassen laser method

14 Abstract asymptotic spectra

Strassen mainly studied tensors but he developed an abstract theory of asymptoticspectra in a general setting In the next section we apply this abstract theory tographs We now introduce the abstract theory One has a semiring S (think of asemiring as a ring without additive inverses) that contains N and a preorder 6on S that (1) behaves well with respect to the semiring operations (2) inducesthe natural order on N and (3) for any a b isin S b 6= 0 there is an r isin N sube Swith a 6 r middot b We call such a preorder a Strassen preorder The main theoremis that the asymptotic version 6sim of the Strassen preorder is characterised by themonotone semiring homomorphisms S rarr Rge0 For a b isin S let a 6sim b if there is a

sequence xn isin NN with x1nn rarr 1 when nrarrinfin and an 6 bnxn for all n isin N Let

X = X(S6) = φ isin Hom(SRge0) foralla b isin S a 6 brArr φ(a) le φ(b)

The set X is called the asymptotic spectrum of (S6)

Theorem (Strassen) a 6sim b iff forallφ isin X φ(a) le φ(b)

Strassen applies this theorem to study rank and subrank of tensors Wedefine an abstract notion of rank R(a) = minn isin N a 6 n and an abstractnotion of subrank Q(a) = maxm isin N m 6 a We then naturally have anasymptotic rank ˜R(a) = limnrarrinfinR(an)1n and (under certain mild conditions) anasymptotic subrank ˜Q(a) = limnrarrinfinQ(an)1n In fact ˜R(a) = infn R(an)1n and

˜Q(a) = supn Q(an)1n by Feketersquos lemma The theorem implies the following dualcharacterisations

12 Chapter 1 Introduction

Corollary (Section 28) If a isin S with ak gt 2 for some k isin N then

˜Q(a) = minφisinX

φ(a)

If a isin S with φ(a) ge 1 for some φ isin X then

˜R(a) = maxφisinX

φ(a)

In Chapter 2 we will discuss the abstract theory of asymptotic spectra Wewill discuss a proof of the above theorem that is obtained by integrating the proofsof Strassen in [Str88] and the proof of the KadisonndashDubois theorem of Beckerand Schwarz in [BS83] We will also discuss some basic properties of generalasymptotic spectra

15 The asymptotic spectrum of graphs

In the previous section we have seen the abstract theory of asymptotic spectraWe now discuss a problem in graph theory where we can apply this abstracttheory Consider a communication channel with input alphabet a b c d e andoutput alphabet 1 2 3 4 5 When the sender gives an input to the channel thereceiver gets an output according to the following diagram where an outgoingarrow is picked randomly (say uniformly randomly)

a 1

b 2

c 3

d 4

e 5

Output 2 has an incoming arrow from a and an incoming arrow from b Wesay a and b are confusable because the receiver cannot know whether a or bwas given as an input to the channel In this channel the pairs of inputsa b b c c d d e e a are confusable If we restrict the input set toa subset of pairwise non-confusable letters say a c then we can use the channelto communicate two messages with zero error It is clear that for this channel anynon-confusable set of inputs has size at most two Can we make better use of thechannel if we use the channel twice Yes now the input set is the set of two letterwords aa ab ac ad ae ba bb and we have a set of pairwise non-confusablewords aa bc ce db ed which has size 5 Thus ldquoper channel userdquo we can send atleast

radic5 letters What happens if we use the channel n times

15 The asymptotic spectrum of graphs 13

The situation is concisely described by drawing the confusability graph of thechannel which has the input letters as vertices and the confusable pairs of inputletters as edges For the above channel the confusability graph is the 5-cycle C5

a

b

cd

e

A subset of inputs that are pairwise non-confusable corresponds to a subset ofthe vertices in the confusability graph that contains no edges an independent setThe independence number of any graph G is the size of the largest independentset in G and is denoted by α(G) If G is the confusability graph of some channelthen the confusability graph for using the channel n times is denoted by Gn (thegraph product is called the strong graph product) The question of how manyletters we can send asymptotically translates to computing the limit

Θ(G) = limnrarrinfin

α(Gn)1n

which exists because α is supermultiplicative under The parameter Θ(G) wasintroduced by Shannon [Sha56] and is called the Shannon capacity of the graph GComputing the Shannon capacity is a nontrivial problem already for small graphsLovasz in 1979 [Lov79] computed the value Θ(C5) =

radic5 by introducing and

evaluating a new graph parameter ϑ which is now known as the Lovasz thetanumber Already for the 7-cycle C7 the Shannon capacity is not known

Duality theorem We propose a new application of the abstract theory ofasymptotic spectra to graph theory The main theorem that results from this is adual characterisation of the Shannon capacity of graphs For graphs G and H wesay G 6 H if there is a graph homomorphism Grarr H ie from the complementof G to the complement of H We show graphs are a semiring under the stronggraph product and the disjoint union t and 6 is a Strassen preorder onthis semiring The rank in this setting is the clique cover number χ(middot) = χ( middot )ie the chromatic number of the complement The subrank in this setting is theindependence number α(middot) Let X(G) be the set of semiring homomorphismsfrom graphs to Rge0 that are monotone under 6 From the abstract theory ofasymptotic spectra we derive the following duality theorem

Theorem (Theorem 31) Θ(G) = minφisinX(G) φ(G)

In Chapter 3 we will prove Theorem 31 and we will discuss the known elementsin X(G) which are the Lovasz theta number and a family of parameters obtainedby ldquofractionalisingrdquo

14 Chapter 1 Introduction

16 Tensor degeneration

We move to the second story line that we mentioned earlier degeneration Degen-eration is a prominent theme in algebraic complexity theory Roughly speakingdegeneration is an algebraic notion of approximation defined via orbit closures

For tensors for example degeneration is defined as follows Let V1 V2 V3

be finite-dimensional complex vector spaces and let V = V1 otimes V2 otimes V3 be thetensor product space Let G = GL(V1) times GL(V2) times GL(V3) act naturally on V Let s t isin V Let G middot t = g middot t g isin G be the orbit of t under G We say tdegenerates to s and write t s if s is an element in the orbit closure G middot t Herethe closure is taken with respect to the Zariski topology or equivalently withrespect to the Euclidean topology One should think of this degeneration asa topologically closed version of the restriction preorder le for tensors that wedefined earlier Degeneration is a ldquolargerrdquo preorder than restriction in the sensethat s t implies s le t

In several algebraic models of computation approximative computations cor-respond to certain degenerations In some models such an approximative com-putation can be turned into an exact computation at a small cost for exampleusing the method of interpolation The currently fastest matrix multiplicationalgorithms are constructed in this way for example

On the other hand it turns out that if a lower bound technique for analgebraic measure of complexity is ldquocontinuousrdquo then the lower bounds obtainedwith this technique are already lower bounds on the approximative version ofthe complexity measure This observation turns approximative complexity anddegeneration into an interesting topic itself A research program in this directionis the geometric complexity theory program of Mulmuley and Sohoni towardsseparating the algebraic complexity class VP (and related classes) from VNP[MS01] (see also [Ike13])

In this section we briefly discuss three results related to degeneration of tensorsthat are not discussed further in this dissertation Then we will discuss resultson combinatorial degeneration in Section 17 and algebraic branching programdegeneration in Section 18

Ratio of tensor rank and border rank The approximative or degenera-tion version of tensor rank is called border rank and is denoted by R It has beenknown since the work of Bini and Strassen that tensor rank R and border rank Rare different How much can they be different In [Zui17] we showed the followinglower bound Let k ge 3 There is a sequence of k-tensors tn in (C2n)otimesk such thatR(tn)R(tn) ge k minus o(1) when n rarr infin This answers a question of Landsbergand Micha lek [LM16b] and disproves a conjecture of Rhodes [AJRS13] Furtherprogress will most likely require the construction of explicit tensors with hightensor rank which has implications in formula complexity [Raz13]

Border support rank Support rank is a variation on tensor rank whichhas its own approximative version called border support rank A border support

17 Combinatorial degeneration 15

rank upper bound for the matrix multiplication tensor yields an upper bound onthe asymptotic complexity This was shown by Cohn and Umans in the contextof the group theoretic approach towards fast matrix multiplication [CU13] Theyasked what is the border support rank of the smallest matrix multiplicationtensor 〈2 2 2〉 In [BCZ17a] we showed that it equals seven Our proof usesthe highest-weight vector technique (see also [HIL13]) Our original motivationto study support rank is a connection that we found between support rank andnondeterministic multiparty quantum communication complexity [BCZ17b]

Tensor rank under outer tensor product We applied degeneration asa tool to study an outer tensor product otimes on tensors For s isin Cn1 otimes middot middot middot otimes Cnk

and t isin Cm1 otimes middot middot middot otimes Cm` let s otimes t be the natural (k + `)-tensor in Cn1 otimes middot middot middot otimesCnk otimes Cm1 otimes middot middot middot otimes Cm` The products otimes and otimes differ by a regrouping of thetensor indices It is well known that tensor rank is not multiplicative under otimesIn [CJZ18] we showed that tensor rank is already not multiplicative under otimes astronger result Nonmultiplicativity occurs when taking a power of a tensor whoseborder rank is strictly smaller than its tensor rank This answers a question ofDraisma [Dra15] and Saptharishi et al [CKSV16]

17 Combinatorial degeneration

In the previous section we introduced the general idea of degeneration and discusseddegeneration of tensors Combinatorial degeneration is the combinatorial analogueof tensor degeneration Consider sets Φ sube Ψ sube I1 times middot middot middot times Ik of k-tuples Wesay Φ is a combinatorial degeneration of Ψ and write Ψ Φ if there are mapsui Ii rarr Z such that for all α isin I1 times middot middot middot times Ik if α isin Ψ Φ then

sumki=1 ui(αi) gt 0

and if α isin Φ thensumk

i=1 ui(αi) = 0 We prove that combinatorial asymptoticsubrank is nonincreasing under combinatorial degeneration

Theorem (Theorem 521) If Ψ Φ then ˜Q(Ψ) ge ˜Q(Φ)

The analogous statement for subrank of tensors is trivially true The crucialpoint is that Theorem 521 is about combinatorial subrank As an example Theo-rem 521 combined with the CW method yields an elegant optimal constructionof tri-colored sum-free sets which are combinatorial objects related to cap sets

18 Algebraic branching program degeneration

We now consider degeneration in the context of algebraic branching programs Acentral theme in algebraic complexity theory is the study of the power of differentalgebraic models of computation and the study of the corresponding complexityclasses We have already (implicitly) used an algebraic model of computationwhen we discussed matrix multiplication circuits

16 Chapter 1 Introduction

bull A circuit is a directed acyclic graph G with one or more source verticesand one sink vertex Each source vertex is labelled by a variable xi ora constant α isin F The other vertices are labelled by either + or times andhave in-degree 2 (that is fan-in 2) Each vertex of G naturally computesa polynomial The value of G is the element computed at the sink vertexThe size of G is the number of vertices (One may also allow multiple sinkvertices in order to compute multiple polynomials eg to compute matrixmultiplication) Here is an example of a circuit computing xy + 2x+ y minus 1

minus1 2 x y source vertices

times times

+ +

+ sink vertex

Consider the following two models

bull A formula is a circuit whose graph is a tree

bull An algebraic branching program (abp) is a directed acyclic graph G withone source vertex s one sink vertex t and affine linear forms over the basefield F as edge labels Moreover each vertex is labeled with an integer (itslayer) and the arrows in the abp point from vertices in layer i to vertices inlayer i+ 1 The cardinality of the largest layer we call the width of the abpThe number of vertices we call the size of the abp The value of an abp isthe sum of the values of all sndasht-paths where the value of an sndasht-path is theproduct of its edge labels We say that an abp computes its value Here isan example of a width-3 abp computing xy + 2x+ y minus 1

s

t

x2

xyminus1

19 Organisation 17

The above models of computation give rise to complexity classes A complexityclass consists of families of multivariate polynomials (fn)n = (f(x1 xqn)n)nisinNover some fixed field F We say a family of polynomials (fn)n is a p-family if thedegree of fn and the number of variables of fn grow polynomially in n Let VPbe the class of p-families with polynomially bounded circuit size Let VPe be theclass of p-families with polynomially bounded formula size For k isin N let VPk bethe class of families of polynomials computable by width-k abps of polynomiallybounded size Let VPs be the class of p-families computable by skew circuitsof polynomial size Skew circuits are a type of circuits between formulas andgeneral circuits The class VPs coincides with the class of families of polynomialscomputable by abps of polynomially bounded size (see eg [Sap16]) Ben-Orand Cleve proved that VP3 = VP4 = middot middot middot = VPe [BOC92] Allender and Wangproved VP2 ( VP3 [AW16] Thus VP2 ( VP3 = VP4 = middot middot middot = VPe sube VPsThe following separation problem is one of the many open problems regardingalgebraic complexity classes Is the inclusion VPe sube VPs strict Motivated by thisseparation problem we study the approximation closure of VPe We mentionedthat Ben-Or and Cleve proved that formula size is polynomially equivalent towidth-3 abp size [BOC92] Regarding width-2 there are explicit polynomials thatcannot be computed by any width-2 abp of any size [AW16] The abp model hasa natural notion of approximation When we allow approximation in our abpsthe situation changes completely

Theorem (Theorem 78) Any polynomial can be approximated by a width-2 abpof size polynomial in the formula size

In terms of complexity classes this means VP2 = VPe where middot denotes theldquoapproximation closurerdquo of the complexity class The theorem suggests an ap-proach regarding the separation of VPe and VPs Namely superpolynomial lowerbounds on formula size may be obtained from superpolynomial lower bounds onapproximate width-2 abp size We moreover study the nondeterminism closure ofcomplexity classes and prove a new characterisation of the complexity class VNP

19 Organisation

This dissertation is divided into chapters as follows We will begin with the abstracttheory of asymptotic spectra in Chapter 2 Then we introduce the asymptoticspectra of graphs and a new characterisation of the Shannon capacity in Chapter 3In Chapter 4 we introduce the asymptotic spectrum of tensors discuss the supportfunctionals of Strassen for oblique tensors and a characterisation of asymptoticslice rank of oblique tensors as the minimum over the support functionals InChapter 5 we discuss tight tensors the higher-order CoppersmithndashWinogradmethod the combinatorial degeneration method and applications to the cap setproblem type sets and graph tensors In Chapter 6 we introduce an infinite family

18 Chapter 1 Introduction

of elements in the asymptotic spectrum of complex k-tensors and characterise theasymptotic slice rank as the minimum over the quantum functionals Finally inChapter 7 we study algebraic branching programs and approximation closure andnondeterminism closure of algebraic complexity classes

Chapter 2

The theory of asymptotic spectra

21 Introduction

This is an expository chapter about the abstract theory of asymptotic spectra ofVolker Strassen [Str88] The theory studies semirings S that are endowed with apreorder 6 The main result Theorem 212 is that under certain conditions theasymptotic version 6sim of this preorder is characterised by the semiring homomor-phisms S rarr Rge0 that are monotone under 6 These monotone homomorphismsmake up the ldquoasymptotic spectrumrdquo of (S6) For the elements of S we havenatural notions of rank and subrank generalising rank and subrank of tensorsThe asymptotic spectrum gives a dual characterisation of the asymptotic versionsof rank and subrank This dual description may be thought of as a ldquolower boundrdquomethod in the sense of computational complexity theory In Chapter 3 andChapter 4 we will study two specific pairs (S6)

22 Semirings and preorders

A (commutative) semiring is a set S with a binary addition operation + a binarymultiplication operation middot and elements 0 1 isin S such that for all a b c isin S

(1) + is associative (a+ b) + c = a+ (b+ c)

(2) + is commutative a+ b = b+ a

(3) 0 + a = a

(4) middot is associative (a middot b) middot c = a middot (b middot c)

(5) middot is commutative a middot b = b middot a

(6) 1 middot a = a

19

20 Chapter 2 The theory of asymptotic spectra

(7) middot distributes over + a middot (b+ c) = (a middot b) + (a middot c)

(8) 0 middot a = 0

As usual we abbreviate a middot b as ab A preorder is a relation 4 on a set X such thatfor all a b c isin X

(1) 4 is reflexive a 4 a

(2) 4 is transitive a 4 b and b 4 c implies a 4 c

As usual a 4 b is the same as b lt a Let N = 0 1 2 be the set of naturalnumbers and let Ngt0 = 1 2 be the set of strictly-positive natural numbersWe write le for the natural order 0 le 1 le 2 le 3 le middot middot middot on N

23 Strassen preorders

Let S be a semiring with N sube S A preorder 4 on S is a Strassen preorder if

(1) forallnm isin N n le m iff n 4 m

(2) foralla b c d isin S if a 4 b and c 4 d then a+ c 4 b+ d and ac 4 bd

(3) foralla b isin S b 6= 0 existr isin N a 4 rb

Note that condition (2) is equivalent to the condition foralla b s isin S if a 4 b thena+ s 4 b+ s and as 4 bs

Let 4 be a Strassen preorder on S Then 0 4 1 by condition (1) For a isin Swe have a 4 a by reflexivity and thus 0 4 a by condition (2)

Examples

We give two examples of a semiring with a Strassen preorder Proofs and formaldefinitions are given later

Graphs Let S be the set of all (isomorphism classes of) finite simple graphsLet GH isin S Let G t H be the disjoint union of G and H Let G H bethe strong graph product of G and H (see Chapter 3) With addition t andmultiplication the set S becomes a semiring The 0 in S is the graph with novertices and the 1 in S is the graph with a single vertex Let G be the complementof G Define a preorder 6 on S by G 6 H if there is a graph homomorphismGrarr H Then 6 is a Strassen preorder We will investigate this semiring furtherin Chapter 3

24 Asymptotic preorders 4sim 21

Tensors Let F be a field Let k isin N Let S be the set of all k-tensors over Fwith arbitrary format that is S = cupFn1 otimes middot middot middot otimes Fnk n1 nk isin N Fors isin Fn1 otimes middot middot middot otimes Fnk and t isin Fm1 otimes middot middot middot otimes Fmk let s 6 t if there are linear mapsAi Fmi rarr Fni with (A1otimesmiddot middot middototimesAk)t = s We identify any s t isin S for which s 6 tand t 6 s Let oplus be the direct sum of k-tensors and let otimes be the tensor productof k-tensors (see Chapter 4) With addition oplus and multiplication otimes the set Sbecomes a semiring The 0 in S is the zero tensor and the 1 in S is the standardbasis element e1otimes middot middot middot otimes e1 isin F1otimes middot middot middot otimes F1 The preorder 6 is a Strassen preorderWe will investigate this semiring further in Chapter 4 Chapter 5 and Chapter 6

24 Asymptotic preorders 4simDefinition 21 Let 4 be a relation on S Define the relation 4sim on S by

a2 4sim a1 if exist(xN) isin NN infNx

1NN = 1 forallN isin N aN2 4 aN1 xN (21)

If 4 is a Strassen preorder then we may in (21) replace the infimum infN x1NN

by the limit limNrarrinfin x1NN since we may assume xN+M le xNxM (if aN2 4 aN1 xN

and aM2 4 aM1 xM then aN+M2 4 aN+M

1 xNxM) and then apply Feketersquos lemma(Lemma 22)

Lemma 22 (Feketersquos lemma see [PS98 No 98]) Let x1 x2 x3 isin Rge0 satisfyxn+m le xn + xm Then limnrarrinfin xnn = infn xnn

Proof Let y = infn xnn Let ε gt 0 Let m isin Ngt0 with xmm lt y + ε Anyn isin N can be written in the form n = qm+ r where r is an integer 0 le r le mminus 1Set x0 = 0 Then xn = xqm+r le xm + xm + middot middot middot+ xm + xr = qxm + xr Therefore

xnn

=xqm+r

qm+ rle qxm + xr

qm+ r=xmm

qm

qm+ r+xrn

Thus

y le xnnlt (y + ε)

qm

n+xrn

The claim follows because xrnrarr 0 and qmnrarr 1 when nrarrinfin

For a1 a2 isin S if a1 4 a2 then clearly a1 4sim a2

Lemma 23 Let 4 be a Strassen preorder on S Then 4sim is a Strassen preorderon S the ldquoasymptotic preorderrdquo corresponding to 4

Proof Let a b c d isin S We verify that 4sim is a preorderFirst reflexivity We have a 4 a so aN 4 aN middot 1 so a 4sim a

22 Chapter 2 The theory of asymptotic spectra

Second transitivity Let a 4sim b and b 4sim c This means aN 4 bNxN andbN 4 cNyN with x

1NN rarr 1 and y

1NN rarr 1 Then aN 4 bNxN 4 cNxNyN Since

(xNyN)1N rarr 1 we conclude a 4sim cWe verify condition (1) Let nm isin N If n le m then n 4 m so n 4sim m If

n 4sim m then nN 4 nMxN so nN le mNxN which implies n le mWe verify condition (2) Let a 4sim b and c 4sim d This means aN 4 bNxN and

cN 4 dNyN Thus aNcN 4 bNdNxNyN and so ac 4sim bd Assume xN and yN arenondecreasing (otherwise set xN = maxnleN xn) Then

(a+ c)N =Nsumm=0

(N

m

)amcNminusm 4

Nsumm=0

(N

m

)bmdNminusmxmyNminusm

4Nsumm=0

(N

m

)bmdNminusmxNyN = (b+ d)NxNyN

Thus a+ c 4sim b+ dWe verify (3) Let a b isin S b 6= 0 Then there is an r isin N with a 4 rb and

thus a 4sim rb

Lemma 24 Let 4 be a Strassen preorder on S Let a1 a2 b isin S

(i) If a2 + b 4 a1 + b then a2 4sim a1

(ii) If a2b 4 a1b with b 6= 0 then a2 4sim a1

(iii) If a24simsim a1 then a2 4sim a1

(iv) If exists isin S foralln isin N na2 4 na1 + s then a2 4sim a1

Proof (ii) Let a2b 4 a1b By an inductive argument similar to the argument weused to prove (24)

forallN isin N aN2 b 4 aN1 b (22)

Let m r isin N with 1 4 mb 4 r (We use b 6= 0) From (22) follows

forallN isin N aN2 4 aN2 mb 4 aN1 mb 4 aN1 r

Thus we conclude a2 4sim a1(iii) Let a2

4simsim a1 This means aN2 4sim aN1 xN with x1NN rarr 1 This in turn means

that (aN2 )M 4 (aN1 xN)MyNM with forallN isin N y1MNM rarr 1 that is

aNM2 4 aNM1 xMN yNM

Choose a sequence N 7rarrMN such that (yNMN)1MN le 2 eg given N let MN be

the smallest M for which (yNM)1M le 2 Then aNMN2 4 aNMN

1 xMNN yNMN

and

(xMNN yNMN

)1(NMN ) = x1NN (yNMN

)1(NMN ) le x1NN 21N rarr 1

25 Maximal Strassen preorders 23

We conclude a2 4sim a1(iv) Let s isin S with foralln isin N na2 4 na1 + s We may assume a1 6= 0 Let k isin N

with s 4 ka1 Then

foralln isin N kna2 4 kna1 + ka1 = ka1(n+ 1) (23)

Apply (ii) to (23) to get

foralln isin N a2n 4sim a1(n+ 1)

By an inductive argument

forallN isin N aN2 4sim aNminus12 a12 4sim aNminus2

2 a213 4sim middot middot middot 4sim aN1 (N + 1)

Since (N + 1)1N rarr 1 a24simsim a1 From (iii) follows a2 4sim a1

(i) Let a2 + b 4 a1 + b We first prove

forallq isin N qa2 + b 4 qa1 + b (24)

By assumption the statement is true for q = 1 suppose the statement is truefor q minus 1 then

qa2 + b = (q minus 1)a2 + (a2 + b) 4 (q minus 1)a2 + (a1 + b)

= ((q minus 1)a2 + b) + a1 4 ((q minus 1)a1 + b) + a1 = qa1 + b

which proves the statement by induction Then foralln isin N na2 4 na1 + b From (iv)follows a2 4sim a1

25 Maximal Strassen preorders

Let P be the set of Strassen preorders on S For 4142 isin P we write 42 sube 41

if for all a b isin S a 42 b implies a 41 b (The notation 42 sube 41 is natural if wethink of the relations 4i as sets of pairs (a b) with a 4i b)

Lemma 25 Let 4 isin P with 4 = 4sim and a2 64 a1 Then there is an element4a1a2 isin P with 4 sube 4a1a2 and a1 4a1a2 a2

Proof For x1 x2 isin S let

x1 4a1a2 x2 if exists isin S x1 + sa2 4 x2 + sa1

The relation 4a1a2 is reflexive since x + 0 middot a2 4 x + 0 middot a1 The relation 4a1a2

is transitive if x1 4a1a2 x2 and x2 4a1a2 x3 then x1 + sa2 4 x2 + sa1 andx2 + ta2 4 x3 + ta1 for some s t isin S and so x1 + (t + s)a2 4 x2 + ta2 + sa1 4x3 + ta1 + sa1 = x3 + (t + s)a1 Thus x1 4a1a2 x3 We conclude that 4a1a2 is apreorder on S

24 Chapter 2 The theory of asymptotic spectra

We prove that 4a1a2 is a Strassen preorder If x1 4a1a2 x2 and y1 4a1a2 y2then clearly x1 + y1 4a1a2 x2 + y2 If x1 4a1a2 x2 and y isin S then x1y 4a1a2 x2yFrom this follows if x1 4a1a2 x2 and y1 4a1a2 y2 then x1y2 4a1a2 x2y2

Let nm isin N If n le m then n 4 m so n 4a1a2 m If n 6le m then n ge m+ 1Suppose n 4a1a2 m Let s isin S with n+ sa2 4 m+ sa1 Adding m+ 1 4 n gives

m+ 1 + n+ sa2 4 n+m+ sa1

Since 4 = 4sim we may apply Lemma 24 (i) to obtain

1 + sa2 4 sa1 (25)

From (25) follows s 6= 0 From (25) also follows

sa2 4 sa1 (26)

Since 4 = 4sim we may apply Lemma 24 (ii) to (26) to obtain the contradiction

a2 4 a1

Therefore n 64a1a2 m We conclude that 4a1a2 is a Strassen preorder thatis 4 isin P

Finally we have a1 4a1a2 a2 since a1 + 1 middot a2 4 a2 + 1 middot a1 Also if x1 4 x2then x1 + 0 middot a2 4 x2 + 0 middot a1 that is 4 sube 4a1a2

Let 4 be a Strassen preorder Let P4 be the set of Strassen preorderscontaining 4 ordered by inclusion sube Let C sube P4 be any chain Then theunion of all preorders in C is an element of P4 and contains all elements of CTherefore by Zornrsquos lemma P4 contains a maximal element (maximal withrespect to inclusion sube)

Lemma 26 Let 4 be maximal in P Then 4 = 4sim

Proof Trivially 4 sube 4sim From Lemma 23 we know 4sim isin P From maximalityof 4 follows 4 = 4sim

A relation 4 on S is total if for all a b isin S a 4 b or b 4 a

Lemma 27 Let 4 be maximal in P Then 4 is total

Proof Suppose 4 is not total say a1 64 a2 and a2 64 a1 By Lemma 25 there is anelement 4a1a2isin P with 4 sube 4a1a2 and a1 4a1a2 a2 Then 4 is strictly containedin 4a1a2 which contradicts the maximality of 4 We conclude 4 is total

26 The asymptotic spectrum X(S6) 25

26 The asymptotic spectrum X(S6)

Definition 28 Let S be a semiring with N sube S and let 6 be a Strassen preorderon S Let

X(S6) = φ isin Hom(SRge0) a 6 brArr φ(a) le φ(b)

We call X(S6) the asymptotic spectrum of (S6) We call the elements ofX(S6) spectral points

Lemma 29 Let 4 isin P be total There is exactly one semiring homomorphismφ S rarr Rge0 with

a 4 brArr φ(a) le φ(b)

Moreover if 4 is maximal in P then

a 4 bhArr φ(a) le φ(b)

Proof Let 4 isin P be total For a isin S define

φ(a) = inf rs

r s isin N sa 4 rψ(a) = supu

v u v isin N u 4 va

We prove ψ(a) le φ(a) Let r s u v isin N Suppose u 4 va and sa 4 r Thenfollows su 4 vsa 4 vr Thus uv le rs We prove ψ(a) ge φ(a) Supposeψ(a) lt φ(a) Let r s isin N with ψ(a) lt rs lt φ(a) Then sa 64 r Fromtotality follows sa lt r Thus ψ(a) ge rs which is a contradiction We concludeψ(a) = φ(a)

Let a b isin S We prove φ(a+ b) le φ(a) + φ(b) Let sa sb ra rb isin N Supposesaa 4 ra and sbb 4 rb Then sasba 4 sbra and sasbb 4 sarb By additionsasb(a+b) 4 sbra+sarb Thus φ(a+b) le ra

sa+ rb

sb We prove ψ(a+b) ge ψ(a)+ψ(b)

Suppose ua 4 vaa and ub 4 vbb Then vbua 4 vavba and vaub 4 vavbb By additionvbua + vaub 4 vavb(a+ b) Thus ψ(a+ b) ge ua

va+ ub

vb We thus have additivity

We prove φ(ab) le φ(a)φ(b) Suppose saa 4 ra and sbb 4 rb Then sasbab 4rarb Thus φ(ab) le ra

sa

rbsb

We prove ψ(ab) ge ψ(a)ψ(b) Suppose ua 4 vaa and

ub 4 vbb Then uaub 4 vavbab Thus uava

ubvble ψ(ab) We thus have multiplicativity

We prove monotonicity a 4 brArr φ(a) le φ(b) Suppose sbb 4 rb From a 4 bfollows sba 4 sbb 4 rb Thus φ(a) le rb

sb

We prove φ(1) = 1 Trivially 1 4 1 Therefore φ(1) le 11

= 1 and ψ(1) ge 11

= 1We prove φ(0) = 0 Trivially sa0 4 0 so φ(0) le 0

sa= 0 Trivially 0 4 va0 so

φ(0) ge 0va

= 0We prove the uniqueness of φ Let φ1 φ2 be semiring homomorphisms S rarr Rge0

with a 4 b rArr φi(a) le φi(b) Suppose φ1(a) lt φ2(a) Let u v isin N with

26 Chapter 2 The theory of asymptotic spectra

φ1(a) lt uvlt φ2(a) Then va 64 u so by totality va lt u Thus φ1(a) ge u

v which

is a contradiction This proves uniquenessFinally suppose 4 is maximal in P Lemma 26 gives 4 = 4sim Let a 64 b

From Lemma 24 (iv) follows existn na 64 nb+ 1 By totality na lt nb+ 1 Apply φto get φ(a) ge φ(b) + 1

n In particular φ(a) gt φ(b)

Lemma 210 The map

X(S6)rarr maximal elements in P6 φ 7rarr 4φwith a 4φ b iff φ(a) le φ(b) is a bijection

Proof Let φ isin X(S6) One verifies that 4φ is a Strassen preorder and 6 sube 6sim sube4φ Let 4 be maximal in P4φ Lemma 27 says that 4 is total By Lemma 29there is a ψ isin X(S6) with 4 sube 4ψ Clearly 4φ sube 4ψ The uniqueness statementof Lemma 29 implies φ = ψ This means 4φ = 4 that is 4φ is maximal Weconclude that the map is well defined

Let 4 maximal in P6 Then 4 is total By Lemma 29 there is a φ isin X(S6)with 4 sube 4φ We conclude the map is surjective

Let φ ψ isin X(S6) with 4φ = 4ψ From Lemma 29 follows φ = ψ Weconclude the map is injective

Lemma 211 Let a b isin S Then a 6sim b iff a 4 b for all maximal 4 isin P6

Proof Let 4 isin P6 be maximal Then 6sim sube 4sim = 4 by Lemma 26 so a 6sim bimplies a 4 b

Suppose a 66sim b Let n isin Nge1 with na 66sim nb+1 (Lemma 24 (iv)) By Lemma 25there is an element 4nb+1na isin P with 6sim sube 4nb+1na and we may assume 4nb+1na

is maximal Then nb+ 1 4nb+1na na and so a 64nb+1na b

27 The representation theorem

The following theorem is the main theorem

Theorem 212 ([Str88 Th 24]) Let S be a commutative semiring with N sube Sand let 6 be a Strassen preorder on S Let X = X(S6) be the set of 6-monotonesemiring homomorphisms from S to Rge0

X = X(S6) = φ isin Hom(SRge0) foralla b isin S a 6 brArr φ(a) le φ(b)

For a b isin S let a 6sim b if there is a sequence (xN) isin NN with x1NN rarr 1 when

N rarrinfin such that forallN isin N aN 6 bNxN Then

foralla b isin S a 6sim b iff forallφ isin X φ(a) le φ(b)

Proof Let a b isin S Suppose a 6sim b Then clearly for all φ isin X we haveφ(a) le φ(b) Suppose a 66sim b By Lemma 211 there is a maximal 4isin P6 witha 64 b By Lemma 210 there is a φ isin X with φ(a) gt φ(b)

28 Abstract rank and subrank RQ 27

28 Abstract rank and subrank RQ

We generalise the notions of rank and subrank for tensors to arbitrary semiringswith a Strassen preorder Let a isin S Define the rank

R(a) = minr isin N a 6 r

and the subrank

Q(a) = maxr isin N r 6 a

Then Q(a) le R(a) Define the asymptotic rank

˜R(a) = limNrarrinfin

R(aN)1N

Define the asymptotic subrank

˜Q(a) = limNrarrinfin

Q(aN)1N

By Feketersquos lemma (Lemma 22) asymptotic rank is an infimum and asymptoticsubrank is a supremum as follows

˜R(a) = infN

R(aN)1N

˜Q(a) = supN

Q(aN)1N when a = 0 or a ge 1

Theorem 212 implies that the asymptotic rank and asymptotic subrank have thefollowing dual characterisation in terms of the asymptotic spectrum (This is astraightforward generalisation of [Str88 Th 38])

Corollary 213 (cf [Str88 Th 38]) For a isin S with existφ isin Xφ(a) ge 1

˜R(a) = maxφisinX

φ(a)

Proof Let φ isin X For N isin N R(aN) ge φ(a)N Therefore ˜R(a) ge φ(a)and so ˜R(a) ge maxφisinX φ(a) It remains to prove ˜R(a) le maxφisinX φ(a) Welet x = maxφisinX φ(a) By assumption x ge 1 By definition of x we have

forallφ isin X φ(a) le x

Take the mth power on both sides

forallφ isin Xm isin N φ(am) le xm

Take the ceiling on the right-hand side

forallφ isin Xm isin N φ(am) le dxme

28 Chapter 2 The theory of asymptotic spectra

Apply Theorem 212 to get asymptotic preorders

forallm isin N am 6sim dxme

Then by definition of asymptotic preorder

forallmN isin N amN 6 dxmeN2εmN for some εmN isin o(N)

Then

forallmN isin N R(amN)1mN le dxme1m2εmNmN

From x ge 1 follows dxme1m rarr x when m rarr infin Choose m = m(N) withm(N)rarrinfin as N rarrinfin and εm(N)N isin o(N) to get ˜R(a) = infN R(aN )1N le x

Corollary 214 (cf [Str88 Th 38]) For a isin S with existk isin N ak gt 2

˜Q(a) = minφisinX

φ(a)

Proof Let φ isin X For N isin N Q(aN ) le φ(a)N Therefore ˜Q(a) le φ(a) so ˜Q(a) leminφisinX φ(a) It remains to prove ˜Q(a) ge minφisinX φ(a) Let y = minφisinX φ(a)

From the assumption ak gt 2 follows y gt 1 By definition of y we have

forallφ isin X φ(a) ge y

Take the mth power on both sides

forallφ isin Xm isin N φ(am) ge ym

Take the floor on the right-hand side

forallφ isin Xm isin N φ(am) ge bymc

Apply Theorem 212 to get asymptotic preorders

forallm isin N am gtsim bymc

Then by definition of asymptotic preorder

forallmN isin N amN2εmN gt bymcN for some εmN isin o(N)

Now we use ak gt 2 to get

forallmN isin N amN+kεmN gt bymcN

Then

forallmN isin N Q(amN+kεmN )1

mN+kεmN ge bymcN

mN+kεmN

Choose m = m(N) with m(N) rarr infin as N rarr infin and εm(N)N isin o(N) to obtain

˜Q(a) = supN Q(aN)1N ge y

29 Topological aspects 29

29 Topological aspects

Theorem 212 does not tell the full story Namely there is also a topologicalcomponent which we will now discuss Let S be a semiring with N sube S Let 6 bea Strassen preorder on S Let X = X(S6) be the asymptotic spectrum of (S6)For a isin S let

a Xrarr Rge0 φ 7rarr φ(a) (27)

The map a simply evaluates a given homomorphism φ at a One may think of aas the collection (φ(a))φisinX of all evaluations of the elements of X at a Let Rge0

have the Euclidean topology Endow X with the weak topology with respect tothe family of functions a a isin S That is endow X with the coarsest topologysuch that each a becomes continuous

Let C(XRge0) be the semiring of continuous functions Xrarr Rge0 with additionand multiplication defined pointwise on X that is (f + g)(x) = f(x) + g(x)and (f middot g)(x) = f(x)g(x) for f g isin C(XRge0) and x isin X Define the semiringhomomorphism

Φ S rarr C(XRge0) a 7rarr a

which maps a to the evaluator a defined in (27)

Theorem 215 ([Str88 Th 24])

(i) X is a nonempty compact Hausdorff space

(ii) foralla b isin S a 6sim b iff Φ(a) le Φ(b) pointwise on X

(iii) Φ(S) separates the points of X

Proof Statement (ii) follows from Theorem 212Statement (iii) is clearWe prove statement (i) We have 2 66sim 1 so from Theorem 212 follows that X

cannot be emptyFor a isin S let na isin N with a le na Then for φ isin X φ(a) le na and so

φ(a) isin [0 na] Embed X subeprod

aisinS[0 na] as a set via φ 7rarr (φ(a))aisinS The setprodaisinS[0 na] with the product topology is compact by the theorem of TychonoffTo see that X is closed in

prodaisinS[0 na] we write X as an intersection of sets

X =φ isin

prodaisinS

[0 na] φ(0) = 0capφ isin

prodaisinS

[0 na] φ(1) = 1

cap⋂bcisinS

φ isin

prodaisinS

[0 na] φ(b+ c)minus φ(b)minus φ(c) = 0

cap⋂bcisinS

φ isin

prodaisinS

[0 na] φ(bc)minus φ(b)φ(c) = 0

30 Chapter 2 The theory of asymptotic spectra

cap⋂bcisinSblec

φ isin

prodaisinS

[0 na] φ(b) le φ(c)

and we observe that the intersected sets are closed

X = 0minus1(0) cap 1minus1(1)

cap⋂bcisinS

((b+ c)minus bminus c

)minus1(0)

cap⋂bcisinS

((bc)minus bc

)minus1(0)

cap⋂bcisinSblec

(cminus b

)minus1([0infin))

This implies X is also compactLet φ ψ isin X be distinct Let a isin S with φ(a) 6= ψ(a) Then a(φ) 6= a(ψ)

Let U 3 a(φ) V 3 a(ψ) be open and disjoint subsets of Rge0 Then aminus1(U) andaminus1(V ) are open and disjoint subsets of X We conclude that X is Hausdorff

210 Uniqueness

Let S be a semiring with N sube S Let 6 be a Strassen preorder on S LetX = X(S6) be the asymptotic spectrum of (S6) The object X is unique inthe following sense

Theorem 216 ([Str88 Cor 27]) Let Y be a compact Hausdorff space LetΨ S rarr C(YRge0) be a homomorphism of semirings such that

Ψ(S) separates the points of Y (28)

and

foralla b isin S a 6sim bhArr Ψ(a) le Ψ(b) pointwise on Y (29)

Then there is a unique homeomorphism (continuous bijection with continuousinverse) h Y rarr X such that the diagram

S

C(XRge0) C(YRge0)

ΨΦ

hlowast

(210)

commutes where hlowast φ 7rarr φ h Namely let h y 7rarr(a 7rarr Ψ(a)(y)

)

211 Subsemirings 31

Proof We prove uniqueness Suppose there are two such homeomorphisms

h1 h2 Y rarr X

Suppose x 6= h2(hminus11 (x)) for some x isin X Since Φ(S) separates the points of X

there is an a isin S with Φ(a)(x) 6= Φ(a)(h2(hminus11 (x))) Let y = hminus1

1 (x) isin Y ThenΦ(a)(h1(y)) 6= Φ(a)(h2(y)) Since (210) commutes Φ(a)(h1(y)) = Ψ(a)(y) andΦ(a)(h2(y)) = Ψ(a)(y) a contradiction

We prove existence Let h Y rarr X y 7rarr (a 7rarr Ψ(a)(y)) One verifies that his well-defined continuous injective and that the diagram in (210) commutes Itremains to show that h is surjective We know that Q middot Φ(S) is a Q-subalgebraof C(XR) which separates points and which contains the nonzero constantfunction Φ(1) so by the StonendashWeierstrass theorem Q middot Φ(S) is dense in C(XR)under the sup-norm Suppose h is not surjective Then h(Y) ( X is a properclosed subset Let x0 isin X h(Y) be in the complement Since X is a compactHausdorff space there is a continuous function f Xrarr [minus1 1] with

f(h(Y)) = 1

f(x0) = minus1

We know that f can be approximated by elements from Q middot Φ(S) ie let ε gt 0then there are a1 a2 isin S N isin N such that

1N

(Φ(a1)(x)minus Φ(a2)(x)

)gt 1minus ε for all x isin h(Y)

1N

(Φ(a1)(x0)minus Φ(a2)(x0)

)lt minus1 + ε

This means Ψ(a1) ge Ψ(a2) pointwise on Y so a1 gtsim a2 but also Φ(a1) 6ge Φ(a2)pointwise on X so a1 6gtsim a2 This is a contradiction

211 Subsemirings

Let S be a subsemiring of a semiring T and let 6 be a Strassen preorder on T Then the restriction 6|S is a Strassen preorder on S How are the asymptoticspectra X(S6|S) and X(T6) related Obviously for φ isin X(T6) we haveφ|S isin X(S6|S) In fact the uniqueness theorem of Section 210 implies that allelements of X(S6|S) are restrictions of elements of X(T6)

Corollary 217 Let S be a subsemiring of a semiring T Let 6 be a Strassenpreorder on T Then

X(S6|S) = X(T6)|S

Proof Let

X = X(S6|S)

32 Chapter 2 The theory of asymptotic spectra

Φ S rarr C(XRge0) a 7rarr a

and let

Y = X(T6)|S = φ|S φ isin X(T6)Ψ S rarr C(YRge0) a 7rarr

(φ|S 7rarr φ|S(a)

)

Then Y is a compact Hausdorff space Let φ|S ψ|S isin Y be distinct Then there isan a isin S with φ|S(a) 6= ψ|S(a) so (28) holds For a b isin S a 6sim b iff Φ(a) le Φ(b)iff Ψ(a) le Ψ(b) so (29) holds Therefore

h X(T6)|S rarr X(S6|S) φ|S 7rarr(a 7rarr Ψ(a)(φ|S)

)= φ|S

is a homeomorphism

212 Subsemirings generated by one element

Let S be a semiring and let 6 be a Strassen preorder on S We specialise to thesimplest type of subsemiring of S Namely let a isin S and let

N[a] = ksumi=0

ni ai k isin N ni isin N

sube S

be the subsemiring of S generated by a We call X(N[a]) = X(N[a]6|N[a]) theasymptotic spectrum of a

Corollary 218 (cf [Str88]) If ak gt 2 for some k isin N then

˜Q isin X(N[a])

If φ(a) ge 1 for some φ isin X then

˜R isin X(N[a])

Proof Let X = X(N[a]) Let n1 nq By Corollary 214

˜Q(an1 + middot middot middot+ anq) = minφisinX

φ(an1 + middot middot middot+ anq)

Since φ is a homomorphism φ(an1 + middot middot middot+ anq) = φ(a)n1 + middot middot middot+ φ(a)nq Now weobserve that xn1 + middot middot middot+ xnq is minimised by taking x minimal in the domain Weconclude

˜Q(an1 + middot middot middot+ anq) =

qsumi=1

(minφisinX

φ(a))ni = ˜Q(a)n1 + middot middot middot+ ˜Q(a)nq

The claim for asymptotic rank ˜R similarly follows from Corollary 213

213 Universal spectral points 33

Remark 219 In general asymptotic subrank ˜Q and asymptotic rank ˜R are notelements of the asymptotic spectrum We will see an example in Chapter 4 relatedto the matrix multiplication tensor

Remark 220 Corollary 218 is closely related to Schonhagersquos τ -theorem fortensors also called Schonhagersquos asymptotic sum inequality The τ -theorem featuresin every recent fast matrix multiplication algorithm (ie every algorithm based onthe laser method)

Remark 221 An element φ isin X(N[a]) is uniquely determined by the valueof φ(a) isin Rge0 We may thus identify the asymptotic spectrum X(N[a]) with acompact (ie closed and bounded) subset of the positive reals Rge0 via φ 7rarr φ(a)

213 Universal spectral points

Having discussed the simplest type of subsemiring in the previous section letus discuss the most difficult type of supersemiring When applying the theoryof asymptotic spectra to some setting there is a natural largest semiring S inwhich the objects of study live For example we may study the semiring S of all(equivalence classes of) 3-tensors of arbitrary format over F Or we may studythe semiring S of all (isomorphism classes of) finite simple graphs We refer tothe elements of the asymptotic spectrum X(S) of the ldquoambientrdquo semiring S bythe term universal spectral points (cf [Str88 page 119]) The universal spectralpoints are the most useful monotone homomorphisms

214 Conclusion

To a semiring S with a Strassen preorder 6 we associated an asymptotic pre-order 6sim We proved that this asymptotic preorder is characterised by the6-monotone semiring homomorphisms S rarr Rge0 which make up the asymp-totic spectrum X(S6) of (S6) For (S6) we naturally have a rank functionR S rarr N and a subrank function Q S rarr N Their asymptotic versions

˜R(a) = infn R(an)1n and ˜Q(a) = supn Q(an)1n coincide with maxφisinX(S6) φ(a)

and minφisinX(S6) φ(a) respectively assuming existφ isin Xφ(a) ge 1 and existk isin N ak gt 2respectively Unfortunately we have proved the existence of the asymptotic spec-trum by nonconstructive means Explicitly constructing spectral points for a givenpair (S6) will be a challenging task

Some remarks about our proof in this chapter The proof in [Str88] uses theKadisonndashDubois theorem from the paper of Becker and Schwartz [BS83] as ablack-box Our presentation basically integrates the proof of Strassen with theproof of Becker and Schwartz The notions of rank and subrank were in [Str88] onlydiscussed for tensors We considered the straightforward generalisation to arbitrary

34 Chapter 2 The theory of asymptotic spectra

semirings with a Strassen preorder An evident feature of our presentation is thatwe do not pass from the semiring to its Grothendieck ring but instead stay inthe semiring In this way we stay close to the ldquoreal worldrdquo objects I thank JopBriet and Lex Schrijver for this idea There is a large body of literature on theKadisonndashDubois theorem for which we refer to the modern books by Prestel andDelzell [PD01 Theorem 526] and Marshall [Mar08 Theorem 544]

Chapter 3

The asymptotic spectrum of graphsShannon capacity

This chapter is based on the manuscript [Zui18]

31 Introduction

This chapter is about the Shannon capacity of graphs which was introduced byClaude Shannon in the context of coding theory [Sha56] More precisely we willapply the theory of asymptotic spectra of Chapter 2 to gain a better understandingof Shannon capacity (and other asymptotic properties of graphs)

We first recall the definition of the Shannon capacity of a graph Let G be a(finite simple) graph with vertex set V (G) and edge set E(G) An independent setor stable set in G is a subset of V (G) that contains no edges The independencenumber or stability number α(G) is the cardinality of the largest independentset in G For graphs G and H the and-product GH also called strong graphproduct is defined by

V (GH) = V (G)times V (H)

E(GH) =(g h) (gprime hprime)

(g gprime isin E(G) or g = gprime

)and

(h hprime isin E(H) or h = hprime

)and (g h) 6= (gprime hprime)

The Shannon capacity Θ(G) is defined as the limit

Θ(G) = limNrarrinfin

α(GN)1N (31)

This limit exists and equals the supremum supN α(GN)1N by Feketersquos lemma(Lemma 22)

Computing the Shannon capacity is nontrivial already for small graphs Lovaszin [Lov79] computed the value Θ(C5) =

radic5 where Ck denotes the k-cycle graph

by introducing and evaluating a new graph parameter ϑ which is now known as

35

36 Chapter 3 The asymptotic spectrum of graphs Shannon capacity

the Lovasz theta number For example the value of Θ(C7) is currently not knownThe Shannon capacity Θ is not known to be hard to compute in the sense ofcomputational complexity On the other hand deciding whether α(G) le k givena graph G and k isin N is NP-complete [Kar72]

New result dual description of Shannon capacity

The new result of this chapter is a dual characterisation of the Shannon capacityof graphs This characterisation is obtained by applying Strassenrsquos theory ofasymptotic spectra of Chapter 2 Thus this chapter also serves as an illustrationof the theory of asymptotic spectra

To state the theorem we need the standard notions graph homomorphismgraph complement and graph disjoint union Let G and H be graphs A graphhomomorphism f G rarr H is a map f V (G) rarr V (H) such that for allu v isin V (G) if u v isin E(G) then f(u) f(v) isin E(H) In other words a graphhomomorphism maps edges to edges The complement G of G is defined by

V (G) = V (G)

E(G) =u v u v 6isin E(G) u 6= v

We define a relation 6 on graphs let G 6 H if there is a graph homomor-phism Grarr H from the complement of G to the complement of H The disjointunion G tH is defined by

V (G tH) = V (G) t V (H)

E(G tH) = E(G) t E(H)

For n isin N the complete graph Kn is the graph with V (Kn) = [n] = 1 2 nand E(Kn) = i j i j isin [n] i 6= j Thus K0 = K0 is the empty graphand K1 = K1 is the graph consisting of a single vertex and no edges

Theorem 31 Let S sube graphs be a collection of graphs which is closed underthe disjoint union t and the strong graph product and which contains the graphwith a single vertex K1 Define the asymptotic spectrum X(S) as the set of allmaps φ S rarr Rge0 such that for all GH isin S

(1) if G 6 H then φ(G) le φ(H)

(2) φ(G tH) = φ(G) + φ(H)

(3) φ(GH) = φ(G)φ(H)

(4) φ(K1) = 1

32 The asymptotic spectrum of graphs 37

Let G 6sim H if there is a sequence (xN) isin NN with x1NN rarr 1 when N rarr infin such

that for every N isin N

GN 6 (HN)txN = HN t middot middot middot tHN︸ ︷︷ ︸xN

Then

(i) G 6sim H iff forallφ isin X(S) φ(G) le φ(H)

(ii) Θ(G) = minφisinX(S) φ(G)

Statement (ii) of Theorem 31 is nontrivial in the sense that Θ is not anelement of X(graphs) Namely Θ is not additive under t by a result ofAlon [Alo98] and Θ is not multiplicative under by a result of Haemers [Hae79]It turns out that the graph parameter G 7rarr maxφisinX(graphs) φ(G) is itself anelement of X(graphs) and is equal to the fractional clique cover number χf (seeSection 332 and eg [Sch03 Eq (67112)]) Fritz in [Fri17] proves (independentlyof Strassenrsquos line of work) a statement that is weaker than Theorem 31 Namelyhe proves the statement of Theorem 31 without the additivity condition (2)

In Section 32 we will prove Theorem 31 by applying the theory of asymptoticspectra of Chapter 2 to the appropriate semiring and preorder In Section 33 wewill discuss the elements in the asymptotic spectrum of graphs X(graphs) thatare currently known to me the Lovasz theta number the fractional clique covernumber the fractional orthogonal rank of the complement and the fractionalHaemers bounds We moreover prove a sufficient condition for the ldquofractionalisa-tionrdquo of a graph parameter to be in the asymptotic spectrum of graphs

32 The asymptotic spectrum of graphs

In this section we prove Theorem 31 by applying the theory of asymptotic spectrato the appropriate semiring

321 The semiring of graph isomorphism classes GA graph homomorphism f Grarr H is a graph isomorphism if f is bijective asa map V (G)rarr V (H) and bijective as a map E(G)rarr E(H) We write G sim= Hif there is a graph isomorphism f G rarr H The relation sim= is an equivalencerelation on graphs which we call isomorphism For example the graphs Gand H given by

V (G) = a b c d E(G) = a b b c c d a dV (H) = 1 2 3 4 E(H) = 1 3 2 3 2 4 1 4

38 Chapter 3 The asymptotic spectrum of graphs Shannon capacity

are isomorphic Let G = graphssim= be the set of equivalence classes in graphsunder sim= ie the isomorphism classes The relation 6 is a preorder on G Recallthat Kn is the complete graph on n vertices and thus Kn is the graph with nvertices and no edges

Lemma 32 Let ABC isin graphs

(i) t and are commutative and associative operations on G

(ii) distributes over t on G ie A (B t C) = (AB) t (A C)

(iii) K1 A = A

(iv) K0 A = K0

(v) K0 t A = A

(vi) Kn tKm = Kn+m

Proof We leave the proof to the reader

In other words Lemma 32 says that (Gt K0 K1) is a (commutative)semiring in which the elements K0 K1 K2 behave like the natural numbers NWe will denote this semiring simply by G

322 Strassen preorder via graph homomorphisms

Let G be the semiring of graphs Recall that G 6 H if there is a graph homomor-phism f Grarr H

Lemma 33 The preorder 6 is a Strassen preorder on G That is for graphsABCD isin G we have the following

(i) For nm isin N Kn 6 Km iff n le m

(ii) If A 6 B and C 6 D then A t C 6 B tD and A C 6 B D

(iii) For AB isin G if B 6= K0 then there is an r isin N with A 6 Kr B

Proof Statement (i) is easy to verify We prove (ii) Let f Ararr B and g C rarr Dbe graph homomorphisms Let the map f t g V (A) t V (C)rarr V (B) t V (D) bedefined by

(f + g)(a) = f(a) for a isin V (A)

(f + g)(c) = g(c) for c isin V (C)

32 The asymptotic spectrum of graphs 39

One verifies directly that f t g is a graph homomorphism A t C rarr B tD Letthe map f g V (A)times V (C)rarr V (B)times V (D) be defined by

(f g)(a c) = (f(a) g(c))

One verifies directly that f g is a graph homomorphism A C rarr B D Thisproves (ii) We prove (iii) Let r = |V (A)| Then A 6 Kr By assumptionB 6= K0 so K1 6 B Therefore A 6 Kr

sim= Kr1 6 KrB This proves (iii)

323 The asymptotic spectrum of graphs X(G)

We thus have a semiring G with a Strassen preorder 6 We are therefore in theposition to apply the theory of asymptotic spectra (Chapter 2) Let us translatethe abstract terminology to this setting

Let G 6sim H if there is a sequence (xN) isin NN with (xN)1N rarr 1 such that forevery N isin N we have GN 6 HN KxN ie GN 6 (HN)txN

Let S sube G be a subsemiring For example one may take S = G or one maychoose any set X sube G and let S = N[X] be the subsemiring of G generated by Xunder t and

The asymptotic spectrum of S is the set X(S) of 6-monotone semiring homo-morphisms S rarr Rge0 ie all maps φ S rarr Rge0 such that for all GH isin S

(1) if G 6 H then φ(G) le φ(H)

(2) φ(G tH) = φ(G) + φ(H)

(3) φ(GH) = φ(G)φ(H)

(4) φ(K1) = 1

We call X(G) the asymptotic spectrum of graphs

Theorem 34 Let GH isin S Then G 6sim H iff forallφ isin X(S) φ(G) le φ(H)

Proof By Lemma 32 we have a semigroup S and by Lemma 33 we have aStrassen preorder 6 so we may apply Theorem 212

We refer to Chapter 2 for a discussion of the topological properties of X(S)

324 Shannon capacity Θ

Let us discuss the (asymptotic) rank and (asymptotic) subrank for (G6) Recallthat an independent set in G is a subset of V (G) that contains no edges andthe independence number α(G) is the cardinality of the largest independent setin G A colouring of G is an assignment of colours to the elements of V (G) suchthat connected vertices get distinct colours The chromatic number χ(G) is the

40 Chapter 3 The asymptotic spectrum of graphs Shannon capacity

smallest number of colours in any colouring of G The clique cover number χ(G)is defined as the chromatic number of the complement χ(G) = χ(G)

For the semiring G with preorder 6 the abstract definition of subrank ofSection 28 becomes Q(G) = maxm isin N Km 6 G and the abstract definitionof rank becomes R(G) = minn isin N G 6 Kn

Lemma 35

(i) α(G) = Q(G)

(ii) χ(G) = R(G)

Proof We leave the proof to the reader

We see directly that the asymptotic rank is the Shannon capacity

˜R(G) = limNrarrinfin

R(GN)1N = limNrarrinfin

α(GN)1N = Θ(G)

and that the asymptotic subrank is the asymptotic clique cover number

˜Q(G) = limNrarrinfin

Q(GN)1N = limNrarrinfin

χ(GN)1N = ˜χ(G)

Let S sube G be a subsemiring Let G isin S

Corollary 36 Θ(G) = minφisinX(S) φ(G)

Proof Let G be a graph Either G = K0 or K1 6 G 6 K1 or G contains at leastone edge In the first two cases the claim is clearly true In the third case G gt K2

and we may thus apply Corollary 213

Corollary 37 ˜χ(G) = maxφisinX(S) φ(G)

Proof This is Corollary 214

Remark 38 As mentioned earlier it turns out that ˜χ is in fact itself an elementof X(G) See Section 332 (This is a striking difference with the situation fortensors which we will discuss in Chapter 4 there both asymptotic rank andasymptotic subrank are not in the asymptotic spectrum see Remark 44)

Shannon capacity is not in the asymptotic spectrum

Lemma 39 GG ge K|V (G)|

Proof Let D = (u u) u isin V (G) Let (u u) (v v) isin D Then eitheru v isin E(G) or u v isin E(G) (exclusive or) and so (u u) (v v) 6isin E(GG)Therefore the subgraph in GG induced by D is isomorphic to K|V (G)|

Example 310 Let G be the Schlafli graph This is a graph with 27 verticesThus Θ(GG) ge |V (G)| = 27 On the other hand Haemers in [Hae79] showedthat Θ(G)Θ(G) le 21 This implies the map Θ is not in X(G) since it is notmultiplicative under

33 Universal spectral points 41

33 Universal spectral points

The abstract theory of asymptotic spectra of Chapter 2 does not explicitly describethe elements of X(G) ie the universal spectral points (cf Section 213) Howeverseveral graph parameters from the literature can be shown to be universal spectralpoints In fact recently in [BC18] the first infinite family of universal spectralpoints was found the fractional Haemers bounds We give a brief (and probablyincomplete) overview of currently known elements in X(G)

331 Lovasz theta number ϑ

For any real symmetric matrix A let Λ(A) be the largest eigenvalue The Lovasztheta number ϑ(G) is defined as

ϑ(G) = minΛ(A) A isin RV (G)timesV (G) symmetric u v 6isin E(G)rArr Auv = 1

The parameter ϑ(G) was introduced by Lovasz in [Lov79] We refer to [Knu94]and [Sch03] for a survey It follows from well-known properties that ϑ isin X(G)

332 Fractional graph parameters

Besides the Lovasz theta number there are several elements in X(G) that arenaturally obtained as fractional versions of -submultiplicative t-subadditive6-monotone maps G rarr Rge0 For any map φ G rarr Rge0 we define a fractionalversion φf by

φf (G) = infd

φ(GKd

)d

We will discuss several fractional parameters from the literature and prove ageneral theorem about fractional parameters

Fractional clique cover number

We consider the fractional version of the clique cover number χ(G) = χ(G) It iswell-known that χf isin X(G) see eg [Sch03] The fractional clique cover number χfin fact equals the asymptotic clique cover number ˜χ(G) = limNrarrinfin χ(GN)1N

which we introduced in the previous section see [MP71] and also [Sch03 Th 6717]

Fractional Haemers bound

Let rank(A) denote the matrix rank of any matrix A For any set C of matricesdefine rank(C) = minrank(A) A isin C For a field F and a graph G define theset of matrices

MF(G) = A isin FV (G)timesV (G) foralluv Avv 6= 0 u v 6isin E(G)rArr Auv = 0

42 Chapter 3 The asymptotic spectrum of graphs Shannon capacity

Let RF(G) = rank(MF(G)) The parameter RF(G) was introduced by Haemersin [Hae79] and is known as the Haemers bound The fractional Haemers bound RF

f

was studied by Anna Blasiak in [Bla13] and was recently shown to be -multiplica-tive by Bukh and Cox in [BC18] From this it is not hard to prove that RF

f isin X(G)Bukh and Cox in [BC18] furthermore prove a separation result for any field F ofnonzero characteristic and any ε gt 0 there is a graph G such that for any field Fprimewith char(F) 6= char(Fprime) the inequality RF

f(G) lt εRFprimef (G) holds This separation

result implies that there are infinitely many elements in X(G)

Fractional orthogonal rank

In [CMR+14] the orthogonal rank ξ(G) and its fractional version the projectiverank ξf (G) are studied It easily follows from results in [CMR+14] that G 7rarr ξf (G)is in X(G)

General fractional parameters

We will prove something general about fractional parameters Define the lexico-graphic product GnH by

V (GnH) = V (G)times V (H)

E(GnH) =(g h) (gprime hprime) g gprime isin E(G)

or (g = gprime and h hprime isin E(H))

The lexicographic product satisfies GnH = GnH Also define the or-productG lowastH by

V (G lowastH) = V (G)times V (H)

E(G lowastH) =(g h) (gprime hprime) g gprime isin E(G) or h hprime isin E(H)

The or-product and the strong graph product are related by G lowastH = GH Thestrong graph product gives a subgraph of the lexicographic product which gives asubgraph of the or-product

GH sube GnH sube G lowastH

Therefore G lowastH 6 GnH 6 GH Finally GnKd = G lowastKd and of courseGKd = Gtd

We will prove if φ G rarr Rge0 is -submultiplicative t-subadditive and 6-monotone then φf is again -submultiplicative t-subadditive and 6-monotoneMoreover if φ G rarr N is 6-monotone and satisfies

forallGH isin G φ(GnH) ge φ(GnKφ(H))

then φf is n-supermultiplicative and more importantly φf is -supermultiplica-tive

33 Universal spectral points 43

Lemma 311

(i) If φ is t-superadditive then φf is t-superadditive

(ii) If φ is 6-monotone then φf is 6-monotone

(iii) If φ is t-subadditive and 6-monotone then φf is t-subadditive

(iv) If foralln isin N φ(Kn) = n then foralln isin N φf (Kn) = n

(v) If φ is -submultiplicative and 6-monotone then φf is -submultiplicative

Proof Let GH isin G Let d isin N(i) The lexicographic product distributes over the disjoint union

(G tH) nKd = (GnKd) t (H nKd)

By superadditivity

φ((GnKd) t (H nKd)) ge φ(GnKd) + φ(H nKd)

Therefore

φf (G tH) = infd

φ((G tH) nKd)

d

= infd

φ((GnKd) t (H nKd))

d

ge infd

φ(GnKd)

d+φ(H nKd)

d

ge infd1

φ(GnKd1)

d1

+ infd2

φ(H nKd2)

d2

= φf (G) + φf (H)

(ii) Let G 6 H Then G n Kd 6 H n Kd Thus φ(G n Kd) le φ(H n Kd)Therefore φf (G) le φf (H)

(iii) We have GnKd 6 GKd = Gtd Thus by monotonicity and subadditivity

φ(GnKd) le dφ(G)

and for d e isin N

φ(GnKde) = φ((GnKd) nKe) le eφ(GnKd)

We use this inequality to get for d1 d2 isin N

φ(GnKd1)

d1

+φ(H nKd2)

d2

ge φ(GnKd1d2) + φ(H nKd1d2)

d1d2

44 Chapter 3 The asymptotic spectrum of graphs Shannon capacity

From subadditivity follows

φ(GnKd1d2) + φ(H nKd1d2)

d1d2

ge φ((GnKd1d2) t (H nKd1d2))

d1d2

=φ((G tH) nKd1d2)

d1d2

ge φf (G tH)

We conclude φf (G) + φf (H) ge φf (G tH)(iv) Let n isin N Then φf (Kn) = infd φ(Kn nKd)d = infd φ(Knd)d = n(v) Let d1 d2 isin N We claim

(GH) nKd1d2 le (GnKd1) (H nKd2)

This is the same as saying there is a graph homomorphism

(GH) nKd1d2 rarr (GnKd1) (H nKd2)

which is the same as saying there is a graph homomorphism

(G lowastH) nKd1d2 rarr (GnKd1) lowast (H nKd2)

where lowast denotes the or-product of graphs One verifies that (g h (i j)) 7rarr((g i) (h j)) is such a graph homomorphism proving the claim The claimtogether with monotonicity and submultiplicativity gives

φ((GH)nKd1d2) le φ((GnKd1) (H nKd2)) le φ(GnKd1)φ(H nKd2)

Therefore

φf (GH) = infd

φ((GH) nKd)

d

= infd1d2

φ((GH) nKd1d2)

d1d2

le infd1d2

φ(GnKd1)

d1

φ(H nKd2)

d2

= φf (G)φf (H)

This concludes the proof of the lemma

Lemma 312 Let φ G rarr N satisfy

forallGH isin G φ(GnH) ge φ(GnKφ(H)) (32)

Then

infH

φ(GnH)

φ(H)= inf

d

φ(GnKd)

d

33 Universal spectral points 45

Proof From (32) follows

φ(GnH)

φ(H)geφ(GnKφ(H))

φ(H)

and so

φ(GnH)

φ(H)ge inf

d

φ(GnKd)

d

We take the infimum over H to get

infH

φ(GnH)

φ(H)ge inf

d

φ(GnKd)

d

The inequality in the other direction

infH

φ(GnH)

φ(H)le inf

d

φ(GnKd)

d

is trivially true

Lemma 313 Let φ G rarr N be 6-monotone and satisfy

forallGH isin G φ(GnH) ge φ(GnKφ(H))

Then φf is n- and -supermultiplicative

Proof Let AB isin G We have AB gt AnB so

φf (AB) ge φf (AnB)

It remains to show φf (AnB) ge φf (A)φf (B) We have

φ(AnB nH)

φ(H)=φ(An (B nH))

φ(B nH)

φ(B nH)

φ(H)

which implies

φ(AnB nH)

φ(H)ge inf

Hprime

φ(AnH prime)

φ(H prime)infHprimeprime

φ(B nH primeprime)

φ(H primeprime)= φf (A)φf (B)

Take the infimum over H to obtain φf (AnB) ge φf (A)φf (B)

Theorem 314 Let φ G rarr N be t-additive -submultiplicative 6-monotoneand Kn-normalised and satisfy

forallGH isin G φ(GnH) ge φ(GnKφ(H))

Then φf is in X(G)

Proof This follows from Lemma 311 Lemma 312 and Lemma 313

46 Chapter 3 The asymptotic spectrum of graphs Shannon capacity

34 Conclusion

In this chapter we introduced a new connection between Strassenrsquos theory ofasymptotic spectra and the Shannon capacity of graphs In particular we charac-terised the Shannon capacity (which is defined as a supremum) as a minimisationover elements in the asymptotic spectrum of graphs Known elements in theasymptotic spectrum of graphs include the fractional clique cover number theLovasz theta number the projective rank and the fractional Haemers bound Weare left with a clear goal for future work find all elements in the asymptoticspectrum of graphs

Chapter 4

The asymptotic spectrum of tensorsexponent of matrix multiplication

This chapter is based on joint work with Matthias Christandl and Peter Vrana [CVZ18]

41 Introduction

This chapter is about tensors t isin Fn1 otimes middot middot middot otimes Fnk and their asymptotic propertiesThe theory of asymptotic spectra of Chapter 2 was developed by Strassen exactlyfor the purpose of understanding the asymptotic properties of tensors Thischapter is expository and provides the necessary background for understandingChapter 5 and Chapter 6

Let us first define the asymptotic properties of interest and discuss some oftheir applications We need the concepts restriction tensor product and diagonaltensor Let s isin Fn1 otimes middot middot middot otimes Fnk and t isin Fm1 otimes middot middot middot otimes Fmk be tensors We say srestricts to t and write s gt t if there are linear maps Ai Fni rarr Fmi suchthat t = (A1 otimes middot middot middot otimes Ak) middot s The tensor product of s and t is the elements otimes t isin Fn1m1 otimes middot middot middot otimes Fnkmk with coordinates (s otimes t)ij = sitj We naturallydefine the direct sum s oplus t isin Fn1+m1 otimes middot middot middot otimes Fnk+mk We define the diagonaltensors 〈n〉 =

sumni=1 ei otimes middot middot middot otimes ei for n isin N where e1 en is the standard basis

of Fn The tensor rank R(t) is the smallest number n isin N such that t can bewritten as a sum of simple tensors a simple tensor being a tensor of the formv1 otimes middot middot middot otimes vk Equivalently R(t) = minn isin N t 6 〈n〉 The asymptotic rankis the regularisation ˜R(t) = limnrarrinfinR(totimesn)1n While tensor rank is known to behard to compute [Has90 Shi16] we do not know whether asymptotic rank is hardto compute

The exponent of matrix multiplication

The motivating example for studying asymptotic rank is the problem of findingthe exponent of matrix multiplication ω Recall from the introduction that ω

47

48 Chapter 4 The asymptotic spectrum of tensors matrix multiplication

is the infimum over a isin R such that two n times n matrices can be multipliedusing O(na) arithmetic operations (in the algebraic circuit model) It turns out(see [BCS97]) that ω is characterised by the asymptotic rank ˜R(〈2 2 2〉) of thematrix multiplication tensor

〈2 2 2〉 =sum

ijkisin[2]

eij otimes ejk otimes eki isin F4 otimes F4 otimes F4

Namely ˜R(〈2 2 2〉) = 2ω We know the trivial lower bound 2 le ω see Section 43We know the (non-trivial) upper bound ω le 23728639 which is by Coppersmithand Winograd [CW90] and improvements by Stothers Williams and Le Gall[Sto10 Wil12 LG14]

Asymptotic subrank and asymptotic restriction

Besides (asymptotic) rank we naturally define subrank Q(t) = maxm isin N 〈m〉 6 t and the asymptotic subrank ˜Q(t) = limnrarrinfinQ(totimesn)1n Moreover wesay s restricts asymptotically to t written s gtsim t if there is a sequence of naturalnumbers a(n) isin o(n) such that for all n isin N

sotimesn otimes 〈2〉otimesa(n) gt totimesn

One can prove (see [Str91]) that

sotimesn otimes 〈2〉otimeso(n) gt totimesn iff sotimesn+o(n) gt totimesn

Our goal is to understand asymptotic restriction asymptotic rank and asymptoticsubrank

More connections quantum information combinatorics algebraic prop-erty testing

Besides matrix multiplication other applications of asymptotic restriction oftensors asymptotic rank of tensors and asymptotic subrank of tensors includedeciding the feasibility of an asymptotic transformation between pure quantumstates via stochastic local operations and classical communication (slocc) inquantum information theory [BPR+00 DVC00 VDDMV02 HHHH09] boundingthe size of combinatorial structures like cap sets and tri-colored sum-free sets inadditive combinatorics [Ede04 Tao08 ASU13 CLP17 EG17 Tao16 BCC+17KSS16 TS16] see Chapter 5 and bounding the query complexity of certainproperties in algebraic property testing [KS08 BCSX10 Sha09 BX15 HX17FK14]

This chapter is organised as follows In Section 42 we briefly discuss thesemiring of tensors the asymptotic spectrum of tensors and asymptotic rank and

42 The asymptotic spectrum of tensors 49

subrank In Section 43 we discuss the gauge points a simple construction of finitelymany elements in the asymptotic spectrum of tensors In Section 44 we discussthe Strassen support functionals a family of elements in the asymptotic spectrumof ldquoobliquerdquo tensors This family is parametrised by probability distributionson [k] In Section 45 we discuss an extension of the support functionals calledthe Strassen upper support functionals which have the potential to be universalFinally in Section 46 we prove a new result we show how asymptotic slice rankis related to the support functionals

42 The asymptotic spectrum of tensors

Let us properly set up the semiring of tensors and the asymptotic spectrum Forthe proofs we refer to [Str87 Str88 Str91]

421 The semiring of tensor equivalence classes TWe begin by putting an equivalence relation on tensors For example we want toidentify isomorphic tensors and also for any tensor t isin Fn1 otimes middot middot middot otimes Fnk we wantto identify t with toplus 0 where 0 isin Fm1 otimes middot middot middot otimes Fmk is a zero tensor of any format

We say s is isomorphic to t and write s sim= t if there are bijective linear mapsAi Fmi rarr Fni such that t = (A1 Ak) middot s

We say s and t are equivalent and write s sim t if there are zero tensorss0 = 0 isin Fa1 times middot middot middot times Fak and t0 = 0 isin Fb1 times middot middot middot times Fbk such that s oplus s0

sim= t oplus t0The equivalence relation sim is in fact the equivalence relation generated by therestriction preorder 6

Let T be the set of sim-equivalence classes of k-tensors over F for some fixed kand field F The direct sum and the tensor product naturally carry over to T and T becomes a semiring with additive unit 〈0〉 and multiplicative unit 〈1〉(more precisely the equivalence classes of those tensors but we will not make thisdistinction)

422 Strassen preorder via restriction

Restriction 6 induces a partial order on T which behaves well with respect tothe semiring operations and naturally n le m if and only if 〈n〉 6 〈m〉 Thereforerestriction 6 is a Strassen preorder on T

423 The asymptotic spectrum of tensors X(T )

Let S sube T be a subsemiring Let

X(S) = X(S6) = φ isin Hom(SRge0) foralla b isin S a 6 brArr φ(a) le φ(b)

50 Chapter 4 The asymptotic spectrum of tensors matrix multiplication

We call X(S) the asymptotic spectrum of S and we call X(T ) the asymptoticspectrum of k-tensors over F

Theorem 41 ([Str88]) Let s t isin S Then s t iff forallφ isin X(S) φ(s) le φ(t)

Proof This follows from Theorem 212

We refer to Chapter 2 for a discussion of the topological properties of X(S)

Remark 42 We mention that X(S) may equivalently be defined with degenera-tion instead of restriction ge Over C we say f degenerates to g written f gif f sim= f prime and g sim= gprime and gprime is in the Euclidean closure (or equivalently Zariskiclosure) of the orbit GLn1 times middot middot middot timesGLnk middot f prime It is a nontrivial fact from algebraicgeometry (see [Kra84 Lemma III231] or [BCS97]) that there is a degenerationf g if and only if there are matrices Ai with entries polynomial in ε such that(A1 Ak) middot f = εdg + εd+1g1 + middot middot middot + εd+ege for some elements g1 ge Thelatter definition of degeneration is valid when C is replaced by an arbitrary fieldF and that is how degeneration is defined for an arbitrary field Degenerationis weaker than restriction f ge g implies f g Asymptotically however thenotions coincide f amp g if and only if fotimesn otimes 〈2〉otimeso(n) gotimesn We mention thatanalogous to restriction degeneration gives rise to border rank and border subrankR(f) = minr isin N f 〈r〉 and Q(f) = maxs isin N 〈s〉 f respectively

424 Asymptotic rank and asymptotic subrank

The abstract theory of asymptotic spectra characterises asymptotic subrank andasymptotic rank as follows

Corollary 43 Let S sube T be a subsemiring Let a isin S Then

˜Q(a) = minφisinX(S)

φ(a) (41)

˜R(a) = maxφisinX(S)

φ(a) (42)

Proof Statement (42) follows from Corollary 213 since either a = 0 or a gt 1For statement (41) if totimesk gt 2 for some k isin N then we apply Corollary 214Otherwise one can show that ˜Q(t) equals 0 or 1 using the gauge points of thenext section (see [Str88 Lemma 37])

Remark 44 One verifies that ˜R and ˜Q are 6-monotones and have value non 〈n〉 They are not universal spectral points however Namely the asymptoticrank of each of the three tensors

〈2 1 1〉 = e1 otimes e1 otimes 1 + e2 otimes e2 otimes 1 isin F2 otimes F2 otimes F1

〈1 1 2〉 = e1 otimes 1otimes e1 + e2 otimes 1otimes e2 isin F2 otimes F1 otimes F2

43 Gauge points ζ(i) 51

〈1 2 2〉 = 1otimes e1 otimes e1 + 1otimes e2 otimes e2 isin F1 otimes F2 otimes F2

equals 2 whereas their tensor product equals the matrix multiplication ten-sor 〈2 2 2〉 whose tensor rank equals 7 and whose asymptotic rank is thus atmost 7 ie strictly smaller than 23 Therefore asymptotic rank is not multiplica-tive On the other hand the asymptotic subrank of each of the above three tensorsequals 1 whereas the asymptotic subrank of 〈2 2 2〉 equals 4 see Chapter 5Therefore asymptotic subrank is not multiplicative

Goal 45 Our goal is now to explicitly describe elements in X(T ) universalspectral points or more modestly to describe elements in X(S) for interestingsubsemirings S sube T

Strassen constructed a finite family of elements in X(T ) the gauge points andan infinite family of elements in X(oblique tensors) the support functionalsThe support functionals are powerful enough to determine the asymptotic subrankof any ldquotight tensorrdquo Tight tensors are discussed in Chapter 5 In Chapter 6 weconstruct an infinite family in X(k-tensors over C) the quantum functionalsIn the rest of this chapter we discuss the gauge points and the support functionalsWe will focus on the case k = 3 for clarity of exposition

43 Gauge points ζ(i)

Strassen in [Str88] introduced a finite family of elements in X(T ) called the gaugepoints We focus on 3-tensors but the construction generalises immediately tok-tensors Let Vi = Fni Let t isin V1 otimes V2 otimes V3 Let i isin [3] Let flatteni(t) bethe image of t under the grouping V1 otimes V2 otimes V3 rarr Vi otimes (

otimesj 6=i Vj) We think

of flatteni(t) as a matrix Let ζ(i) T rarr N t 7rarr rank(flatteni(t)) with rankdenoting matrix rank We call ζ(1) ζ(2) ζ(3) the gauge points From the propertiesof matrix rank follows directly that ζ(i) is multiplicative under otimes additive under oplusmonotone under restriction 6 (and under degeneration ) and normalised to 1on 〈1〉 = e1 otimes e1 otimes e1

Theorem 46 ζ(1) ζ(2) ζ(3) isin X(T )

Recall ˜Q(t) le φ(t) le ˜R(t) for φ isin X(T ) In particular maxi ζ(i)(t) le ˜R(t)

We do not know whether maxiisin[3] ζ(i) equals ˜R To be precise we do not know any t

for which maxi ζ(i)(t) lt ˜R(t) and we do not know a proof that maxi ζ

(i)(t) = ˜R(t)for all t There are various families of tensors t for which maxi ζ

(i)(t) = ˜R(t) isproven We will see such a family in Section 542 For the matrix multiplicationtensor 〈2 2 2〉 we have 4 = maxi ζ

(i)(〈2 2 2〉) le 2ω so maxi ζ(i)(t) = ˜R(t) would

imply that the matrix multiplication exponent ω equals 2On the other hand ˜Q(t) le mini ζ

(i)(t) There exist t for which ˜Q(t) is

strictly smaller than miniisin[3] ζ(i)(t) To show this strict inequality we need another

52 Chapter 4 The asymptotic spectrum of tensors matrix multiplication

technique of Strassen the support functionals The support functionals are thetopic of the next section

44 Support functionals ζθ

Strassen in [Str91] constructed an infinite family of elements in the asymptoticspectrum of oblique k-tensors called the support functionals In this section we ex-plain the construction of the support functionals The support functionals providethe benchmark for our new quantum functionals (Chapter 6) and are relevant inthe context of combinatorial problems like the cap set problem (Section 542) Forclarity of exposition we focus on 3-tensors The ideas extend directly to k-tensors

Oblique tensors are tensors for which in some basis the support has thefollowing special structure Let t isin Fn1 otimes Fn2 otimes Fn3 Let e1 eni be thestandard basis of Fni Write t =

sumijk tijk ei otimes ej otimes ek Let [ni] = 1 2 ni

Let supp(t) = (i j k) tijk 6= 0 sube [n1] times [n2] times [n3] be the support of t withrespect to the standard basis Let [ni] have the natural ordering 1 lt 2 lt middot middot middot lt [ni]and let [n1]times [n2]times [n3] have the product order denoted by le That is x le yif for all i isin [3] holds xi le yi We call supp(t) oblique if supp(t) is an antichainwith respect to le ie if any two elements in supp(t) are incomparable withrespect to le We call a tensor t oblique if supp(g middot t) is oblique for some groupelement g isin G(t) = GLn1 times GLn2 times GLn3 The family of oblique tensors is asemiring under oplus and otimes

Not all tensors are oblique Obliqueness is not a generic property (see Propo-sition 621) However many tensors that are of interest in algebraic complexitytheory are oblique notably the matrix multiplication tensors

〈a b c〉 =sumiisin[a]

sumjisin[b]

sumkisin[c]

eij otimes ejk otimes eki isin Fab otimes Fbc otimes Fca

For any finite set X let P(X) be the set of all probability distributions on XFor any probability distribution P isin P(X) the Shannon entropy of P is definedas H(P ) = minus

sumxisinX P (x) log2 P (x) with 0 log2 0 understood as 0 Given finite

sets X1 Xk and a probability distribution P isin P(X1 times middot middot middot times Xk) on theproduct set X1 times middot middot middot timesXk we denote the marginal distribution of P on Xi by Pithat is Pi(a) =

sumxxi=a

P (x) for any a isin Xi

Definition 47 Let θ isin Θ = P([3]) For t isin Fn1 otimes Fn2 otimes Fn3 0 with supp(t)oblique define

ζθ(t) = max2sum3i=1 θ(i)H(Pi) P isin P(supp(t))

We call the ζθ for θ isin Θ the support functionals

Theorem 48 ζθ isin X(oblique) for θ isin Θ

44 Support functionals ζθ 53

We work towards the proof of Theorem 48 For p isin [0 1] let h(p) be thebinary entropy function h(p) = minusp log2 p minus (1 minus p) log2(1 minus p) ie h(p) is theShannon entropy of the probability vector (p 1minus p) The following properties ofthe Shannon entropy are well-known

Lemma 49

(i) H(P otimesQ) = H(P ) +H(Q) for P isin P(X1) Q isin P(X2)

(ii) H(P ) le H(P1) +H(P2) for P isin P(X1 timesX2)

(iii) H(pPoplus(1minusp)Q) = pH(P )+(1minusp)H(Q)+h(p) for PQ isin P(X) p isin [0 1]

(iv) 2a + 2b = max0leple1 2pa+(1minusp)b+h(p) for a b isin R

For X sube [n1]times [n2]times [n3] let Xle = y isin [n1]times [n2]times [n3] existx isin X y le x bethe downward closure of X Let max(X) = y isin X forallx isin X y le x rArr y = xbe the maximal points of X with respect to le Let Sn be the symmetric groupof permutations of [n] Then the product group Sn1 times Sn2 times Sn3 acts naturallyon [n1]times [n2]times [n3]

Lemma 410 Let t isin Fn1 otimes Fn2 otimes Fn3 For every g isin G(t) there is a triple ofpermutations w isin W (t) = Sn1 times Sn2 times Sn3 with w middotmax(supp(g middot t)) sube supp(t)le

Proof We prepare for the construction of w Let n isin N Let e1 en bethe standard basis of Fn Let g isin GLn Let f1 fn with fj = g middot ej be thetransformed basis of Fn Let (Ei)iisin[n] and (Fj)jisin[n] be the complete flags of Fnwith

Ei = Spanei ei+1 enFj = Spanfj fj+1 fn

Define the map

π [n]rarr [n] j 7rarr maxi isin [n] Ei cap (fj + Fj+1) 6= empty

(43)

We prove π is injective Let j k isin [n] with j le k and suppose i = π(j) = π(k)Let Ftimes = F 0 From (43) follows

(Ftimesei + Ei+1) cap (fj + Fj+1) 6= empty (44)

Ei+1 cap (fj + Fj+1) = empty (45)

(Ftimesei + Ei+1) cap (fk + Fk+1) 6= empty (46)

Suppose j lt k Then from (44) and (46) we obtain a contradiction to (45) Weconclude that j = k Thus π is injective

54 Chapter 4 The asymptotic spectrum of tensors matrix multiplication

For each Fni define as above the standard complete flag (Eij)jisin[ni] of Fni the

complete flag (F ij )jisin[ni] corresponding to the basis given by gi and the permuta-

tion πi [ni]rarr [ni] Let w = (π1 π2 π3) isin W (t)We will prove w middot max(supp(g middot t)) sube supp(t)le Let y isin max(supp(g middot t))

Let x = w middot y By construction of πi the intersection Eixicap (f iyi + F i

yi+1) is notempty Choose

f iyi isin Eixicap (f iyi + F i

yi+1)

Let tlowast be the multilinear map Fn1 times Fn2 times Fn3 rarr F with tlowast(ei ej ek) = tijk for alli isin [n1] j isin [n2] k isin [n3] Then

tlowast(f 1y1 f 2y2 f 3y3

) = tlowast(f 1y1 f 2y2 f 3y3

) +sum

zisin[n1]times[n2]times[n3]zgty

cz tlowast(f 1

z1 f 2z2 f 3z3

) (47)

for some cz isin F Since y is maximal in supp(gmiddott) the sum over z gt y in (47) equalszero We conclude tlowast(f 1

y1 f 2y2 f 3y3

) = tlowast(f 1y1 f 2y2 f 3y3

) 6= 0 Thus tlowast(E1x1timesE2

x2timesE3

x3)

is not zero and thus x isin supp(t)le

Proof of Theorem 48 We prove ζθ on oblique tensors is otimes-multiplicative oplus-additive 6-monotone and normalised to 1 on 〈1〉 = e1otimese1otimese1 The normalisationζθ(〈1〉) = 1 is clear

We prove ζθ is otimes-supermultiplicative Let s isin Fn1 otimes Fn2 otimes Fn3 and lett isin Fm1otimesFm2otimesFm3 Let P isin P(supp(t)) and Q isin P(supp(s)) Then the productP otimesQ isin P(supp(sotimes t)) has marginals PiotimesQi Since H(PiotimesQi) = H(Pi)+H(Qi)(Lemma 49(i)) we conclude ζθ(s)ζθ(t) le ζθ(sotimes t)

We prove ζθ is otimes-submultiplicative For P isin P(supp(t)) and θ isin Θ we use thenotation Hθ(P ) =

sum3i=1 θ(i)H(Pi) We naturally identify supp(t) with a subset

of [n1] times [n2] times [n3] times [m1] times [m2] times [m3] Let P isin P(supp(t)) Let P[3] be themarginal distribution of P on [n1] times [n2] times [n3] and let P3+[3] be the marginaldistribution of P on [m1]times [m2]times [m3] Then Hθ(P ) le Hθ(P[3]) +Hθ(P3+[3]) byLemma 49(ii) We conclude ζθ(sotimes t) le ζθ(s)ζθ(t)

We prove ζθ is oplus-additive By definition

ζθ(soplus t) = max2Hθ(P ) P isin P(supp(soplus t))= max

max0leple1

2Hθ(pPoplus(1minusp)Q) P isin P(supp(s)) Q isin P(supp(t))

From Lemma 49(iii) and (iv) follows

max

max0leple1

2Hθ(pPoplus(1minusp)Q) P isin P(supp(s)) Q isin P(supp(t))

= max

max0leple1

2pHθ(P )+(1minusp)Hθ(Q)+h(p) P isin P(supp(s)) Q isin P(supp(t))

= max

2Hθ(P ) + 2Hθ(Q) P isin P(supp(s)) Q isin P(supp(t))

44 Support functionals ζθ 55

= ζθ(s) + ζθ(t)

We conclude ζθ(soplus t) = ζθ(s) + ζθ(t)We prove ζθ is 6-monotone Let s 6 t with supp(s) and supp(t) oblique Then

there are linear maps Ai with s = (A1 otimes A2 otimes A3) middot t If A1 A2 A3 are of theform diag(1 1 0 0) then ζθ(s) le ζθ(t) Suppose g = (A1 A2 A3) isin G(t)Let P isin P(supp(t)) maximise Hθ on P(supp(t)) Let σ isin W such that σ middot Phas non-increasing marginals Then Hθ(σ middot P ) = Hθ(P ) and σ middot P maximises Hθ

on P(supp(σ middot t)) Then σ middot P maximises Hθ on P(supp(σ middot t)le) by Lemma 412below Let Q isin P(supp(g middot t)) maximise Hθ on P(supp(g middot t)) By Lemma 410there is a w isin W with w middot supp(g middot t) sube supp(σ middot t)le Then Hθ(w middotQ) = Hθ(Q) leHθ(σ middot P ) = Hθ(P ) Thus maxPisinP(supp(gmiddott)) Hθ(P ) le maxPisinsupp(t) Hθ(P ) Weconclude ζθ(g middot t) le ζθ(t)

The following two lemmas finish the above proof of Theorem 48 Recall thatin the proof we defined Hθ(P ) =

sum3i=1 θ(i)H(Pi) for θ isin Θ

Lemma 411 ([Str91 Prop 21]) Let Φ sube [n1] times [n2] times [n3] Let P isin P(Φ)Let supp(P ) be the support x isin Φ P (x) 6= 0 For x isin Φ define hP (x) =minussum3

i=1 θ(i) log2 Pi(xi) Then P maximises Hθ on P(Φ) if and only if

forallx isin supp(P ) hP (x) = maxyisinΦ

hP (y) (48)

Proof We write Hθ(P ) in terms of hP

Hθ(P ) =3sumi=1

θ(i)H(Pi) =sum

xisinsupp(P )

P (x)hP (x) (49)

For Q isin P(Φ)

limεrarr0+

d

dεHθ

((1minus ε)P + εQ

)= lim

εrarr0+

d

sumx

((1minus ε)P (x) + εQ(x)

)h(1minusε)P+εQ(x)

=sumx

P (x)

( 3sumi=1

θ(i)Pi(xi)minusQi(xi)

Pi(xi) ln(2)

)+sumx

(minusP (x) +Q(x)

)hP (x)

=sumx

Q(x)hP (x)minussumx

P (x)hP (x)

Therefore since Hθ is continuous and concave P maximises Hθ if and only if

forallQ isin P(Φ)sumx

Q(x)hP (x)minussumx

P (x)hP (x) le 0 (410)

56 Chapter 4 The asymptotic spectrum of tensors matrix multiplication

We will prove (410) is equivalent to (48) Supposesum

xQ(x)hP (x) lesum

x P (x)hP (x)for every Q isin P(Φ) In particular hP (y) le

sumx P (x)hP (x) for every y isin Φ so

maxyisinΦ hP (y) lesum

x P (x)hP (x) Then maxyisinΦ hP (y) =sum

x P (x)hP (x) We con-clude maxyisinΦ hP (y) = hP (x) for every x isin supp(P )

Suppose maxyisinΦ hP (y) = hP (x) for every x isin supp(P ) Then hP (y) le hP (x)for every Q isin P(Φ) y isin supp(Q) x isin supp(P ) We conclude

sumxQ(x)hP (x) lesum

x P (x)hP (x)

Lemma 412 ([Str91 Cor 22]) Let Φ sube [n1]times [n2]times [n3] Let P maximise Hθ

on P(Φ) Suppose Pi is nonincreasing on [ni] for each i isin [3] Then P max-imises Hθ on P(Φle) where Φle is the downward closure of Φ with respect to le

Proof We know P satisfies (48) We will prove P satisfies (48) with Φ replacedby Φle Then we are done by Lemma 411 Let x isin Φle Then x le y forsome y isin Φ Then (P1(x1) P2(x2) P3(x3)) ge (P1(y1) P2(y2) P3(y3)) since each Piis nonincreasing Then hP (x) le hP (y) We conclude maxΦle hP le maxΦ hP Onthe other hand Φ sube Φle Therefore maxΦ hP le maxΦle hP

Using the support functionals Strassen managed to fully compute the asymp-totic spectrum of several semirings generated by oblique tensors We will see anexample in Section 542

45 Upper and lower support functionals ζθ ζθ

In Section 44 we defined the support functionals ζθ oblique rarr Rge0 andproved that ζθ isin X(oblique) From the general theory of asymptotic spectra(Chapter 2) we know ζθ is the restriction of some map φ tensors rarr Rge0

in X(T ) However the proof of that fact was non-constructive In other wordswe know that ζθ can be extended to an element of X(T ) In this short sectionwe discuss a candidate extension proposed by Strassen called the upper supportfunctional We also discuss a companion called the lower support functional

For arbitrary t isin Fn1 otimes Fn2 otimes Fn3 the upper support functional and the lowersupport functional are defined as

ζθ(t) = mingisinG(t)

max2Hθ(P ) P isin P(supp(g middot t))

ζθ(t) = maxgisinG(t)

max2Hθ(P ) P isin P(max(supp(g middot t)))

with G(t) = GLn1 timesGLn2 timesGLn3 and Hθ(P ) =sum3

i=1 θ(i)H(Pi) We summarisethe known properties of the upper and lower support functional

Theorem 413 ([Str91]) Let s isin Fn1 otimes Fn2 otimes Fn3 and t isin Fm1 otimes Fm2 otimes Fm3Let θ isin Θ = P([3])

45 Upper and lower support functionals ζθ ζθ 57

(i) ζθ(〈n〉) = n for n isin N

(ii) ζθ(soplus t) = ζθ(s) + ζθ(t)

(iii) ζθ(sotimes t) le ζθ(s)ζθ(t)

(iv) If s gt t then ζθ(s) ge ζθ(t)

Theorem 414 ([Str91]) Let s isin Fn1 otimes Fn2 otimes Fn3 and t isin Fm1 otimes Fm2 otimes Fm3Let θ isin Θ

(i) ζθ(〈n〉) = n for n isin N

(ii) ζθ(soplus t) ge ζθ(s) + ζθ(t)

(iii) ζθ(sotimes t) ge ζθ(s)ζθ(t)

(iv) If s gt t then ζθ(s) ge ζθ(t)

Theorem 415 ([Str91]) ζθ(sotimes t) ge ζθ(s)ζθ(t) and ζθ(t) ge ζθ(t) for θ isin Θ

Regarding statement (ii) in Theorem 414 Burgisser [Bur90] shows that thelower support functional ζθ is not in general additive under the direct sumwhen θi gt 0 for all i See also [Str91 Comment (iii)] In particular this impliesthat the upper support functional ζθ(t) and the lower support functional ζθ(t)are not equal in general the upper support functional being additive In factto show that the lower support functional is not additive Burgisser first showsthat when F is algebraically closed the generic value of ζθ on Fn otimes Fn otimes Fnequals (1minusmini θi) log2 n+ o(n) On the other hand Tobler [Tob91] shows thatthe generic value of ζθ on FnotimesFnotimesFn equals log2 n So even generically ζθ and ζθare different on Fn otimes Fn otimes Fn

For θ isin Θ we say f is θ-robust if ζθ(t) = ζθ(t) We say t is robust if t is θ-robustfor all θ isin Θ Let us try to understand what robust tensors look like A tensor tis θ-robust if and only if

ζθ(t) le ζθ(t) (411)

The set of θ-robust tensors is closed under oplus and otimes since

ζθ(soplus t) = ζθ(s) + ζθ(t) = ζθ(s) + ζθ(t) le ζθ(soplus t)

and

ζθ(sotimes t) le ζθ(s)ζθ(t) = ζθ(s)ζθ(t) le ζθ(sotimes t)

For X sube [n1] times [n2] times [n3] we use the notation Hθ(X) = maxPisinP(X) Hθ(P )Let t isin Fn1 otimes Fn2 otimes Fn3 0 Equation (411) means that there are g h isin G(t)and P isin P(max supp(h middot t)) such that Hθ(supp(g middot t)) le Hθ(P ) In this case we

58 Chapter 4 The asymptotic spectrum of tensors matrix multiplication

have ζθ(t) = ζθ(t) = 2Hθ(P ) In particular t is θ-robust if there is a g isin G(t) suchthat the maximisation Hθ(supp(g middot t) is attained by a P isin P(max(supp(g middot t)))This criterion is automatically satisfied for all θ when supp(g middot t) = max(supp(g middot t))for some g isin G(t) Suppose t is oblique Then supp(g middot t) is an antichainfor some g isin G(t) and thus supp(g middot t) = max supp(g middot t) Then t is robustand ζθ(t) = ζθ(t) = 2Hθ(supp(gmiddott))

46 Asymptotic slice rank

Slice rank is a variation on tensor rank that was introduced by Terence Taoin [Tao16] to study cap sets We will look at cap sets in Section 54 Here westudy the relationship between asymptotic slice rank and the support functionals

Consider the following characterisation of tensor rank Let a simple tensor beany tensor of the form v1 otimes v2 otimes v3 isin V1 otimes V2 otimes V3 with vi isin Vi for i isin [k] Thenthe rank R(t) of t isin V1 otimes V2 otimes V3 is the smallest number r such that t can bewritten as a sum of r simple tensors

Slice rank is defined similarly but with simple tensors replaced by slicesFor S sube [k] let VS =

otimesiisinS Vi For j isin [k] let j = j A tensor in V1 otimes V2 otimes V3

is called a slice if it is of the form v otimes w with v isin Vj and w isin Vj for some j isin [k](under the natural reordering of the tensor legs) Let t isin V1 otimes V2 otimes V3 The slicerank of t denoted by SR(t) is the smallest number r such that t can be writtenas a sum of r slices For example the tensor

W = e1 otimes e1 otimes e2 + e1 otimes e2 otimes e1 + e2 otimes e1 otimes e1 isin F2 otimes F2 otimes F2 (412)

has slice rank 2 since we can write W = e1 otimes (e1 otimes e2 + e2 otimes e1) + e2 otimes e1 otimes e1In fact the slice rank of any element in V1 otimes V2 otimes V3 is at most mini dimVi Thetensor rank of W on the other hand is known to be 3

Slice rank is clearly monotone under restriction The slice rank of the diagonaltensor 〈r〉 equals r [Tao16] It follows that subrank is at most slice rank

Q(t) le SR(t)

The motivation for the introduction of slice rank in [Tao16] was finding upperbounds on subrank Q(t) and asymptotic subrank ˜Q(t)

The main result of this section is the following theorem Recall that a tensor tis oblique if the support supp(g middot t) is an antichain for some g isin G(t)

Theorem 416 Let t be oblique Then

limnrarrinfin

SR(totimesn)1n = minθisinP([3])

ζθ(t)

Our proof of Theorem 416 is based on a proof of Tao and Sawin in [TS16]and discussions of the author with Dion Gijswijt The explicit connection betweenasymptotic slice rank and the support functionals is new

46 Asymptotic slice rank 59

We use Theorem 416 before giving its proof to see that SR is not submulti-plicative and not supermultiplicative under the tensor product otimes In particular wecannot use Feketersquos lemma Lemma 22 to prove that the limit limnrarrinfin SR(totimesn)1n

exists Thus the existence of the limit is a non-trivial consequence of Theorem 416Let W as in (412) Then SR(W ) = 2 We have ζ(131313)(W ) = 2h(13) lt 2

From Theorem 416 follows SR(Wotimesn) le 2nh(13)+o(1) We conclude SR(Wotimesn) lt 2n

for n large enough We conclude SR is not supermultiplicative Now it is alsoclear that slice rank is not the same as (border) subrank since (border) subrankis supermultiplicative

Next the tensorssumn

i=1 eiotimeseiotimes1sumn

i=1 eiotimes1otimeseisumn

i=1 1otimeseiotimesei have slice rankone while their tensor product equals the matrix multiplication tensor 〈n n n〉which has slice rank n2 by Theorem 416 and Theorem 53 in the next chapterapplied to the tight tensor 〈n n n〉 We conclude SR is not submultiplicative

Slice rank and hitting set number

We study the hitting set number of the support of a tensor Let Φ sube [n1]times[n2]times[n3]A hitting set for Φ is a 3-tuple of sets A1 sube [n1] A3 sube [n2] A3 sube [n3] such that forevery a isin Φ there is an i isin [3] with ai isin Ai We may think of Φ as a 3-partite3-uniform hypergraph Then the definition of hitting set says every edge a isin Φ ishit by an element of some Ai A hitting set is also called a vertex cover everyedge being covered by some vertex or a transversal The size of the hittingset (A1 A2 A3) is |A1|+ |A2|+ |A3| The hitting set number τ(Φ) is the size ofthe smallest hitting set for Φ Let t isin Fn1 otimes Fn2 otimes Fn3

Lemma 417 Let g isin G(t) = GLn1timesGLn2timesGLn3 Then SR(t) le τ(supp(g middot t))

Proof This is clear

Lemma 418 Let g isin G(t) Then SR(t) ge τ(max(supp(g middot t)))

Proof It is sufficient to consider g = e Let

t =

r1sumi=1

v1i otimes u1

i +

r2sumi=1

v2i otimes u2

i +

r3sumi=1

v3i otimes u3

i

be a slice decomposition We may assume vj1 vjrj

are linearly independent

Let Vj = Spanvj1 vjrj sube Fnj Let Wj sube (Fnj)lowast be the elements in thedual space that vanish on Vj Let Bj sube Wj be a basis with the followingproperty with respect to the standard basis the matrix with the elementsof Bj as columns is in reduced row echelon form ie each column is of theform (lowast middot middot middot lowast 1 0 middot middot middot 0)T and the pivot elements (the 1rsquos) are all in different rowsLet Sj sube [nj] be the indices of the pivot element Let Sj = [nj] Sj be thecomplement Then |Sj| = rj We claim (S1 S2 S3) is a hitting set for max(supp(t))

60 Chapter 4 The asymptotic spectrum of tensors matrix multiplication

Then r1 + r2 + r3 = |S1|+ |S2|+ |S3| ge τ(max(supp(t))) Let x isin max(supp(t))Suppose x isin S1 times S2 times S3 For every j isin [3] let φj isin Bj have its pivot element atindex xj Let φ = φ1 otimes φ2 otimes φ3 Then φ isin W1 otimesW2 otimesW3 so φ(t) = 0 Since x ismaximal and each Bj is in reduced row echelon form

φ(t) =sumylex

ty φ(ey1 otimes ey2 otimes ey3)

=sumyltx

ty φ(ey1 otimes ey2 otimes ey3) + tx ex1 otimes ex2 otimes ex3

=sumyltx

sy ey1 otimes ey2 otimes ey3 + tx ex1 otimes ex2 otimes ex3

for some sy isin F From φ(t) = 0 follows tx = 0 This contradicts x isin supp(t) sox 6isin S1 times S2 times S3 ie there is a j isin [3] with xj isin Sj

Asymptotic hitting set number

We now study the asymptotic hitting set number ˜τ(Φ) = limnrarrinfin τ(Φtimesn)1nWe will use some basic facts of types and type classes Let X be a finite

set Let N isin N An N-type on X is a probability distribution P on X withN middot P (x) isin N for all x isin X Let P be an N -type on X The type class TNP sube XN

is the set of sequences s = (s1 sN) with x occuring N middot P (x) times in s forevery x isin X ie |i isin [N ] si = x| = N middot P (x)

Lemma 419 The number of N-types on X equals(N+|X|minus1|X|minus1

) Let P be an

N-type The size of the type class TNP equals the multinomial coefficient(NNP

)

Proof We leave the proof to the reader

Lemma 420 Let P be an N-type on X Then

1

(N + 1)|X|2NH(P ) le

(N

NP

)le 2NH(P )

Proof See eg [CT12 Theorem 1113]

Lemma 421 log2˜τ(Φ) le maxPisinP(Φ) miniisin[3]H(Pi)

Proof Let P maximise maxPisinP(Φ) miniH(Pi) Let n isin N We construct a hittingset (A1 A2 A3) for Φn as follows Let x isin Φn Viewing x as an n-tuple of elementsin Φ let Q isin Pn(Φ) be the type of x (ie the empirical distribution) Let j isin [3]with H(Qj) = miniisin[3]H(Qi) By our choice of P we have

H(Qj) = miniisin[3]

H(Qi) le miniisin[3]

H(Pi)

46 Asymptotic slice rank 61

Viewing x as a 3-tuple (x1 x2 x3) add xj to Aj We repeat this for all x isin ΦnThe final (A1 A2 A3) is a hitting set for Φn by construction For each j isin [3]

|Aj| lesumQj

|T nQj | lesumQj

2nH(Qj)

where the sum is over Qj isin Pn(Φj) with H(Qj) le miniisin[3]H(Pi) Then

|Aj| le |Pn(Φj)| 2nminiH(Pi) = poly(n)2nminiH(Pi)

We conclude |A1|+ |A2|+ |A3| le poly(n)2nminiH(Pi)

Lemma 422 log2˜τ(Φ) ge maxPisinP(Φ) miniisin[3]H(Pi)

Proof Let P maximise maxPisinP(Φ) miniH(Pi) Let n isin N Let (A1 A2 A3) be ahitting set for Φn Let Q isin Pn(Φ) be an n-type with miniH(Qi) = miniH(Pi)minuso(n) Let Ψ = T nQ sube Φn be the set of strings with type Q Then (A1 A2 A3) is ahitting set for Ψ Let πi Ψrarr Φn

i (x1 x2 x3) 7rarr xi Then

Ψ = πminus11 (A1) cup πminus1

2 (A2) cup πminus13 (A3)

Let j isin [3] with |πminus1j (Aj)| ge 1

3|Ψ| The fiber πminus1

j (a) has constant size over a isin Ψj

Let cj = |πminus1j (a)| be this size Then

|Ψ| =sumaisinΨj

|πminus1j (a)| =

sumaisinΨj

cj = |Ψj| cj

And

|πminus1j (Aj)| =

sumaisinAjcapΨj

|πminus1j (a)| = |Aj capΨj| cj le |Aj| cj

Therefore

|Aj| ge|πminus1j (Aj)|cj

ge13|Ψ|cj

= 13|Ψj|

We have |Ψj| ge 2nH(Qj)minuso(n) ge 2nminiH(Qi)minuso(n) ge 2nminiH(Pi)minuso(n) We conclude|A1|+ |A2|+ |A3| ge |Aj| ge 1

3|Ψj| ge 1

32nminiH(Pi)minuso(n)

Lemma 423 log2˜τ(Φ) = maxPisinP(Φ) miniisin[3] H(Pi)

Proof This follows directly from the above lemmas

62 Chapter 4 The asymptotic spectrum of tensors matrix multiplication

Asymptotic slice rank

We now combine the above lemmas about slice rank and the asymptotic hittingset number to prove Theorem 416 First we have the following basic lemma

Lemma 424 minθisinΘ maxPisinP(Φ)Hθ(P ) = maxPisinP(Φ) miniisin[3]H(Pi)

Proof Since Hθ(P ) is convex in θ and concave in P von Neumannrsquos minimaxtheorem gives minθ maxPisinP(Φ)Hθ(P ) = maxPisinP(Φ) minθHθ(P ) Finally we usethat minθHθ(P ) = miniH(Pi)

Define fsim(t) = lim supnrarrinfin f(totimesn)1n and fsim(t) = lim infnrarrinfin f(totimesn)1n

Lemma 425 Let t isin Fn1 otimes Fn2 otimes Fn3 Then

maxgisinG(t)

maxPisinP(max supp(gmiddott))

miniH(Pi) le SRsim(t) le SRsim(t) le min

θζθ(t)

Proof By definition SRsim(t) le SRsim(t) From Lemma 417 follows

SRsim(t) le˜τ(supp(g middot t))

for any g isin G(t) Lemma 423 gives ˜τ(supp(g middot t)) = maxPisinP(supp(gmiddott)) mini 2H(Pi)

Thus with the help of Lemma 424

SRsim(t) le mingisinG(t)

maxPisinP(supp(gmiddott))

mini

2H(Pi) = minθζθ(t)

From Lemma 418 follows

˜τ(max(supp(g middot t))) le SRsim(t)

for any g isin G(t) Lemma 423 gives

maxgisinG(t)

maxPisinP(max(supp(gmiddott)))

mini

2H(Pi) le SRsim(t)

This proves the lemma

Proof of Theorem 416 We may assume Φ = supp(t) is oblique Then withthe help of Lemma 424 and Lemma 425

minθisinΘ

ζθ(t) = minθisinΘ

ζθ(t)

= minθisinΘ

maxPisinmax(Φ)

2Hθ(P )

= maxPisinmax(Φ)

miniisin[3]

2H(Pi)

le maxgisinG(t)

maxPisinP(max(supp(gmiddott)))

miniisin[3]

2H(Pi)

le SRsim(t)

le SRsim(t)

le minθisinΘ

ζθ(t)

This proves the claim

47 Conclusion 63

47 Conclusion

The study of asymptotic rank of tensors is motivated by the open problem of findingthe exponent of matrix multiplication Asymptotic subrank has applicationsin for example combinatorics and algebraic property testing Via the theoryof asymptotic spectra Strassen characterised asymptotic rank and asymptoticsubrank in terms of the asymptotic spectrum of tensors Strassen introduced thegauge points in X(T ) and the support functionals in X(oblique) More preciselythere are the lower support functionals and the upper support functionals Thelower support functionals are not additive and can thus not be universal spectralpoints The upper support functionals may be universal spectral points but thiscan however not be shown with the help of the lower support functionals Finallywe showed that for oblique tensors the asymptotic slice rank exists and equals theminimum value over the support functionals In the next chapter we will see asubfamily of the oblique 3-tensors for which the support functionals are powerfulenough to compute the asymptotic subrank

Chapter 5

Tight tensors and combinatorialsubrank cap sets

This chapter is based on joint work with Matthias Christandl and Peter Vrana [CVZ16

CVZ18]

51 Introduction

In the previous chapter we discussed the gauge points and the support function-als ζθ The gauge points are in the asymptotic spectrum of all tensors while thesupport functionals are in the asymptotic spectrum of oblique tensors

How ldquopowerfulrdquo are the support functionals We know ˜Q(t) le ζθ(t) le ˜R(t) for

oblique t Thus maxθ ζθ(t) le ˜R(t) In fact maxθ ζ

θ(t) is at most the maximumover the gauge points maxS ζ(S) and in turn maxS ζ(S) is at most ˜R(t) Asremarked earlier it is not known whether maxS ζ(S) equals ˜R(t) in general

On the other hand we have ˜Q(t) le minθ ζθ(t) Do we attain equality here

in general ˜Q(t) = minθ ζθ(t) The answer is ldquoyesrdquo for the subsemiring of tight

3-tensors In this chapter we study tight k-tensors

Tight tensors

Let I1 Ik be finite sets Let Φ sube I1 times middot middot middot times Ik We say Φ is tight if there areinjective maps ui Ii rarr Z for i isin [k] such that

forallα isin Φ u1(α1) + middot middot middot+ uk(αk) = 0

We say t isin Fn1 otimes middot middot middot otimes Fnk is tight if there is a g isin G(t) = GLn1 times middot middot middot times GLnksuch that the support supp(g middot t) is tight

Recall that a tensor is oblique if the support is an antichain in some basisClearly tight tensors are oblique To summarise the families of tensors that we

65

66 Chapter 5 Tight tensors and combinatorial subrank cap sets

have defined up to now we have

tight sube oblique sube robust sube θ-robust

Recall that the families of oblique robust and θ-robust tensors each form asemiring under otimes and oplus Tight tensors have the same property [Str91 Section 5]Another property is that any subset of a tight set is tight

Example 51 Let k ge 3 be fixed For any integer n ge 1 and c isin [n] the set

Φn(c) = α isin 0 nminus 1k α1 + middot middot middot+ αk = c

is tight For any integer n ge 2 and any c isin [n] the set

Ψn(c) = α isin 0 nminus 1k α1 + middot middot middot+ αk = c mod n

is not tight (cf Exercise 1520 in [BCS97])

Example 52 When F contains a primitive nth root of unity ζ the tensor

tn =sum

αisinΨn(nminus1)

eα1 otimes middot middot middot otimes eαk isin (Fn)otimesk

which has support Ψn(n minus 1) is tight Namely the elements vj =sumn

i=1 ζijei

for j isin [n] form a basis of Fn Let g isin G(tn) be the corresponding basistransformation Then we have tn =

sumnj=1 vj otimes middot middot middot otimes vj and we see that the

support supp(g middot tn) = α isin [n]k α1 = middot middot middot = αk is tight (See also [BCS97Exercise 1525]) When the characteristic of F equals n the tensor tn is also tightas we will see in Section 542

Combinatorial subrank and the CoppersmithndashWinograd method

We care about tight tensors because of a remarkable theorem for tight 3-tensors ofStrassen (Theorem 53 below) To understand the theorem we need the concept ofcombinatorial asymptotic subrank (cf [Str91 Section 5]) We say D sube I1timesmiddot middot middottimesIkis a diagonal when any two distinct α β isin D are distinct in all k coordinates Inother words for elements in D the value at one coordinate uniquely determinesthe value at the other k minus 1 coordinates Let Φ sube I1 times middot middot middot times Ik We say adiagonal D sube I1 times middot middot middot times Ik is free for Φ or simply D sube Φ is a free diagonalif D = Φ cap (D1 times middot middot middot times Dk) where Di = xi (x1 xk) isin D Define the(combinatorial) subrank Q(Φ) as the size of the largest free diagonal D sube ΦFor Φ sube I1 times middot middot middot times Ik and Ψ sube J1 times middot middot middot times Jk we naturally define the productΦtimesΨ sube (I1 times J1)times middot middot middot times (Ik times Jk) by

ΦtimesΨ = ((α1 β1) (αk βk)) α isin Φ β isin Ψ

51 Introduction 67

Define the (combinatorial) asymptotic subrank ˜Q(Φ) = limnrarrinfinQ(Φtimesn)1n Lett isin Fn1 otimes middot middot middot otimes Fnk and let Φ be the support of t in the standard basis ThenQ(Φ) le Q(t) and ˜Q(Φ) le ˜Q(t) The number Q(Φ) may be interpreted as thelargest number n such that 〈n〉 can be obtained from t using a restriction thatconsists of matrices that have at most one nonzero entry in each row and ineach column (This is called M-restriction in [Str87 Section 6] which standsfor monomial restriction) We may also interpret Φ as a k-partite hypergraphThen Q(Φ) is the size of the largest induced k-partite matching in Φ

Let Φ sube [n1]timesmiddot middot middottimes [nk] and let t isin Fn1 otimesmiddot middot middototimesFnk be any tensor with supportequal to Φ Then the (asymptotic) subranks of Φ and t are related as follows

Q(Φ) le Q(t) and ˜Q(Φ) le ˜Q(t)

Strassen proved the following theorem using the method of Coppersmith andWinograd [CW90] Recall that for Φ sube I1 times I2 times I3 we let P(Φ) be the set ofprobability distributions on Φ For P isin P(Φ) let P1 P2 P3 be the marginaldistributions of P on the 3 components of I1 times I2 times I3

Theorem 53 ([Str91 Lemma 51]) Let Φ sube I1 times I2 times I3 be tight Then

˜Q(Φ) = maxPisinP(Φ)

miniisin[3]

2H(Pi) (51)

The consequence of Theorem 53 is that the support functionals are sufficientlypowerful to compute the asymptotic subrank of tight 3-tensors

Corollary 54 ([Str91 Proposition 54]) Let t isin Fn1 otimes Fn2 otimes Fn3 be tight Then

˜Q(t) = minθisinP([3])

ζθ(t)

Moreover if Φ = supp(g middot t) is tight for some g isin G(t) then ˜Q(t) = ˜Q(Φ)

Remark 55 Strassen conjectured in [Str94 Conjecture 53] that for the familyof tight 3-tensors the support functionals give all spectral points in the asymp-totic spectrum X(tight 3-tensors) In [Str91] numerous examples are given ofsubfamilies of tight 3-tensors for which this is the case

Remark 56 Equation (51) becomes false when we let Φ sube I1 times middot middot middot times Ikwith k ge 4 and we let the right-hand side of the equation be maxPisinP(Φ) mini 2H(Pi)see [CVZ16 Example 1138]

New results in this chapter

This chapter is an investigation of tight tensors combinatorial asymptotic subrankand applications More precisely this chapter contains the following new results

68 Chapter 5 Tight tensors and combinatorial subrank cap sets

Higher-order CoppersmithndashWinograd method In Section 52 we extendTheorem 53 to obtain a lower bound for ˜Q(Φ) for tight sets Φ sube I1 times middot middot middot times Ikwith k ge 4 Our lower bound is not known to be optimal in general We computeexamples for which the lower bound is optimal

Combinatorial degeneration method In Section 53 we further extend therange of application of the CoppersmithndashWinograd method via a partial order

on supports of tensors called combinatorial degeneration We prove that if Φ Ψthen ˜Q(Φ) le ˜Q(Ψ) Suppose Ψ is not tight but Φ is tight then we may apply the(higher-order) CoppersmithndashWinograd method to obtain a lower bound on ˜Q(Φ)and thus on ˜Q(Ψ)

Cap sets In Section 54 we relate the theory of asymptotic spectra theCoppersmithndashWinograd method and the combinatorial degeneration methodto the problem of upper bounding the maximum size of cap sets in Fnp

Graph tensors Graph tensors are generalisations of the matrix multiplicationtensor 〈2 2 2〉 parametrised by graphs In Section 55 we discuss how one canapply the higher-order CoppersmithndashWinograd method to obtain upper boundson the asymptotic rank of complete graph tensors We also briefly discuss thesurgery method which gives good upper bounds on the asymptotic rank of graphtensors for sparse graphs like cycle graphs

52 Higher-order CW method

In this section we extend Theorem 53 to tight Φ sube I1 times middot middot middot times Ik with k ge 4We introduce some notation Let P(Φ) be the set of probability distributionson Φ For P isin P(Φ) let P1 Pk be the marginal distributions of P on the kcomponents of I1 times middot middot middot times Ik Let R(Φ) be the set of all subsets R sube Φ2 suchthat R 6sube (x x) x isin Φ and R sube (x y) isin Φ2 xi = yi for some i isin [k]For P isin P(Φ) and R isin R(Φ) let Q(R (P1 Pk)) be the set of probabilitydistributions Q on R whose marginal distributions on the 2k components of Rsatisfy Qi = Qk+i = Pi for i isin [k]

Let I1 Ik be finite subsets of Z The result of this section is a lower bound onthe asymptotic subrank of any Φ sube I1timesmiddot middot middottimesIk satisfying foralla isin Φ

sumki=1 ai = 0 For

R sube R(Φ) let r(R) be the rank over Q of the matrix with rows xminusy (x y) isin R

Theorem 57 Let Φ sube Zk be a finite set with foralla isin Φsumk

i=1 ai = 0 Then

log2 ˜Q(Φ) ge maxP

minRQ

H(P )minus (k minus 2)H(Q)minusH(P )

r(R)

with P isin P(Φ) R isin R(Φ) and Q isin Q(R (P1 Pk))

52 Higher-order CoppersmithndashWinograd method 69

521 Construction

We prepare for the proof of Theorem 57 by discussing some basic facts

Average-free sets

Lemma 58 Let k isin N Let M isin N We say a subset B sube ZMZ is (k minus 1)-average-free if

forallx1 xk isin B x1 + middot middot middot+ xkminus1 = (k minus 1)xk rArr x1 = middot middot middot = xk

There is a (k minus 1)-average-free set B sube ZMZ of size |B| = M1minuso(1)

Proof There is a set A sube 1 bMminus1kminus1c of size |A| = M1minuso(1) with

forallx1 xk isin A x1 + middot middot middot+ xkminus1 = (k minus 1)xk rArr x1 = middot middot middot = xk (52)

see [VC15 Lemma 10] Let B = a mod M a isin A sube ZMZ Then |B| = |A|Let x1 xk isin B with x1 + middot middot middot+ xkminus1 = (k minus 1)xk View x1 xk as elementsin 1 bMminus1

kminus1c Then x1 + middot middot middot+xkminus1 = (kminus1)xk still holds From (52) follows

x1 = middot middot middot = xk in Z and hence also in ZMZ

Linear combinations of uniform variables

Lemma 59 Let M be a prime Let u1 un be independently uniformly dis-tributed over ZMZ Let v1 vm be (ZMZ)-linear combinations of u1 unThen the vector v = (v1 vm) is uniformly distributed over the range of v in(ZMZ)m

Proof Let vi =sum

j cijuj with cij isin ZMZ Then v = Cu with u = (u1 un)and C the matrix with entries Cij = cij Let y in the image of C Then thecardinality of the preimage Cminus1(y) equals the cardinality of the kernel of CIndeed if Cx = y then Cminus1(y) = x + ker(C) Since u is uniform we concludethat v is uniform on the image of C

Free diagonals

Lemma 510 Let G be a graph with n vertices and m edges Then G has at leastnminusm connected components

Proof A graph without edges has n connected components For every edge thatwe add to the graph we lose at most one connected component

Lemma 511 Let I1 Ik be finite sets Let Ψ sube I1 times middot middot middot times Ik Let

C = a b sube Ψ a 6= bexisti isin [k] ai = bi

Then Q(Ψ) ge |Ψ| minus |C| Obviously the statement remains true if we replace C bythe larger set (a b) isin Ψ2 a 6= bexisti isin [k] ai = bi

70 Chapter 5 Tight tensors and combinatorial subrank cap sets

Proof Let G = (Ψ C) be the graph with vertex set Ψ and edge set C Let Γ sube Ψcontain exactly one vertex per connected component of G The vertices in Γ arepairwise not adjacent So Γ is a diagonal Of course Γ sube Ψcap (Γ1times middot middot middot times Γk) Leta isin Ψ cap (Γ1 times middot middot middot times Γk) Let x1 xk isin Γ with

(x1)1 = a1 (x2)2 = a2 (xk)k = ak

Then x1 xk are all adjacent to a in G ie they are all in the same connectedcomponent Then x1 = middot middot middot = xk since Γ contains precisely one vertex perconnected component So a = x1 = middot middot middot = xk So a isin Γ We conclude thatΓ supe Ψ cap (Γ1 times middot middot middot times Γk) Finally |Γ| ge |Ψ| minus |C| by Lemma 510

We now give the proof of Theorem 57 We repeat some notation from aboveLet k ge 3 Let Φ sube Zk be a finite set Let P(Φ) be the set of probabilitydistributions on Φ For P isin P(Φ) let P1 Pk be the marginal distributionsof P on the k components of Zk Let R(Φ) be the set of all subsets R sube Φ2 suchthat R 6sube (x x) x isin Φ and R sube (x y) isin Φ2 xi = yi for some i isin [k]For P isin P(Φ) and R isin R(Φ) let Q(R (P1 Pk)) be the set of probabilitydistributions Q on R whose marginal distributions on the 2k components of Rsatisfy Qi = Qk+i = Pi for i isin [k] For R sube R(Φ) let r(R) be the rank over Q ofthe matrix with rows

xminus y (x y) isin R

For any prime M let rM(R) be the rank over ZMZ of the same matrix

Theorem (Theorem 57) Let Φ sube Zk be a finite set with foralla isin Φsumk

i=1 ai = 0Then

log2 ˜Q(Φ) ge maxP

minRQ

H(P )minus (k minus 2)H(Q)minusH(P )

r(R)

with P isin P(Φ) R isin R(Φ) and Q isin Q(R (P1 Pk))

Proof Let P be a rational probability distribution on Φ ie foralla isin Φ P (a) isin Q

Choice of parameters

This proof involves a variable N that we will let go to infinity and a primenumber M that depends on N For the sake of rigor we first set the dependenceof M on N and make sure that N is large enough for M to have good properties

Let n isin N such that P is an n-type ie foralla isin Φ nP (a) isin N Let N = tn be amultiple of n Let

f(N) = log2

(2|Φ|

2

maxRisinR(Φ)

(N + |R| minus 1

|R| minus 1

))isin o(N) (53)

52 Higher-order CoppersmithndashWinograd method 71

Let

g(N) = |Φ| log2(N + 1) isin o(N)

By Lemma 420

2NH(P )minusg(N) le(N

NP

) (54)

Let

micro(N) = maxRQ

H(Q)minusH(P ) + (1 + g(N) + f(N)) 1N

r(R)(55)

with R isin R(Φ) and Q isin Q(R (P1 Pk)) Let M be a prime with

d2micro(N)Ne leM le 2d2micro(N)Ne (56)

Such a prime exists by Bertrandrsquos postulate see eg [AZ14] We can make Marbitrarily large by choosing N large enough Choose N = tn large enough suchthat

M gt k minus 1 (57)

forallR isin R(Φ) rM(R) = r(R) (58)

We will later let t and thus N go to infinity

Restrict to marginal type classes

The set ΦotimesN is a finite subset of (ZN)k Let a isin ΦotimesN Then we have thatai = ((ai)1 (ai)N) isin ZN for i isin [k] We restrict to those a for which ai is inthe type class TNPi for all i isin [k] Thus let

Ψ = ΦotimesN cap (TNP1times middot middot middot times TNPk)

We prove a lower bound on the size of Ψ Let (s1 sN ) isin TNP Then sj isin Φ forj isin [N ] and ((s1)i (sN)i) isin TNPi for i isin [k] So(

((s1)1 (sN)1) ((s1)k (sN)k))isin ΦotimesN cap (TNP1

times middot middot middot times TNPk) = Ψ

Thus |Ψ| ge |TNP | By Lemma 419 |TNP | =(NNP

) By Lemma 420

(NNP

)ge

2NH(P )minusg(N) Therefore

|Ψ| ge 2NH(P )minusg(N) (59)

72 Chapter 5 Tight tensors and combinatorial subrank cap sets

Hashing

Let u1 ukminus1 v1 vN isin ZMZ For i isin [k] let

hi ZN rarr ZMZ

x 7rarr

ui +

sumNj=1 xjvj for 1 le i le k minus 1

1kminus1

(u1 + middot middot middot+ ukminus1 minus

sumNj=1 xjvj

)for i = k

Note that kminus1 is invertible in ZMZ by (57) Let a isin Ψ Then ((a1)j (ak)j) isinΦ for j isin [N ] So

sumki=1(ai)j = 0 for every j isin [N ] Thus

ksumi=1

Nsumj=1

(ai)jvj =Nsumj=1

vj

ksumi=1

(ai)j = 0

Therefore

h1(a1) + middot middot middot+ hkminus1(akminus1) = (k minus 1)hk(ak)

Restrict to average-free set

Let B sube ZMZ be a (k minus 1)-average-free set of size

|B| geM1minusκ(M) with κ(M) isin o(1) (510)

meaning

forallx1 xk isin B x1 + middot middot middot+ xkminus1 = (k minus 1)xk rArr x1 = middot middot middot = xk (511)

(Lemma 58) Let Ψprime sube Ψ be the subset

Ψprime = a isin Ψ foralli isin [k] hi(ai) isin B

Let a isin Ψprime Then a isin Ψ so

h1(a1) + middot middot middot+ hkminus1(akminus1) = (k minus 1)hk(ak)

Since hi(ai) isin B for every i isin [k] (511) implies

h1(a1) = middot middot middot = hk(ak)

Probabilistic method

Clearly Q(ΦotimesN) ge Q(Ψ) ge Q(Ψprime) Let

C prime = (a b) isin Ψprime2 a 6= bexisti isin [k] ai = bi

52 Higher-order CoppersmithndashWinograd method 73

Let X = |Ψprime| and Y = |C prime| By Lemma 511

Q(Ψprime) ge X minus Y

Let u1 ukminus1 v1 vN be independent uniformly random variables over thefield ZMZ Then X and Y are random variables Then

Q(Ψprime) ge E[X minus Y ] = E[X]minus E[Y ]

where the expectation is over u1 ukminus1 v1 vN We will prove

E[X] = |B| |Ψ|Mminus(kminus1) (512)

E[Y ] le |B|maxRQ

2NH(Q)+f(N)Mminus(kminus1)minusr(R) (513)

with f(N) as defined in (53) and R isin R(Φ) Q isin Q(R (P1 Pk)) Beforeproving (512) and (513) we derive the final bound

Derivation of final bound

From (512) and (513) follows

E[X]minus E[Y ] ge |B| |Ψ|Mminus(kminus1) minus |B|maxRQ

2NH(Q)+f(N)Mminus(kminus1)minusr(R)

We factor out |B| |Ψ| and Mminus(kminus1)

E[X]minus E[Y ] ge |B| |Ψ|Mminus(kminus1)(

1minus 1

|Ψ|maxRQ

2NH(Q)+f(N)Mminusr(R))

From our choice of micro(N) from (55)

micro(N) = maxRQ

H(Q)minusH(P ) + (1 + g(N) + f(N)) 1N

r(R)

follows

maxRQ

2N(H(Q)minusH(P )minusr(R)micro(N))+g(N)+f(N) le 1

2 (514)

Apply |B| geM1minusκ(M) from (510) and |Ψ| ge 2NH(P )minusg(N) from (59) to get

E[X]minus E[Y ] geM1minusκ(M)2NH(P )minusg(N)Mminus(kminus1)

middot(

1minus 2minusNH(P )+g(N) maxRQ

2NH(Q)+f(N)Mminusr(R))

geMminus(kminus2+κ(M))2NH(P )minusg(N)

74 Chapter 5 Tight tensors and combinatorial subrank cap sets

middot(

1minusmaxRQ

2NH(Q)minusNH(P )+g(N)+f(N)Mminusr(R))

(Here we used (514) to see that the second factor is nonnegative) Apply theupper bound 2micro(N)N leM le 2micro(N)N+2 from (56) to get

E[X]minus E[Y ] ge (2micro(N)N+2)minus(kminus2+κ(M))2NH(P )minusg(N)

middot(

1minusmaxRQ

2NH(Q)minusNH(P )+g(N)+f(N)(2micro(N)N)minusr(R))

= 2N(H(P )minus(kminus2+κ(M))micro(N))minus2(kminus2+κ(M))minusg(N)

middot(

1minusmaxRQ

2N(H(Q)minusH(P )minusr(R)micro(N))+g(N)+f(N))

Using (514) we get

E[X]minus E[Y ] ge 2N(H(P )minus(kminus2+κ(M))micro(N))minus2(kminus2+κ(M))minusg(N)(1minus 1

2)

= 2N(H(P )minus(kminus2+κ(M))micro(N))minus2(kminus2+κ(M))minusg(N)minus1

Then

1

Nlog2 Q(ΦotimesN)

ge 1

Nlog2(E[X]minus E[Y ])

ge H(P )minus (k minus 2 + κ(M)) maxRQ

H(Q)minusH(P ) + (1 + g(N) + f(N)) 1N

r(R)

minus 2(k minus 2 + κ(M)) + g(N) + 1

N

We let t and thus N go to infinity and obtain

log2 ˜Q(Φ) ge H(P )minus (k minus 2) maxRQ

H(Q)minusH(P )

r(R)

This lower bound holds for any rational probability distribution P on Φ and bycontinuity for any real probability distribution P on Φ

It remains to prove (512) and (513) We do this in the lemmas below

Lemma 512 E[X] = |B| |Ψ|Mminus(kminus1)

Proof Let a isin Ψ Then h1(a1) + middot middot middot+ hkminus1(akminus1) = (k minus 1)hk(ak) The followingfour statements are equivalent

a isin Ψprime

foralli isin [k] hi(ai) isin B

52 Higher-order CoppersmithndashWinograd method 75

existb isin B h1(a1) = middot middot middot = hk(ak) = b

existb isin B h1(a1) = middot middot middot = hkminus1(akminus1) = b

Therefore

P[a isin Ψprime] =sumbisinB

P[h1(a1) = middot middot middot = hkminus1(akminus1) = b]

For b isin B

P[h1(a1) = middot middot middot = hkminus1(akminus1) = b] = (Mminus1)kminus1

We conclude

E[X] =sumaisinΨ

P[a isin Ψprime]

=sumaisinΨ

sumbisinB

P[h1(a1) = middot middot middot = hkminus1(akminus1) = b]

=sumaisinΨ

sumbisinB

(Mminus1)kminus1

= |Ψ| |B|Mminus(kminus1)

This proves the lemma

Lemma 513 E[Y ] le |B|maxRQ 2NH(Q)+f(N)Mminus(kminus1)minusr(R)

Proof Let

C = (a aprime) isin Ψ2 a 6= aprimeexisti isin [k] ai = aprimei

Let (a aprime) isin C The following statements are equivalent

(a aprime) isin C prime (515)

a aprime isin Ψprime (516)

foralli isin [k] hi(ai) hi(aprimei) isin B (517)

existb isin B h1(a1) = middot middot middot = hk(ak) = h1(aprime1) = middot middot middot = hk(aprimek) = b (518)

Therefore

E[Y ] =sum

(aaprime)isinC

P[(a aprime) isin C prime]

=sum

(aaprime)isinC

sumbisinB

P[h1(a1) = middot middot middot = hk(ak) = h1(aprime1) = middot middot middot = hk(aprimek) = b]

76 Chapter 5 Tight tensors and combinatorial subrank cap sets

Let (a aprime) isin C Then hi(ai) and hi(aprimei) are ZMZ-linear combinations of

u1 ukminus1 v1 vN The random variable(h1(a1) hk(ak) h1(aprime1) hk(a

primek))

is uniformly distributed over the image subspace V sube (ZMZ)2k Let b isin BThen (b b) isin V since u1 = middot middot middot = uk = b v1 vN = 0 is a valid assignmentTherefore

P[h1(a1) = middot middot middot = hk(ak) = h1(aprime1) = middot middot middot = hk(aprimek) = b] = |V |minus1

And |V | equals M to the power the rank of the matrix1 0 middot middot middot 0 1

kminus11 0 middot middot middot 0 1

kminus1

0 1 0 1kminus1

0 1 0 1kminus1

0 0 middot middot middot 1 1

kminus10 0 middot middot middot 1 1

kminus1

a1 a2 middot middot middot akminus1 minus akkminus1

aprime1 aprime2 middot middot middot aprimekminus1 minusaprimekkminus1

(519)

over ZMZ with a1 ak aprime1 a

primek thought of as column vectors in (ZMZ)N

With column operations we transform (519) into0 0 middot middot middot 0 0 1 0 middot middot middot 0 00 0 middot middot middot 0 0 0 1 0 0

0 0 middot middot middot 0 0 0 0 1 0

a1 minus aprime1 a2 minus aprime2 middot middot middot akminus1 minus aprimekminus1 ak minus aprimek aprime1 aprime2 middot middot middot aprimekminus1 0

(520)

Matrix (520) has rank equal to k minus 1 plus rM(a aprime) = rk(A(a aprime)) where

A(a aprime) =(a1 minus aprime1 a2 minus aprime2 middot middot middot ak minus aprimek

)

We obtain

E[Y ] lesum

(aaprime)isinC

sumbisinB

Mminus(kminus1+rM (aaprime))

Since the summands are independent of b we get

E[Y ] le |B|sum

(aaprime)isinC

Mminus(kminus1+rM (aaprime))

Let (a aprime) isin C Consider the rows of A(a aprime) The N rows are of theform xi minus yi with (xi yi) isin Φ2 Let s = ((x1 y1) (xN yN)) Let R =

52 Higher-order CoppersmithndashWinograd method 77

(x1 y1) (xN yN) We have rM(a aprime) = rM(R) and rM(R) = r(R) by (58)Let Q be the N -type with supp(Q) = R and s isin TNQ From a 6= aprime followsR 6sube (x x) x isin Φ From existi isin [k] ai = aprimei follows existi isin [k] R sube (x y) xi = yiFrom a aprime isin TNP1

times middot middot middot times TNPk follows Qi = Qk+i = Pi for all i isin [k] We thus have

E[Y ] le |B|sum

RisinR(Φ)

sumQisinQ(R(P1Pk))

supp(Q)=RQ is N -type

sumsisinTNQ

Mminus(kminus1+r(R))

The number of N -types Q with supp(Q) = R is at most the number of N -typeson R which is at most

(N+|R|minus1|R|minus1

)(Lemma 419) For any Q isin Q(R (P1 Pk))

|TNQ | le 2NH(Q) (Lemma 419) Therefore

E[Y ] le |B|sum

RisinR(Φ)

(N + |R| minus 1

|R| minus 1

)max

QisinQ(R(P1Pk))2NH(Q) Mminus(kminus1+r(R))

Also |R(Φ)| le 2|Φ|2

Therefore

E[Y ] le |B| 2|Φ|2 maxRisinR(Φ)

(N + |R| minus 1

|R| minus 1

)max

QisinQ(R(P1Pk))2NH(Q) Mminus(kminus1+r(R))

We conclude that

E[Y ] le |B|maxRQ

2NH(Q)+f(N)Mminus(kminus1)minusr(R)

This proves the lemma

522 Computational remarks

The following two lemmas are helpful when applying Theorem 57 We leave theproof to the reader

Lemma 514 Let P isin P(Φ) Let RRprime isin R(Φ) with R sube Rprime and r(R) = r(Rprime)Then

maxQisinQ(R(P1Pk))

H(Q)minusH(P )

r(R)le max

QisinQ(Rprime(P1Pk))

H(Q)minusH(P )

r(Rprime)

Lemma 515 Let R isin R(Φ) There is an equivalence relation Rprime isin R(Φ) withR sube Rprime and r(R) = r(Rprime)

78 Chapter 5 Tight tensors and combinatorial subrank cap sets

523 Examples type sets

We discuss some examples The first example we will use to get good upperbounds on the asymptotic rank of complete graph tensors in Section 55 We focuson one family of examples that is parametrised by partitions Let λ ` k be aninteger partition of k with d parts Let

Φλ = a isin 0 1 dminus 1 type(a) = λ

The set Φλ is tight

Theorem 516 log2 ˜Q(Φ(22)) = 1

Proof Let Φ = Φ(22) Clearly ˜Q(Φ) le 2 After relabelling foralla isin Φsumk

i=1 ai = 0We may thus apply Theorem 57 Let P be the uniform probability distributionon Φ Then H(P ) = log2 6

Let R isin R(Φ) We may assume that

R sube (1 1 0 0) (1 0 1 0) (1 0 0 1)2

cup(0 0 1 1) (0 1 0 1) (0 1 1 0)2

We may assume R is an equivalence relation (Lemma 515) Let (x y) isin RLet Rprime = R cup ((1 1 1 1) minus x (1 1 1 1) minus y) isin R(Φ) Then R sube Rprime andRprime isin R(Φ) and r(R) = r(Rprime) We may thus assume that if (x y) isin R then also((1 1 1 1)minus x (1 1 1 1)minus y) isin R (Lemma 514)

Let S = (1 1 0 0) (1 0 1 0) (1 0 0 1) By the above observation it sufficesto consider equivalence relations on S There are three types of such equivalencerelations

Type (3) all three elements of S are equivalent Then |R| = 18 and r(R) = 2Type (2 1) two elements of S are equivalent and inequivalent to the third

element (which is equivalent to itself) Then |R| = 10 and r(R) = 1Type (1 1 1)) all elements of S are inequivalent Then R sube (x x) x isin Φ

which is a contradictionFor type (3) and (2 1) the uniform probability distribution Q on R has

marginals Qi = Q4+i = Pi for i isin [4] The uniform Q is optimal Then H(Q) =log2 |R| Let R(3) and R(21) be equivalence relations of type (3) and (2 1) Then

log2 ˜Q(Φ) ge minH(P )minus 2

r(R(3))

(log2 |R(3)| minusH(P )

)

H(P )minus 2

r(R(21))

(log2 |R(21)| minusH(P )

)= min log2 6minus 2

2(log2 18minus log2 6)

log2 6minus 21(log2 10minus log2 6)

= min1 log25425 = 1

This proves the theorem

53 Combinatorial degeneration method 79

Theorem 517 log2 ˜Q(Φ(0kminus11)) = h(1k)

Proof We refer to [CVZ16]

With Srinivasan Arunachalam and Peter Vrana we have the following unpub-lished result

Theorem 518 log2 ˜Q(Φ(0k21k2)) = 1

53 Combinatorial degeneration method

In this section we extend the (higher-order) CoppersmithndashWinograd method via apreorder called combinatorial degeneration Suppose Ψ sube I1timesmiddot middot middottimes Ik is not tightbut has a tight subset Φ sube Ψ In the rest of this section we focus on obtaining alower bound on ˜Q(Ψ) via Φ This has an application in the context of tri-coloredsum-free sets (Section 542) for example

Definition 519 ([BCS97]) Let Φ sube Ψ sube I1 times middot middot middot times Ik We say that Φ is acombinatorial degeneration of Ψ and write Ψ Φ if there are maps ui Ii rarr Z(i isin [k]) such that for all α isin I1 times middot middot middot times Ik if α isin Ψ Φ then

sumki=1 ui(αi) gt 0

and if α isin Φ thensumk

i=1 ui(αi) = 0 Note that the maps ui need not be injective

Combinatorial degeneration gets its name from the following standard proposi-tion see eg [BCS97 Proposition 1530]

Proposition 520 Let t isin Fn1 otimes middot middot middot otimes Fnk Let Ψ = supp(t) Let Φ sube Ψ suchthat Ψ Φ Then t t|Φ

Proposition 520 brings us only slightly closer to our goal Namely givent isin Fn1 otimesmiddot middot middototimesFnk with Ψ = supp(t) and given Φ sube Ψ such that ΨΦ it followsdirectly from Proposition 520 that t t|Φ and thus ˜Q(t) ge ˜Q(t|Φ) This howeverdoes not give us a lower bound on the combinatorial asymptotic subrank ˜Q(Ψ)The following theorem does Our theorem extends a result in [KSS16]

Theorem 521 Let Φ sube Ψ sube I1 times middot middot middot times Ik If Ψ Φ then

˜Q(Ψ) ge ˜Q(Φ)

Lemma 522 Let Φ sube Ψ sube I1 times middot middot middot times Ik If Ψ Φ then ˜Q(Ψ) ge Q(Φ)

Proof Pick maps ui Ii rarr Z such that

ksumi=1

ui(αi) = 0 for α isin Φ

ksumi=1

ui(αi) gt 0 for α isin Ψ Φ

80 Chapter 5 Tight tensors and combinatorial subrank cap sets

Let D be a free diagonal in Φ with |D| = Q(Φ) and let

wi =sumxisinDi

ui(x)

Let n isin N and define

Wi =

(x1 xn|D|) isin Itimesn|D|i

n|D|sumj=1

ui(xj) = nwi

Then

Ψtimesn|D| cap (W1 times middot middot middot timesWk) = Φtimesn|D| cap (W1 times middot middot middot timesWk)

The inclusion supe is clear To show sube let (x1 xk) isin Ψtimesn|D| cap (W1 times middot middot middot timesWk)Write xi = (xi1 xi2 xin|D|) and consider the n|D| times k matrix of evaluations

u1(x11) u2(x21) middot middot middot uk(xk1)u1(x12) u2(x22) middot middot middot uk(xk2)

u1(x1n|D|) u2(x2n|D|) middot middot middot uk(xkn|D|)

The sum of the ith column is nwi by definition of Wi andsumk

i=1 nwi = 0 Therow sums are nonnegative by definition of the maps u1 uk We conclude thatthe row sums are zero Therefore (x1 xk) is an element of Φtimesn|D|

Since D is a free diagonal in Φ Dtimesn|D| is a free diagonal in Φtimesn|D| and alsoDtimesn|D| cap (W1times middot middot middot timesWk) is a free diagonal in Φtimesn|D| cap (W1times middot middot middot timesWk) which inturn is equal to Ψtimesn|D| cap (W1 times middot middot middot timesWk) Therefore Dtimesn|D| cap (W1 times middot middot middot timesWk) isalso a free diagonal in Ψtimesn|D| ie

Q(Ψtimesn|D|) ge |Dtimesn|D| cap (W1 times middot middot middot timesWk)|

In the set Dtimesn|D| consider the strings with uniform type ie where all |D|elements of D occur exactly n times These are clearly in W1 times middot middot middot timesWk andtheir number is

(n|D|nn

) Therefore

Q(Ψtimesn|D|) ge(

n|D|n n

)= |D|n|D|minuso(n)

which implies ˜Q(Ψ) = limnrarrinfinQ(Ψtimesn|D|)1

n|D| ge |D|

Proof of Theorem 521 We have ˜Q(Ψ) = limnrarrinfin ˜Q(Ψtimesn)1n It follows fromLemma 522 that

limnrarrinfin ˜Q(Ψtimesn)1n ge lim

nrarrinfinQ(Φtimesn)1n

The right-hand side is ˜Q(Φ)

54 Cap sets 81

54 Cap sets

A subset A sube (Z3Z)n is called a cap set if any line in A is a point a linebeing a triple of points of the form (u u + v u + 2v) Until recently it wasnot known whether the maximal size of a cap set in (Z3Z)n grows like 3nminuso(n)

or like cnminuso(n) for some c lt 3 Gijswijt and Ellenberg in [EG17] inspired bythe work of Croot Lev and Pach in [CLP17] settled this question showing thatc le 3(207+33

radic33)138 asymp 2755 Tao realised in [Tao16] that the cap set question

may naturally be phrased as the problem of computing the size of the largestmain diagonal in powers of the ldquocap set tensorrdquo

sumα eα1 otimes eα2 otimes eα3 where the

sum is over α1 α2 α3 isin F3 with α1 + α2 + α3 = 0 Here main diagonal refersto a subset A of the basis elements such that restricting the cap set tensor toAtimesAtimesA gives the tensor

sumvisinA votimes votimes v We show that the cap set tensor is in

the GL3(F3)times3 orbit of the ldquoreduced polynomial multiplication tensorrdquo which wasstudied in [Str91] and we show how recent results follow from this connectionusing Theorem 521

541 Reduced polynomial multiplication

Let tn be the tensorsum

α eα1 otimes eα2 otimes eα3 where the sum is over (α1 α2 α3) in0 1 nminus13 such that α1 +α2 = α3 We call tn the reduced polynomial multi-plication tensor since tn is essentially the structure tensor of the algebra F[x](xn)of univariate polynomials modulo the ideal generated by xn The support of tnequals

(α1 α2 α3) isin 0 nminus 13

∣∣α1 + α2 = α3

which via α3 7rarr nminus 1minus α3 we may identify with the set

Φn =

(α1 α2 α3) isin 0 nminus 13∣∣α1 + α2 + α3 = nminus 1

(521)

The support Φn is tight (cf Example 51) Strassen proves in [Str91 Theorem 67]using Corollary 54 that ˜Q(tn) = ˜Q(Φn) = z(n) where z(n) is defined as

z(n) =γn minus 1

γ minus 1γminus2(nminus1)3 (522)

with γ equal to the unique positive real solution of the equation 1γminus1minus n

γnminus1= nminus1

3

The following table contains values of z(n) for small n See also [Str91 Table 1]

82 Chapter 5 Tight tensors and combinatorial subrank cap sets

n z(n)

rounded exact

2 188988 3223 = 2h(13)

3 275510 3(207 + 33radic

33)1384 3610725 4461586 5309737 6156208 7001559 78461210 869012

In fact [Str91 Theorem 67] says that the asymptotic spectrum of tn is completelydetermined by the support functionals and that the possible values that thespectral points can take on tn form the closed interval [z(n) n] (cf Remark 221)

X(N[tn]) = ζθ|N[tn] θ isin P([3]) φ(tn) φ isin X(N[tn]) = [z(n) n]

542 Cap sets

We turn to cap sets

Definition 523 A three-term progression-free set is a set A sube (ZmZ)n satisfy-ing the following For all (x1 x2 x3) isin Atimes3 there are u v isin (ZmZ)n such that(x1 x2 x3) = (u u + v u + 2v) if and only if x1 = x2 = x3 Let r3((ZmZ)n) bethe size of the largest three-term progression-free set in (ZmZ)n and define theregularisation ˜r3(ZmZ) = limnrarrinfin r3((ZmZ)n)1n

A three-term progression-free set in (Z3Z)n is called a cap or cap set Wenext discuss an asymmetric variation on three-term progression free sets calledtri-colored sum-free sets which are potentially larger They are interesting sinceall known upper bound techniques for the size of three-term progression-free setsturn out to be upper bounds on the size of tri-colored sum-free sets

Definition 524 Let G be an abelian group Let Γ sube GtimesGtimesG For i isin [3] wedefine the marginal sets Γi = x isin G existα isin Γ αi = x We say Γ is tricoloredsum-free if the following holds The set Γ is a diagonal and for any α isin Γ1timesΓ2timesΓ3α1 + α2 + α3 = 0 if and only if α isin Γ (Recall that Γ sube I1 times I2 times I3 is a diagonalwhen any two distinct α β isin Γ are distinct in all coordinates) Let s3(G) be thesize of the largest tricolored sum-free set in GtimesGtimesG and define the regularisation

˜s3(G) = limnrarrinfin s3(Gtimesn)1n

Equivalently Γ sube GtimesGtimesG is a tricolored sum-free set if and only if Γ is afree diagonal in α isin GtimesGtimesG α1 + α2 + α3 = 0

54 Cap sets 83

If the set A sube G = (ZmZ)n is three-term progression-free then the setΓ = (a aminus2a) a isin A sube G times G times G is tri-colored sum-free Therefore wehave ˜r3(ZmZ) le ˜s3(ZmZ)

We summarise the recent history of results on cap sets For clarity we focuson m = 3 we refer the reader to the references for the general results Edel in[Ede04] proved the lower bound 221739 le ˜r3(Z3Z) In [EG17] Ellenberg andGijswijt proved the upper bound

˜r3(Z3Z) le 3(207 + 33radic

33)138 asymp 2755

Blasiak et al [BCC+17] proved that in fact

˜s3(Z3Z) le 3(207 + 33radic

33)138

This upper bound was shown to be an equality in [KSS16 Nor16 Peb16]

Theorem 525 ˜s3(Z3Z) = 3(207 + 33radic

33)138

We reprove Theorem 525 by proving that ˜s3(ZmZ) equals the asymptoticsubrank z(m) of tm discussed in Section 541 when m is a prime power Thesignificance of our proof lies in the explicit connection to the framework ofasymptotic spectra and not in the obtained value which also for prime powers mwas already computed in [BCC+17 KSS16 Nor16 Peb16]

Proof We will prove ˜s3(ZmZ) = z(m) when m is a prime power By defini-tion ˜s3(ZmZ) equals the asymptotic subrank of the set

α isin 0 mminus 13 α1 + α2 + α3 = 0 mod m

which via α3 7rarr α3 minus (mminus 1) we may identify with the set

Ψm = α isin 0 mminus 13 α1 + α2 + α3 = mminus 1 mod m

and so ˜s3(ZmZ) = ˜Q(Ψm) Let

Φm = α isin 0 mminus 13 α1 + α2 + α3 = mminus 1

We know ˜Q(Φm) = z(m) (Section 541) We will show that ˜Q(Φm) = ˜Q(Ψm)when m is a prime power This proves the theorem

We prove ˜Q(Φm) le ˜Q(Ψm) There is a combinatorial degeneration Φm ΨmIndeed let ui 0 mminus 1 rarr 0 mminus 1 be the identity map If α isin Φmthen

sum3i=1 ui(αi) = m minus 1 and if α isin Ψm Φm then

sum3i=1 ui(αi) equals m minus 1

plus a positive multiple of m This means Theorem 521 applies and we thusobtain ˜Q(Φm) le ˜Q(Ψm) This proves the claim

We show ˜Q(Ψm) le ˜Q(Φm) when m is a power of the prime p Let F = FpLet fm isin Fm otimes Fm otimes Fm have support Ψm with all nonzero coefficients equal

84 Chapter 5 Tight tensors and combinatorial subrank cap sets

to 1 Obviously ˜Q(Ψm) le ˜Q(fm) To compute ˜Q(fm) we show that there is abasis in which the support of fm equals the tight set Φm Then ˜Q(fm) = ˜Q(Φm)(Corollary 54) This implies the claim We prepare to give the basis (which isthe same basis as used in [BCC+17]) First observe that the rule x 7rarr

(xa

)gives a

well-defined map ZmZrarr ZpZ since for a isin 0 1 mminus 1 if x = y mod mthen

(xa

)=(ya

)mod p by Lucasrsquo theorem Let (ex)x be the standard basis of Fm

The elements (sum

xisinZmZ(xa

)ex)aisinZmZ form a basis of Fm since the matrix (

(xa

))ax

is upper triangular with ones on the diagonal We will now rewrite fm in the basis((sum

x

(xa

)ex)a (

sumy

(yb

)ey)b (

sumz

(zc

)ez)c) Observe that

(x

mminus1

)equals 1 if and only

if x equals mminus 1 and hence

fm =sum

xyzisinZmZx+y+z=mminus1

ex otimes ey otimes ez =sum

xyzisinZmZ

(x+ y + z

mminus 1

)ex otimes ey otimes ez

The identity(x+y+zw

)=sum(

xa

)(yb

)(zc

)with sum over a b c isin 0 1 mminus 1 such

that a+ b+ c = w is true and thussumxyzisinZmZ

(x+ y + z

mminus 1

)ex otimes ey otimes ez

=sum

xyzisinZmZ

sumabcisin01mminus1

a+b+c=mminus1

(x

a

)(y

b

)(z

c

)ex otimes ey otimes ez (523)

We may simply rewrite (523) as

sumabcisin01mminus1

a+b+c=mminus1

sumxisinZmZ

(x

a

)ex otimes

sumyisinZmZ

(y

b

)eb otimes

sumzisinZmZ

(z

c

)ez

Therefore with respect to the basis ((sum

x

(xa

)ex)a (

sumy

(yb

)ey)b (

sumz

(zc

)ez)c) the

support of fm equals the tight set Φm (And even stronger fm is isomorphic tothe tensor F[x](xm) of Section 541)

Remark 526 Why did we reprove the cap set result Theorem 525 Ourmotivation being interested in the asymptotic spectrum of tensors was to seeif the techniques in the cap set papers are stronger than the Strassen supportfunctionals ie whether they give any new spectral points Above we have seenthat the cap set result itself can be proven with the support functionals In fact weshow in Section 46 that for oblique tensors the asymptotic slice-rank which wasintroduced in [Tao16] to give a concise proof of [EG17] equals the minimum valueover the support functionals In Section 611 we show that for all complex tensorsasymptotic slice-rank equals the minimum value of the quantum functionals

55 Graph tensors 85

55 Graph tensors

In this section we briefly discuss the application that motivated us to proveTheorem 57 in [CVZ16] namely upper bounding the asymptotic rank of so-calledgraph tensors Graph tensors are defined as follows

Let G = (VE) be a graph (or hypergraph) with vertex set V and edgeset E Let n isin N Let (bi)iisin[n] be the standard basis of Fn We define the graphtensor Tn(G) as

Tn(G) =sumiisin[n]E

otimesvisinV

(otimeseisinEvisine

bie

)

seen as a |V |-tensor Given a vertex v isin V let d(v) denote the degree of v thatis d(v) equals the number of edges e isin E that contain v Then Tn(G) is naturallyinotimes

visinV Fd(v) We write T(G) for T2(G) For example for the complete graphon four vertices K4 the graph tensor is

T(K4) = T( )

= T( )

otimes T( )

otimes T( )

otimes T( )

otimes T( )

otimes T( )

=sum

iisin016(bi1otimesbi2otimesbi5)otimes (bi2otimesbi3otimesbi6)otimes (bi3otimesbi4otimesbi5)otimes (bi1otimesbi4otimesbi6)

living in (C8)otimes4 Let Kk be the complete graph on k vertices The 2times2 matrix mul-tiplication tensor 〈2 2 2〉 equals the tensor T(K3) Define the exponent ω(T(G)) =log2 ˜R(T(G)) We study the exponent per edge τ(T(G)) = ω(T(G)) |E(G)|

Our result is an upper bound on τ(T(K4)) in terms of the combinatorialasymptotic subrank ˜Q(Φ(22)) which we studied in Theorem 516

Theorem 527 For any q ge 1 τ(T(K4)) le logq

( q + 2

˜Q(Φ(22))

)

Proof We apply a generalisation of the laser method See [CVZ16]

Corollary 528 Let k ge 4 Then τ(T(Kk)) le 0772943

Proof In the bound of Theorem 527 we plug in the value ˜Q(Φ(22)) = 2 fromTheorem 516 Then we optimise over q to obtain the value 0772943 Bya ldquocovering argumentrdquo we can show that τ(T(Kk)) is non-increasing when kincreases

For k ge 4 Corollary 528 improves the upper bound τ(T(Kk)) le 0790955that can be derived from the well-known upper bound of Le Gall [LG14] on theexponent of matrix multiplication ω = ω(T(K3))

86 Chapter 5 Tight tensors and combinatorial subrank cap sets

A standard ldquoflattening argumentrdquo (ie using the gauge points from the asymp-totic spectrum) yields the lower bound τ(T(Kk)) ge 1

2k(k minus 1) if k is even and

τ(T(Kk)) ge 12(k + 1)k if k is odd As a consequence if the exponent of matrix

multiplication ω equals 2 then τ(T(K4)) = τ(T(K3)) = 23 We raise the following

question is there a k ge 5 such that τ(T(Kk)) lt23

Tensor surgery cycle graphs

For graph tensors given by sparse graphs good upper bounds on the asymptoticrank can be obtained with an entirely different method called tensor surgery whichwe introduced in [CZ18] As an illustration let me mention the results we obtainedfor cycle graphs with tensor surgery Recall ω = log2 ˜R(〈2 2 2〉) = log2 ˜R(T(Ck))Let ωk = log2 ˜R(T(Ck)) First observe that ωk = k for even k For odd k triviallyk minus 1 le ωk le k We prove the following

Theorem 529 For k ` odd ωk+`minus1 le ωk + ω`

Corollary 530 Let k ge 5 odd Then ωk le ωkminus2 + ω3 and thus ωk le kminus12ω

Corollary 531 If ω = 2 then ωk = k minus 1 for all odd k

See [CZ18] for the proofs

56 Conclusion

Tight tensors are a subfamily of the oblique tensors For tight 3-tensors theminimum over the support functionals equals the asymptotic subrank This isproven via the CoppersmithndashWinograd method The construction is in fact of avery combinatorial nature In this chapter we studied the combinatorial notion ofsubrank We proved that combinatorial subrank is monotone under combinatorialdegeneration We studied the cap set problem via the support functionals Weextended the CoppersmithndashWinograd method to higher-order tensors and appliedthis method to study graph tensors

Chapter 6

Universal points in the asymp-totic spectrum of tensors entanglementpolytopes moment polytopes

This chapter is based on joint work with Matthias Christandl and Peter Vrana [CVZ18]

61 Introduction

In Chapter 4 following Strassen we introduced the asymptotic spectrum oftensors X(T ) = X(T 6) for T the semiring of k-tensors over F for some fixedinteger k and field F with addition given by direct sum oplus multiplication givenby tensor product otimes and preorder 6 given by restriction (or degeneration) Theasymptotic spectrum characterises the asymptotic rank ˜R and the asymptoticsubrank ˜Q We have seen that the asymptotic rank plays an important role inalgebraic complexity theory the asymptotic rank of the matrix multiplicationtensor 〈2 2 2〉 =

sumijkisin[2] eij otimes ejkotimes eki isin F4otimesF4otimesF4 characterises the exponent

of the arithmetic complexity of multiplying two n times n matrices over F thatis ˜R(〈2 2 2〉) = 2ω We have also seen in Chapter 5 how one may use theasymptotic subrank to upper bound the size of combinatorial objects like forexample cap sets in Fn3

New results in this chapter

So far the only elements we have seen in X(T ) (ie universal spectral pointscf Section 213) are the gauge points (Section 43) Besides that we have seenin Section 44 that the Strassen support functionals ζθ are in X(oblique) Inthis chapter we introduce for the first time an explicit infinite family of universalspectral points (over the complex numbers) the quantum functionals Our newinsight is to use the moment polytope Given a tensor t isin Cn1 otimes Cn2 otimes Cn3 themoment polytope P(t) is a convex polytope that carries representation-theoretic

87

88 Chapter 6 Universal points in the asymptotic spectrum of tensors

information about t The quantum functionals are defined as maximisations overmoment polytopes

Let me immediately put a disclaimer The quantum functionals do not give anew lower bound on the asymptotic rank of matrix multiplication 〈2 2 2〉 namelythe quantum functionals give the same lower bound as the gauge points Alsothe quantum functionals being defined for tensors over complex numbers only wedo not expect to get new upper bounds on the size of combinatorial objects thatare ldquolike cap setsrdquo

So what have we gained Arguably we have found the ldquorightrdquo viewpoint onhow to construct universal spectral points for tensors (In fact after writing ourpaper [CVZ18] we realised that Strassen had begun a study of moment polytopesin the appendix of the German survey [Str05] Strassen did not construct newuniversal spectral points however not in that publication at least) If there aremore universal spectral points then our viewpoint may lead the way to findingthem Moreover whereas no efficient algorithm is known for evaluating the supportfunctionals the moment polytope viewpoint may open the way to having efficientalgorithms for evaluating the quantum functionals

In Sections 62ndash67 we work towards the construction of the quantum functionalsand we give a proof that they are universal spectral points In Sections 68ndash610 wecompare the quantum functionals and the support functionals and in Section 611we relate asymptotic slice rank to the quantum functionals

In this chapter we will focus on 3-tensors but the theory naturally generalisesto k-tensors

62 SchurndashWeyl duality

For background on representation theory we refer to [Kra84] [Ful97] and [GW09]Let Sn be the symmetric group on n symbols Let Sn act on the tensor

space (Cd)otimesn by permuting the tensor legs

π middot v1 otimes middot middot middot otimes vn = vπminus1(1) otimes middot middot middot otimes vπminus1(n) π isin Sn

Let GLd be the general linear group of Cd Let GLd act on (Cd)otimesn via the diagonalembedding GLd rarr GLtimesnd g 7rarr (g g)

g middot v1 otimes middot middot middot otimes vn = (gv1)otimes middot middot middot otimes (gvn) g isin GLd

The actions of Sn and GLd commute so we have a well-defined action of the productgroup Sn timesGLd on (Cd)otimesn SchurndashWeyl duality describes the decomposition ofthe space (Cd)otimesn into a direct sum of irreducible Sn timesGLd representations Thisdecomposition is

(Cd)otimesn sim=oplusλ`dn

[λ]otimes Sλ(Cd) (61)

62 SchurndashWeyl duality 89

with [λ] an irreducible Sn representation of type λ and Sλ(Cd) an irreducibleGLd-representation of type λ when `(λ) le d and 0 when `(λ) gt d We use thenotation λ `d n for the partitions of n with at most d parts Let

Pλ (Cd)otimesn rarr (Cd)otimesn

be the equivariant projector onto the isotypical component of type λ ie onto thesubspace of (Cd)otimesn isomorphic to [λ]otimes Sλ(Cd) The projector Pλ is given by theaction of the group algebra element

Pλ =(dim[λ]

n

)2 sumTisinTab(λ)

cT isin C[Sn]

where Tab(λ) is the set of Young tableaux of shape λ filled with [n] and with cTthe Young symmetrizer

cT =sum

σisinC(T )

sgn(σ)σsum

πisinR(T )

π

where C(T ) R(T ) sube Sn are the subgroups of permutations inside columns andpermutations inside rows respectively The element Pλ is a minimal centralidempotent in C[Sn] and

sumλ`n Pλ = e

Back to the decomposition of (Cd)otimesn We need a handle on the size of thecomponents in the direct sum decomposition (61) For our application it is goodto think of d as a constant and n as a large number The number of summands inthe direct sum decomposition (61) is upper bounded by a polynomial in n

|λ `d n| le (n+ 1)d

ie there are only few summands compared to the total dimension dn There arethe following well-known bounds on the dimensions of the irreducible representa-tions [λ] and Sλ(Cd) that make up the summands

nprodd`=1(λ` + dminus `)

le dim[λ] le nprodd`=1 λ`

(62)

dimSλ(Cd) le (n+ 1)d(dminus1)2 (63)

Let p isin Rn be a probability vector iesumn

i=1 pi = 1 and pi ge 0 for i isin [n]Let H(p) be the Shannon entropy of the probability vector p

H(p) =nsumi=1

pi log2

1

pi

For α isin [0 1] let h(α) = H((α 1 minus α)) be the binary entropy For a partitionλ = (λ1 λ`) ` n let λ = λn = (λ1n λ`n) be the probability vectorobtained by normalising λ

90 Chapter 6 Universal points in the asymptotic spectrum of tensors

Let λ ` n For N isin N let Nλ = (Nλ1 Nλ2 Nλ`) be the stretchedpartition We see that asymptotically in the stretching factor N the dimensionof [Nλ] behaves like a multinomial coefficient and

2NnH(λ)minuso(N) le dim[Nλ] le 2NnH(λ) (64)

63 Kronecker and LittlewoodndashRichardson coef-

ficients gλmicroν cλmicroν

Let micro ν ` n Let Sn rarr SntimesSn π 7rarr (π π) be the diagonal embedding Considerthe decomposition of the tensor product [micro] otimes [ν] restricted along the diagonalembedding

[micro]otimes [ν] darrSntimesSnSnsim=otimesλ`n

HomSn([λ] [micro]otimes [ν])otimes [λ]

Define the Kronecker coefficient

gλmicroν = dim HomSn([λ] [micro]otimes [ν])

ie gλmicroν is the multiplicity of [λ] in [micro]otimes [ν]

Let λ `a+b Let GLa timesGLb rarr GLa+b (AB) 7rarr AoplusB be the block-diagonalembedding Consider the decomposition of the representation Sλ(Ca+b) restrictedalong the block-diagonal embedding

Sλ(Ca+b) darrGLa+b

GLatimesGLbsim=oplusmicro`aν`b

Hλmicroν otimes Smicro(Ca)otimes Sν(Cb)

with

Hλmicroν = HomGLatimesGLb(Smicro(Ca)otimes Sν(Cb)Sλ(Ca+b))

Define the LittlewoodndashRichardson coefficient cλmicroν = dimHλmicroν

For partitions λ λprime ` define λ + λprime elementwise The Kronecker and theLittlewoodndashRichardson coefficients have the following semigroup property (seeeg [CHM07])

Lemma 61 Let λ micro ν α β γ ` be partitions

(i) If gλmicroν gt 0 and gαβγ gt 0 then gλ+α micro+β ν+γ gt 0

(ii) If cλmicroν gt 0 and cαβγ gt 0 then cλ+αmicro+β ν+γ gt 0

64 Entropy inequalities 91

64 Entropy inequalities

The semigroup properties imply the following lemma Of this lemma the firststatement can be found in a paper by Christandl and Mitchison [CM06] while wedo not know of any source that explicitly states the second statement For theconvenience of the reader we give the proofs of both statements

Lemma 62 Let λ micro ν ` be partitions

(i) If gλmicroν gt 0 then H(λ) le H(micro) +H(ν)

(ii) If cλmicroν gt 0 then H(λ) le |micro||micro|+|ν|H(micro) + |ν|

|micro|+|ν|H(ν) + h( |micro||micro|+|ν|

)

Proof (i) Let gλmicroν gt 0 Suppose λ micro ν ` n Let N isin N Then Lemma 61implies gNλNmicroNν gt 0 This means HomSnN ([Nλ] [Nmicro]otimes [Nν]) 6= 0 which impliesdim[Nλ] le dim[Nmicro] dim[Nν] From (64) we have the dimension bounds

2NnH(λ)minuso(N) le dim[Nλ]

dim[Nmicro] le 2NnH(micro)

dim[Nν] le 2NnH(ν)

Thus NnH(λ) minus o(N) le NnH(micro) + NnH(ν) Divide by Nn and let N go toinfinity to get H(λ) le H(micro) +H(ν)

(ii) We restrict the decomposition

(Ca+b)otimesn sim=oplusλ`a+bn

[λ]otimes Sλ(Ca+b)

along the block-diagonal embedding to get

(Ca+b)otimesn darrGLa+b

GLatimesGLbsim=otimesλ`a+bn

[λ]otimes Sλ(Ca+b) darrGLa+b

GLatimesGLb

sim=oplusλ`a+bn

[λ]otimesoplusmicro`aν`b

Ccλmicroν otimes Smicro(Ca)otimes Sν(Cb)

sim=oplusmicro`aν`b

(oplusλ`a+bn

[λ]otimes Ccλmicroν)otimes Smicro(Ca)otimes Sν(Cb)

On the other hand

(Ca+b)otimesn darr sim= (Ca oplus Cb)otimesn darrsim= (Ca)otimesn oplus ((Ca)otimesnminus1 otimes Cb)oplus middot middot middot oplus (Cb)otimesn darr

sim=noplusk=0

C(nk) otimesoplusmicro`ak

([micro]otimes Smicro(Ca))otimesoplus

ν`bnminusk

([ν]otimes Sν(Cb))

92 Chapter 6 Universal points in the asymptotic spectrum of tensors

sim=noplusk=0

oplusmicro`akν`bnminusk

(C(nk) otimes [micro]otimes [ν]

)otimes Smicro(Ca)otimes Sν(Cb)

Suppose cλmicroν gt 0 Comparing the above expressions gives the inequality dim[λ] le(n|micro|

)dim[micro] dim[ν] By the semigroup property Lemma 61 we have cNλNmicroNν gt 0

for all N isin N Thus dim[Nλ] le(NnN |micro|

)dim[Nmicro] dim[Nν] for all N isin N Then

from (64) follows

2NnH(λ)minuso(N) le 2Nnh(|micro|n

)2N |micro|H(micro)2N |ν|H(ν)

We conclude H(λ) le h( |micro|n

) + |micro|nH(micro) + |ν|

nH(ν)

Let x = (x(1) x(2) x(3)) be a triple of probability vectors x(i) isin Rni Let θ isin Θbe a weighting Let Hθ(x) be the θ-weighted average of the Shannon entropies ofthe probability vectors x(1) x(2) and x(3)

Hθ(x) = θ(1)H(x(1)) + θ(2)H(x(2)) + θ(3)H(x(3))

(Note that this notation is slightly different from the notation used in Chapter 4)We will use the notation λ `3 n to say that λ is a triple of partitions of n ie λequals (λ(1) λ(2) λ(3)) where each λ(i) is a partition of n We write λ for the

normalised triple (λ(1) λ(2) λ(3))

Lemma 63 Let λ micro ν `3 be three triples of partitions

(i) If gλ(i)micro(i)ν(i) gt 0 for all i then 2Hθ(λ) le 2Hθ(micro)2Hθ(ν)

(ii) If micro `3 m ν `3 nminusm and cλ(i)

micro(i)ν(i) gt 0 for all i then 2Hθ(λ) le 2Hθ(micro) + 2Hθ(ν)

Proof (i) Suppose gλ(i)micro(i)ν(i) gt 0 for all i Then H(λ(i)) le H(micro(i)) +H(ν(i)) for

all i by Lemma 62 Thussum

i θ(i)H(λ(i)) lesum

i θ(i)H(micro(i))+sum

i θ(i)H(ν(i)) Then

Hθ(λ) le Hθ(micro) +Hθ(ν) We conclude 2Hθ(λ) le 2Hθ(micro)2Hθ(ν)

(ii) Suppose cλ(i)

micro(i)ν(i) gt 0 for all i Then H(λ(i)) le mnH(micro(i))+ nminusm

nH(ν(i))+h(m

n)

by Lemma 62 We take the θ-weighted average to get Hθ(λ) le mnHθ(micro) +

nminusmmHθ(ν) + h(m

n) We conclude 2Hθ(λ) le 2Hθ(micro) + 2Hθ(ν) by Lemma 49(iv)

65 Hilbert spaces and density operators

Endow the vector space Cn with a hermitian inner product (one may take thestandard hermitian inner product 〈u v〉 =

sumni=1 uivi for u v isin Cn where middot denotes

taking the complex conjugate) so that it is a Hilbert space

66 Moment polytopes P(t) 93

Let (V1 〈middot middot〉) and (V2 〈middot middot〉) be Hilbert spaces On V1 oplus V2 we define the innerproduct by 〈u1 oplus u2 v1 oplus v2〉 = 〈u1 v1〉+ 〈u2 v3〉 On V1 otimes V2 we define the innerproduct by 〈u1 otimes u2 v1 otimes v2〉 = 〈u1 v1〉〈u2 v2〉 and extending linearly

Let V be a Hilbert space A positive semidefinite hermitian operator ρ V rarr Vwith trace one is called a density operator The sequence of eigenvalues of a densityoperator ρ is a probability vector Let spec(ρ) = (p1 pn) be the sequence ofeigenvalues of ρ ordered non-increasingly p1 ge middot middot middot ge pn

Let V1 and V2 be Hilbert spaces Given a density operator ρ on V1 otimes V2the reduced density operator ρ1 = tr2 ρ is uniquely defined by the property thattr(ρ1X1) = tr(ρ(X1otimes IdV2)) for all operators X1 on V1 The operator ρ1 is again adensity operator The operation tr2 is called the partial trace over V2 Explicitly ρ1

is given by 〈ei ρ1(ej)〉 =sum

`〈ei otimes f` ρ(ej otimes f`)〉 where the ei are some basis of V1

and the fi are some basis of V2 (the statement is independent of basis choice)Let Vi be a Hilbert space and consider the tensor product V1otimesV2otimesV3 Associate

with t isin V1 otimes V2 otimes V3 the dual element tlowast = 〈t middot〉 isin (V1 otimes V2 otimes V3)lowast Then

ρt = ttlowast〈t t〉 = t〈t middot〉〈t t〉

is a density operator on V1 otimes V2 otimes V3 Viewing ρt as a density operator on theregrouped space V1otimes (V2otimes V3) we may take the partial trace of ρt over V2otimes V3 asdescribed above We denote the resulting density operator by ρt1 = tr23 ρ

t Wesimilarly define ρt2 = tr13 ρ

t and ρt3 = tr12 ρt

66 Moment polytopes P(t)

We give a brief introduction to moment polytopes We refer to [Nes84 Bri87Fra02 Wal14] for more information We begin with the general setting and thenspecialise to orbit closures in tensor spaces

661 General setting

Let G be a connected reductive algebraic group (We refer to Kraft [Kra84] andHumphreys [Hum75] for an introduction to algebraic groups) Fix a maximal torusT sube G and a Borel subgroup T sube B sube G We have the character group X(T ) theWeyl group W the root system Φ sube X(T ) and the system of positive roots Φ+ sube ΦFor λ micro isin X(T ) we set λ 4 micro if micro minus λ is a sum of positive roots Let V bea rational G-representation The restriction of the action of G to T gives adecomposition

V =oplus

λisinX(T )

Vλ Vλ = v isin V forallt isin T t middot v = λ(t)v

This decomposition is called the weight decomposition of V The λ isin X(T )with Vλ 6= 0 are called the weights of V with respect to T The Vλ are the

94 Chapter 6 Universal points in the asymptotic spectrum of tensors

weight spaces of V For v isin V let vλ be the component of v in Vλ Letsupp(v) = λ vλ 6= 0

Let E be the real vector space E = X(T ) otimes R The Weyl group W actson X(T ) and thus on E We enlarge 4 to a partial order on E as follows Forx y isin E let x 4 y if y minus x is a nonnegative linear combination of positive rootsLet D sube E be the positive Weyl chamber For every x isin E the orbit W middot xintersects the positive Weyl chamber D in exactly one point which we denote bydom(x)

Let V be a finite-dimensional rational G-module Let χ isin X(T ) cap D bea dominant character We denote the χ-isotypical component of V with V(χ)Let Z sube V be a Zariski closed set We denote the coordinate ring of Z with C[Z]We denote the degree d part of C[Z] with C[Z]d If Z is G-stable then C[Z]d is aG-module

Definition 64 Let V be a rational G-module and Z sube V a nontrivial irreducibleclosed G-stable cone The moment polytope of Z denoted by

P(Z)

is defined as the Euclidean closure in E of the set

R(Z) = χd (C[Z]d)(χlowast) 6= 0

of normalised characters χd for which the χlowast-isotypical component (C[Z]d)(χlowast) isnot zero

Theorem 65 (MumfordndashNess [Nes84] Brion [Bri87] Franz [Fra02]) The momentpolytope is indeed a convex polytope and it is equal to the image of the so-calledmoment map intersected with the positive Weyl chamber

P(Z) = micro(Z 0) capD

Let Z = G middot v be the orbit closure (in the Zariski topology) of a vector v isin V 0and suppose G middot v is a cone

Lemma 66 (See eg [Str05]) Suppose G middot v is a cone Then

R(G middot v) = χd (C[G middot v]d)(χlowast) 6= 0= χd (lin(G middot votimesd))(χ) 6= 0

662 Tensor spaces

We specialise to 3-tensors Let V = V1 otimes V2 otimes V3 with Vi = Cni Let

G = GLn1 timesGLn2 timesGLn3

67 Quantum functionals F θ(t) 95

T = T1 times T2 times T3

with Ti the diagonal matrices in GLni The weight decomposition of V is thedecomposition with respect to the standard basis elements ex1 otimes ex2 otimes ex3 wherex isin [n1]times [n2]times [n3] The support supp(v) is the support of v with respect to thestandard basis

In the current setting there is a beautiful rephrasing of Theorem 65 in termsof ordered spectra of reduced density matrices Recall from Section 65 that forv isin V 0 we have a density matrix ρv and reduced density matrices ρvi of whichwe may take the non-increasingly ordered spectra spec(ρvi )

Theorem 67 (WalterndashDoranndashGrossndashChristandl [WDGC13]) Let Z sube V be anontrivial irreducible closed G-stable cone Then

P(Z) = (spec ρz1 spec ρz2 spec ρz3) z isin Z 0

Let v isin V 0 We consider the moment polytope of the orbit closure Z = G middot vIn this setting Lemma 66 specialises to the following

Lemma 68 (See eg [Str05])

R(G middot v) = χd (C[G middot v]d)(χlowast) 6= 0= χd (lin(G middot votimesd))(χ) 6= 0= χd Pχv

otimesd 6= 0

where Pχ = Pχ(1) otimes Pχ(2) otimes Pχ(3) with Pχ(i) V otimesdi rarr V otimesdi the projector onto the

isotypical component of type χ(i) discussed in Section 62

On the other hand Theorem 67 immediately gives a description of the momentpolytope P(G middot v) in terms of ordered spectra of reduced density matrices

Theorem 69 Let v isin V 0 Then

P(G middot v) = (spec ρu1 spec ρu2 spec ρu3) u isin G middot v 0

Summarising we have two descriptions of the moment polytope a represen-tation-theoretic or invariant-theoretic description (Lemma 68) and a quantummarginal spectra description (Theorem 69) These two descriptions are the keyto proving the properties of the quantum functionals that we need

67 Quantum functionals F θ(t)

We will now define the quantum functionals and prove that they are universalspectral points

96 Chapter 6 Universal points in the asymptotic spectrum of tensors

Let p = (p1 pn) isin Rn be a probability vector iesumn

i=1 pi = 1 andpi ge 0 for all i isin [n] Recall that H(p) denotes the Shannon entropy of theprobability vector p H(p) =

sumni=1 pi log2 1pi Let x = (x(1) x(2) x(3)) be a

triple of probability vectors x(i) isin Rni Let θ isin Θ be a weighting Recallthat Hθ(x) denotes the θ-weighted average of the Shannon entropies of the threeprobability vectors x(1) x(2) x(3)

Hθ(x) = θ(1)H(x(1)) + θ(2)H(x(2)) + θ(3)H(x(3))

Let V = Cn1 otimes Cn2 otimes Cn3 Let G = GLn1 timesGLn2 timesGLn3 Let v isin V 0 Weuse the notation P(v) = P(G middot v) for the moment polytope of the orbit closureof v

Definition 610 For θ isin Θ and v isin V 0 let

F θ(v) = max2Hθ(x) x isin P(v)

Let F θ(0) = 0 We call the functions F θ the quantum functionals The namequantum functional comes from the fact that the moment polytope P(t) consistsof triples of quantum marginal entropies

Theorem 611 Let T be the semiring of 3-tensors over C Let 6 be the restrictionpreorder For θ isin Θ

F θ isin X(T 6)

In other words F θ is a semiring homomorphism T rarr Rge0 which is monotoneunder degeneration 6 In fact F θ is monotone under degeneration

Remark 612 The results in this chapter generalise to k-tensors over C In ourpaper [CVZ18] we discuss this general situation in detail and make a distinctionbetween upper quantum functionals and lower quantum functionals

Let p isin Rn and q isin Rm be probability vectors The tensor product potimesq isin Rnm

defined by

potimes q = (piqj i isin [n] j isin [m])

is a probability vector The direct sum poplus q isin Rn+m defined by

poplus q = (p1 pn q1 qm)

is a probability vectorLet x = (x(1) x(2) x(3)) and y = (y(1) y(2) y(3)) be triples of probability vectors

We define the tensor product xotimes y elementwise

xotimes y = (x(1) otimes y(1) x(2) otimes y(2) x(3) otimes y(3))

67 Quantum functionals F θ(t) 97

We define the direct sum xoplus y elementwise

xoplus y = (x(1) oplus y(1) x(2) oplus y(2) x(3) oplus y(3))

For x otimes y and x oplus y to be in the moment polytope we will need to reorder thecomponents non-increasingly For a triple of probability vectors x = (x(1) x(2) x(3))let

dom(x)

be the triple of probability vectors obtained from x be reordering the compo-nents x(i) such that they become non-increasing Let dom(S) = dom(x) x isin S

For v isin Cn1otimesCn2otimesCn3 we will use the notation G(v) = GLn1timesGLn2timesGLn3

to denote the group that naturally corresponds to the space that v lives in Wewill use the notation P(v) = P(G(v) middot v) for the moment polytope of the orbitclosure of v

Theorem 613 Let s isin Cn1 otimes Cn2 otimes Cn3 and t isin Cm1 otimes Cm2 otimes Cm3

(i) dom(P(s)otimesP(t)

)sube P(sotimes t)

(ii) forallα isin [0 1] dom(αP(s)oplus (1minus α) P(t)

)sube P(soplus t)

(iii) If s t isin Cn1 otimes Cn2 otimes Cn3 0 and s isin G(t) middot t then P(s) sube P(t)

(iv) P(soplus 0) = P(s)oplus 0

(v) P(〈1〉) = ((1) (1) (1)) with 〈1〉 = e1 otimes e1 otimes e1 isin C1 otimes C1 otimes C1

Proof To prove statements (i) and (ii) let x isin P(s) and y isin P(t) Then thereare elements a isin G(s) middot s and b isin G(t) middot t with ordered marginal spectra x and y

x = (spec ρa1 spec ρa2 spec ρa3)

y = (spec ρb1 spec ρb2 spec ρb3)

We prove statement (i) We have aotimes b isin G(sotimes t) middot sotimes t Thus

dom(xotimes y) = (spec ρaotimesb1 spec ρaotimesb2 spec ρaotimesb3 ) isin P(sotimes t)

We conclude dom(P(s)otimesP(t)) sube P(sotimes t) We prove statement (ii) Let α isin [0 1]Define the tensor u(α) isin Cn1+m1 otimes Cn2+m2 otimes Cn3+m3 by

u(α) =

radicαradic〈s s〉

aoplusradic

1minus αradic〈t t〉

b

Then u(α) isin G(soplus t) middot soplus t We have ρu(α)i = αρai oplus (1 minus α)ρbi From the

observation

spec(αρai oplus (1minus α)ρbi) = dom(αxoplus (1minus α)y)

98 Chapter 6 Universal points in the asymptotic spectrum of tensors

follows dom(αxoplus (1minus α)y) isin P(G(soplus t) middot soplus t) We conclude

dom(αP(s)oplus (1minus α)P(t)) sube P(soplus t)

We have thus proven statement (i) and (ii)We prove statement (iii) Let G = G(t) = G(s) Let s isin G middot t Then

G middot s sube G middot t so we have a G-equivariant restriction map C[G middot s] C[G middot t] onthe coordinate rings Let χd isin R(G middot s) with (C[G middot s]d)(χlowast) 6= 0 Then also(C[G middot t]d)(χlowast) 6= 0 by Schurrsquos lemma Thus χd isin R(G middot t) sube P(G middot t) Weconclude P(s) sube P(t)

We prove statement (iv) Let χd isin R(G(soplus 0) middot (soplus 0)) with Pχ(soplus0)otimesd 6= 0Recall from Section 62 that Pχ is given by the action of an element in the groupalgebra C[Sd] which we also denoted by Pχ From this viewpoint we see that also

Pχsotimesd 6= 0 So χd isin R(G(s) middot s)Statement (v) is a direct observation

Corollary 614

(i) F θ(s)F θ(t) le F θ(sotimes t)

(ii) F θ(s) + F θ(t) le F θ(soplus t)

(iii) If s t then F θ(s) le F θ(t)

(iv) F θ(〈1〉) = 1

Proof (i) Let x isin P(s) and y isin P(t) Then xotimesy isin P(sotimest) by Theorem 613 It isa basic fact that Hθ(x)+Hθ(y) = Hθ(xotimesy) (Lemma 49) so 2Hθ(x)2Hθ(y) = 2Hθ(xotimesy)We conclude F θ(s)F θ(t) le F θ(sotimes t)

(ii) Let x isin P(s) and y isin P(t) Then by Theorem 613 for all α isin [0 1]

dom(αxoplus (1minus α)y) isin P(soplus t)

It is a basic fact that αHθ(x) + (1 minus α)Hθ(y) + h(α) = Hθ(αx oplus (1 minus α)y)(Lemma 49) Thus for any α isin [0 1] we have 2αHθ(x)+(1minusα)Hθ(y)+h(α) le F θ(soplus t)Using Lemma 49(iv) we conclude F θ(s) + F θ(t) le F θ(soplus t)

(iii) This follows from statement (iii) and (iv) of Theorem 613 since bydefinition degeneration s t means soplus 0 isin G(toplus 0) middot (toplus 0)

(iv) This follows from statement (v) of Theorem 613

67 Quantum functionals F θ(t) 99

Theorem 615

(i) R(sotimes t) sube λN existmicroN isin R(s) νN isin R(t) gλ(i)micro(i)ν(i) gt 0 for all i

(ii) R(soplus t) sube λN existmicrom isin R(s) ν(N minusm) isin R(t) cλ(i)

micro(i)ν(i) gt 0 for all i

Proof (i) Let s isin V1 otimes V2 otimes V3 and let t isin W1 otimesW2 otimesW3 Let λN isin R(sotimes t)with Pλ(sotimes t)otimesN 6= 0 Let π be the natural reordering map

π ((V1 otimesW1)otimes (V2 otimesW2)otimes (V3 otimesW3))otimesN

rarr (V1 otimes V2 otimes V3)otimesN otimes (W1 otimesW2 otimesW3)otimesN

Then

(sotimes t)otimesN =summicroν

πminus1(Pmicro otimes Pν)π(sotimes t)otimesN

Let micro ν `3 N with Pλπminus1(Pmicro otimes Pν)π(s otimes t)otimesN 6= 0 Then Pmicros

otimesN 6= 0 andPνt

otimesN 6= 0 ie microN isin R(s) and νN isin R(t) Moreover Pλπminus1(Pmicro otimes Pν)π 6= 0

which means the Kronecker coefficients gλ(i)micro(i)ν(i) are nonzero(ii) Let λN isin R(soplus t) with Pλ(soplus t)otimesN 6= 0 Let us expand (soplus t)otimesN as

(soplus t)otimesN = sotimesN oplus (sotimesNminus1 otimes t)oplus middot middot middot oplus totimesN

Then Pλ does not vanish on some summand which we may assume to be of theform sotimesm otimes totimesNminusm Let π be the natural projection

π ((V1 oplusW1)otimes (V2 oplusW2)otimes (V3 oplusW3))otimesN

rarr (V1 otimes V2 otimes V3)otimesm otimes (W1 otimesW2 otimesW3)otimesNminusm

Let micro ν with Pλπminus1(Pmicro otimes Pν)π(soplus t)otimesN 6= 0 Then Pmicros

otimesm 6= 0 and PνtotimesNminusm 6= 0

Moreover Pλπminus1(Pmicro otimes Pν)π 6= 0 Therefore the LittlewoodndashRichardson coeffi-

cients cλ(i)

micro(i)ν(i) are nonzero

Corollary 616

(i) F θ(sotimes t) le F θ(s)F θ(t)

(ii) F θ(soplus t) le F θ(s) + F θ(t)

Proof (i) Let λN isin R(s otimes t) By Theorem 615 there is a microN isin R(s) and aνN isin R(t) such that the Kronecker coefficient gλ(i)micro(i)ν(i) is nonzero for every i

Then 2Hθ(micro) le F θ(s) and 2Hθ(ν) le F θ(t) by definition of F θ The Kroneckercoefficients being nonzero implies

2Hθ(λ) le 2Hθ(micro)2Hθ(ν)

100 Chapter 6 Universal points in the asymptotic spectrum of tensors

by Lemma 63 We conclude F θ(sotimes t) le F θ(s)F θ(t)

(ii) Let λN isin R(s oplus t) Then by Theorem 615 there are microm isin R(s) and

ν(N minus m) isin R(t) such that the LittlewoodndashRichardson coefficient cλ(i)

micro(i)ν(i) isnonzero for every i This means

2Hθ(λ) le 2Hθ(micro) + 2Hθ(ν)

by Lemma 63 We conclude F θ(soplus t) le F θ(s) + F θ(t)

Proof of Theorem 611 Corollary 614 and Corollary 616 together prove The-orem 611

68 Outer approximation

In this section we discuss an outer approximation of P(t) We will use this outerapproximation to show that the quantum functionals are at most the supportfunctionals

Let 4 be the dominance order ie majorization order on triples of probabilityvectors For any set S sube Rn1 times Rn2 times Rn3 of triples of probability vectors let S4

denote the upward closure with respect to 4

S4 = y isin Rn1 times Rn2 times Rn3 existx isin S x 4 y

Let conv(S) denote the convex hull of S in Rn1 times Rn2 times Rn3 Recall that for x isinS we defined dom(x) as the triple of probability vectors obtained from x =(x(1) x(2) x(3)) by reordering the components x(i) such that they become non-increasing and dom(S) = dom(x) x isin S

Theorem 617 (Strassen [Str05]) Let v isin V 0 Then

P(v) sube (dom conv supp v)4 (65)

Proof We give the proof for the convenience of the reader Let χd isin R(G middot v)Then (lin(G middot votimesd))(χ) 6= 0 Let Mχ sube lin(G middot votimesd) be a simple G-submodule withhighest weight χ Let N sube V otimesd be the G-module complement N oplusMχ = V otimesdThen votimesd is not in N Let v =

oplusγisinsupp v vγ be the weight decomposition Then votimesd

is a sum of tensor products of the vγ At least one summand is not in N say ofweight η =

sumγ dγγ with

sumγ dγ = d The projection V otimesd rarrMχ along N maps this

summand onto a nonzero weight vector of weight η So η is a weight of Mχ Thenalso dom(η) is a weight of Mχ Since χ is the highest weight of Mχ dom(η) 4 χ

Then dom(ηd) 4 χd We have ηd =sum

γdγdγ isin conv supp v We conclude

R(G middot v) sube (dom conv supp v)4 and thus P(G middot v) sube (dom conv supp v)4

69 Inner approximation for free tensors 101

69 Inner approximation for free tensors

In this section we discuss an inner approximation for the moment polytope of afree tensor We will use this inner approximation in the next section to prove thatthe quantum functionals coincide with the support functionals when restricted tofree tensors We will prove that not all tensors are free

We say a set Φ sube [n1] times [n2] times [n3] is free if every two different elementsof Φ differ in at least two coordinates in other words if the elements of Φ haveHamming distance at least two We say v isin V = Cn1 otimes Cn2 otimes Cn3 is free if forsome g isin G(v) = GLn1 timesGLn2 timesGLn3 the support supp(g middot v) sube [n1]times [n2]times [n3]is free (Free is called schlicht in [Str05])

Theorem 618 (Strassen [Str05]) Let v isin V 0 with supp(v) free Then

dom conv supp v sube P(v)

Proof We refer to [Str05]

Corollary 619 Let v isin V 0 with supp(v) free Then

P(v)4 =(dom conv supp v

)4

Proof By Theorem 618 dom conv supp v sube P(v) We take the upward closureon both sides to get (dom conv supp v)4 sube P(v)4 On the other hand fromTheorem 617 follows P(v)4 sube (dom conv supp v)4

Remark 620 Recall that v isin V is oblique if the support supp(g middot v) is anantichain for some g isin G(v) (Section 44) Such antichains are free so obliquetensors are free Thus tight sube oblique sube free Like the tight tensors andoblique tensors free tensors from a semigroup under otimes and oplus

Proposition 621 For n ge 5 there exists a tensor that is not free in CnotimesCnotimesCn

Proof We upper bound the maximal size of a free support Let Φ sube [n]times [n]times [n]be free Any two distinct elements in Φ are still distinct if we forget the thirdcoefficient of each Therefore |Φ| = |(α1 α2) α isin Φ| le n2 (This is a specialcase of the Singleton bound [Sin64] from coding theory This upper bound is tightsince Φ = (a b c) a b c isin [n] c = a+ b mod n is free and has size n2) Secondwe apply the following observation of Burgisser [Bur90 page 3] Let

Zn = t isin Cn otimes Cn otimes Cn existg isin G(t) |supp(g middot t)| lt n3 minus 3n2

Let Yn = Cn otimes Cn otimes Cn Zn Then the set Yn is Zariski open and nonemptyNow let n ge 5 and let t isin Yn Then forallg isin G(t) |supp(g middot t)| ge n3 minus 3n2 gt n2 Weconclude t is not free

102 Chapter 6 Universal points in the asymptotic spectrum of tensors

610 Quantum functionals versus support func-

tionals

We discussed the support functionals ζθ isin X(oblique 3-tensors over F) in Chap-ter 4 We recall its definition over C Let V = Cn1otimesCn2otimesCn3 For θ isin Θ = P([3])and t isin V 0 with supp(t) oblique

ζθ(t) = max2Hθ(P ) P isin P(supp(t))

We also discussed an extension of ζθ to all 3-tensors over C the upper supportfunctional

ζθ(t) = mingisinG(t)

max2Hθ(P ) P isin P(supp(g middot t))

We know ζθ(s otimes t) le ζθ(s)ζθ(t) ζθ(s oplus t) = ζθ(s) + ζθ(t) ζθ(〈1〉) = 1 ands 6 trArr ζθ(s) le ζθ(t) for any s t isin V

The set conv supp(g middot t) is the set of marginals of probability distributions onsupp(g middot t) Thus dom conv supp(g middot t) is the set of ordered marginals of probabilitydistributions on supp(g middot t) Therefore

ζθ(t) = mingisinG(t)

maxxisinS(gmiddott)

2Hθ(x)

with S(w) = dom conv suppw Let X sube Rn1 times Rn2 times Rn3 be a set of triples ofprobability vectors From Schur-convexity of the Shannon entropy function followsmaxxisinX 2Hθ(x) = maxxisinX4 2Hθ(x) Also Hθ(x) = Hθ(domx)

Theorem 622 ζθ(t) ge F θ(t)

Proof Let g isin G(t) such that

maxxisinS

2Hθ(x) = ζθ(t)

with S = dom conv supp(g middot t) We have

maxxisinS

2Hθ(x) = maxxisinS4

2Hθ(x)

By Theorem 617 P(t) sube S4 We conclude F θ(t) le ζθ(t)

Theorem 623 Let t isin V be free Then ζθ(t) = F θ(t)

Proof We know from Theorem 622 that ζθ(t) ge F θ(t) We prove ζθ(t) le F θ(t)Let g isin G(t) such that supp(g middot t) is free Let S = dom conv supp(g middot t) Thenζθ(t) le maxxisinS 2Hθ(x) = maxxisinS4 2Hθ(x) By Theorem 618 we have S4 = P(t)4We conclude ζθ(t) le F θ(t)

611 Asymptotic slice rank 103

We can show that the regularised upper support functional equals the quantumsupport functional As a consequence the quantum functional is at least the lowersupport functional which was discussed in Chapter 4

Theorem 624 limnrarrinfin1nζθ(totimesn)1n = F θ(t)

Proof We refer the reader to [CVZ18]

Corollary 625 F θ(v) ge ζθ(v)

Proof By Theorem 624 F θ(v) = limnrarrinfin ζθ(votimesn)1n We know ζθ(v) ge ζθ(v)

by Theorem 415 and thus limnrarrinfin ζθ(votimesn)1n ge limnrarrinfin ζθ(v

otimesn)1n The lowersupport functional ζθ is supermultiplicative under otimes (Theorem 414) so

limnrarrinfin

ζθ(votimesn)1n ge ζθ(v)

Combining these three inequalities proves the theorem

611 Asymptotic slice rank

We proved in Section 46 that for oblique t isin Fn1 otimes Fn2 otimes Fn3 the asymptotic slicerank limnrarrinfin SR(totimesn)1n exists and equals minθisinΘ ζ

θ(t) with Θ = P([3]) In thissection we prove the analogous statement for the quantum functionals

Theorem 626 Let t isin Cn1 otimes Cn2 otimes Cn3 Then

limnrarrinfin

SR(totimesn)1n = minθisinΘ

F θ(t)

We work towards the proof of Theorem 626 Let t isin Cn1 otimesCn2 otimesCn3 0 LetEθ(t) = log2 F

θ(t)

Lemma 627 For any ε gt 0 there is an n0 isin N such that for all n ge n0 there isa λn isin R(t) with miniisin[3]H(λ(i)) ge minθisinΘE

θ(t)minus ε

Proof By definition

minθisinΘ

Eθ(t) = minθisinΘ

maxxisinP(t)

sumjisin[3]

θ(j)H(x(j))

By Von Neumannrsquos minimax theorem the right-hand side equals

maxxisinP(t)

minθisinΘ

sumjisin[3]

θ(j)H(x(j))

which equals

maxxisinP(t)

minjisin[3]

H(x(j))

104 Chapter 6 Universal points in the asymptotic spectrum of tensors

Let ε gt 0 Let microm isin R(t) with minjisin[3] H(micro(j)) ge minθisinΘ Eθ(t)minus ε2 We will

use two facts We have (P(1) otimes P(1) otimes P(1))t = t 6= 0 The triples of partitions λwith Pλt

otimesn 6= 0 for some n form a semigroup Let n isin N We can write n = qm+rwith q r isin N 0 le r lt m Let λ(j) = qmicro(j) + (r) Then by the semigroup property

Pλtotimesn 6= 0 ie λn isin R(t) We have 1

n(qmicro(j) + (r)) = qm

nmicro(j) + r

n(r) By concavity

of Shannon entropy

H( 1n(qmicro(j) + (r))) = H( qm

nmicro(j) + r

n(r))

ge qmnH(micro(j))

ge (1minus mn

)H(micro(j))

When n is large enough (1minus mn

)H(micro(j)) is at least H(micro(j))minus ε2 Let n0 isin N suchthat this is the case for all j isin [3]

Lemma 628 Let λn isin R(t) Then SR(totimesn) ge miniisin[3] dim[λ(i)]

Proof We have the restriction totimesn ge Pλtotimesn 6= 0 Choose rank-one projections Aj

in the vector spaces Sλ(j)(Cnj) with

s = (id[λ(1)]otimesA1)otimes (id[λ(2)]otimesA2)otimes (id[λ(3)]otimesA3)Pλtotimesn 6= 0

The tensor s is invariant under Sn acting diagonally on (Cn1)otimesnotimes(Cn2)otimesnotimes(Cn3)otimesnThus the marginal spectra spec ρsi are uniform This implies s is semistableFrom [BCC+17 Theorem 46] follows that SR(s) equals miniisin[3] dim[λ(i)]

Lemma 629 lim infnrarrinfin SR(totimesn)1n ge minθisinΘ Fθ(t)

Proof Let ε gt 0 For n large enough choose λn isin R(t) as in Lemma 627 ByLemma 628 SR(totimesn) ge miniisin[3] dim[λ(i)] The right-hand side we lower bound by

miniisin[3]

dim[λ(i)] ge miniisin[3]

2nH(λ(i))2minuso(n) ge 2n(minθisinΘ Eθ(t)minusε)2minuso(n)

Then lim infnrarrinfin SR(totimesn)1n ge 2minθisinΘ Eθ(t)minusε

Lemma 630 lim supnrarrinfin SR(totimesn)1n le F θ(t)

Proof Let n isin N Define s1 s2 s3 isin Cn1 otimes Cn2 otimes Cn3 by

s1 =(sumλ(1)`n

H(λ(1))leEθ(t)

Pλ(1) otimes Idotimes Id)totimesn

s2 =(sumλ(2)`n

H(λ(2))leEθ(t)

Idotimes Pλ(2) otimes Id)

(totimesn minus s1)

612 Conclusion 105

s3 =(sumλ(3)`n

H(λ(3))leEθ(t)

Idotimes Idotimes Pλ(3)

)(totimesn minus s1 minus s2)

Then totimesn = s1 +s2 +s3 The slice rank of an element in the image of Pλ(1)otimes Idotimes Id

is at most dim[λ(1)] otimes Sλ(1)(Cn1) which is at most 2nH(λ(1))+o(n) (Section 62)Similarly for Id otimes Pλ(2) otimes Id and Id otimes Id otimes Pλ(3) The tensor s1 is in the imageof the sum

sumλ(1) Pλ(1) otimes Id otimes Id over λ(1) ` n with at most n1 parts There are

at most (n+ 1)n1 such partitions Thus SR(s1) le (n+ 1)n12nEθ(t)+o(n) Similarly

for s2 and s3 Therefore

lim supnrarrinfin

SR(totimesn)1n le lim supnrarrinfin

(3(n+ 1)maxiisin[3] ni 2nE

θ(t)+o(n))1n

(66)

The right-hand side of (66) equals F θ(t)

Proof of Theorem 626 Lemma 629 and Lemma 630 together prove Theo-rem 626

612 Conclusion

In this chapter we constructed the first infinite family of spectral points for 3-tensors over C the quantum functionals For 30 years the only explicit spectralpoints known were the gauge points The constructions in this chapter naturallygeneralise to higher-order tensors for which we refer to our paper [CVZ18] Wedo not know whether the quantum functionals are all spectral points for 3-tensorsover C Finally we showed that for complex tensors the asymptotic slice rankexists and equals the minimum value over the quantum functionals

Chapter 7

Algebraic branching programsapproximation and nondeterminism

This chapter is based on joint work with Karl Bringmann and Christian

Ikenmeyer [BIZ17]

71 Introduction

The study of asymptotic tensor rank in previous chapters was originally motivatedby the study of the complexity of matrix multiplication in the algebraic circuitmodel an algebraic model of computation In this chapter we will study severalother algebraic models of computation and algebraic complexity classes

Formulas the class VPe and the determinant

An (arithmetic) formula is a rooted binary tree whose leaves are each labeledwith a variable or a field constant and whose root and intermediate vertices arelabeled with either + (addition) or times (multiplication) In the natural way viarecursion over the tree structure a formula computes a multivariate polynomial f The formula size of a multivariate polynomial f is the smallest number of verticesrequired for any formula to compute f Here is an example of a formula of size 7computing the polynomial (3 + x)(3 + y)

3 x 3 y

+ +

times

A sequence of multivariate polynomials (fn)nisinN is called a family Valiant inhis seminal paper [Val79] introduced the complexity class VPe that is defined as

107

108 Chapter 7 Algebraic branching programs

the set of all families whose formula size is polynomially bounded (We say asequence (an)n isin NN of natural numbers is polynomially bounded if there exists aunivariate polynomial q such that an le q(n) for all n) For example the family((x1)

n + (x2)n + middot middot middot+ (xn)n)n is in VPe because the formula size of this family

grows quadratically

The smallest known formulas for the determinant family detn have size nO(logn)This follows from Berkowitzrsquo algorithm [Ber84] which gives an algebraic cir-cuit of depth O(log2 n) and thus by expanding we get an algebraic formula ofdepth O(log2 n) whose size is then trivially bounded by 2O(log2 n) = nO(logn) Itis a major open question in algebraic complexity theory whether formulas ofpolynomially bounded size exist for detn This question can be phrased in termsof complexity classes as asking whether or not the inclusion VPe sube VPs is strict(We will define VPs shortly)

Motivated by this question we study the closure class VPe of families ofpolynomials that can be approximated arbitrarily closely by families in VPe

(see Section 724 for the formal definition) Over the field R or C one can thinkof VPe as the set of families whose border formula size is polynomially boundedThe border formula size of a polynomial f is the smallest number c such that thereexists a sequence gi of polynomials with formula size at most c and limirarrinfin gi = f

Continuous lower bounds

In algebraic complexity theory problem instances correspond to vectors v isin FnA complexity lower bound often takes the form of a function f Fn rarr F that is zeroon the vectors of ldquolow complexityrdquo and nonzero on v We refer to Grochow [Gro13]for a discussion of settings where complexity lower bounds are obtained in thisway (eg [NW97 Raz09 LO15 GKKS13 LMR13 BI13]) Over the complexnumbers we can in fact assume that these functions f are continuous [Gro13](and even so-called highest-weight vector polynomials) If C and D are algebraiccomplexity classes with C sube D (for example C = VPe and D = VPs) thena proof of separation D 6sube C in this continuous manner implies the strongerseparation D 6sube C In our case it is thus natural to aim for the separation VPs 6subeVPe instead of the slightly weaker VPs 6sube VPe which provides further motivationfor studying VPe This is exactly analogous to the geometric complexity theoryapproach of Mulmuley and Sohoni (see eg [MS01 MS08] and the exposition[BLMW11 Sec 9]) which aims to prove the separation VNP 6sube VPs to attackValiantrsquos famous conjecture VPs 6= VNP [Val79] (Here VNP is the class ofp-definable families see Section 724)

New results in this chapter

We prove two new results in this chapter

71 Introduction 109

Algebraic branching programs of width 2 An algebraic branching pro-gram (abp) is a directed acyclic graph with a source vertex s and a sink vertex tthat has affine linear forms over the base field F as edge labels Moreover werequire that each vertex is labeled with an integer (its layer) and that edges in theabp only point from vertices in layer i to vertices in layer i+ 1 The width of anabp is the cardinality of its largest layer The size of an abp is the number of itsvertices The value of an abp is the sum of the values of all sndasht-paths where thevalue of an sndasht-path is the product of its edge labels We say that an abp computesits value The class VPs coincides with the class of families of polynomials thatcan be computed by abps of polynomially bounded size see eg [Sap16]

For k isin N we introduce the class VPk as the class of families of polyno-mials computable by width-k abps of polynomially bounded size It is well-known (see Lemma 72) that VPk sube VPe for all k ge 1 In 1992 Ben-Or andCleve [BOC92] showed that VPk = VPe for all k ge 3 In 2011 Allender andWang [AW16] showed that width-2 abps cannot compute every polynomial so inparticular we have a strict inclusion VP2 ( VP3

We prove that the closure of VP2 and the closure of VPe are equal

VP2 = VPe (71)

when char(F) 6= 2 From (71) and the result of Allender and Wang follows directlythat the inclusion VP2 ( VP2 is strict We have thus separated a complexityclass from its approximation closure

VNP via affine linear forms Every algebraic complexity class has a nondeter-ministic closure (see Section 725 for the definition) The nondeterministic closureof VP is called VNP and the nondeterministic closure of VPe is called VNPeIn 1980 Valiant [Val80] proved VNPe = VNP The nondeterministic closureof VP1 and VP2 we call VNP1 and VNP2 Using interpolation techniques wecan deduce VNP2 = VNP from (71) provided the field is infinite Using moresophisticated techniques we prove

VNP1 = VNP (72)

From (72) easily follows VP1 ( VNP1 Also from [AW16] we get VP2 ( VNP2We have thus separated complexity classes from their nondeterministic closures

Further related work

An excellent exposition on the history of small-width computation can be foundin [AW16] along with an explicit polynomial that cannot be computed by width-2abps namely x1x2 + x3x4 + middot middot middot+ x15x16 Saha Saptharishi and Saxena in [SSS09Cor 14] showed that x1x2 + x3x4 + x5x6 cannot be computed by width-2 abpsthat correspond to the iterated matrix multiplication of upper triangular matrices

110 Chapter 7 Algebraic branching programs

Burgisser in [Bur04] studied approximations in the model of general algebraiccircuits finding general upper bounds on the error degree For most algebraiccomplexity classes C the relation between C and C has not been an activeobject of study As pointed out recently by Forbes [For16] Nisanrsquos result [Nis91]implies that C = C for C being the class of size-k algebraic branching programson noncommuting variables A structured study of VP and VPs was startedin [GMQ16] Much work in lower bounds for algebraic approximation algorithmshas been done in the area of bilinear complexity dating back to [BCRL79 Str83Lic84] and more recently eg [Lan06 LO15 HIL13 Zui17 LM16a]

This chapter is organised as follows In Section 72 we discuss definitions andbasic results In Section 73 we prove that the approximation closure of VP2

equals the approximation closure of VPe ie VP2 = VPe In Section 74 we provethat the nondeterminism closure of VP1 equals VNP

72 Definitions and basic results

We briefly recall the definition of circuits formulas and branching programs andwe recall the definition of the corresponding complexity classes Then we discusssome straightforward relationships among these classes and review the proof of atheorem by Ben-Or and Cleve which inspired our work Finally we discuss theapproximation closure and the nondeterminism closure for algebraic complexityclasses

721 Computational models

Let x1 x2 be formal variables By F[x] we mean the ring of polynomials over Fwith variables x1 x2 xk with k large enough

A circuit is a directed acyclic graph G with one or more source vertices andone sink vertex Each source vertex is labelled by a variable xi or a constant c isin FThe other vertices are labelled by either + or times and have in-degree 2 (that isfan-in 2) Each vertex computes an element in F[x] by recursion over the graphThe element computed by the sink is the element computed by the circuit Thesize of a circuit is the number of vertices

A formula is a circuit whose graph is a treeAn algebraic branching program (abp) is a directed acyclic graph with a source

vertex s and a sink vertex t that has affine linear forms αxi + β α β isin F asedge labels Moreover we require that each vertex is labeled with an integer (itslayer) and that edges in the abp only point from vertices in layer i to vertices inlayer i+ 1 The width of an abp is the cardinality of its largest layer The size ofan abp is the number of its vertices The value of an abp is the sum of the valuesof all sndasht-paths where the value of an sndasht-path is the product of its edge labels

72 Definitions and basic results 111

We say that an abp computes its value

For example the following abp has depth 5 width 3 and computes thepolynomial x1x2 + x2 + 2x1 minus 1

x1 2

x1x2minus1

An abp G corresponds naturally to an iterated product of matrices for any twoconsecutive layers Li Li+1 in G let Mi be the matrix (evw)visinLiwisinLi+1

with evwthe label of the edge from v to w (or 0 if there is no edge from v to w) Then thevalue of G equals the product Mk middot middot middotM2M1

For example the above abp corresponds to the following iterated matrixproduct

(1 1 1

)minus1 0 00 x2 00 0 x1

1 0 0x1 1 00 0 2

1

11

722 Complexity classes VP VPe VPk

The circuit size of a polynomial f is the size of the smallest circuit computing f The formula size of a polynomial f is the size of the smallest formula computing f

A family is a sequence (fn)nisinN of multivariate polynomials over F A class is aset of families The class VP consists of all families (fn) with circuit size degreeand number of variables in poly(n) The class VPe consists of all families (fn)with formula size in poly(n) (The origin of the subscript e in VPe is the termldquoarithmetic expressionrdquo) Clearly VPe sube VP

We introduce classes defined by abps Let k ge 1 The class VPk consists of allfamilies computed by polynomial-size width-k abps with edges labelled by affinelinear forms

sumi αixi + β with coefficients αi β isin F

We note that the above classes depend on the choice of the ground field F

In our paper [BIZ17] we make a distinction between three different types ofedge labels for abps The class VPk in this chapter corresponds to the class VPg

k

in [BIZ17]

112 Chapter 7 Algebraic branching programs

723 The theorem of Ben-Or and Cleve

This subsection is about the relations among VPk and VPe

Lemma 71 VPk sube VP` when k le `

Proof This is clearly true

Lemma 72 VPk sube VPe for any k

Proof For the simple proof we refer to [BIZ17]

Ben-Or and Cleve [BOC92] showed that for k ge 3 the classes VPk and VPe

are in fact equal

Theorem 73 (Ben-Or and Cleve [BOC92]) For k ge 3 VPk = VPe

We will review the construction of Ben-Or and Cleve here because we will useit to prove Theorem 78 and Theorem 715 The following depth-reduction lemmafor formulas by Brent is a crucial ingredient

Lemma 74 (Brent [Bre74]) Let f be an n-variate degree-d polynomial computedby a formula of size s Then f can also be computed by a formula of size poly(s n d)and depth O(log s)

Proof See the survey of Saptharishi [Sap16 Lemma 55] for a modern proof

Proof of Theorem 73 Lemma 72 says VPk sube VPe We will prove theinlusion VPe sube VP3 from which follows VPe sube VPk by Lemma 71 andthus VPk = VPe For a polynomial h define the matrix

M(h) =

1 0 0h 1 00 0 1

which as part of an abp looks like

h

We call the following matrices primitive

bull M(h) with h any variable or any constant in F

bull the 3times 3 permutation matrices denoted by Mπ with π isin S3

bull the diagonal matrices Mabc = diag(a b c) with a b c isin F

72 Definitions and basic results 113

The entries of the primitives are variables or constants in F making them suitableto use in the construction of a width-3 abp

Let (fn) isin VPe Then fn can be computed by a formula of size s(n) isin poly(n)By Brentrsquos depth-reduction theorem for formulas (Lemma 74) fn can be computedby a formula of size poly(n) and depth d(n) isin O(log s(n))

We will construct a sequence of primitives A1 Am(n) such that

A1 middot middot middotAm(n) =

1 0 0fn 1 00 0 1

with m(n) isin O(4d(n)) = poly(n) Then

fn(x) = ( 1 1 1 )Mminus110A1 middot middot middotAm(

111

)

so fn(x) can be computed by a width-3 abp of length poly(n) proving the theoremTo explain the construction let h be a polynomial and consider a formula

computing h of depth d The goal is to construct (recursively on the formulastructure) primitives A1 Am such that

A1 middot middot middotAm =

1 0 0h 1 00 0 1

with m isin O(4d)

Suppose h is a variable or a constant Then M(h) is itself a primitive matrixSuppose h = f + g is a sum of two polynomials f g and suppose M(f) and

M(g) can be written as a product of primitives Then M(f + g) equals a productof primitives because M(f + g) = M(f)M(g) This can easily be verified directlyor by noting that in the corresponding partial abps the top-bottom paths (ui-vjpaths) have the same value

u1 u2 u3

v1 v2 v3

f

gsim

u1 u2 u3

v1 v2 v3

f+g

Suppose h = fg is a product of two polynomials f g and suppose M(f) andM(g) can be written as a product of primitives Then M(fg) equals a product ofprimitives because

M(f middot g) = M(23)

(M1minus11M(123)M(g)M(132)M(f)

)2M(23)

114 Chapter 7 Algebraic branching programs

(here (23) isin S3 denotes the transposition 1 7rarr 1 2 7rarr 3 3 7rarr 2 and (123) isin S3

denotes the cyclic shift 1 7rarr 2 2 7rarr 3 3 7rarr 1) as can be verified either directly orby checking that in the corresponding partial abps the top-bottom paths (ui-vjpaths) have the same value

u1 u2 u3

v1 v2 v3

f

minus1

g

f

g

minus1

sim

u1 u2 u3

v1 v2 v3

f middotg

This completes the construction

The length m of the construction is m(h) = 1 for h a variable or constant andrecursively m(f + g) = m(f) +m(g) m(f middot g) = 2(m(f) +m(g)) so m isin O(4d)where d is the formula size of h

The above result of Ben-Or and Cleve (Theorem 73) raises the intriguingquestion whether the inclusion VP2 sube VPe is strict Allender and Wang [AW16]show that the inclusion is indeed strict in fact they show that some polynomialscannot be computed by any width-2 abp

Theorem 75 (Allender and Wang [AW16]) The polynomial

x1x2 + x3x4 + middot middot middot+ x15x16

cannot be computed by any width-2 abp Therefore we have the separation ofclasses VP2 ( VP3 = VPe

72 Definitions and basic results 115

724 Approximation closure C

We define the norm of a complex multivariate polynomial as the sum of theabsolute values of its coefficients This defines a topology on the polynomial ringC[x1 xm] Given a complexity measure L say abp size or formula size thereis a natural notion of approximate complexity that is called border complexityNamely a polynomial f isin C[x] has border complexity Ltop at most c if there isa sequence of polynomials g1 g2 in C[x] converging to f such that each gisatisfies L(gi) le c It turns out that for reasonable classes over the field of complexnumbers C this topological notion of approximation is equivalent to what we callalgebraic approximation (see eg [Bur04]) Namely a polynomial f isin C[x] satisfiesL(f)alg le c iff there are polynomials f1 fe isin C[x] such that the polynomial

h = f + εf1 + ε2f2 + middot middot middot+ εefe isin C[εx]

has complexity LC(ε)(h) le c where ε is a formal variable and LC(ε)(h) denotesthe complexity of h over the field extension C(ε) This algebraic notion ofapproximation makes sense over any base field and we will use it in the statementsand proofs of this chapter

Definition 76 Let C(F) be a class over the field F We define the approximationclosure C(F) as follows a family (fn) over F is in C(F) if there are polynomialsfni(x) isin F[x] and a function e Nrarr N such that the family (gn) defined by

gn(x) = fn(x) + εfn1(x) + ε2fn2(x) + middot middot middot+ εe(n)fne(n)(x)

is in C(F(ε)) We define the poly-approximation closure Cpoly

(F) similarly butwith the additional requirement that e(n) isin poly(n) We call e(n) the errordegree

725 Nondeterminism closure N(C)

We introduce the nondeterminism closure for algebraic complexity classes

Definition 77 Let C be a class The class N(C) consists of families (fn) withthe following property there is a family (gn) isin C and p(n) q(n) isin poly(n) suchthat

fn(x) =sum

bisin01p(n)

gq(n)(bx)

where x and b denote sequences of variables x1 x2 and b1 b2 bp(n) We saythat f(x) is a hypercube sum over g and that b1 b2 bp(n) are the hypercubevariables For any subscript x we will use the notation VNPx to denote N(VPx)We remark that the map C 7rarr N(C) trivially satisfies all properties of being aKuratowski closure operator ie N(empty) = empty C sube N(C) N(CcupD) = N(C)cupN(D)and N(N(C)) = N(C)

116 Chapter 7 Algebraic branching programs

73 Approximation closure of VP2

We show that every polynomial can be approximated by a width-2 abp Evenbetter we show that every polynomial can be approximated by a width-2 abpof size polynomial in the formula size and with error degree polynomial in theformula size This is the main result of the current chapter

Theorem 78 VPe sube VP2poly

when char(F) 6= 2

Proof For a polynomial h define the matrix M(h) =(

1 0h 1

) We call the following

matrices primitives

bull M(h) with h any variable or constant in F

bull(

12ε

00 1

)

(minus2ε 0

0 1

)

(minus1 ε0 1

)

(minus1 minusε0 1

)

(minus1 00 1

)The entries of the primitives are variables or constants in the base field F(ε)making them suitable to use in a width-2 abp over the base field F(ε)

Let (fn) isin VPe so fn(x) can be computed by a formula of size s(n) isin poly(n)By Brentrsquos depth reduction theorem for formulas (Lemma 74) fn can be computedby a formula of size poly(n) and depth d(n) isin O(log s(n))

We will construct a sequence of primitives A1 Am(n) such that

A1 middot middot middotAm(n) =

(1 0fn 1

)+ ε

(fn111 fn112

fn121 fn122

)+ ε2

(fn211 fn212

fn221 fn222

)+ middot middot middot + εe

(fne11 fne12

fne21 fne22

)for some fnijk isin F[x] with m(n) e(n) isin O(8d(n)) = poly(n) Then

( 1 1 )(minus1 00 1

)A1 middot middot middotAm(n)( 1

1 ) = fn(x) +O(ε)

so fn(x) can be approximated by a width-2 abp of length poly(n) and with errordegree poly(n) proving the theorem

We begin with the construction Let h be a polynomial and consider a formulacomputing h of depth d The goal is to construct recursively on the tree structureof the formula a sequence of primitives A1 Am such that for some hijk isin F[x]

A1 middot middot middotAm =

(1 0h 1

)+ ε

(0 0h121 0

)+ ε2

(h211 h212

h221 h222

)+

middot middot middot + εe(he11 he12

he21 he22

)(73)

with m e isin O(8d) Notice the particular first-degree error pattern in (73) whichour recursion will rely on

73 Approximation closure of VP2 117

Suppose h is a variable or a constant Then M(h) is itself a primitive satisfy-ing (73)

Suppose h = f + g is a sum of two polynomials f g and suppose that

F =

(1 0f 1

)+ ε

(0 0f prime 0

)+O(ε2) (74)

G =

(1 0g 1

)+ ε

(0 0gprime 0

)+O(ε2) (75)

are products of primitives for some f prime gprime isin F[x] Then

G middot F =

(1 0

f + g 1

)+ ε

(0 0

f prime + gprime 0

)+O(ε2)

is a product of primitives satisfying (73)Suppose h = fg is a product of two polynomials and suppose that F and G

are of the form (74) and (75) and are products of primitives We will constructM((f + g)2) M(minusf 2) M(minusg2) approximately in such a way that when we usethe identity (f + g)2 minus f 2 minus g2 = 2fg the error terms cancel properly Define theexpressions sq+(A) and sqminus(A) by

sqplusmn(A) =

(minusε 00 1

)middot A middot

(minus1 plusmnε0 1

)middot A middot

(1ε

00 1

)

Then

sqplusmn(F ) =

(1∓ εf 0

plusmnf 2 +O(ε) 1plusmn εf

)+O(ε2)

We have

sqminus(F ) middot sqminus(G) middot sq+(G middot F )

=

(1 + εg 0

minusg2 +O(ε) 1minus εg

)middot(

1 + εf 0minusf 2 +O(ε) 1minus εf

)middot(

1minus ε(f + g) 0(f + g)2 +O(ε) 1 + ε(f + g)

)+O(ε2)

which simplifies to

sqminus(F ) middot sqminus(G) middot sq+(G middot F ) =

(1 0

2fg +O(ε) 1

)+O(ε2)

118 Chapter 7 Algebraic branching programs

We conclude(2 00 1

)middot sqminus(G) middot sqminus(F ) middot sq+(G middot F ) middot

(12

00 1

)=

(minus2ε 0

0 1

)middotG middot

(minus1 minusε0 1

)middotG middot

(minus1 00 1

)middot F middot

(minus1 minusε0 1

)F

middot(minus1 00 1

)middotG middot F middot

(minus1 ε0 1

)middotG middot F middot

(12ε

00 1

)=

(1 0

fg +O(ε) 1

)+O(ε2)

This completes the constructionThe length m of the construction is m(h) = 1 for h a variable or constant

and recursively m(f + g) = m(f) + m(g) m(f middot g) = 4(m(f) + m(g)) + 7 Weconclude m isin O(8d) The error degree e of the construction satisfies the samerecursion so e isin O(8d)

Remark 79 The construction in the above proof of Theorem 78 is differentfrom the construction in our paper [BIZ17] The recursion in the above proof issimpler while the construction in [BIZ17] has a better error degree and has aspecial form which relates it to a family of polynomials called continuants

Corollary 710 VP2 = VPe and VP2poly

= VPepoly

when char(F) 6= 2

Proof We have VP2 sube VPe by Lemma 72 Taking closures on both sides weobtain VP2 sube VPe and VP2

polysube VPepoly

When char(F) 6= 2 VPe sube VP2

poly(Theorem 78) By taking closures follows

VPe sube VP2 and VPepolysube VP2

poly

Corollary 711 VP2poly

= VPe when char(F) 6= 2 and F is infinite

Proof By Corollary 710 VP2poly

= VPepoly

We prove VPepoly

= VPe inLemma 712 below

Lemma 712 VPepoly

= VPe when char(F) 6= 2 and F is infinite

Proof The inclusion VPe sube VPepoly

is trivially true We prove the other directionLet (fn) isin VPe

poly Then there are polynomials fni(x) isin F[x] and e(n) isin poly(n)

such that

fn(x) + εfn1(x) + ε2fn2(x) + middot middot middot+ εe(n)fne(n)(x)

is computed by a poly-size formula Γ over F(ε) Let α0 α1 αe(n) be distinctelements in F such that replacing ε by αj in Γ is a valid substitution ie not

74 Nondeterminism closure of VP1 119

causing division by zero These αj exist since our field is infinite by assumptionView

gn(ε) = fn(x) + εfn1(x) + ε2fn2(x) + middot middot middot+ εe(n)fne(n)(x)

as a polynomial in ε The polynomial gn(ε) has degree at most e(n) so we canwrite gn(ε) as follows (Lagrange interpolation on e(n) + 1 points)

gn(ε) =

e(n)sumj=0

gn(αj)prod

0lemlee(n)m6=j

εminus αmαj minus αm

(76)

Clearly fn(x) = gn(0) However replacing ε by 0 in Γ is not a valid substitutionin general From (76) we see directly how to write gn(0) as a linear combinationof the values gn(αj) namely

gn(0) =

e(n)sumj=0

gn(αj)prod

0lemlee(n)m6=j

minusαmαj minus αm

that is

gn(0) =e(n)sumj=0

βj gn(αj) with βj =prod

0lemlee(n)m6=j

αmαm minus αj

The value gn(αj) is computed by the formula Γ with ε replaced by αj which we

denote by Γ|ε=αj Thus fn(x) is computed by the poly-size formulasume(n)

j=0 βj Γ|ε=αj We conclude (fn) isin VPe

Remark 713 The statement of Lemma 712 also holds with VPe replaced withVPs or with VP by a similar proof

74 Nondeterminism closure of VP1

Recall the definition of VNPx = N(VPx) from Definition 77 Valiant proved thefollowing characterisation of VNP in his seminal work [Val80] See also [BCS97Thm 2126] [Bur00 Thm 213] and [MP08 Thm 2]

Theorem 714 (Valiant [Val80]) VNPe = VNP

We strengthen Valiantrsquos characterisation of VNP from VNPe to VNP1

Theorem 715 VNP1 = VNP when char(F) 6= 2

120 Chapter 7 Algebraic branching programs

The idea of the proof is ldquoto simulate in VNP1rdquo the primitives that we used inthe proof of VPe sube VP3 (Theorem 73)

Proof of Theorem 715 Clearly VNP1 sube VNP by Lemma 72 and takingthe nondeterminism closure N We will prove that VNP sube VNP1 Recall thatin the proof of VPe sube VP3 (Theorem 73) we defined for any polynomial h thematrix

M(h) =

1 0 0h 1 00 0 1

and we called the following matrices primitives

bull M(h) with h any variable or any constant in F

bull the 3times 3 permutation matrices denoted by Mπ for π isin S3

bull the diagonal matrices Mabc = diag(a b c) with a b c isin F

In the proof of VPe sube VP3 we constructed for any family (fn) isin VPe a sequenceof primitive matrices An1 Ant(n) with t(n) isin poly(n) such that

fn(x) = ( 1 1 1 )Mminus110A1 middot middot middotAm(

111

) (77)

We will show VPe sube VNP1 by constructing a hypercube sum over a width-1abp that evaluates the right-hand side of (77) This implies VNPe sube VNP1 bytaking the N-closure Then by Valiantrsquos Theorem 714 VNP sube VNP1

Let f(x) be a polynomial and let A1 Ak be primitive matrices suchthat f(x) is computed as

f(x) = ( 1 1 1 )Ak middot middot middotA1

(111

)

View this expression as a width-3 abp G with vertex layers labeled as shown inthe left-hand diagram in Fig 71 Assume for simplicity that all edges betweenlayers are present possibly with label 0 The sum of the values of every sndasht pathin G equals f(x)

f(x) =sumjisin[3]k

Ak[jk jkminus1] middot middot middotA1[j2 j1] (78)

We introduce some hypercube variables To every vertex of G except s and twe associate a bit the bits in the ith layer we call b1[i] b2[i] b3[i] To an sndashtpath in G we associate an assignment of the bj[i] by setting the bits of verticesvisited by the path to 1 and the others to 0 For example in the right-hand

74 Nondeterminism closure of VP1 121

s

0

1

2

kminus1

k

t

A1

A2

Ak

s

1 0 0

0 1 0

0 1 0

0 0 1

0 1 0

t

Figure 71 Illustration of the layer labelling and the path labelling used in theproof of Theorem 715

diagram in Fig 71 we show an sndasht path with the corresponding assignment of thebits bj[i] The assignments of the bj[i] corresponding to sndasht paths are preciselythe assignments such that for every i isin [k] exactly one of b1[i] b2[i] b3[i] equals 1Let

V (b1 b2 b3) =prodiisin[k]

(b1[i] + b2[i] + b3[i]

)prodstisin[3]s 6=t

(1minus bs[i]bt[i]

) (79)

Then the assignments of the bj[i] corresponding to sndasht paths are precisely theassignments such that V (b1 b2 b3) = 1 Otherwise V (b1 b2 b3) = 0

We will write f(x) as a hypercube sum by replacing each Ai[ji jiminus1] in (78)by a product of affine linear forms Si(Ai) with variables b and xsum

b

V (b1 b2 b3)Sk(Ak) middot middot middotS1(A1)

Define the expression Eq(α β) = (1minus α minus β)(1minus α minus β) for α β isin 0 1 Theexpression Eq(α β) evaluates to 1 if α equals β and evaluates to 0 otherwise

bull For any variable or constant x define

Si(M(x)) =(1 + (xminus 1)(b1[i]minus b1[iminus1])

)middot(1minus (1minus b2[i])b2[iminus1]

)middot Eq

(b3[iminus1] b3[i]

)

122 Chapter 7 Algebraic branching programs

bull For any permutation π isin S3 define

Si(Mπ) = Eq(b1[iminus1] bπ(1)[i]

)middot Eq

(b2[iminus1] bπ(2)[i]

)middot Eq

(b3[iminus1] bπ(3)[i]

)

bull For any constants a b c isin F define

Si(Mabc) =(a middot b1[iminus1] + b middot b2[iminus1] + c middot b3[iminus 1]

)middot Eq

(b1[iminus1] b1[i]

)middot Eq

(b2[iminus1] b2[i]

)middot Eq

(b3[iminus1] b3[i]

)

One verifies that

f(x) =sumb

V (b1 b2 b3)Sk(Ak) middot middot middotS1(A1)

Some of the factors in the expressions for the Si(Ai) are not affine linear As afinal step we apply the equality 1 + xy = 1

2

sumcisin01(x + 1 minus 2c)(y + 1 minus 2c) to

write these factors as products of affine linear forms introducing new hypercubevariables

75 Conclusion

We finish with an overview of inclusions equalities and separations among theclasses VPk VPe VP and their approximation and nondeterminism closures(when char(F) 6= 2) see Fig 72 The figure relies on the following two simplelemmas of which proofs can be found in our paper [BIZ17]

Lemma 716 ([BIZ17 Prop 510]) VP1 = VP1

Lemma 717 ([BIZ17 Prop 511]) VP1 ( VNP1 when char(F) 6= 2

75 Conclusion 123

VP1 VP2 VPe VP

VP1 VP2 VPe VP

VNP1 VNP2 VNPe VNP=

(

(

= =

(

= sube

( sube[AW16]

717

=716 sube sube(

subesube(710

715 [Val80]

[Val79]

Figure 72 Overview of relations among the algebraic complexity classes VPkVPe VP and their approximation and nondeterminism closures (when char(F) isnot 2) The relations without reference are either by definition or follow logicallyfrom the other relations

Bibliography

[AJRS13] Elizabeth S Allman Peter D Jarvis John A Rhodes andJeremy G Sumner Tensor rank invariants inequalities andapplications SIAM J Matrix Anal Appl 34(3)1014ndash1045 2013doi101137120899066 p 14

[Alo98] Noga Alon The Shannon capacity of a union Combinatorica18(3)301ndash310 1998 doi101007PL00009824 p 37

[ASU13] Noga Alon Amir Shpilka and Christopher Umans On sunflowersand matrix multiplication Comput Complexity 22(2)219ndash243Jun 2013 doi101007s00037-013-0060-1 p 48

[AW16] Eric Allender and Fengming Wang On the power of algebraicbranching programs of width two Comput Complexity25(1)217ndash253 2016 doi101007s00037-015-0114-7 p 17109 114 123

[AZ14] Martin Aigner and Gunter M Ziegler Proofs from The BookSpringer-Verlag Berlin fifth edition 2014doi101007978-3-662-44205-0 p 71

[BC18] Boris Bukh and Christopher Cox On a fractional version ofHaemersrsquo bound arXiv 2018 arXiv180200476 p 41 42

[BCC+17] Jonah Blasiak Thomas Church Henry Cohn Joshua A GrochowEric Naslund William F Sawin and Chris Umans On cap setsand the group-theoretic approach to matrix multiplication DiscreteAnal 2017 arXiv160506702 doi1019086da1245 p 4883 84 104

125

126 Bibliography

[BCPZ16] Harry Buhrman Matthias Christandl Christopher Perry andJeroen Zuiddam Clean quantum and classical communicationprotocols Phys Rev Lett 117230503 Dec 2016doi101103PhysRevLett117230503 p 1

[BCRL79] Dario Bini Milvio Capovani Francesco Romani and Grazia LottiO(n27799) complexity for ntimes n approximate matrix multiplicationInf Process Lett 8(5)234ndash235 1979doi1010160020-0190(79)90113-3 p 3 110

[BCS97] Peter Burgisser Michael Clausen and M Amin ShokrollahiAlgebraic complexity theory volume 315 of Grundlehren MathWiss Springer-Verlag Berlin 1997doi101007978-3-662-03338-8 p 4 6 48 50 66 79 119

[BCSX10] Arnab Bhattacharyya Victor Chen Madhu Sudan and Ning XieTesting Linear-Invariant Non-linear Properties A Short Reportpages 260ndash268 Springer Berlin Heidelberg Berlin Heidelberg2010 doi101007978-3-642-16367-8_18 p 48

[BCZ17a] Markus Blaser Matthias Christandl and Jeroen Zuiddam Theborder support rank of two-by-two matrix multiplication is sevenarXiv 2017 arXiv170509652 p 1 15

[BCZ17b] Harry Buhrman Matthias Christandl and Jeroen ZuiddamNondeterministic Quantum Communication Complexity the CyclicEquality Game and Iterated Matrix Multiplication In Christos HPapadimitriou editor 8th Innovations in Theoretical ComputerScience Conference (ITCS 2017) pages 241ndash2418 2017arXiv160303757 doi104230LIPIcsITCS201724 p 115

[Ber84] Stuart J Berkowitz On computing the determinant in smallparallel time using a small number of processors Inform ProcessLett 18(3)147ndash150 1984 doi1010160020-0190(84)90018-8p 108

[BI13] Peter Burgisser and Christian Ikenmeyer Explicit lower bounds viageometric complexity theory Proceedings 45th Annual ACMSymposium on Theory of Computing 2013 pages 141ndash150 2013doi10114524886082488627 p 108

[Bin80] Dario Bini Relations between exact and approximate bilinearalgorithms Applications Calcolo 17(1)87ndash97 1980doi101007BF02575865 p 3

Bibliography 127

[BIZ17] Karl Bringmann Christian Ikenmeyer and Jeroen Zuiddam OnAlgebraic Branching Programs of Small Width In Ryan OrsquoDonnelleditor 32nd Computational Complexity Conference (CCC 2017)pages 201ndash2031 2017 doi104230LIPIcsCCC201720 p 1107 111 112 118 122

[Bla13] Anna Blasiak A graph-theoretic approach to network coding PhDthesis Cornell University 2013 URL httpsecommonscornelledubitstreamhandle181334147ab675pdf p 42

[BLMW11] Peter Burgisser Joseph M Landsberg Laurent Manivel and JerzyWeyman An overview of mathematical issues arising in thegeometric complexity theory approach to VP 6= VNP SIAM JComput 40(4)1179ndash1209 2011 doi101137090765328 p 108

[BOC92] Michael Ben-Or and Richard Cleve Computing algebraic formulasusing a constant number of registers SIAM J Comput21(1)54ndash58 1992 doi1011370221006 p 17 109 112

[BPR+00] Charles H Bennett Sandu Popescu Daniel Rohrlich John ASmolin and Ashish V Thapliyal Exact and asymptotic measuresof multipartite pure-state entanglement Phys Rev A63(1)012307 2000 doi101103PhysRevA63012307 p 48

[Bre74] Richard P Brent The parallel evaluation of general arithmeticexpressions J ACM 21(2)201ndash206 April 1974doi101145321812321815 p 112

[Bri87] Michel Brion Sur lrsquoimage de lrsquoapplication moment In Seminairedrsquoalgebre Paul Dubreil et Marie-Paule Malliavin (Paris 1986)volume 1296 of Lecture Notes in Math pages 177ndash192 SpringerBerlin 1987 doi101007BFb0078526 p 9 93 94

[BS83] Eberhard Becker and Niels and Schwartz Zum Darstellungssatzvon Kadison-Dubois Arch Math (Basel) 40(5)421ndash428 1983doi101007BF01192806 p 7 12 33

[Bur90] Peter Burgisser Degenerationsordnung und Tragerfunktionalbilinearer Abbildungen PhD thesis Universitat Konstanz 1990httpnbn-resolvingdeurnnbndebsz352-opus-20311p 57 101

[Bur00] Peter Burgisser Completeness and reduction in algebraiccomplexity theory volume 7 of Algorithms and Computation inMathematics Springer-Verlag Berlin 2000doi101007978-3-662-04179-6 p 119

128 Bibliography

[Bur04] Peter Burgisser The complexity of factors of multivariatepolynomials Found Comput Math 4(4)369ndash396 2004doi101007s10208-002-0059-5 p 110 115

[BX15] Arnab Bhattacharyya and Ning Xie Lower bounds for testingtriangle-freeness in boolean functions Comput Complexity24(1)65ndash101 2015 doi101007s00037-014-0092-1 p 48

[BZ17] Jop Briet and Jeroen Zuiddam On the orthogonal rank of Cayleygraphs and impossibility of quantum round elimination QuantumInf Comput 17(1amp2) 2017 URL httpwwwrintonpresscomxxqic17qic-17-120106-0116pdfarXiv160806113 p 2

[CHM07] Matthias Christandl Aram W Harrow and Graeme MitchisonNonzero Kronecker coefficients and what they tell us about spectraComm Math Phys 270(3)575ndash585 2007doi101007s00220-006-0157-3 p 90

[CJZ18] Matthias Christandl Asger Kjaeligrulff Jensen and Jeroen ZuiddamTensor rank is not multiplicative under the tensor product LinearAlgebra Appl 543125ndash139 2018doi101016jlaa201712020 p 2 15

[CKSV16] Suryajith Chillara Mrinal Kumar Ramprasad Saptharishi andV Vinay The chasm at depth four and tensor rank Old resultsnew insights arXiv 2016 arXiv160604200 p 15

[CLP17] Ernie Croot Vsevolod F Lev and Peter Pal Pach Progression-freesets in Zn

4 are exponentially small Ann of Math (2)185(1)331ndash337 2017 doi104007annals201718517 p 4881

[CM06] Matthias Christandl and Graeme Mitchison The spectra ofquantum states and the Kronecker coefficients of the symmetricgroup Comm Math Phys 261(3)789ndash797 2006doi101007s00220-005-1435-1 p 91

[CMR+14] Toby Cubitt Laura Mancinska David E Roberson SimoneSeverini Dan Stahlke and Andreas Winter Bounds onentanglement-assisted source-channel coding via the Lovasz thetanumber and its variants IEEE Trans Inform Theory60(11)7330ndash7344 2014 arXiv13107120doi101109TIT20142349502 p 42

Bibliography 129

[CT12] Thomas M Cover and Joy A Thomas Elements of informationtheory John Wiley amp Sons 2012 p 60

[CU13] Henry Cohn and Christopher Umans Fast matrix multiplicationusing coherent configurations In Proceedings of the Twenty-FourthAnnual ACM-SIAM Symposium on Discrete Algorithms pages1074ndash1086 SIAM 2013 p 15

[CVZ16] Matthias Christandl Peter Vrana and Jeroen ZuiddamAsymptotic tensor rank of graph tensors beyond matrixmultiplication arXiv 2016 arXiv160907476 p 2 65 67 7985

[CVZ18] Matthias Christandl Peter Vrana and Jeroen Zuiddam Universalpoints in the asymptotic spectrum of tensors In Proceedings of 50thAnnual ACM SIGACT Symposium on the Theory of Computing(STOCrsquo18) ACM New York 2018 arXiv170907851doi10114531887453188766 p 2 47 65 87 88 96 103 105

[CW82] Don Coppersmith and Shmuel Winograd On the asymptoticcomplexity of matrix multiplication SIAM J Comput11(3)472ndash492 1982 doi1011370211038 p 3

[CW87] Don Coppersmith and Shmuel Winograd Matrix multiplication viaarithmetic progressions In Proceedings of the nineteenth annualACM symposium on Theory of computing pages 1ndash6 ACM 1987p 3

[CW90] Don Coppersmith and Shmuel Winograd Matrix multiplication viaarithmetic progressions J Symbolic Comput 9(3)251ndash280 1990doi101016S0747-7171(08)80013-2 p 4 6 8 10 48 67

[CZ18] Matthias Christandl and Jeroen Zuiddam Tensor surgery andtensor rank Comput Complexity Mar 2018doi101007s00037-018-0164-8 p 2 86

[Dra15] Jan Draisma Multilinear Algebra and Applications (lecture notes)2015 URL httpsmathsitesunibechjdraismapublicationsmlapplpdfp 15

[DVC00] Wolfgang Dur Guivre Vidal and Juan Ignacio Cirac Three qubitscan be entangled in two inequivalent ways Phys Rev A (3)62(6)062314 12 2000 doi101103PhysRevA62062314 p 48

130 Bibliography

[Ede04] Yves Edel Extensions of generalized product caps Des CodesCryptogr 31(1)5ndash14 2004 doi101023A1027365901231p 48 83

[EG17] Jordan S Ellenberg and Dion Gijswijt On large subsets of Fnq with

no three-term arithmetic progression Ann of Math (2)185(1)339ndash343 2017 doi104007annals201718518 p 1048 81 83 84

[FK14] Hu Fu and Robert Kleinberg Improved lower bounds for testingtriangle-freeness in boolean functions via fast matrix multiplicationIn Approximation Randomization and CombinatorialOptimization Algorithms and Techniques (APPROXRANDOM2014) pages 669ndash676 2014doi104230LIPIcsAPPROX-RANDOM2014669 p 48

[For16] Michael Forbes Some concrete questions on the border complexityof polynomials Presentation given at the Workshop on AlgebraicComplexity Theory WACT 2016 in Tel Avivhttpswwwyoutubecomwatchv=1HMogQIHT6Q 2016 p 110

[Fra02] Matthias Franz Moment polytopes of projective G-varieties andtensor products of symmetric group representations J Lie Theory12(2)539ndash549 2002 URLhttpemisamsorgjournalsJLTvol12_no216htmlp 93 94

[Fri17] Tobias Fritz Resource convertibility and ordered commutativemonoids Math Structures Comput Sci 27(6)850ndash938 2017doi101017S0960129515000444 p 37

[Ful97] William Fulton Young tableaux volume 35 of LondonMathematical Society Student Texts Cambridge University PressCambridge 1997 With applications to representation theory andgeometry p 88

[GKKS13] Ankit Gupta Pritish Kamath Neeraj Kayal and RamprasadSaptharishi Approaching the chasm at depth four In 2013 IEEEConference on Computational ComplexitymdashCCC 2013 pages 65ndash73IEEE Computer Soc Los Alamitos CA 2013doi101109CCC201316 p 108

[GMQ16] Joshua A Grochow Ketan D Mulmuley and Youming QiaoBoundaries of VP and VNP In Ioannis Chatzigiannakis MichaelMitzenmacher Yuval Rabani and Davide Sangiorgi editors 43rd

Bibliography 131

International Colloquium on Automata Languages andProgramming (ICALP 2016) volume 55 pages 341ndash3414 2016arXiv160502815 doi104230LIPIcsICALP201634 p 110

[Gro13] Joshua A Grochow Unifying and generalizing known lower boundsvia geometric complexity theory arXiv 2013 arXiv13046333p 108

[GW09] Roe Goodman and Nolan R Wallach Symmetry representationsand invariants volume 255 of Graduate Texts in MathematicsSpringer Dordrecht 2009 doi101007978-0-387-79852-3p 88

[Hae79] Willem Haemers On some problems of Lovasz concerning theShannon capacity of a graph IEEE Trans Inform Theory25(2)231ndash232 1979 doi101109TIT19791056027 p 37 4042

[Has90] Johan Hastad Tensor rank is NP-complete J Algorithms11(4)644ndash654 1990 doi1010160196-6774(90)90014-6 p 47

[HHHH09] Ryszard Horodecki Pawe l Horodecki Micha l Horodecki and KarolHorodecki Quantum entanglement Rev Modern Phys81(2)865ndash942 2009 doi101103RevModPhys81865 p 48

[HIL13] Jonathan D Hauenstein Christian Ikenmeyer and Joseph MLandsberg Equations for lower bounds on border rank ExpMath 22(4)372ndash383 2013 doi101080105864582013825892p 15 110

[Hum75] James E Humphreys Linear algebraic groups Springer-VerlagNew York-Heidelberg 1975 Graduate Texts in Mathematics No21 p 93

[HX17] Ishay Haviv and Ning Xie Sunflowers and testing triangle-freenessof functions Comput Complexity 26(2)497ndash530 Jun 2017doi101007s00037-016-0138-7 p 48

[Ike13] Christian Ikenmeyer Geometric complexity theory tensor rankand LittlewoodndashRichardson coefficients PhD thesis UniversitatPaderborn 2013 p 14

[Kar72] Richard M Karp Reducibility among combinatorial problems InComplexity of computer computations (Proc Sympos IBM ThomasJ Watson Res Center Yorktown Heights NY 1972) pages85ndash103 Plenum New York 1972 p 36

132 Bibliography

[Knu94] Donald E Knuth The sandwich theorem Electron J Combin 11994 URL httpwwwcombinatoricsorgVolume_1Abstractsv1i1a1htmlp 41

[Kra84] Hanspeter Kraft Geometrische Methoden in der InvariantentheorieSpringer 1984 doi101007978-3-663-10143-7 p 50 88 93

[KS08] Tali Kaufman and Madhu Sudan Algebraic property testing Therole of invariance In Proceedings of the Fortieth Annual ACMSymposium on Theory of Computing STOC rsquo08 pages 403ndash412New York NY USA 2008 ACMdoi10114513743761374434 p 48

[KSS16] Robert Kleinberg William F Sawin and David E Speyer Thegrowth rate of tri-colored sum-free sets arXiv 2016arXiv160700047 p 48 79 83

[Lan06] Joseph M Landsberg The border rank of the multiplication of2times 2 matrices is seven J Amer Math Soc 19(2)447ndash459 2006doi101090S0894-0347-05-00506-0 p 110

[LG14] Francois Le Gall Powers of tensors and fast matrix multiplicationIn ISSAC 2014mdashProceedings of the 39th International Symposiumon Symbolic and Algebraic Computation pages 296ndash303 ACM NewYork 2014 doi10114526086282608664 p 4 6 8 48 85

[Lic84] Thomas Lickteig A note on border rank Inf Process Lett18(3)173ndash178 1984 doi1010160020-0190(84)90023-1p 110

[LM16a] Joseph M Landsberg and Mateusz Micha lek A 2n2 minus log(n)minus 1lower bound for the border rank of matrix multiplication arXiv2016 arXiv160807486 p 110

[LM16b] Joseph M Landsberg and Mateusz Micha lek Abelian tensorsJ Math Pures Appl 2016 doi101016jmatpur201611004p 14

[LMR13] Joseph M Landsberg Laurent Manivel and Nicolas RessayreHypersurfaces with degenerate duals and the geometric complexitytheory program Comment Math Helv 88(2)469ndash484 2013doi104171CMH292 p 108

[LO15] Joseph M Landsberg and Giorgio Ottaviani New lower bounds forthe border rank of matrix multiplication Theory Comput

Bibliography 133

11285ndash298 2015 arXiv11126007doi104086toc2015v011a011 p 108 110

[Lov79] Laszlo Lovasz On the Shannon capacity of a graph IEEE TransInform Theory 25(1)1ndash7 1979 doi101109TIT19791055985p 13 35 41

[Mar08] Murray Marshall Positive polynomials and sums of squaresvolume 146 of Mathematical Surveys and Monographs AmericanMathematical Society Providence RI 2008doi101090surv146 p 34

[MP71] Robert J McEliece and Edward C Posner Hide and seek datastorage and entropy The Annals of Mathematical Statistics42(5)1706ndash1716 1971 doi101214aoms1177693169 p 41

[MP08] Guillaume Malod and Natacha Portier Characterizing Valiantrsquosalgebraic complexity classes J Complexity 24(1)16ndash38 2008doi101016jjco200609006 p 119

[MS01] Ketan D Mulmuley and Milind Sohoni Geometric complexitytheory I An approach to the P vs NP and related problemsSIAM J Comput 31(2)496ndash526 2001doi101137S009753970038715X p 14 108

[MS08] Ketan D Mulmuley and Milind Sohoni Geometric complexitytheory II Towards explicit obstructions for embeddings amongclass varieties SIAM J Comput 38(3)1175ndash1206 2008doi101137080718115 p 108

[Nes84] Linda Ness A stratification of the null cone via the moment mapAmer J Math 106(6)1281ndash1329 1984 With an appendix byDavid Mumford doi1023072374395 p 9 93 94

[Nis91] Noam Nisan Lower bounds for non-commutative computation InProceedings of the twenty-third annual ACM symposium on Theoryof computing pages 410ndash418 ACM 1991doi101145103418103462 p 110

[Nor16] Sergey Norin A distribution on triples with maximum entropymarginal arXiv 2016 arXiv160800243 p 83

[NW97] Noam Nisan and Avi Wigderson Lower bounds on arithmeticcircuits via partial derivatives Comput Complexity 6(3)217ndash234199697 doi101007BF01294256 p 108

134 Bibliography

[Pan78] Victor Ya Pan Strassenrsquos algorithm is not optimal Trilineartechnique of aggregating uniting and canceling for constructingfast algorithms for matrix operations In 19th Annual Symposiumon Foundations of Computer Science (Ann Arbor Mich 1978)pages 166ndash176 IEEE Long Beach Calif 1978 p 3

[Pan80] Victor Ya Pan New fast algorithms for matrix operations SIAMJ Comput 9(2)321ndash342 1980 doi1011370209027 p 3

[Pan81] Victor Ya Pan New combinations of methods for the accelerationof matrix multiplication Comput Math Appl 7(1)73ndash125 1981doi1010160898-1221(81)90009-2 p 3

[Pan84] Victor Ya Pan How to multiply matrices faster volume 179 ofLecture Notes in Computer Science Springer-Verlag Berlin 1984doi1010073-540-13866-8 p 3

[Pan18] Victor Ya Pan Fast feasible and unfeasible matrix multiplicationarXiv 2018 arXiv180404102 p 6

[PD01] Alexander Prestel and Charles N Delzell Positive polynomialsSpringer Monographs in Mathematics Springer-Verlag Berlin2001 From Hilbertrsquos 17th problem to real algebradoi101007978-3-662-04648-7 p 34

[Peb16] Luke Pebody Proof of a conjecture of Kleinberg-Sawin-SpeyerarXiv 2016 arXiv160805740 p 83

[PS98] George Polya and Gabor Szego Problems and theorems inanalysis I Classics in Mathematics Springer-Verlag Berlin 1998Series integral calculus theory of functions Translated from theGerman by Dorothee Aeppli Reprint of the 1978 Englishtranslation doi101007978-3-642-61905-2 p 21

[Raz09] Ran Raz Multi-linear formulas for permanent and determinant areof super-polynomial size J ACM 56(2)Art 8 17 2009doi10114515027931502797 p 108

[Raz13] Ran Raz Tensor-rank and lower bounds for arithmetic formulasJ ACM 60(6)Art 40 15 2013 doi1011452535928 p 14

[Rom82] Francesco Romani Some properties of disjoint sums of tensorsrelated to matrix multiplication SIAM J Comput 11(2)263ndash2671982 doi1011370211020 p 3

Bibliography 135

[Sap16] Ramprasad Saptharishi A survey of lower bounds in arithmeticcircuit complexity 302 2016 Online survey URLhttpsgithubcomdasarpmarlowerbounds-survey p 6 17109 112

[Sch81] Arnold Schonhage Partial and total matrix multiplication SIAMJ Comput 10(3)434ndash455 1981 p 3

[Sch03] Alexander Schrijver Combinatorial optimization polyhedra andefficiency volume 24 Springer Science amp Business Media 2003p 37 41

[Sha56] Claude E Shannon The zero error capacity of a noisy channelInstitute of Radio Engineers Transactions on Information TheoryIT-2(September)8ndash19 1956 doi101109TIT19561056798p 13 35

[Sha09] Asaf Shapira Greenrsquos conjecture and testing linear-invariantproperties In Proceedings of the Forty-first Annual ACMSymposium on Theory of Computing STOC rsquo09 pages 159ndash166New York NY USA 2009 ACMdoi10114515364141536438 p 48

[Shi16] Yaroslav Shitov How hard is the tensor rank arXiv 2016arXiv161101559 p 47

[Sin64] Richard C Singleton Maximum distance q-nary codes IEEETrans Information Theory IT-10116ndash118 1964doi101109TIT19641053661 p 101

[SOK14] Adam Sawicki Micha l Oszmaniec and Marek Kus Convexity ofmomentum map Morse index and quantum entanglement RevMath Phys 26(3)1450004 39 2014doi101142S0129055X14500044 p 9

[SSS09] Chandan Saha Ramprasad Saptharishi and Nitin Saxena Thepower of depth 2 circuits over algebras In IARCS AnnualConference on Foundations of Software Technology and TheoreticalComputer Science volume 4 pages 371ndash382 2009arXiv09042058 doi104230LIPIcsFSTTCS20092333p 109

[Sto10] Andrew James Stothers On the complexity of matrix multiplicationPhD thesis University of Edinburgh 2010httphdlhandlenet18424734 p 4 6 8 48

136 Bibliography

[Str69] Volker Strassen Gaussian elimination is not optimal NumerMath 13(4)354ndash356 1969 doi101007BF02165411 p 3 5

[Str83] Volker Strassen Rank and optimal computation of generic tensorsLinear Algebra Appl 5253645ndash685 1983doi1010160024-3795(83)80041-X p 110

[Str86] Volker Strassen The asymptotic spectrum of tensors and theexponent of matrix multiplication In Proceedings of the 27thAnnual Symposium on Foundations of Computer Science SFCS rsquo86pages 49ndash54 Washington DC USA 1986 IEEE Computer Societydoi101109SFCS198652 p 4 7

[Str87] Volker Strassen Relative bilinear complexity and matrixmultiplication J Reine Angew Math 375376406ndash443 1987doi101515crll1987375-376406 p 3 4 49 67

[Str88] Volker Strassen The asymptotic spectrum of tensors J ReineAngew Math 384102ndash152 1988doi101515crll1988384102 p 4 7 12 19 26 27 28 2930 32 33 49 50 51

[Str91] Volker Strassen Degeneration and complexity of bilinear mapssome asymptotic spectra J Reine Angew Math 413127ndash1801991 doi101515crll1991413127 p 3 4 10 48 49 5255 56 57 66 67 81 82

[Str94] Volker Strassen Algebra and complexity In First EuropeanCongress of Mathematics Vol II (Paris 1992) volume 120 ofProgr Math pages 429ndash446 Birkhauser Basel 1994doi101007s10107-008-0221-1 p 67

[Str05] Volker Strassen Komplexitat und Geometrie bilinearerAbbildungen Jahresber Deutsch Math-Verein 107(1)3ndash31 2005p 4 88 94 95 100 101

[Tao08] Terence Tao Structure and randomness pages from year one of amathematical blog American Mathematical Soc 2008 p 48

[Tao16] Terence Tao A symmetric formulation of theCrootndashLevndashPachndashEllenbergndashGijswijt capset boundhttpsterrytaowordpresscom 2016 p 48 58 81 84

[Tob91] Verena Tobler Spezialisierung und Degeneration von TensorenPhD thesis Universitat Konstanz 1991httpnbn-resolvingdeurnnbndebsz352-opus-20324p 57

Bibliography 137

[TS16] Terence Tao and Will Sawin Notes on the ldquoslice rankrdquo of tensorshttpsterrytaowordpresscom 2016 p 48 58

[Val79] Leslie G Valiant Completeness classes in algebra In ConferenceRecord of the Eleventh Annual ACM Symposium on Theory ofComputing (Atlanta Ga 1979) pages 249ndash261 ACM New York1979 doi101145800135804419 p 107 108 123

[Val80] Leslie G Valiant Reducibility by algebraic projections Universityof Edinburgh Department of Computer Science 1980 InternalReport p 109 119 123

[VC15] Peter Vrana and Matthias Christandl Asymptotic entanglementtransformation between W and GHZ states J Math Phys56(2)022204 12 2015 arXiv13103244doi10106314908106 p 69

[VDDMV02] F Verstraete J Dehaene B De Moor and H Verschelde Fourqubits can be entangled in nine different ways Phys Rev A (3)65(5 part A)052112 5 2002 doi101103PhysRevA65052112p 48

[Wal14] Michael Walter Multipartite quantum states and their marginalsPhD thesis ETH Zurich 2014 arXiv14106820 p 93

[WDGC13] Michael Walter Brent Doran David Gross and MatthiasChristandl Entanglement polytopes multiparticle entanglementfrom single-particle information Science 340(6137)1205ndash12082013 arXiv12080365 doi101126science1232957 p 8 995

[Wil12] Virginia Vassilevska Williams Multiplying matrices faster thanCoppersmith-Winograd Extended abstract InSTOCrsquo12mdashProceedings of the 2012 ACM Symposium on Theory ofComputing pages 887ndash898 ACM New York 2012doi10114522139772214056 p 4 6 8 48

[Zui17] Jeroen Zuiddam A note on the gap between rank and border rankLinear Algebra Appl 52533ndash44 2017doi101016jlaa201703015 p 2 14 110

[Zui18] Jeroen Zuiddam The asymptotic spectrum of graphs and theShannon capacity arXiv 2018 arXiv180700169 p 35

Glossary

〈n〉 ntimes middot middot middot times n diagonal tensor 47

〈a b c〉 matrix multiplication tensor 48

G lowastH or-product 42

GH strong graph product and-product 35

α(G) stability number 35

χ(G) clique cover number 40

Kk complete graph on k vertices 36

F θ(t) quantum functional 96

G(t) GLn1 times middot middot middot timesGLnk for t isin Fn1 otimes middot middot middot otimes Fnk 52

H(P ) Shannon entropy of probability distribution P 52

h(p) binary entropy of probability p isin [0 1] 53

τ(Φ) hitting set number 59

˜τ(Φ) asymptotic hitting set number 60

ω matrix multiplication exponent 47

P moment polytope 94

139

140 Glossary

P(X) the set of probability distributions on X 52

R rank 27

˜R asymptotic rank 27

R(t) border rank 50

R(G) rank of a graph clique cover number 40

R(t) tensor rank 47

SR(t) slice rank 58

Q subrank 27

˜Q asymptotic subrank 27

Q(t) border subrank 50

Q(Φ) combinatorial subrank 10

Q(G) subrank of a graph stability number 40

supp(t) support 52

Θ(G) Shannon capacity 35

ϑ(G) Lovasz theta number 41

G tH disjoint union 36

W (t) Sn1 times middot middot middot times Snk for t isin Fn1 otimes middot middot middot otimes Fnk 53

X(S6) asymptotic spectrum of semiring S with Strassen preorder 6 25

ζ(S)(t) gauge point 51

ζθ(t) support functional 52

Samenvatting

Algebraısche complexiteit asymptotische spectra enverstrengelingspolytopen

Het is welbekend dat de rang van een matrix multiplicatief is onder het Krone-ckerproduct additief onder de directe som genormaliseerd op identiteitsmatricesen niet-stijgend onder vermenigvuldiging van links en van rechts met matricesMatrixrang is zelfs de enige reele parameter met deze vier eigenschappen In 1986initieerde Strassen de studie van de uitbreiding naar tensoren vind alle afbeel-dingen van k-tensoren naar de reele getallen die multiplicatief zijn onder hettensor Kroneckerproduct additief onder de directe som genormaliseerd op ldquoiden-titeitstensorenrdquo en niet-stijgend onder het toepassen van lineaire afbeeldingen opde k tensorfactoren Strassen noemde de verzameling van deze afbeeldingen hetldquoasymptotische spectrum van k-tensorenrdquo Hij bewees als we het asymptotischespectrum begrijpen dan begrijpen we de asymptotische relaties tussen tensorswaaronder de asymptotische subrang en de asymptotische rang In het bijzonderals we het asymptotische spectrum kennen dan kennen we de aritmetische com-plexiteit van matrixvermenigvuldiging een centraal probleem in de algebraıschecomplexiteitstheorie

Een van de hoofdresultaten in dit proefschrift is de eerste expliciete construc-tie van een oneindige familie van elementen in het asymptotische spectrum vancomplexe k-tensoren genaamd de quantumfunctionalen Onze constructie is geba-seerd op informatietheorie en momentpolytopen ook wel verstrengelingspolytopengenoemd Daarnaast bestuderen we onder andere de relatie tussen de recentgeıntroduceerde slice rang en de quantumfunctionalen en we bewijzen dat deldquoasymptotischerdquo slice rang gelijk is aan het minimum over de quantumfunctionalenNaast het bestuderen van de bovengenoemde tensorparameters geven we eenuitbreiding van de CoppersmithndashWinograd-methode (voor het verkrijgen vanondergrenzen op de asymptotische combinatorische subrang) naar hogere-orde

141

142 Samenvatting

tensoren dwz tensoren van orde minstens 4 We passen deze uitbreiding toeom nieuwe bovengrenzen te krijgen op de asymptotische tensorrang van complete-graaftensoren via de lasermethode (Gezamenlijk werk met Christandl en VranaQIP 2018 STOC 2018)

Als een nieuwe toepassing van de abstracte theorie van asymptotische spectraintroduceren we het asymptotische spectrum van grafen in de grafentheorie Ana-loog aan de situatie voor tensoren geldt als we het asymptotisch spectrum vangrafen begrijpen dan begrijpen we de Shannoncapaciteit een graafparameter diede zero-error-communicatiecomplexiteit van communicatiekanalen karakteriseertMet andere woorden we bewijzen een nieuwe dualiteitsstelling voor de Shannon-capaciteit Voorbeelden van elementen in het asymptotische spectrum van grafenzijn het thetagetal van Lovasz en de fractionele Haemersgrenzen

Tot slot bestuderen we een algebraısch model van berekening genaamd algebraicbranching programs Een algebraic branching program (abp) is het spoor vaneen product van matrices met polynomen van graad hoogstens 1 als elementenDe maximale grootte van de matrices heet de breedte van de abp In 1992bewezen Ben-Or en Cleve dat elk polynoom berekend kan worden door eenbreedte-3 abp met een aantal matrices dat polynomiaal is in de formula size vanhet polynoom Daarentegen bewezen Allender en Wang in 2011 dat sommigepolynomen niet berekend kunnen worden door breedte-2 abps Wij bewijzen dat elkpolynoom benaderd kan worden door een breedte-2 abp met een aantal matricesdat polynomiaal is in de formula size van het polynoom waarbij benaderingwordt bedoeld in de zin van degeneration (Gezamenlijk werk met Ikenmeyer enBringmann CCC 2017 JACM 2018)

Summary

Algebraic complexity asymptotic spectra andentanglement polytopes

Matrix rank is well-known to be multiplicative under the Kronecker productadditive under the direct sum normalised on identity matrices and non-increasingunder multiplying from the left and from the right by any matrices In fact matrixrank is the only real matrix parameter with these four properties In 1986 Strassenproposed to study the extension to tensors find all maps from k-tensors to thereals that are multiplicative under the tensor Kronecker product additive underthe direct sum normalised on ldquoidentity tensorsrdquo and non-increasing under actingwith linear maps on the k tensor factors Strassen called the collection of thesemaps the ldquoasymptotic spectrum of k-tensorsrdquo He proved that understandingthe asymptotic spectrum implies understanding the asymptotic relations amongtensors including the asymptotic subrank and the asymptotic rank In particularknowing the asymptotic spectrum means knowing the arithmetic complexity ofmatrix multiplication a central problem in algebraic complexity theory

One of the main results in this dissertation is the first explicit construction ofan infinite family of elements in the asymptotic spectrum of complex k-tensorscalled the quantum functionals Our construction is based on information theoryand moment polytopes ie entanglement polytopes Moreover among otherthings we study the relation of the recently introduced slice rank to the quantumfunctionals and find that ldquoasymptoticrdquo slice rank equals the minimum over thequantum functionals Besides studying the above tensor parameters we extendthe CoppersmithndashWinograd method (for obtaining asymptotic combinatorialsubrank lower bounds) to higher-order tensors ie order at least 4 We applythis generalisation to obtain new upper bounds on the asymptotic tensor rankof complete graph tensors via the laser method (Joint work with Christandland Vrana QIP 2018 STOC 2018)

143

144 Summary

In graph theory as a new instantiation of the abstract theory of asymptoticspectra we introduce the asymptotic spectrum of graphs Analogous to thesituation for tensors understanding the asymptotic spectrum of graphs meansunderstanding the Shannon capacity a graph parameter capturing the zero-errorcommunication complexity of communication channels In different words weprove a new duality theorem for Shannon capacity Some known elements in theasymptotic spectrum of graphs are the Lovasz theta number and the fractionalHaemers bounds

Finally we study an algebraic model of computation called algebraic branchingprograms An algebraic branching program (abp) is the trace of a product ofmatrices with affine linear forms as matrix entries The maximum size of thematrices is called the width of the abp In 1992 Ben-Or and Cleve provedthat width-3 abps can compute any polynomial efficiently in the formula sizeOn the other hand in 2011 Allender and Wang proved that some polynomialscannot be computed by any width-2 abp We prove that any polynomial can beefficiently approximated by a width-2 abp where approximation is defined in thesense of degeneration (Joint work with Ikenmeyer and Bringmann CCC 2017JACM 2018)

Titles in the ILLC Dissertation Series

ILLC DS-2009-01 Jakub SzymanikQuantifiers in TIME and SPACE Computational Complexity of GeneralizedQuantifiers in Natural Language

ILLC DS-2009-02 Hartmut FitzNeural Syntax

ILLC DS-2009-03 Brian Thomas SemmesA Game for the Borel Functions

ILLC DS-2009-04 Sara L UckelmanModalities in Medieval Logic

ILLC DS-2009-05 Andreas WitzelKnowledge and Games Theory and Implementation

ILLC DS-2009-06 Chantal BaxSubjectivity after Wittgenstein Wittgensteinrsquos embodied and embedded subjectand the debate about the death of man

ILLC DS-2009-07 Kata BaloghTheme with Variations A Context-based Analysis of Focus

ILLC DS-2009-08 Tomohiro HoshiEpistemic Dynamics and Protocol Information

ILLC DS-2009-09 Olivia LadinigTemporal expectations and their violations

ILLC DS-2009-10 Tikitu de JagerrdquoNow that you mention it I wonderrdquo Awareness Attention Assumption

ILLC DS-2009-11 Michael FrankeSignal to Act Game Theory in Pragmatics

ILLC DS-2009-12 Joel UckelmanMore Than the Sum of Its Parts Compact Preference Representation OverCombinatorial Domains

ILLC DS-2009-13 Stefan BoldCardinals as Ultrapowers A Canonical Measure Analysis under the Axiom ofDeterminacy

ILLC DS-2010-01 Reut TsarfatyRelational-Realizational Parsing

ILLC DS-2010-02 Jonathan ZvesperPlaying with Information

ILLC DS-2010-03 Cedric DegremontThe Temporal Mind Observations on the logic of belief change in interactivesystems

ILLC DS-2010-04 Daisuke IkegamiGames in Set Theory and Logic

ILLC DS-2010-05 Jarmo KontinenCoherence and Complexity in Fragments of Dependence Logic

ILLC DS-2010-06 Yanjing WangEpistemic Modelling and Protocol Dynamics

ILLC DS-2010-07 Marc StaudacherUse theories of meaning between conventions and social norms

ILLC DS-2010-08 Amelie GheerbrantFixed-Point Logics on Trees

ILLC DS-2010-09 Gaelle FontaineModal Fixpoint Logic Some Model Theoretic Questions

ILLC DS-2010-10 Jacob VosmaerLogic Algebra and Topology Investigations into canonical extensions dualitytheory and point-free topology

ILLC DS-2010-11 Nina GierasimczukKnowing Onersquos Limits Logical Analysis of Inductive Inference

ILLC DS-2010-12 Martin Mose BentzenStit Iit and Deontic Logic for Action Types

ILLC DS-2011-01 Wouter M KoolenCombining Strategies Efficiently High-Quality Decisions from ConflictingAdvice

ILLC DS-2011-02 Fernando Raymundo Velazquez-QuesadaSmall steps in dynamics of information

ILLC DS-2011-03 Marijn KoolenThe Meaning of Structure the Value of Link Evidence for Information Retrieval

ILLC DS-2011-04 Junte ZhangSystem Evaluation of Archival Description and Access

ILLC DS-2011-05 Lauri KeskinenCharacterizing All Models in Infinite Cardinalities

ILLC DS-2011-06 Rianne KapteinEffective Focused Retrieval by Exploiting Query Context and Document Struc-ture

ILLC DS-2011-07 Jop BrietGrothendieck Inequalities Nonlocal Games and Optimization

ILLC DS-2011-08 Stefan MinicaDynamic Logic of Questions

ILLC DS-2011-09 Raul Andres LealModalities Through the Looking Glass A study on coalgebraic modal logic andtheir applications

ILLC DS-2011-10 Lena KurzenComplexity in Interaction

ILLC DS-2011-11 Gideon BorensztajnThe neural basis of structure in language

ILLC DS-2012-01 Federico SangatiDecomposing and Regenerating Syntactic Trees

ILLC DS-2012-02 Markos MylonakisLearning the Latent Structure of Translation

ILLC DS-2012-03 Edgar Jose Andrade LoteroModels of Language Towards a practice-based account of information innatural language

ILLC DS-2012-04 Yurii KhomskiiRegularity Properties and Definability in the Real Number Continuum idealizedforcing polarized partitions Hausdorff gaps and mad families in the projectivehierarchy

ILLC DS-2012-05 David Garcıa SorianoQuery-Efficient Computation in Property Testing and Learning Theory

ILLC DS-2012-06 Dimitris GakisContextual Metaphilosophy - The Case of Wittgenstein

ILLC DS-2012-07 Pietro GallianiThe Dynamics of Imperfect Information

ILLC DS-2012-08 Umberto GrandiBinary Aggregation with Integrity Constraints

ILLC DS-2012-09 Wesley Halcrow HollidayKnowing What Follows Epistemic Closure and Epistemic Logic

ILLC DS-2012-10 Jeremy MeyersLocations Bodies and Sets A model theoretic investigation into nominalisticmereologies

ILLC DS-2012-11 Floor SietsmaLogics of Communication and Knowledge

ILLC DS-2012-12 Joris DormansEngineering emergence applied theory for game design

ILLC DS-2013-01 Simon PauwSize Matters Grounding Quantifiers in Spatial Perception

ILLC DS-2013-02 Virginie FiutekPlaying with Knowledge and Belief

ILLC DS-2013-03 Giannicola ScarpaQuantum entanglement in non-local games graph parameters and zero-errorinformation theory

ILLC DS-2014-01 Machiel KeestraSculpting the Space of Actions Explaining Human Action by IntegratingIntentions and Mechanisms

ILLC DS-2014-02 Thomas IcardThe Algorithmic Mind A Study of Inference in Action

ILLC DS-2014-03 Harald A BastiaanseVery Many Small Penguins

ILLC DS-2014-04 Ben RodenhauserA Matter of Trust Dynamic Attitudes in Epistemic Logic

ILLC DS-2015-01 Marıa Ines CrespoAffecting Meaning Subjectivity and evaluativity in gradable adjectives

ILLC DS-2015-02 Mathias Winther MadsenThe Kid the Clerk and the Gambler - Critical Studies in Statistics andCognitive Science

ILLC DS-2015-03 Shengyang ZhongOrthogonality and Quantum Geometry Towards a Relational Reconstructionof Quantum Theory

ILLC DS-2015-04 Sumit SourabhCorrespondence and Canonicity in Non-Classical Logic

ILLC DS-2015-05 Facundo CarreiroFragments of Fixpoint Logics Automata and Expressiveness

ILLC DS-2016-01 Ivano A CiardelliQuestions in Logic

ILLC DS-2016-02 Zoe ChristoffDynamic Logics of Networks Information Flow and the Spread of Opinion

ILLC DS-2016-03 Fleur Leonie BouwerWhat do we need to hear a beat The influence of attention musical abilitiesand accents on the perception of metrical rhythm

ILLC DS-2016-04 Johannes MartiInterpreting Linguistic Behavior with Possible World Models

ILLC DS-2016-05 Phong LeLearning Vector Representations for Sentences - The Recursive Deep LearningApproach

ILLC DS-2016-06 Gideon Maillette de Buy WennigerAligning the Foundations of Hierarchical Statistical Machine Translation

ILLC DS-2016-07 Andreas van CranenburghRich Statistical Parsing and Literary Language

ILLC DS-2016-08 Florian SpeelmanPosition-based Quantum Cryptography and Catalytic Computation

ILLC DS-2016-09 Teresa PiovesanQuantum entanglement insights via graph parameters and conic optimization

ILLC DS-2016-10 Paula HenkNonstandard Provability for Peano Arithmetic A Modal Perspective

ILLC DS-2017-01 Paolo GaleazziPlay Without Regret

ILLC DS-2017-02 Riccardo PinosioThe Logic of Kantrsquos Temporal Continuum

ILLC DS-2017-03 Matthijs WesteraExhaustivity and intonation a unified theory

ILLC DS-2017-04 Giovanni CinaCategories for the working modal logician

ILLC DS-2017-05 Shane Noah Steinert-ThrelkeldCommunication and Computation New Questions About Compositionality

ILLC DS-2017-06 Peter HawkeThe Problem of Epistemic Relevance

ILLC DS-2017-07 Aybuke OzgunEvidence in Epistemic Logic A Topological Perspective

ILLC DS-2017-08 Raquel Garrido AlhamaComputational Modelling of Artificial Language Learning Retention Recogni-tion amp Recurrence

ILLC DS-2017-09 Milos StanojevicPermutation Forests for Modeling Word Order in Machine Translation

ILLC DS-2018-01 Berit JanssenRetained or Lost in Transmission Analyzing and Predicting Stability in DutchFolk Songs

ILLC DS-2018-02 Hugo HuurdemanSupporting the Complex Dynamics of the Information Seeking Process

ILLC DS-2018-03 Corina KoolenReading beyond the female The relationship between perception of authorgender and literary quality

ILLC DS-2018-04 Jelle BruinebergAnticipating Affordances Intentionality in self-organizing brain-body-environmentsystems

ILLC DS-2018-05 Joachim DaiberTypologically Robust Statistical Machine Translation Understanding and Ex-ploiting Differences and Similarities Between Languages in Machine Transla-tion

ILLC DS-2018-06 Thomas BrochhagenSignaling under Uncertainty

ILLC DS-2018-07 Julian SchloderAssertion and Rejection

ILLC DS-2018-08 Srinivasan ArunachalamQuantum Algorithms and Learning Theory

ILLC DS-2018-09 Hugo de Holanda Cunha NobregaGames for functions Baire classes Weihrauch degrees transfinite computa-tions and ranks

ILLC DS-2018-10 Chenwei ShiReason to Believe

ILLC DS-2018-11 Malvin GattingerNew Directions in Model Checking Dynamic Epistemic Logic

ILLC DS-2018-12 Julia IlinFiltration Revisited Lattices of Stable Non-Classical Logics

  • Acknowledgements
  • Introduction
    • Matrix multiplication
    • The asymptotic spectrum of tensors
    • Higher-order CW method
    • Abstract asymptotic spectra
    • The asymptotic spectrum of graphs
    • Tensor degeneration
    • Combinatorial degeneration
    • Algebraic branching program degeneration
    • Organisation
      • The theory of asymptotic spectra
        • Introduction
        • Semirings and preorders
        • Strassen preorders
        • Asymptotic preorders
        • Maximal Strassen preorders
        • The asymptotic spectrum
        • The representation theorem
        • Abstract rank and subrank
        • Topological aspects
        • Uniqueness
        • Subsemirings
        • Subsemirings generated by one element
        • Universal spectral points
        • Conclusion
          • The asymptotic spectrum of graphs Shannon capacity
            • Introduction
            • The asymptotic spectrum of graphs
              • The semiring of graph isomorphism classes
              • Strassen preorder via graph homomorphisms
              • The asymptotic spectrum of graphs
              • Shannon capacity
                • Universal spectral points
                  • Lovaacutesz theta number
                  • Fractional graph parameters
                    • Conclusion
                      • The asymptotic spectrum of tensors matrix multiplication
                        • Introduction
                        • The asymptotic spectrum of tensors
                          • The semiring of tensor equivalence classes
                          • Strassen preorder via restriction
                          • The asymptotic spectrum of tensors
                          • Asymptotic rank and asymptotic subrank
                            • Gauge points
                            • Support functionals
                            • Upper and lower support functionals
                            • Asymptotic slice rank
                            • Conclusion
                              • Tight tensors and combinatorial subrank cap sets
                                • Introduction
                                • Higher-order CoppersmithndashWinograd method
                                  • Construction
                                  • Computational remarks
                                  • Examples type sets
                                    • Combinatorial degeneration method
                                    • Cap sets
                                      • Reduced polynomial multiplication
                                      • Cap sets
                                        • Graph tensors
                                        • Conclusion
                                          • Universal points in the asymptotic spectrum of tensors entanglement polytopes moment polytopes
                                            • Introduction
                                            • SchurndashWeyl duality
                                            • Kronecker and LittlewoodndashRichardson coefficients
                                            • Entropy inequalities
                                            • Hilbert spaces and density operators
                                            • Moment polytopes
                                              • General setting
                                              • Tensor spaces
                                                • Quantum functionals
                                                • Outer approximation
                                                • Inner approximation for free tensors
                                                • Quantum functionals versus support functionals
                                                • Asymptotic slice rank
                                                • Conclusion
                                                  • Algebraic branching programs approximation and nondeterminism
                                                    • Introduction
                                                    • Definitions and basic results
                                                      • Computational models
                                                      • Complexity classes
                                                      • The theorem of Ben-Or and Cleve
                                                      • Approximation closure
                                                      • Nondeterminism closure
                                                        • Approximation closure of VP2
                                                        • Nondeterminism closure of VP1
                                                        • Conclusion
                                                          • Bibliography
                                                          • Glossary
                                                          • Samenvatting
                                                          • Summary
Page 3: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch

Algebraic complexity

asymptotic spectra and

entanglement polytopes

ILLC Dissertation Series DS-2018-13

For further information about ILLC-publications please contact

Institute for Logic Language and ComputationUniversiteit van Amsterdam

Science Park 1071098 XG Amsterdam

phone +31-20-525 6051e-mail illcuvanl

homepage httpwwwillcuvanl

The investigations were supported by the Netherlands Organization for ScientificResearch NWO (617023116) the European Commission and the QuSoft ResearchCenter for Quantum Software

Copyright ccopy 2018 by Jeroen Zuiddam

ISBN 978-94-028-1175-9

Algebraic complexity

asymptotic spectra and

entanglement polytopes

Academisch Proefschrift

ter verkrijging van de graad van doctoraan de Universiteit van Amsterdamop gezag van de Rector Magnificus

prof dr ir KIJ Maexten overstaan van een door het College voor Promoties ingestelde

commissie in het openbaar te verdedigen in de Agnietenkapelop dinsdag 23 oktober 2018 te 1200 uur

door

Jeroen Zuiddam

geboren te Leiderdorp

Promotiecommisie

Promotores prof dr HM Buhrman Universiteit van Amsterdamprof dr M Christandl Koslashbenhavns Universitet

Overige leden prof dr M Laurent Tilburg Universityprof dr EM Opdam Universiteit van Amsterdamprof dr RM de Wolf Universiteit van Amsterdamdr J Briet CWI Amsterdamdr M Walter Universiteit van Amsterdam

Faculteit der Natuurwetenschappen Wiskunde en Informatica

Contents

Acknowledgements ix

1 Introduction 311 Matrix multiplication 512 The asymptotic spectrum of tensors 613 Higher-order CW method 1014 Abstract asymptotic spectra 1115 The asymptotic spectrum of graphs 1216 Tensor degeneration 1417 Combinatorial degeneration 1518 Algebraic branching program degeneration 1519 Organisation 17

2 The theory of asymptotic spectra 1921 Introduction 1922 Semirings and preorders 1923 Strassen preorders 2024 Asymptotic preorders 4sim 2125 Maximal Strassen preorders 2326 The asymptotic spectrum X(S6) 2527 The representation theorem 2628 Abstract rank and subrank RQ 2729 Topological aspects 29210 Uniqueness 30211 Subsemirings 31212 Subsemirings generated by one element 32213 Universal spectral points 33

v

214 Conclusion 33

3 The asymptotic spectrum of graphs Shannon capacity 3531 Introduction 3532 The asymptotic spectrum of graphs 37

321 The semiring of graph isomorphism classes G 37322 Strassen preorder via graph homomorphisms 38323 The asymptotic spectrum of graphs X(G) 39324 Shannon capacity Θ 39

33 Universal spectral points 41331 Lovasz theta number ϑ 41332 Fractional graph parameters 41

34 Conclusion 46

4 The asymptotic spectrum of tensors matrix multiplication 4741 Introduction 4742 The asymptotic spectrum of tensors 49

421 The semiring of tensor equivalence classes T 49422 Strassen preorder via restriction 49423 The asymptotic spectrum of tensors X(T ) 49424 Asymptotic rank and asymptotic subrank 50

43 Gauge points ζ(i) 5144 Support functionals ζθ 5245 Upper and lower support functionals ζθ ζθ 5646 Asymptotic slice rank 5847 Conclusion 63

5 Tight tensors and combinatorial subrank cap sets 6551 Introduction 6552 Higher-order CoppersmithndashWinograd method 68

521 Construction 69522 Computational remarks 77523 Examples type sets 78

53 Combinatorial degeneration method 7954 Cap sets 81

541 Reduced polynomial multiplication 81542 Cap sets 82

55 Graph tensors 8556 Conclusion 86

6 Universal points in the asymptotic spectrum of tensors entan-glement polytopes moment polytopes 8761 Introduction 87

vi

62 SchurndashWeyl duality 8863 Kronecker and LittlewoodndashRichardson coefficients gλmicroν c

λmicroν 90

64 Entropy inequalities 9165 Hilbert spaces and density operators 9266 Moment polytopes P(t) 93

661 General setting 93662 Tensor spaces 94

67 Quantum functionals F θ(t) 9568 Outer approximation 10069 Inner approximation for free tensors 101610 Quantum functionals versus support functionals 102611 Asymptotic slice rank 103612 Conclusion 105

7 Algebraic branching programs approximation and nondetermi-nism 10771 Introduction 10772 Definitions and basic results 110

721 Computational models 110722 Complexity classes VP VPe VPk 111723 The theorem of Ben-Or and Cleve 112724 Approximation closure C 115725 Nondeterminism closure N(C) 115

73 Approximation closure of VP2 11674 Nondeterminism closure of VP1 11975 Conclusion 122

Bibliography 125

Glossary 139

Samenvatting 141

Summary 143

vii

Acknowledgements

First of all I thank all my coauthors for very fruitful collaboration Harry BuhrmanMatthias Christandl Peter Vrana Jop Briet Chris Perry Asger Jensen MarkusBlaser Christian Ikenmeyer and Karl Bringmann

Chris Zaal Leen Torenvliet and Robert Belleman I thank for all their effortsto set up for me the ldquodouble bachelor programmerdquo in Mathematics and Computerscience at the University of Amsterdam (UvA) in 2009 This programme as wellas the ldquowebklasrdquo on the Riemann hypothesis organised by Jan van de Craats andRoland van der Veen and the close vicinity of the UvA to the Dutch nationalresearch institute for mathematics and computer science (CWI) made me decideto come to Amsterdam My enjoyable master thesis project in mathematics withEric Opdam made me follow the academic path for which I thank Eric

Of course most importantly I thank my PhD supervisor Harry Buhrmanfor introducing me to research as a bachelor student for absorbing me into theAlgorithms and Complexity group at CWI for having enough faith in me to hireme as his PhD student in 2014 and for his general guidance throughout I feelvery lucky for the opportunities and scientific freedom that this has brought me

Matthias Christandl has been my closest collaborator and mentor since wemet in Berkeley in 2014 In practice this meant countless nights of fun Skypesessions between Amsterdam and Copenhagen countless enjoyable visits to theUniversity of Copenhagen and countless kitchen table sessions at the HallinsgadeThanks Matthias for the energy inspiration and optimism And thanks Matthiasand Henriette for the hospitality

Jop Briet I thank for his general guidance and for lots of inspiration Thepolynomial method reading group which he mainly organised inspired partof my paper with Matthias Christandl and Peter Vrana on universal points inthe asymptotic spectrum of tensors (This reading group also resulted in DionGijswijtrsquos paper on cap sets) My paper with Jop on round elimination laterinspired me to write the paper on the asymptotic spectrum of graphs

ix

Christian Ikenmeyer I thank for numerous inspiring discussions on algebraiccomplexity theory and tensors which greatly influenced my papers on tensor rankand our joint paper with Karl Bringmann on algebraic branching programs

Peter Vrana I thank for our many enjoyable research collaborations the resultsof which form a central part of this dissertation for his clever insights and forfinding several mathematical mistakes while reading the draft of this dissertation

Ronald de Wolf I thank for his general advice throughout my PhD and formany suggestions regarding the current version of this dissertation which will beincorporated in the next version (but not in the printed version because of theregulations of the University of Amsterdam)

Jop Briet Monique Laurent Lex Schrijver Peter Vrana Matthias ChristandlMaris Ozols Michael Walter and Bart Sevenster I thank for helpful discussionsregarding the results in Chapter 2 and Chapter 3 of this dissertation

Srinivasan Arunachalam I thank for sharing the ups and downs during ourfour years as PhD students at CWI Florian Speelman Farrokh Labib SvenPolak Bart Litjens and Bart Sevenster I thank for numerous valuable researchdiscussions

Bikkie Aldeias and Rob van Rooijen I thank for their excellent library servicesMartijn Zuiddam and Maris Ozols I thank for proofreading the draft of this

dissertationFinally I thank my parents and my brothers and my friends for their support

Amsterdam Jeroen ZuiddamAugust 31 2018

x

Publications

[BCPZ16] Harry Buhrman Matthias Christandl Christopher Perry and JeroenZuiddam Clean quantum and classical communication protocolsPhysical Review Letters 117230503 2016httpslinkapsorgdoi101103PhysRevLett117230503

httparxivorgabs160507948

[BCZ17a] Markus Blaser Matthias Christandl and Jeroen Zuiddam The bordersupport rank of two-by-two matrix multiplication is sevenManuscript 2017httpsarxivorgabs170509652

[BCZ17b] Harry Buhrman Matthias Christandl and Jeroen ZuiddamNondeterministic Quantum Communication Complexity the CyclicEquality Game and Iterated Matrix MultiplicationIn Christos H Papadimitriou editor 8th Innovations in TheoreticalComputer Science Conference (ITCS) 2017httpdropsdagstuhldeopusvolltexte20178181

[BIZ17] Karl Bringmann Christian Ikenmeyer and Jeroen Zuiddam On algebraicbranching programs of small widthIn Ryan OrsquoDonnell editor 32nd Computational ComplexityConference (CCC) 2017httpsdoiorg104230LIPIcsCCC201720

httpsarxivorgabs170205328

Journal of the ACM Vol 65 No 5 Article 32 2018httpsdoiorg1011453209663

1

2 Acknowledgements

[BZ17] Jop Briet and Jeroen Zuiddam On the orthogonal rank of Cayley graphsand impossibility of quantum round eliminationQuantum Information and Computation 2017httpwwwrintonpresscomxxqic17qic-17-120106-0116pdf

httpsarxivorgabs160806113

[CJZ18] Matthias Christandl Asger Kjaeligrulff Jensen and Jeroen ZuiddamTensor rank is not multiplicative under the tensor productLinear Algebra and its Applications 543125ndash139 2018httpsdoiorg101016jlaa201712020

httpsarxivorgabs170509379

[CVZ16] Matthias Christandl Peter Vrana and Jeroen Zuiddam Asymptotic tensorrank of graph tensors beyond matrix multiplicationManuscript 2016httpsarxivorgabs160907476

[CVZ18] Matthias Christandl Peter Vrana and Jeroen Zuiddam Universal Pointsin the Asymptotic Spectrum of Tensors Extended AbstractIn Proceedings of 50th Annual ACM SIGACT Symposium on the Theory ofComputing (STOC) 2018httpsdoiorg10114531887453188766

httpsarxivorgabs170907851

[CZ18] Matthias Christandl and Jeroen Zuiddam Tensor surgery and tensor rank(Journal of) computational complexity 2018httpsdoiorg101007s00037-018-0164-8

httpsarxivorgabs160604085

[Zui17] Jeroen Zuiddam A note on the gap between rank and border rankLinear Algebra and its Applications 52533ndash44 2017httpsdoiorg101016jlaa201703015

httparxivorgabs150405597

[Zui18] Jeroen Zuiddam The asymptotic spectrum of graphs and the ShannoncapacityManuscript 2018httparxivorgabs180700169

This dissertation is based on the above papers with primary focus on the fourhighlighted papers

Toelichting op het relatieve belang van de co-auteurs voor elk artikel is hetbelang van de co-auteurs ongeveer gelijk verdeeld

Chapter 1

Introduction

Volker Strassen published in 1969 his famous algorithm for multiplying anytwo ntimes n matrices using only O(n281) rather than O(n3) arithmetical opera-tions [Str69] His discovery marked the beginning of a still ongoing line of researchin the field of algebraic complexity theory a line of research that by now touchesseveral fields of mathematics including algebraic geometry representation theory(quantum) information theory and combinatorics This dissertation is inspired byand contributes to this line of research

No further progress followed for almost 10 years after Strassenrsquos discoverydespite the fact that ldquomany scientists understood that discovery as a signal toattack the problem and to push the exponent further downrdquo [Pan84] Then in 1978Pan improved the exponent from 281 to 279 [Pan78 Pan80] One year later BiniCapovani Lotti and Romani improved the exponent to 278 by constructing fastldquoapproximativerdquo algorithms for matrix multiplication and making these algorithmsexact via the method of interpolation [BCRL79 Bin80] Cast in the languageof tensors the result of Bini et al corresponds to what we now call a ldquoborderrankrdquo upper bound The idea of studying approximative complexity or bordercomplexity for algebraic problems has nowadays become an important theme inalgebraic complexity theory

Schonhage then obtained the exponent 255 by constructing a fast algorithm forcomputing many ldquodisjointrdquo small matrix multiplications and transforming this intoan algorithm for one large matrix multiplication [Sch81] The upper bound was im-proved shortly after by works of Pan [Pan81] Romani [Rom82] and Coppersmithand Winograd [CW82] resulting in the exponent 250 Then in 1987 Strassenpublished the laser method with which he obtained the exponent 248 [Str87] Thelaser method was used in the same year by Coppersmith and Winograd to obtainthe exponent 238 [CW87] To do this they invented a method for constructingcertain large combinatorial structures This method or actually the extendedversion that Strassen published in [Str91] we now call the CoppersmithndashWinogradmethod All further improvements on upper bounding the exponent essentially

3

4 Chapter 1 Introduction

follow the framework of Coppersmith and Winograd and the improvements donot affect the first two digits after the comma [CW90 Sto10 Wil12 LG14]

Define ω to be the optimal exponent in the complexity of matrix multiplicationWe call ω the exponent of matrix multiplication To summarise the above historicalaccount on upper bounds ω lt 238 On the other hand the only lower bound wecurrently have is the trivial lower bound 2 le ω

The history of upper bounds on the matrix multiplication exponent ω whichbegan with Strassenrsquos algorithm and ended with the Strassen laser methodand CoppersmithndashWinograd method is well-known and well-documented seeeg [BCS97 Section 1513] However there is remarkable work of Strassen ona theory of lower bounds for ω and similar types of exponents and this workhas received almost no attention This theory of lower bounds is the theory ofasymptotic spectra of tensors and is the topic of a series of papers by Strassen[Str86 Str87 Str88 Str91 Str05]

In the foregoing the word tensor has popped up twicemdashnamely when wementioned border rank and just now when we mentioned asymptotic spectraof tensorsmdashbut we have not discussed at all why tensors should be relevant forunderstanding the complexity of matrix multiplication First we give a mini courseon tensors A k-tensor t = (ti1ik)i1ik is a k-dimensional array of numbers fromsome field say the complex numbers C Thus a 2-tensor is simply a matrix Ak-tensor is called simple if there exist k vectors v1 vk such that the entries of tare given by the products ti1ik = (v1)i1 middot middot middot (vk)ik for all indices ij The tensorrank of t is the smallest number n such that t can be written as a sum of n simpletensors Thus the tensor rank of a 2-tensor is simply its matrix rank Returning tothe problem of finding the complexity of matrix multiplication there is a special3-tensor called the matrix multiplication tensor that encodes the computationalproblem of multiplying two 2times 2 matrices This 3-tensor is commonly denotedby 〈2 2 2〉 It turns out that the matrix multiplication exponent ω is exactly theasymptotic rate of growth of the tensor rank of the ldquoKronecker powersrdquo of thetensor 〈2 2 2〉 This important observation follows from the fundamental fact thatthe computational problem of multiplying matrices is ldquoself-reduciblerdquo Namely wecan multiply two matrices by viewing them as block matrices and then performmatrix multiplication at the level of the blocks

We wrap up this introductory story To understand the computational com-plexity of matrix multiplication one should understand the asymptotic rate ofgrowth of the tensor rank of a certain family of tensors a family that is obtainedby taking powers of a fixed tensor The theory of asymptotic spectra is the theoryof bounds on such asymptotic parameters of tensors

The main story line of this dissertation concerns the theory of asymptoticspectra In Section 11 of this introduction we discuss in more detail the computa-tional problem of multiplying matrices In Section 12 we discuss the asymptoticspectrum of tensors and discuss a new result an explicit description of infinitely

11 Matrix multiplication 5

many elements in the asymptotic spectrum of tensors In Section 13 we considera new higher-order CoppersmithndashWinograd method

The theory of asymptotic spectra of tensors is a special case of an abstracttheory of asymptotic spectra of preordered semirings which we discuss in Sec-tion 14 In Section 15 we apply this abstract theory to a new setting namelyto graphs By doing this we obtain a new dual characterisation of the Shannoncapacity of graphs

The second story line of this dissertation is about degeneration an algebraickind of approximation related to the concept of border rank of Bini et al We discussdegeneration in the context of tensors in Section 16 There is a combinatorialversion of tensor degeneration which we call combinatorial degeneration Wediscuss a new result regarding combinatorial degeneration in Section 17 FinallySection 18 is about a new result concerning degeneration for algebraic branchingprograms an algebraic model of computation

We finish in Section 19 with a discussion of the organisation of this dissertationinto chapters

11 Matrix multiplication

In this section we discuss in more detail the computational problem of multiplyingtwo matrices

Algebraic complexity theory studies algebraic algorithms for algebraic problemsRoughly speaking algebraic algorithms are algorithms that use only the basicarithmetical operations + and times over some field say R or C A fundamentalexample of an algebraic problem is matrix multiplication

If we multiply two ntimesn matrices by computing the inner products between anyrow of the first matrix and any column of the second matrix one by one we needroughly 2 middot n3 arithmetical operations (+ and times) For example we can multiplytwo 2times2 matrices with 12 arithmetical operations namely 8 multiplications and 4additions(

a11 a12

a21 a22

)(b11 b12

b21 b22

)=

(a11b11 + a12b21 a11b12 + a12b22

a21b11 + a22b21 a21b12 + a22b22

)

Since matrix multiplication is a basic operation in linear algebra it is worthwhileto see if we can do better than 2 middot n3 In 1969 Strassen [Str69] published a betteralgorithm The base routine of Strassenrsquos algorithm is an algorithm for multiplyingtwo 2times 2 matrices with 7 multiplications 18 additions and certain sign changes(

a11 a12

a21 a22

)(b11 b12

b21 b22

)=

(x1 + x4 minus x5 + x7 x3 + x5

x2 + x4 x1 + x3 minus x2 + x6

)with

x1 = (a11 + a22)(b11 + b22)

6 Chapter 1 Introduction

x2 = (a21 + a22)b11

x3 = a11(b12 minus b22)

x4 = a22(minusb11 + b21)

x5 = (a11 + a12)b22

x6 = (minusa11 + a21)(b11 + b12)

x7 = (a12 minus a22)(b21 + b22)

The general routine of Strassenrsquos algorithm multiplies two n times n matrices byrecursively dividing the matrices into four blocks and applying the base routineto multiply the blocks (this is the self-reducibility of matrix multiplication thatwe mentioned earlier) The base routine does not assume commutativity of thevariables for correctness so indeed we can take the variables to be matrices Afterexpanding the recurrence we see that Strassenrsquos algorithm uses 47middotnlog2 7 asymp 47middotn281

arithmetical operations Over the years Strassenrsquos algorithm was improved bymany researchers The best algorithm known today uses C middot n238 arithmeticaloperations where C is some constant [CW90 Sto10 Wil12 LG14] The exponentof matrix multiplication ω is the infimum over all real numbers β such that forsome constant Cβ we can multiply for any n isin N any two ntimes n matrices with atmost Cβ middot nβ arithmetical operations From the above it follows that ω le 238From a simple flattening argument it follows that 2 le ω We are left with thefollowing well-known open problem what is the value of the matrix multiplicationexponent ω

The constant C for the currently best algorithm is impractically large (fora discussion of this issue see eg [Pan18]) For a practical fast algorithm oneshould either improve C or find a balance between C and the exponent of nWe will ignore the size of C in this dissertation and focus on the exponent ωFor an overview of the field of algebraic complexity theory the reader shouldconsult [BCS97] and [Sap16]

12 The asymptotic spectrum of tensors

We now discuss the theory of asymptotic spectra for tensors

Let s and t be k-tensors over a field F s isin Fn1otimesmiddot middot middototimesFnk t isin Fm1otimesmiddot middot middototimesFmk We say s restricts to t and write s gt t if there are linear maps Ai Fni rarr Fmisuch that (A1 otimes middot middot middot otimes Ak)(s) = t Let [n] = 1 n for n isin N We definethe product s otimes t isin Fn1m1 otimes middot middot middot otimes Fnkmk by (s otimes t)(i1j1)(ikjk) = si1iktj1jkfor i isin [n1] times middot middot middot [nk] and j isin [m1] times middot middot middot times [mk] This product generalizes thewell-known Kronecker product of matrices We refer to this product as the tensor(Kronecker) product We define the direct sum s oplus t isin Fn1+m1 otimes middot middot middot otimes Fnk+mk

by (s oplus t)`1`k = s`1`k if ` isin [n1] times middot middot middot times [nk] (s oplus t)n1+`1nk+`k = t`1`k if` isin [m1]times middot middot middot times [mk] and (soplus t)`1`k = 0 for the remaining indices

12 Asymptotic spectra of tensors 7

The asymptotic restriction problem asks to compute the infimum of all realnumbers β ge 0 such that for all n isin N

sotimesβn+o(n) gt totimesn

We may think of the asymptotic restriction problem as having two directionsnamely to find

1 obstructions ldquocertificatesrdquo that prohibit sotimesβn+o(n) gt totimesn or

2 constructions linear maps that carry out sotimesβn+o(n) gt totimesn

Ideally we would like to find matching obstructions and constructions so that weindeed learn the value of β

What do obstructions look like We set β equal to one it turns out that itis sufficient to understand this case We say s restricts asymptotically to t andwrite s gtsim t if

sotimesn+o(n) gt totimesn

What do obstructions look like for asymptotic restriction gtsim More precisely whatdo obstructions look like for gtsim restricted to a subset S sube k-tensors over FLet us assume S is closed under direct sum and tensor product and containsthe diagonal tensors 〈n〉 =

sumni=1 ei otimes middot middot middot otimes ei for n isin N where e1 en is the

standard basis of Fn Let X(S) be the set of all maps φ S rarr Rge0 that are

(a) monotone under restriction gt

(b) multiplicative under the tensor Kronecker product otimes

(c) additive under the direct sum oplus

(d) normalised to φ(〈n〉) = n at the diagonal tensor 〈n〉

The elements φ isin X(S) are called spectral points of S The set X(S) is called theasymptotic spectrum of S

Spectral points φ isin X(S) are obstructions Let s t isin S If s gtsim t thenby definition we have a restriction sotimesn+o(n) gt totimesn Then (a) and (b) imply theinequality φ(s)n+o(n) = φ(sotimesn+o(n)) gt φ(totimesn) = φ(t)n This implies φ(s) gt φ(t)We negate that statement if φ(s) lt φ(t) then not s gtsim t In that case φ is anobstruction to s gtsim t

The remarkable fact is that X(S) is a complete set of obstructions for gtsimNamely for s t isin S the asymptotic restriction s gtsim t holds if and only if we haveφ(s) gt φ(t) for all spectral points φ isin X(S) This was proven by Volker Strassenin [Str86 Str88] His proof uses a theorem of Becker and Schwarz [BS83] which iscommonly referred to as the KadisonndashDubois theorem (for historical reasons) or

8 Chapter 1 Introduction

the real representation theorem (We will say more about this completeness resultin Section 14)

Let us introduce tensor rank and subrank and their asymptotic versionsThe tensor rank of t is the size of the smallest diagonal tensor that restrictsto t R(t) = minr isin N t 6 〈r〉 and the subrank of t is the size of thelargest diagonal tensor to which t restricts Q(t) = maxr isin N 〈r〉 6 tAsymptotic rank is defined as ˜R(t) = limnrarrinfinR(totimesn)1n and asymptotic subrankis defined as ˜Q(t) = limnrarrinfinQ(totimesn)1n From Feketersquos lemma it follows that

˜Q(t) = supn Q(totimesn)1n and ˜R(t) = infn R(totimesn)1n One easily verifies that everyspectral point φ isin X(S) is an upper bound on asymptotic subrank and a lowerbound on asymptotic rank for any tensor t isin S

˜Q(t) le φ(t) le ˜R(t)

Strassen used the completeness of X(S) for 6sim to prove ˜Q(t) = minφisinX(S) φ(t) and

˜R(t) = maxφisinX(S) φ(t) One should think of these expressions as being dual to thedefining expressions for ˜Q and ˜R

We mentioned that Strassen was motivated to study the asymptotic spectrumof tensors by the study of the complexity of matrix multiplication The preciseconnection with matrix multiplication is as follows The matrix multiplicationexponent ω is characterised by the asymptotic rank ˜R(〈2 2 2〉) of the matrixmultiplication tensor

〈2 2 2〉 =sum

ijkisin[2]

eij otimes ejk otimes eki isin F4 otimes F4 otimes F4

via ˜R(〈2 2 2〉) = 2ω We know the trivial lower bound 2 le ω see Section 43 Weknow the (nontrivial) upper bound ω le 23728639 which is by Coppersmith andWinograd [CW90] and improvements by Stothers [Sto10] Williams [Wil12] andLe Gall [LG14] It may seem that for the study of matrix multiplication only theasymptotic rank ˜R is of interest and that the asymptotic subrank ˜Q is just a toyparameter Asymptotic subrank however plays an important role in the currentlybest matrix multiplication algorithms We will discuss this idea in the context ofthe asymptotic subrank of so-called complete graph tensors in Section 55

The important message is understanding the asymptotic spectrum of ten-sors X(S) means understanding asymptotic restriction 6sim the asymptotic sub-rank ˜Q and the asymptotic rank ˜R of tensors Of course we should now find anexplicit description of X(S)

Our main result regarding the asymptotic spectrum of tensors is the explicitdescription of an infinite family of elements in the asymptotic spectrum of allcomplex tensors X(complex k-tensors) which we call the quantum function-als (Chapter 6) Finding such an infinite family has been an open problemsince the work of Strassen Moment polytopes (studied under the name en-tanglement polytopes in quantum information theory [WDGC13]) play a key

12 Asymptotic spectra of tensors 9

role here To each tensor t is associated a convex polytope P(t) collectingrepresentation-theoretic information about t called the moment polytope of t(See eg [Nes84 Bri87 WDGC13 SOK14]) The moment polytope has twoimportant equivalent descriptions

Quantum marginal spectra description We begin with the descriptionof P(t) in terms of quantum marginal spectra

Let V be a (finite-dimensional) Hilbert space In quantum information theorya positive semidefinite hermitian operator ρ V rarr V with trace one is calleda density operator The sequence of eigenvalues of a density operator ρ is aprobability vector We let spec(ρ) = (p1 pn) be the sequence of eigenvalues of ρordered non-increasingly p1 ge middot middot middot ge pn Let V1 and V2 be Hilbert spaces Given adensity operator ρ on V1 otimes V2 the reduced density operator ρ1 = tr2 ρ is uniquelydefined by the property that tr(ρ1X1) = tr(ρ(X1otimesIdV2)) for all operators X1 on V1The operator ρ1 is again a density operator The operation tr2 is called the partialtrace over V2 In an explicit form ρ1 is given by 〈ei ρ1(ej)〉 =

sum`〈eiotimesf` ρ(ejotimesf`)〉

where the ei form a basis of V1 and the fi form an orthonormal basis of V2 (thestatement is independent of basis choice)

Let Vi be a Hilbert space and consider the tensor product V1 otimes V2 otimes V3Associate with t isin V1 otimes V2 otimes V3 the dual element tlowast = 〈t middot〉 isin (V1 otimes V2 otimes V3)

lowastThen ρt = ttlowast〈t t〉 = t〈t middot〉〈t t〉 is a density operator on V1otimesV2otimesV3 Viewing ρt

as a density operator on the regrouped space V1 otimes (V2 otimes V3) we may take thepartial trace of ρt over V2otimesV3 as described above We denote the resulting densityoperator by ρt1 = tr23 ρ

t We similarly define ρt2 and ρt3

Let V = V1otimesV2otimesV3 Let G = GL(V1)timesGL(V2)timesGL(V3) act naturally on V Let t isin V 0 The moment polytope of t is

P(t) = P(G middot t) = (spec(ρu1) spec(ρu2) spec(ρu3)) u isin G middot t 0

Here G middot t denotes the Zariski closure or equivalently the Euclidean closure in Vof the orbit G middot t = g middot t g isin G

Representation-theoretic description On the other hand there is a de-scription of P(t) in terms of non-vanishing of representation-theoretic multiplicitiesWe do not state this description here but stress that it is crucial for our proofs

Quantum functionals For any probability vector θ isin Rk (iesumk

i=1 θ(i) = 1and θ(i) ge 0 for all i isin [k]) we define the quantum functional F θ as an optimisationover the moment polytope

F θ(t) = max

2sumki=1 θ(i)H(x(i)) (x(1) x(k)) isin P(t)

Here H(y) denotes Shannon entropy of the probability vector y We prove that F θ

satisfies properties (a) (b) (c) and (d) for all complex k-tensors

Theorem (Theorem 611) F θ isin X(complex k-tensors)

10 Chapter 1 Introduction

To put our result into context Strassen in [Str91] constructed elements in theasymptotic spectrum of S = oblique k-tensors over F with the preorder 6|SThe set S is a strict and non-generic subset of all k-tensors over F These elementswe call the (Strassen) support functionals On oblique tensors over C the quantumfunctionals and the support functionals coincide An advantage of the supportfunctionals over the quantum functionals is that they are defined over any fieldIn fact the support functionals are ldquopowerful enoughrdquo to reprove the result ofEllenbergndashGijswijt on cap sets [EG17] We discuss the support functionals inSection 44

13 Higher-order CW method

Recall that in the asymptotic restriction problem we have an obstruction directionand a construction direction The quantum functionals and the support functionalsprovide obstructions Now we look at the construction direction Constructionsare asymptotic transformations sotimesβn+o(n) gt totimesn We restrict attention to the casethat t is a diagonal tensor 〈r〉 Constructions in this case essentially correspondto lower bounds on the asymptotic subrank ˜Q(s) The goal is now to constructgood lower bounds on ˜Q(s)

Strassen solved the problem of computing the asymptotic subrank for so-calledtight 3-tensors with the CoppersmithndashWinograd (CW) method and the supportfunctionals [CW90 Str91] The CW method is combinatorial Let us introduce thecombinatorial viewpoint Let I1 Ik be finite sets We call a set D sube I1timesmiddot middot middottimesIka diagonal if any two distinct elements a b isin D differ in all k coordinates LetΦ sube I1timesmiddot middot middottimes Ik We call a diagonal D sube Φ free if D = Φcap (D1timesmiddot middot middottimesDk) HereDi = ai a isin D is the projection ofD onto the ith coordinate The subrank Q(Φ)of Φ is the size of the largest free diagonal D sube Φ For two sets Φ sube I1 times middot middot middot times Ikand Ψ sube J1 times middot middot middot times Jk we define the product ΦtimesΨ sube (I1 times J1)times middot middot middot times (Ik times Jk)by Φ times Ψ = ((a1 b1) (ak bk)) a isin Φ b isin Ψ The asymptotic subrankis defined as ˜Q(Φ) = limnrarrinfinQ(Φtimesn)1n One may think of Φ as a k-partitehypergraph and of a free diagonal in Φ as an induced k-partite matching

How does this combinatorial version of subrank relate to the tensor version ofsubrank that we defined earlier Let t isin Fn1otimesmiddot middot middototimesFnk Expand t in the standardbasis t =

sumiisin[n1]timesmiddotmiddotmiddottimes[nk] ti ei1 otimes middot middot middot otimes eik Let supp(t) be the support of t in the

standard basis supp(t) = i isin [n1]timesmiddot middot middottimes [nk] ti 6= 0 Then Q(supp(t)) le Q(t)We want to construct large free diagonals Let Φ sube I1timesmiddot middot middottimesIk We call Φ tight

if there are injective maps αi Ii rarr Z such that if a isin Φ thensumk

i=1 αi(ai) = 0For a set X let P(X) be the set of probability distributions on X For θ isin P([k]) letHθ(Φ) = maxPisinP(Φ)

sumki=1 θ(i)H(Pi) where H(Pi) denotes the Shannon entropy

of the ith marginal distribution of P In [Str91] Strassen used the CW methodand the support functionals to characterise the asymptotic subrank ˜Q(Φ) fortight Φ sube I1 times I2 times I3 He proved the following Let Φ sube I1 times I2 times I3 be tight

14 Abstract asymptotic spectra 11

Then

˜Q(Φ) = minθisinP([3])

2Hθ(Φ) = maxPisinP(Φ)

miniisin[3]

2H(Pi) (11)

We study the higher-order regime Φ sube I1 times middot middot middot times Ik k ge 4

Theorem (Theorem 57) Let Φ sube I1 times middot middot middot times Ik be tight Then ˜Q(Φ) is lowerbounded by an expression that generalizes the right-hand side of (11)

Stating the lower bound requires a few definitions so we do not state it hereIt is not known whether our new lower bound matches the upper bound given byquantum or support functionals

Using Theorem 57 we managed to exactly determine the asymptotic subranksof several new examples These results in turn we used to obtain upper boundson the asymptotic rank of so-called complete graph tensors via a higher-orderStrassen laser method

14 Abstract asymptotic spectra

Strassen mainly studied tensors but he developed an abstract theory of asymptoticspectra in a general setting In the next section we apply this abstract theory tographs We now introduce the abstract theory One has a semiring S (think of asemiring as a ring without additive inverses) that contains N and a preorder 6on S that (1) behaves well with respect to the semiring operations (2) inducesthe natural order on N and (3) for any a b isin S b 6= 0 there is an r isin N sube Swith a 6 r middot b We call such a preorder a Strassen preorder The main theoremis that the asymptotic version 6sim of the Strassen preorder is characterised by themonotone semiring homomorphisms S rarr Rge0 For a b isin S let a 6sim b if there is a

sequence xn isin NN with x1nn rarr 1 when nrarrinfin and an 6 bnxn for all n isin N Let

X = X(S6) = φ isin Hom(SRge0) foralla b isin S a 6 brArr φ(a) le φ(b)

The set X is called the asymptotic spectrum of (S6)

Theorem (Strassen) a 6sim b iff forallφ isin X φ(a) le φ(b)

Strassen applies this theorem to study rank and subrank of tensors Wedefine an abstract notion of rank R(a) = minn isin N a 6 n and an abstractnotion of subrank Q(a) = maxm isin N m 6 a We then naturally have anasymptotic rank ˜R(a) = limnrarrinfinR(an)1n and (under certain mild conditions) anasymptotic subrank ˜Q(a) = limnrarrinfinQ(an)1n In fact ˜R(a) = infn R(an)1n and

˜Q(a) = supn Q(an)1n by Feketersquos lemma The theorem implies the following dualcharacterisations

12 Chapter 1 Introduction

Corollary (Section 28) If a isin S with ak gt 2 for some k isin N then

˜Q(a) = minφisinX

φ(a)

If a isin S with φ(a) ge 1 for some φ isin X then

˜R(a) = maxφisinX

φ(a)

In Chapter 2 we will discuss the abstract theory of asymptotic spectra Wewill discuss a proof of the above theorem that is obtained by integrating the proofsof Strassen in [Str88] and the proof of the KadisonndashDubois theorem of Beckerand Schwarz in [BS83] We will also discuss some basic properties of generalasymptotic spectra

15 The asymptotic spectrum of graphs

In the previous section we have seen the abstract theory of asymptotic spectraWe now discuss a problem in graph theory where we can apply this abstracttheory Consider a communication channel with input alphabet a b c d e andoutput alphabet 1 2 3 4 5 When the sender gives an input to the channel thereceiver gets an output according to the following diagram where an outgoingarrow is picked randomly (say uniformly randomly)

a 1

b 2

c 3

d 4

e 5

Output 2 has an incoming arrow from a and an incoming arrow from b Wesay a and b are confusable because the receiver cannot know whether a or bwas given as an input to the channel In this channel the pairs of inputsa b b c c d d e e a are confusable If we restrict the input set toa subset of pairwise non-confusable letters say a c then we can use the channelto communicate two messages with zero error It is clear that for this channel anynon-confusable set of inputs has size at most two Can we make better use of thechannel if we use the channel twice Yes now the input set is the set of two letterwords aa ab ac ad ae ba bb and we have a set of pairwise non-confusablewords aa bc ce db ed which has size 5 Thus ldquoper channel userdquo we can send atleast

radic5 letters What happens if we use the channel n times

15 The asymptotic spectrum of graphs 13

The situation is concisely described by drawing the confusability graph of thechannel which has the input letters as vertices and the confusable pairs of inputletters as edges For the above channel the confusability graph is the 5-cycle C5

a

b

cd

e

A subset of inputs that are pairwise non-confusable corresponds to a subset ofthe vertices in the confusability graph that contains no edges an independent setThe independence number of any graph G is the size of the largest independentset in G and is denoted by α(G) If G is the confusability graph of some channelthen the confusability graph for using the channel n times is denoted by Gn (thegraph product is called the strong graph product) The question of how manyletters we can send asymptotically translates to computing the limit

Θ(G) = limnrarrinfin

α(Gn)1n

which exists because α is supermultiplicative under The parameter Θ(G) wasintroduced by Shannon [Sha56] and is called the Shannon capacity of the graph GComputing the Shannon capacity is a nontrivial problem already for small graphsLovasz in 1979 [Lov79] computed the value Θ(C5) =

radic5 by introducing and

evaluating a new graph parameter ϑ which is now known as the Lovasz thetanumber Already for the 7-cycle C7 the Shannon capacity is not known

Duality theorem We propose a new application of the abstract theory ofasymptotic spectra to graph theory The main theorem that results from this is adual characterisation of the Shannon capacity of graphs For graphs G and H wesay G 6 H if there is a graph homomorphism Grarr H ie from the complementof G to the complement of H We show graphs are a semiring under the stronggraph product and the disjoint union t and 6 is a Strassen preorder onthis semiring The rank in this setting is the clique cover number χ(middot) = χ( middot )ie the chromatic number of the complement The subrank in this setting is theindependence number α(middot) Let X(G) be the set of semiring homomorphismsfrom graphs to Rge0 that are monotone under 6 From the abstract theory ofasymptotic spectra we derive the following duality theorem

Theorem (Theorem 31) Θ(G) = minφisinX(G) φ(G)

In Chapter 3 we will prove Theorem 31 and we will discuss the known elementsin X(G) which are the Lovasz theta number and a family of parameters obtainedby ldquofractionalisingrdquo

14 Chapter 1 Introduction

16 Tensor degeneration

We move to the second story line that we mentioned earlier degeneration Degen-eration is a prominent theme in algebraic complexity theory Roughly speakingdegeneration is an algebraic notion of approximation defined via orbit closures

For tensors for example degeneration is defined as follows Let V1 V2 V3

be finite-dimensional complex vector spaces and let V = V1 otimes V2 otimes V3 be thetensor product space Let G = GL(V1) times GL(V2) times GL(V3) act naturally on V Let s t isin V Let G middot t = g middot t g isin G be the orbit of t under G We say tdegenerates to s and write t s if s is an element in the orbit closure G middot t Herethe closure is taken with respect to the Zariski topology or equivalently withrespect to the Euclidean topology One should think of this degeneration asa topologically closed version of the restriction preorder le for tensors that wedefined earlier Degeneration is a ldquolargerrdquo preorder than restriction in the sensethat s t implies s le t

In several algebraic models of computation approximative computations cor-respond to certain degenerations In some models such an approximative com-putation can be turned into an exact computation at a small cost for exampleusing the method of interpolation The currently fastest matrix multiplicationalgorithms are constructed in this way for example

On the other hand it turns out that if a lower bound technique for analgebraic measure of complexity is ldquocontinuousrdquo then the lower bounds obtainedwith this technique are already lower bounds on the approximative version ofthe complexity measure This observation turns approximative complexity anddegeneration into an interesting topic itself A research program in this directionis the geometric complexity theory program of Mulmuley and Sohoni towardsseparating the algebraic complexity class VP (and related classes) from VNP[MS01] (see also [Ike13])

In this section we briefly discuss three results related to degeneration of tensorsthat are not discussed further in this dissertation Then we will discuss resultson combinatorial degeneration in Section 17 and algebraic branching programdegeneration in Section 18

Ratio of tensor rank and border rank The approximative or degenera-tion version of tensor rank is called border rank and is denoted by R It has beenknown since the work of Bini and Strassen that tensor rank R and border rank Rare different How much can they be different In [Zui17] we showed the followinglower bound Let k ge 3 There is a sequence of k-tensors tn in (C2n)otimesk such thatR(tn)R(tn) ge k minus o(1) when n rarr infin This answers a question of Landsbergand Micha lek [LM16b] and disproves a conjecture of Rhodes [AJRS13] Furtherprogress will most likely require the construction of explicit tensors with hightensor rank which has implications in formula complexity [Raz13]

Border support rank Support rank is a variation on tensor rank whichhas its own approximative version called border support rank A border support

17 Combinatorial degeneration 15

rank upper bound for the matrix multiplication tensor yields an upper bound onthe asymptotic complexity This was shown by Cohn and Umans in the contextof the group theoretic approach towards fast matrix multiplication [CU13] Theyasked what is the border support rank of the smallest matrix multiplicationtensor 〈2 2 2〉 In [BCZ17a] we showed that it equals seven Our proof usesthe highest-weight vector technique (see also [HIL13]) Our original motivationto study support rank is a connection that we found between support rank andnondeterministic multiparty quantum communication complexity [BCZ17b]

Tensor rank under outer tensor product We applied degeneration asa tool to study an outer tensor product otimes on tensors For s isin Cn1 otimes middot middot middot otimes Cnk

and t isin Cm1 otimes middot middot middot otimes Cm` let s otimes t be the natural (k + `)-tensor in Cn1 otimes middot middot middot otimesCnk otimes Cm1 otimes middot middot middot otimes Cm` The products otimes and otimes differ by a regrouping of thetensor indices It is well known that tensor rank is not multiplicative under otimesIn [CJZ18] we showed that tensor rank is already not multiplicative under otimes astronger result Nonmultiplicativity occurs when taking a power of a tensor whoseborder rank is strictly smaller than its tensor rank This answers a question ofDraisma [Dra15] and Saptharishi et al [CKSV16]

17 Combinatorial degeneration

In the previous section we introduced the general idea of degeneration and discusseddegeneration of tensors Combinatorial degeneration is the combinatorial analogueof tensor degeneration Consider sets Φ sube Ψ sube I1 times middot middot middot times Ik of k-tuples Wesay Φ is a combinatorial degeneration of Ψ and write Ψ Φ if there are mapsui Ii rarr Z such that for all α isin I1 times middot middot middot times Ik if α isin Ψ Φ then

sumki=1 ui(αi) gt 0

and if α isin Φ thensumk

i=1 ui(αi) = 0 We prove that combinatorial asymptoticsubrank is nonincreasing under combinatorial degeneration

Theorem (Theorem 521) If Ψ Φ then ˜Q(Ψ) ge ˜Q(Φ)

The analogous statement for subrank of tensors is trivially true The crucialpoint is that Theorem 521 is about combinatorial subrank As an example Theo-rem 521 combined with the CW method yields an elegant optimal constructionof tri-colored sum-free sets which are combinatorial objects related to cap sets

18 Algebraic branching program degeneration

We now consider degeneration in the context of algebraic branching programs Acentral theme in algebraic complexity theory is the study of the power of differentalgebraic models of computation and the study of the corresponding complexityclasses We have already (implicitly) used an algebraic model of computationwhen we discussed matrix multiplication circuits

16 Chapter 1 Introduction

bull A circuit is a directed acyclic graph G with one or more source verticesand one sink vertex Each source vertex is labelled by a variable xi ora constant α isin F The other vertices are labelled by either + or times andhave in-degree 2 (that is fan-in 2) Each vertex of G naturally computesa polynomial The value of G is the element computed at the sink vertexThe size of G is the number of vertices (One may also allow multiple sinkvertices in order to compute multiple polynomials eg to compute matrixmultiplication) Here is an example of a circuit computing xy + 2x+ y minus 1

minus1 2 x y source vertices

times times

+ +

+ sink vertex

Consider the following two models

bull A formula is a circuit whose graph is a tree

bull An algebraic branching program (abp) is a directed acyclic graph G withone source vertex s one sink vertex t and affine linear forms over the basefield F as edge labels Moreover each vertex is labeled with an integer (itslayer) and the arrows in the abp point from vertices in layer i to vertices inlayer i+ 1 The cardinality of the largest layer we call the width of the abpThe number of vertices we call the size of the abp The value of an abp isthe sum of the values of all sndasht-paths where the value of an sndasht-path is theproduct of its edge labels We say that an abp computes its value Here isan example of a width-3 abp computing xy + 2x+ y minus 1

s

t

x2

xyminus1

19 Organisation 17

The above models of computation give rise to complexity classes A complexityclass consists of families of multivariate polynomials (fn)n = (f(x1 xqn)n)nisinNover some fixed field F We say a family of polynomials (fn)n is a p-family if thedegree of fn and the number of variables of fn grow polynomially in n Let VPbe the class of p-families with polynomially bounded circuit size Let VPe be theclass of p-families with polynomially bounded formula size For k isin N let VPk bethe class of families of polynomials computable by width-k abps of polynomiallybounded size Let VPs be the class of p-families computable by skew circuitsof polynomial size Skew circuits are a type of circuits between formulas andgeneral circuits The class VPs coincides with the class of families of polynomialscomputable by abps of polynomially bounded size (see eg [Sap16]) Ben-Orand Cleve proved that VP3 = VP4 = middot middot middot = VPe [BOC92] Allender and Wangproved VP2 ( VP3 [AW16] Thus VP2 ( VP3 = VP4 = middot middot middot = VPe sube VPsThe following separation problem is one of the many open problems regardingalgebraic complexity classes Is the inclusion VPe sube VPs strict Motivated by thisseparation problem we study the approximation closure of VPe We mentionedthat Ben-Or and Cleve proved that formula size is polynomially equivalent towidth-3 abp size [BOC92] Regarding width-2 there are explicit polynomials thatcannot be computed by any width-2 abp of any size [AW16] The abp model hasa natural notion of approximation When we allow approximation in our abpsthe situation changes completely

Theorem (Theorem 78) Any polynomial can be approximated by a width-2 abpof size polynomial in the formula size

In terms of complexity classes this means VP2 = VPe where middot denotes theldquoapproximation closurerdquo of the complexity class The theorem suggests an ap-proach regarding the separation of VPe and VPs Namely superpolynomial lowerbounds on formula size may be obtained from superpolynomial lower bounds onapproximate width-2 abp size We moreover study the nondeterminism closure ofcomplexity classes and prove a new characterisation of the complexity class VNP

19 Organisation

This dissertation is divided into chapters as follows We will begin with the abstracttheory of asymptotic spectra in Chapter 2 Then we introduce the asymptoticspectra of graphs and a new characterisation of the Shannon capacity in Chapter 3In Chapter 4 we introduce the asymptotic spectrum of tensors discuss the supportfunctionals of Strassen for oblique tensors and a characterisation of asymptoticslice rank of oblique tensors as the minimum over the support functionals InChapter 5 we discuss tight tensors the higher-order CoppersmithndashWinogradmethod the combinatorial degeneration method and applications to the cap setproblem type sets and graph tensors In Chapter 6 we introduce an infinite family

18 Chapter 1 Introduction

of elements in the asymptotic spectrum of complex k-tensors and characterise theasymptotic slice rank as the minimum over the quantum functionals Finally inChapter 7 we study algebraic branching programs and approximation closure andnondeterminism closure of algebraic complexity classes

Chapter 2

The theory of asymptotic spectra

21 Introduction

This is an expository chapter about the abstract theory of asymptotic spectra ofVolker Strassen [Str88] The theory studies semirings S that are endowed with apreorder 6 The main result Theorem 212 is that under certain conditions theasymptotic version 6sim of this preorder is characterised by the semiring homomor-phisms S rarr Rge0 that are monotone under 6 These monotone homomorphismsmake up the ldquoasymptotic spectrumrdquo of (S6) For the elements of S we havenatural notions of rank and subrank generalising rank and subrank of tensorsThe asymptotic spectrum gives a dual characterisation of the asymptotic versionsof rank and subrank This dual description may be thought of as a ldquolower boundrdquomethod in the sense of computational complexity theory In Chapter 3 andChapter 4 we will study two specific pairs (S6)

22 Semirings and preorders

A (commutative) semiring is a set S with a binary addition operation + a binarymultiplication operation middot and elements 0 1 isin S such that for all a b c isin S

(1) + is associative (a+ b) + c = a+ (b+ c)

(2) + is commutative a+ b = b+ a

(3) 0 + a = a

(4) middot is associative (a middot b) middot c = a middot (b middot c)

(5) middot is commutative a middot b = b middot a

(6) 1 middot a = a

19

20 Chapter 2 The theory of asymptotic spectra

(7) middot distributes over + a middot (b+ c) = (a middot b) + (a middot c)

(8) 0 middot a = 0

As usual we abbreviate a middot b as ab A preorder is a relation 4 on a set X such thatfor all a b c isin X

(1) 4 is reflexive a 4 a

(2) 4 is transitive a 4 b and b 4 c implies a 4 c

As usual a 4 b is the same as b lt a Let N = 0 1 2 be the set of naturalnumbers and let Ngt0 = 1 2 be the set of strictly-positive natural numbersWe write le for the natural order 0 le 1 le 2 le 3 le middot middot middot on N

23 Strassen preorders

Let S be a semiring with N sube S A preorder 4 on S is a Strassen preorder if

(1) forallnm isin N n le m iff n 4 m

(2) foralla b c d isin S if a 4 b and c 4 d then a+ c 4 b+ d and ac 4 bd

(3) foralla b isin S b 6= 0 existr isin N a 4 rb

Note that condition (2) is equivalent to the condition foralla b s isin S if a 4 b thena+ s 4 b+ s and as 4 bs

Let 4 be a Strassen preorder on S Then 0 4 1 by condition (1) For a isin Swe have a 4 a by reflexivity and thus 0 4 a by condition (2)

Examples

We give two examples of a semiring with a Strassen preorder Proofs and formaldefinitions are given later

Graphs Let S be the set of all (isomorphism classes of) finite simple graphsLet GH isin S Let G t H be the disjoint union of G and H Let G H bethe strong graph product of G and H (see Chapter 3) With addition t andmultiplication the set S becomes a semiring The 0 in S is the graph with novertices and the 1 in S is the graph with a single vertex Let G be the complementof G Define a preorder 6 on S by G 6 H if there is a graph homomorphismGrarr H Then 6 is a Strassen preorder We will investigate this semiring furtherin Chapter 3

24 Asymptotic preorders 4sim 21

Tensors Let F be a field Let k isin N Let S be the set of all k-tensors over Fwith arbitrary format that is S = cupFn1 otimes middot middot middot otimes Fnk n1 nk isin N Fors isin Fn1 otimes middot middot middot otimes Fnk and t isin Fm1 otimes middot middot middot otimes Fmk let s 6 t if there are linear mapsAi Fmi rarr Fni with (A1otimesmiddot middot middototimesAk)t = s We identify any s t isin S for which s 6 tand t 6 s Let oplus be the direct sum of k-tensors and let otimes be the tensor productof k-tensors (see Chapter 4) With addition oplus and multiplication otimes the set Sbecomes a semiring The 0 in S is the zero tensor and the 1 in S is the standardbasis element e1otimes middot middot middot otimes e1 isin F1otimes middot middot middot otimes F1 The preorder 6 is a Strassen preorderWe will investigate this semiring further in Chapter 4 Chapter 5 and Chapter 6

24 Asymptotic preorders 4simDefinition 21 Let 4 be a relation on S Define the relation 4sim on S by

a2 4sim a1 if exist(xN) isin NN infNx

1NN = 1 forallN isin N aN2 4 aN1 xN (21)

If 4 is a Strassen preorder then we may in (21) replace the infimum infN x1NN

by the limit limNrarrinfin x1NN since we may assume xN+M le xNxM (if aN2 4 aN1 xN

and aM2 4 aM1 xM then aN+M2 4 aN+M

1 xNxM) and then apply Feketersquos lemma(Lemma 22)

Lemma 22 (Feketersquos lemma see [PS98 No 98]) Let x1 x2 x3 isin Rge0 satisfyxn+m le xn + xm Then limnrarrinfin xnn = infn xnn

Proof Let y = infn xnn Let ε gt 0 Let m isin Ngt0 with xmm lt y + ε Anyn isin N can be written in the form n = qm+ r where r is an integer 0 le r le mminus 1Set x0 = 0 Then xn = xqm+r le xm + xm + middot middot middot+ xm + xr = qxm + xr Therefore

xnn

=xqm+r

qm+ rle qxm + xr

qm+ r=xmm

qm

qm+ r+xrn

Thus

y le xnnlt (y + ε)

qm

n+xrn

The claim follows because xrnrarr 0 and qmnrarr 1 when nrarrinfin

For a1 a2 isin S if a1 4 a2 then clearly a1 4sim a2

Lemma 23 Let 4 be a Strassen preorder on S Then 4sim is a Strassen preorderon S the ldquoasymptotic preorderrdquo corresponding to 4

Proof Let a b c d isin S We verify that 4sim is a preorderFirst reflexivity We have a 4 a so aN 4 aN middot 1 so a 4sim a

22 Chapter 2 The theory of asymptotic spectra

Second transitivity Let a 4sim b and b 4sim c This means aN 4 bNxN andbN 4 cNyN with x

1NN rarr 1 and y

1NN rarr 1 Then aN 4 bNxN 4 cNxNyN Since

(xNyN)1N rarr 1 we conclude a 4sim cWe verify condition (1) Let nm isin N If n le m then n 4 m so n 4sim m If

n 4sim m then nN 4 nMxN so nN le mNxN which implies n le mWe verify condition (2) Let a 4sim b and c 4sim d This means aN 4 bNxN and

cN 4 dNyN Thus aNcN 4 bNdNxNyN and so ac 4sim bd Assume xN and yN arenondecreasing (otherwise set xN = maxnleN xn) Then

(a+ c)N =Nsumm=0

(N

m

)amcNminusm 4

Nsumm=0

(N

m

)bmdNminusmxmyNminusm

4Nsumm=0

(N

m

)bmdNminusmxNyN = (b+ d)NxNyN

Thus a+ c 4sim b+ dWe verify (3) Let a b isin S b 6= 0 Then there is an r isin N with a 4 rb and

thus a 4sim rb

Lemma 24 Let 4 be a Strassen preorder on S Let a1 a2 b isin S

(i) If a2 + b 4 a1 + b then a2 4sim a1

(ii) If a2b 4 a1b with b 6= 0 then a2 4sim a1

(iii) If a24simsim a1 then a2 4sim a1

(iv) If exists isin S foralln isin N na2 4 na1 + s then a2 4sim a1

Proof (ii) Let a2b 4 a1b By an inductive argument similar to the argument weused to prove (24)

forallN isin N aN2 b 4 aN1 b (22)

Let m r isin N with 1 4 mb 4 r (We use b 6= 0) From (22) follows

forallN isin N aN2 4 aN2 mb 4 aN1 mb 4 aN1 r

Thus we conclude a2 4sim a1(iii) Let a2

4simsim a1 This means aN2 4sim aN1 xN with x1NN rarr 1 This in turn means

that (aN2 )M 4 (aN1 xN)MyNM with forallN isin N y1MNM rarr 1 that is

aNM2 4 aNM1 xMN yNM

Choose a sequence N 7rarrMN such that (yNMN)1MN le 2 eg given N let MN be

the smallest M for which (yNM)1M le 2 Then aNMN2 4 aNMN

1 xMNN yNMN

and

(xMNN yNMN

)1(NMN ) = x1NN (yNMN

)1(NMN ) le x1NN 21N rarr 1

25 Maximal Strassen preorders 23

We conclude a2 4sim a1(iv) Let s isin S with foralln isin N na2 4 na1 + s We may assume a1 6= 0 Let k isin N

with s 4 ka1 Then

foralln isin N kna2 4 kna1 + ka1 = ka1(n+ 1) (23)

Apply (ii) to (23) to get

foralln isin N a2n 4sim a1(n+ 1)

By an inductive argument

forallN isin N aN2 4sim aNminus12 a12 4sim aNminus2

2 a213 4sim middot middot middot 4sim aN1 (N + 1)

Since (N + 1)1N rarr 1 a24simsim a1 From (iii) follows a2 4sim a1

(i) Let a2 + b 4 a1 + b We first prove

forallq isin N qa2 + b 4 qa1 + b (24)

By assumption the statement is true for q = 1 suppose the statement is truefor q minus 1 then

qa2 + b = (q minus 1)a2 + (a2 + b) 4 (q minus 1)a2 + (a1 + b)

= ((q minus 1)a2 + b) + a1 4 ((q minus 1)a1 + b) + a1 = qa1 + b

which proves the statement by induction Then foralln isin N na2 4 na1 + b From (iv)follows a2 4sim a1

25 Maximal Strassen preorders

Let P be the set of Strassen preorders on S For 4142 isin P we write 42 sube 41

if for all a b isin S a 42 b implies a 41 b (The notation 42 sube 41 is natural if wethink of the relations 4i as sets of pairs (a b) with a 4i b)

Lemma 25 Let 4 isin P with 4 = 4sim and a2 64 a1 Then there is an element4a1a2 isin P with 4 sube 4a1a2 and a1 4a1a2 a2

Proof For x1 x2 isin S let

x1 4a1a2 x2 if exists isin S x1 + sa2 4 x2 + sa1

The relation 4a1a2 is reflexive since x + 0 middot a2 4 x + 0 middot a1 The relation 4a1a2

is transitive if x1 4a1a2 x2 and x2 4a1a2 x3 then x1 + sa2 4 x2 + sa1 andx2 + ta2 4 x3 + ta1 for some s t isin S and so x1 + (t + s)a2 4 x2 + ta2 + sa1 4x3 + ta1 + sa1 = x3 + (t + s)a1 Thus x1 4a1a2 x3 We conclude that 4a1a2 is apreorder on S

24 Chapter 2 The theory of asymptotic spectra

We prove that 4a1a2 is a Strassen preorder If x1 4a1a2 x2 and y1 4a1a2 y2then clearly x1 + y1 4a1a2 x2 + y2 If x1 4a1a2 x2 and y isin S then x1y 4a1a2 x2yFrom this follows if x1 4a1a2 x2 and y1 4a1a2 y2 then x1y2 4a1a2 x2y2

Let nm isin N If n le m then n 4 m so n 4a1a2 m If n 6le m then n ge m+ 1Suppose n 4a1a2 m Let s isin S with n+ sa2 4 m+ sa1 Adding m+ 1 4 n gives

m+ 1 + n+ sa2 4 n+m+ sa1

Since 4 = 4sim we may apply Lemma 24 (i) to obtain

1 + sa2 4 sa1 (25)

From (25) follows s 6= 0 From (25) also follows

sa2 4 sa1 (26)

Since 4 = 4sim we may apply Lemma 24 (ii) to (26) to obtain the contradiction

a2 4 a1

Therefore n 64a1a2 m We conclude that 4a1a2 is a Strassen preorder thatis 4 isin P

Finally we have a1 4a1a2 a2 since a1 + 1 middot a2 4 a2 + 1 middot a1 Also if x1 4 x2then x1 + 0 middot a2 4 x2 + 0 middot a1 that is 4 sube 4a1a2

Let 4 be a Strassen preorder Let P4 be the set of Strassen preorderscontaining 4 ordered by inclusion sube Let C sube P4 be any chain Then theunion of all preorders in C is an element of P4 and contains all elements of CTherefore by Zornrsquos lemma P4 contains a maximal element (maximal withrespect to inclusion sube)

Lemma 26 Let 4 be maximal in P Then 4 = 4sim

Proof Trivially 4 sube 4sim From Lemma 23 we know 4sim isin P From maximalityof 4 follows 4 = 4sim

A relation 4 on S is total if for all a b isin S a 4 b or b 4 a

Lemma 27 Let 4 be maximal in P Then 4 is total

Proof Suppose 4 is not total say a1 64 a2 and a2 64 a1 By Lemma 25 there is anelement 4a1a2isin P with 4 sube 4a1a2 and a1 4a1a2 a2 Then 4 is strictly containedin 4a1a2 which contradicts the maximality of 4 We conclude 4 is total

26 The asymptotic spectrum X(S6) 25

26 The asymptotic spectrum X(S6)

Definition 28 Let S be a semiring with N sube S and let 6 be a Strassen preorderon S Let

X(S6) = φ isin Hom(SRge0) a 6 brArr φ(a) le φ(b)

We call X(S6) the asymptotic spectrum of (S6) We call the elements ofX(S6) spectral points

Lemma 29 Let 4 isin P be total There is exactly one semiring homomorphismφ S rarr Rge0 with

a 4 brArr φ(a) le φ(b)

Moreover if 4 is maximal in P then

a 4 bhArr φ(a) le φ(b)

Proof Let 4 isin P be total For a isin S define

φ(a) = inf rs

r s isin N sa 4 rψ(a) = supu

v u v isin N u 4 va

We prove ψ(a) le φ(a) Let r s u v isin N Suppose u 4 va and sa 4 r Thenfollows su 4 vsa 4 vr Thus uv le rs We prove ψ(a) ge φ(a) Supposeψ(a) lt φ(a) Let r s isin N with ψ(a) lt rs lt φ(a) Then sa 64 r Fromtotality follows sa lt r Thus ψ(a) ge rs which is a contradiction We concludeψ(a) = φ(a)

Let a b isin S We prove φ(a+ b) le φ(a) + φ(b) Let sa sb ra rb isin N Supposesaa 4 ra and sbb 4 rb Then sasba 4 sbra and sasbb 4 sarb By additionsasb(a+b) 4 sbra+sarb Thus φ(a+b) le ra

sa+ rb

sb We prove ψ(a+b) ge ψ(a)+ψ(b)

Suppose ua 4 vaa and ub 4 vbb Then vbua 4 vavba and vaub 4 vavbb By additionvbua + vaub 4 vavb(a+ b) Thus ψ(a+ b) ge ua

va+ ub

vb We thus have additivity

We prove φ(ab) le φ(a)φ(b) Suppose saa 4 ra and sbb 4 rb Then sasbab 4rarb Thus φ(ab) le ra

sa

rbsb

We prove ψ(ab) ge ψ(a)ψ(b) Suppose ua 4 vaa and

ub 4 vbb Then uaub 4 vavbab Thus uava

ubvble ψ(ab) We thus have multiplicativity

We prove monotonicity a 4 brArr φ(a) le φ(b) Suppose sbb 4 rb From a 4 bfollows sba 4 sbb 4 rb Thus φ(a) le rb

sb

We prove φ(1) = 1 Trivially 1 4 1 Therefore φ(1) le 11

= 1 and ψ(1) ge 11

= 1We prove φ(0) = 0 Trivially sa0 4 0 so φ(0) le 0

sa= 0 Trivially 0 4 va0 so

φ(0) ge 0va

= 0We prove the uniqueness of φ Let φ1 φ2 be semiring homomorphisms S rarr Rge0

with a 4 b rArr φi(a) le φi(b) Suppose φ1(a) lt φ2(a) Let u v isin N with

26 Chapter 2 The theory of asymptotic spectra

φ1(a) lt uvlt φ2(a) Then va 64 u so by totality va lt u Thus φ1(a) ge u

v which

is a contradiction This proves uniquenessFinally suppose 4 is maximal in P Lemma 26 gives 4 = 4sim Let a 64 b

From Lemma 24 (iv) follows existn na 64 nb+ 1 By totality na lt nb+ 1 Apply φto get φ(a) ge φ(b) + 1

n In particular φ(a) gt φ(b)

Lemma 210 The map

X(S6)rarr maximal elements in P6 φ 7rarr 4φwith a 4φ b iff φ(a) le φ(b) is a bijection

Proof Let φ isin X(S6) One verifies that 4φ is a Strassen preorder and 6 sube 6sim sube4φ Let 4 be maximal in P4φ Lemma 27 says that 4 is total By Lemma 29there is a ψ isin X(S6) with 4 sube 4ψ Clearly 4φ sube 4ψ The uniqueness statementof Lemma 29 implies φ = ψ This means 4φ = 4 that is 4φ is maximal Weconclude that the map is well defined

Let 4 maximal in P6 Then 4 is total By Lemma 29 there is a φ isin X(S6)with 4 sube 4φ We conclude the map is surjective

Let φ ψ isin X(S6) with 4φ = 4ψ From Lemma 29 follows φ = ψ Weconclude the map is injective

Lemma 211 Let a b isin S Then a 6sim b iff a 4 b for all maximal 4 isin P6

Proof Let 4 isin P6 be maximal Then 6sim sube 4sim = 4 by Lemma 26 so a 6sim bimplies a 4 b

Suppose a 66sim b Let n isin Nge1 with na 66sim nb+1 (Lemma 24 (iv)) By Lemma 25there is an element 4nb+1na isin P with 6sim sube 4nb+1na and we may assume 4nb+1na

is maximal Then nb+ 1 4nb+1na na and so a 64nb+1na b

27 The representation theorem

The following theorem is the main theorem

Theorem 212 ([Str88 Th 24]) Let S be a commutative semiring with N sube Sand let 6 be a Strassen preorder on S Let X = X(S6) be the set of 6-monotonesemiring homomorphisms from S to Rge0

X = X(S6) = φ isin Hom(SRge0) foralla b isin S a 6 brArr φ(a) le φ(b)

For a b isin S let a 6sim b if there is a sequence (xN) isin NN with x1NN rarr 1 when

N rarrinfin such that forallN isin N aN 6 bNxN Then

foralla b isin S a 6sim b iff forallφ isin X φ(a) le φ(b)

Proof Let a b isin S Suppose a 6sim b Then clearly for all φ isin X we haveφ(a) le φ(b) Suppose a 66sim b By Lemma 211 there is a maximal 4isin P6 witha 64 b By Lemma 210 there is a φ isin X with φ(a) gt φ(b)

28 Abstract rank and subrank RQ 27

28 Abstract rank and subrank RQ

We generalise the notions of rank and subrank for tensors to arbitrary semiringswith a Strassen preorder Let a isin S Define the rank

R(a) = minr isin N a 6 r

and the subrank

Q(a) = maxr isin N r 6 a

Then Q(a) le R(a) Define the asymptotic rank

˜R(a) = limNrarrinfin

R(aN)1N

Define the asymptotic subrank

˜Q(a) = limNrarrinfin

Q(aN)1N

By Feketersquos lemma (Lemma 22) asymptotic rank is an infimum and asymptoticsubrank is a supremum as follows

˜R(a) = infN

R(aN)1N

˜Q(a) = supN

Q(aN)1N when a = 0 or a ge 1

Theorem 212 implies that the asymptotic rank and asymptotic subrank have thefollowing dual characterisation in terms of the asymptotic spectrum (This is astraightforward generalisation of [Str88 Th 38])

Corollary 213 (cf [Str88 Th 38]) For a isin S with existφ isin Xφ(a) ge 1

˜R(a) = maxφisinX

φ(a)

Proof Let φ isin X For N isin N R(aN) ge φ(a)N Therefore ˜R(a) ge φ(a)and so ˜R(a) ge maxφisinX φ(a) It remains to prove ˜R(a) le maxφisinX φ(a) Welet x = maxφisinX φ(a) By assumption x ge 1 By definition of x we have

forallφ isin X φ(a) le x

Take the mth power on both sides

forallφ isin Xm isin N φ(am) le xm

Take the ceiling on the right-hand side

forallφ isin Xm isin N φ(am) le dxme

28 Chapter 2 The theory of asymptotic spectra

Apply Theorem 212 to get asymptotic preorders

forallm isin N am 6sim dxme

Then by definition of asymptotic preorder

forallmN isin N amN 6 dxmeN2εmN for some εmN isin o(N)

Then

forallmN isin N R(amN)1mN le dxme1m2εmNmN

From x ge 1 follows dxme1m rarr x when m rarr infin Choose m = m(N) withm(N)rarrinfin as N rarrinfin and εm(N)N isin o(N) to get ˜R(a) = infN R(aN )1N le x

Corollary 214 (cf [Str88 Th 38]) For a isin S with existk isin N ak gt 2

˜Q(a) = minφisinX

φ(a)

Proof Let φ isin X For N isin N Q(aN ) le φ(a)N Therefore ˜Q(a) le φ(a) so ˜Q(a) leminφisinX φ(a) It remains to prove ˜Q(a) ge minφisinX φ(a) Let y = minφisinX φ(a)

From the assumption ak gt 2 follows y gt 1 By definition of y we have

forallφ isin X φ(a) ge y

Take the mth power on both sides

forallφ isin Xm isin N φ(am) ge ym

Take the floor on the right-hand side

forallφ isin Xm isin N φ(am) ge bymc

Apply Theorem 212 to get asymptotic preorders

forallm isin N am gtsim bymc

Then by definition of asymptotic preorder

forallmN isin N amN2εmN gt bymcN for some εmN isin o(N)

Now we use ak gt 2 to get

forallmN isin N amN+kεmN gt bymcN

Then

forallmN isin N Q(amN+kεmN )1

mN+kεmN ge bymcN

mN+kεmN

Choose m = m(N) with m(N) rarr infin as N rarr infin and εm(N)N isin o(N) to obtain

˜Q(a) = supN Q(aN)1N ge y

29 Topological aspects 29

29 Topological aspects

Theorem 212 does not tell the full story Namely there is also a topologicalcomponent which we will now discuss Let S be a semiring with N sube S Let 6 bea Strassen preorder on S Let X = X(S6) be the asymptotic spectrum of (S6)For a isin S let

a Xrarr Rge0 φ 7rarr φ(a) (27)

The map a simply evaluates a given homomorphism φ at a One may think of aas the collection (φ(a))φisinX of all evaluations of the elements of X at a Let Rge0

have the Euclidean topology Endow X with the weak topology with respect tothe family of functions a a isin S That is endow X with the coarsest topologysuch that each a becomes continuous

Let C(XRge0) be the semiring of continuous functions Xrarr Rge0 with additionand multiplication defined pointwise on X that is (f + g)(x) = f(x) + g(x)and (f middot g)(x) = f(x)g(x) for f g isin C(XRge0) and x isin X Define the semiringhomomorphism

Φ S rarr C(XRge0) a 7rarr a

which maps a to the evaluator a defined in (27)

Theorem 215 ([Str88 Th 24])

(i) X is a nonempty compact Hausdorff space

(ii) foralla b isin S a 6sim b iff Φ(a) le Φ(b) pointwise on X

(iii) Φ(S) separates the points of X

Proof Statement (ii) follows from Theorem 212Statement (iii) is clearWe prove statement (i) We have 2 66sim 1 so from Theorem 212 follows that X

cannot be emptyFor a isin S let na isin N with a le na Then for φ isin X φ(a) le na and so

φ(a) isin [0 na] Embed X subeprod

aisinS[0 na] as a set via φ 7rarr (φ(a))aisinS The setprodaisinS[0 na] with the product topology is compact by the theorem of TychonoffTo see that X is closed in

prodaisinS[0 na] we write X as an intersection of sets

X =φ isin

prodaisinS

[0 na] φ(0) = 0capφ isin

prodaisinS

[0 na] φ(1) = 1

cap⋂bcisinS

φ isin

prodaisinS

[0 na] φ(b+ c)minus φ(b)minus φ(c) = 0

cap⋂bcisinS

φ isin

prodaisinS

[0 na] φ(bc)minus φ(b)φ(c) = 0

30 Chapter 2 The theory of asymptotic spectra

cap⋂bcisinSblec

φ isin

prodaisinS

[0 na] φ(b) le φ(c)

and we observe that the intersected sets are closed

X = 0minus1(0) cap 1minus1(1)

cap⋂bcisinS

((b+ c)minus bminus c

)minus1(0)

cap⋂bcisinS

((bc)minus bc

)minus1(0)

cap⋂bcisinSblec

(cminus b

)minus1([0infin))

This implies X is also compactLet φ ψ isin X be distinct Let a isin S with φ(a) 6= ψ(a) Then a(φ) 6= a(ψ)

Let U 3 a(φ) V 3 a(ψ) be open and disjoint subsets of Rge0 Then aminus1(U) andaminus1(V ) are open and disjoint subsets of X We conclude that X is Hausdorff

210 Uniqueness

Let S be a semiring with N sube S Let 6 be a Strassen preorder on S LetX = X(S6) be the asymptotic spectrum of (S6) The object X is unique inthe following sense

Theorem 216 ([Str88 Cor 27]) Let Y be a compact Hausdorff space LetΨ S rarr C(YRge0) be a homomorphism of semirings such that

Ψ(S) separates the points of Y (28)

and

foralla b isin S a 6sim bhArr Ψ(a) le Ψ(b) pointwise on Y (29)

Then there is a unique homeomorphism (continuous bijection with continuousinverse) h Y rarr X such that the diagram

S

C(XRge0) C(YRge0)

ΨΦ

hlowast

(210)

commutes where hlowast φ 7rarr φ h Namely let h y 7rarr(a 7rarr Ψ(a)(y)

)

211 Subsemirings 31

Proof We prove uniqueness Suppose there are two such homeomorphisms

h1 h2 Y rarr X

Suppose x 6= h2(hminus11 (x)) for some x isin X Since Φ(S) separates the points of X

there is an a isin S with Φ(a)(x) 6= Φ(a)(h2(hminus11 (x))) Let y = hminus1

1 (x) isin Y ThenΦ(a)(h1(y)) 6= Φ(a)(h2(y)) Since (210) commutes Φ(a)(h1(y)) = Ψ(a)(y) andΦ(a)(h2(y)) = Ψ(a)(y) a contradiction

We prove existence Let h Y rarr X y 7rarr (a 7rarr Ψ(a)(y)) One verifies that his well-defined continuous injective and that the diagram in (210) commutes Itremains to show that h is surjective We know that Q middot Φ(S) is a Q-subalgebraof C(XR) which separates points and which contains the nonzero constantfunction Φ(1) so by the StonendashWeierstrass theorem Q middot Φ(S) is dense in C(XR)under the sup-norm Suppose h is not surjective Then h(Y) ( X is a properclosed subset Let x0 isin X h(Y) be in the complement Since X is a compactHausdorff space there is a continuous function f Xrarr [minus1 1] with

f(h(Y)) = 1

f(x0) = minus1

We know that f can be approximated by elements from Q middot Φ(S) ie let ε gt 0then there are a1 a2 isin S N isin N such that

1N

(Φ(a1)(x)minus Φ(a2)(x)

)gt 1minus ε for all x isin h(Y)

1N

(Φ(a1)(x0)minus Φ(a2)(x0)

)lt minus1 + ε

This means Ψ(a1) ge Ψ(a2) pointwise on Y so a1 gtsim a2 but also Φ(a1) 6ge Φ(a2)pointwise on X so a1 6gtsim a2 This is a contradiction

211 Subsemirings

Let S be a subsemiring of a semiring T and let 6 be a Strassen preorder on T Then the restriction 6|S is a Strassen preorder on S How are the asymptoticspectra X(S6|S) and X(T6) related Obviously for φ isin X(T6) we haveφ|S isin X(S6|S) In fact the uniqueness theorem of Section 210 implies that allelements of X(S6|S) are restrictions of elements of X(T6)

Corollary 217 Let S be a subsemiring of a semiring T Let 6 be a Strassenpreorder on T Then

X(S6|S) = X(T6)|S

Proof Let

X = X(S6|S)

32 Chapter 2 The theory of asymptotic spectra

Φ S rarr C(XRge0) a 7rarr a

and let

Y = X(T6)|S = φ|S φ isin X(T6)Ψ S rarr C(YRge0) a 7rarr

(φ|S 7rarr φ|S(a)

)

Then Y is a compact Hausdorff space Let φ|S ψ|S isin Y be distinct Then there isan a isin S with φ|S(a) 6= ψ|S(a) so (28) holds For a b isin S a 6sim b iff Φ(a) le Φ(b)iff Ψ(a) le Ψ(b) so (29) holds Therefore

h X(T6)|S rarr X(S6|S) φ|S 7rarr(a 7rarr Ψ(a)(φ|S)

)= φ|S

is a homeomorphism

212 Subsemirings generated by one element

Let S be a semiring and let 6 be a Strassen preorder on S We specialise to thesimplest type of subsemiring of S Namely let a isin S and let

N[a] = ksumi=0

ni ai k isin N ni isin N

sube S

be the subsemiring of S generated by a We call X(N[a]) = X(N[a]6|N[a]) theasymptotic spectrum of a

Corollary 218 (cf [Str88]) If ak gt 2 for some k isin N then

˜Q isin X(N[a])

If φ(a) ge 1 for some φ isin X then

˜R isin X(N[a])

Proof Let X = X(N[a]) Let n1 nq By Corollary 214

˜Q(an1 + middot middot middot+ anq) = minφisinX

φ(an1 + middot middot middot+ anq)

Since φ is a homomorphism φ(an1 + middot middot middot+ anq) = φ(a)n1 + middot middot middot+ φ(a)nq Now weobserve that xn1 + middot middot middot+ xnq is minimised by taking x minimal in the domain Weconclude

˜Q(an1 + middot middot middot+ anq) =

qsumi=1

(minφisinX

φ(a))ni = ˜Q(a)n1 + middot middot middot+ ˜Q(a)nq

The claim for asymptotic rank ˜R similarly follows from Corollary 213

213 Universal spectral points 33

Remark 219 In general asymptotic subrank ˜Q and asymptotic rank ˜R are notelements of the asymptotic spectrum We will see an example in Chapter 4 relatedto the matrix multiplication tensor

Remark 220 Corollary 218 is closely related to Schonhagersquos τ -theorem fortensors also called Schonhagersquos asymptotic sum inequality The τ -theorem featuresin every recent fast matrix multiplication algorithm (ie every algorithm based onthe laser method)

Remark 221 An element φ isin X(N[a]) is uniquely determined by the valueof φ(a) isin Rge0 We may thus identify the asymptotic spectrum X(N[a]) with acompact (ie closed and bounded) subset of the positive reals Rge0 via φ 7rarr φ(a)

213 Universal spectral points

Having discussed the simplest type of subsemiring in the previous section letus discuss the most difficult type of supersemiring When applying the theoryof asymptotic spectra to some setting there is a natural largest semiring S inwhich the objects of study live For example we may study the semiring S of all(equivalence classes of) 3-tensors of arbitrary format over F Or we may studythe semiring S of all (isomorphism classes of) finite simple graphs We refer tothe elements of the asymptotic spectrum X(S) of the ldquoambientrdquo semiring S bythe term universal spectral points (cf [Str88 page 119]) The universal spectralpoints are the most useful monotone homomorphisms

214 Conclusion

To a semiring S with a Strassen preorder 6 we associated an asymptotic pre-order 6sim We proved that this asymptotic preorder is characterised by the6-monotone semiring homomorphisms S rarr Rge0 which make up the asymp-totic spectrum X(S6) of (S6) For (S6) we naturally have a rank functionR S rarr N and a subrank function Q S rarr N Their asymptotic versions

˜R(a) = infn R(an)1n and ˜Q(a) = supn Q(an)1n coincide with maxφisinX(S6) φ(a)

and minφisinX(S6) φ(a) respectively assuming existφ isin Xφ(a) ge 1 and existk isin N ak gt 2respectively Unfortunately we have proved the existence of the asymptotic spec-trum by nonconstructive means Explicitly constructing spectral points for a givenpair (S6) will be a challenging task

Some remarks about our proof in this chapter The proof in [Str88] uses theKadisonndashDubois theorem from the paper of Becker and Schwartz [BS83] as ablack-box Our presentation basically integrates the proof of Strassen with theproof of Becker and Schwartz The notions of rank and subrank were in [Str88] onlydiscussed for tensors We considered the straightforward generalisation to arbitrary

34 Chapter 2 The theory of asymptotic spectra

semirings with a Strassen preorder An evident feature of our presentation is thatwe do not pass from the semiring to its Grothendieck ring but instead stay inthe semiring In this way we stay close to the ldquoreal worldrdquo objects I thank JopBriet and Lex Schrijver for this idea There is a large body of literature on theKadisonndashDubois theorem for which we refer to the modern books by Prestel andDelzell [PD01 Theorem 526] and Marshall [Mar08 Theorem 544]

Chapter 3

The asymptotic spectrum of graphsShannon capacity

This chapter is based on the manuscript [Zui18]

31 Introduction

This chapter is about the Shannon capacity of graphs which was introduced byClaude Shannon in the context of coding theory [Sha56] More precisely we willapply the theory of asymptotic spectra of Chapter 2 to gain a better understandingof Shannon capacity (and other asymptotic properties of graphs)

We first recall the definition of the Shannon capacity of a graph Let G be a(finite simple) graph with vertex set V (G) and edge set E(G) An independent setor stable set in G is a subset of V (G) that contains no edges The independencenumber or stability number α(G) is the cardinality of the largest independentset in G For graphs G and H the and-product GH also called strong graphproduct is defined by

V (GH) = V (G)times V (H)

E(GH) =(g h) (gprime hprime)

(g gprime isin E(G) or g = gprime

)and

(h hprime isin E(H) or h = hprime

)and (g h) 6= (gprime hprime)

The Shannon capacity Θ(G) is defined as the limit

Θ(G) = limNrarrinfin

α(GN)1N (31)

This limit exists and equals the supremum supN α(GN)1N by Feketersquos lemma(Lemma 22)

Computing the Shannon capacity is nontrivial already for small graphs Lovaszin [Lov79] computed the value Θ(C5) =

radic5 where Ck denotes the k-cycle graph

by introducing and evaluating a new graph parameter ϑ which is now known as

35

36 Chapter 3 The asymptotic spectrum of graphs Shannon capacity

the Lovasz theta number For example the value of Θ(C7) is currently not knownThe Shannon capacity Θ is not known to be hard to compute in the sense ofcomputational complexity On the other hand deciding whether α(G) le k givena graph G and k isin N is NP-complete [Kar72]

New result dual description of Shannon capacity

The new result of this chapter is a dual characterisation of the Shannon capacityof graphs This characterisation is obtained by applying Strassenrsquos theory ofasymptotic spectra of Chapter 2 Thus this chapter also serves as an illustrationof the theory of asymptotic spectra

To state the theorem we need the standard notions graph homomorphismgraph complement and graph disjoint union Let G and H be graphs A graphhomomorphism f G rarr H is a map f V (G) rarr V (H) such that for allu v isin V (G) if u v isin E(G) then f(u) f(v) isin E(H) In other words a graphhomomorphism maps edges to edges The complement G of G is defined by

V (G) = V (G)

E(G) =u v u v 6isin E(G) u 6= v

We define a relation 6 on graphs let G 6 H if there is a graph homomor-phism Grarr H from the complement of G to the complement of H The disjointunion G tH is defined by

V (G tH) = V (G) t V (H)

E(G tH) = E(G) t E(H)

For n isin N the complete graph Kn is the graph with V (Kn) = [n] = 1 2 nand E(Kn) = i j i j isin [n] i 6= j Thus K0 = K0 is the empty graphand K1 = K1 is the graph consisting of a single vertex and no edges

Theorem 31 Let S sube graphs be a collection of graphs which is closed underthe disjoint union t and the strong graph product and which contains the graphwith a single vertex K1 Define the asymptotic spectrum X(S) as the set of allmaps φ S rarr Rge0 such that for all GH isin S

(1) if G 6 H then φ(G) le φ(H)

(2) φ(G tH) = φ(G) + φ(H)

(3) φ(GH) = φ(G)φ(H)

(4) φ(K1) = 1

32 The asymptotic spectrum of graphs 37

Let G 6sim H if there is a sequence (xN) isin NN with x1NN rarr 1 when N rarr infin such

that for every N isin N

GN 6 (HN)txN = HN t middot middot middot tHN︸ ︷︷ ︸xN

Then

(i) G 6sim H iff forallφ isin X(S) φ(G) le φ(H)

(ii) Θ(G) = minφisinX(S) φ(G)

Statement (ii) of Theorem 31 is nontrivial in the sense that Θ is not anelement of X(graphs) Namely Θ is not additive under t by a result ofAlon [Alo98] and Θ is not multiplicative under by a result of Haemers [Hae79]It turns out that the graph parameter G 7rarr maxφisinX(graphs) φ(G) is itself anelement of X(graphs) and is equal to the fractional clique cover number χf (seeSection 332 and eg [Sch03 Eq (67112)]) Fritz in [Fri17] proves (independentlyof Strassenrsquos line of work) a statement that is weaker than Theorem 31 Namelyhe proves the statement of Theorem 31 without the additivity condition (2)

In Section 32 we will prove Theorem 31 by applying the theory of asymptoticspectra of Chapter 2 to the appropriate semiring and preorder In Section 33 wewill discuss the elements in the asymptotic spectrum of graphs X(graphs) thatare currently known to me the Lovasz theta number the fractional clique covernumber the fractional orthogonal rank of the complement and the fractionalHaemers bounds We moreover prove a sufficient condition for the ldquofractionalisa-tionrdquo of a graph parameter to be in the asymptotic spectrum of graphs

32 The asymptotic spectrum of graphs

In this section we prove Theorem 31 by applying the theory of asymptotic spectrato the appropriate semiring

321 The semiring of graph isomorphism classes GA graph homomorphism f Grarr H is a graph isomorphism if f is bijective asa map V (G)rarr V (H) and bijective as a map E(G)rarr E(H) We write G sim= Hif there is a graph isomorphism f G rarr H The relation sim= is an equivalencerelation on graphs which we call isomorphism For example the graphs Gand H given by

V (G) = a b c d E(G) = a b b c c d a dV (H) = 1 2 3 4 E(H) = 1 3 2 3 2 4 1 4

38 Chapter 3 The asymptotic spectrum of graphs Shannon capacity

are isomorphic Let G = graphssim= be the set of equivalence classes in graphsunder sim= ie the isomorphism classes The relation 6 is a preorder on G Recallthat Kn is the complete graph on n vertices and thus Kn is the graph with nvertices and no edges

Lemma 32 Let ABC isin graphs

(i) t and are commutative and associative operations on G

(ii) distributes over t on G ie A (B t C) = (AB) t (A C)

(iii) K1 A = A

(iv) K0 A = K0

(v) K0 t A = A

(vi) Kn tKm = Kn+m

Proof We leave the proof to the reader

In other words Lemma 32 says that (Gt K0 K1) is a (commutative)semiring in which the elements K0 K1 K2 behave like the natural numbers NWe will denote this semiring simply by G

322 Strassen preorder via graph homomorphisms

Let G be the semiring of graphs Recall that G 6 H if there is a graph homomor-phism f Grarr H

Lemma 33 The preorder 6 is a Strassen preorder on G That is for graphsABCD isin G we have the following

(i) For nm isin N Kn 6 Km iff n le m

(ii) If A 6 B and C 6 D then A t C 6 B tD and A C 6 B D

(iii) For AB isin G if B 6= K0 then there is an r isin N with A 6 Kr B

Proof Statement (i) is easy to verify We prove (ii) Let f Ararr B and g C rarr Dbe graph homomorphisms Let the map f t g V (A) t V (C)rarr V (B) t V (D) bedefined by

(f + g)(a) = f(a) for a isin V (A)

(f + g)(c) = g(c) for c isin V (C)

32 The asymptotic spectrum of graphs 39

One verifies directly that f t g is a graph homomorphism A t C rarr B tD Letthe map f g V (A)times V (C)rarr V (B)times V (D) be defined by

(f g)(a c) = (f(a) g(c))

One verifies directly that f g is a graph homomorphism A C rarr B D Thisproves (ii) We prove (iii) Let r = |V (A)| Then A 6 Kr By assumptionB 6= K0 so K1 6 B Therefore A 6 Kr

sim= Kr1 6 KrB This proves (iii)

323 The asymptotic spectrum of graphs X(G)

We thus have a semiring G with a Strassen preorder 6 We are therefore in theposition to apply the theory of asymptotic spectra (Chapter 2) Let us translatethe abstract terminology to this setting

Let G 6sim H if there is a sequence (xN) isin NN with (xN)1N rarr 1 such that forevery N isin N we have GN 6 HN KxN ie GN 6 (HN)txN

Let S sube G be a subsemiring For example one may take S = G or one maychoose any set X sube G and let S = N[X] be the subsemiring of G generated by Xunder t and

The asymptotic spectrum of S is the set X(S) of 6-monotone semiring homo-morphisms S rarr Rge0 ie all maps φ S rarr Rge0 such that for all GH isin S

(1) if G 6 H then φ(G) le φ(H)

(2) φ(G tH) = φ(G) + φ(H)

(3) φ(GH) = φ(G)φ(H)

(4) φ(K1) = 1

We call X(G) the asymptotic spectrum of graphs

Theorem 34 Let GH isin S Then G 6sim H iff forallφ isin X(S) φ(G) le φ(H)

Proof By Lemma 32 we have a semigroup S and by Lemma 33 we have aStrassen preorder 6 so we may apply Theorem 212

We refer to Chapter 2 for a discussion of the topological properties of X(S)

324 Shannon capacity Θ

Let us discuss the (asymptotic) rank and (asymptotic) subrank for (G6) Recallthat an independent set in G is a subset of V (G) that contains no edges andthe independence number α(G) is the cardinality of the largest independent setin G A colouring of G is an assignment of colours to the elements of V (G) suchthat connected vertices get distinct colours The chromatic number χ(G) is the

40 Chapter 3 The asymptotic spectrum of graphs Shannon capacity

smallest number of colours in any colouring of G The clique cover number χ(G)is defined as the chromatic number of the complement χ(G) = χ(G)

For the semiring G with preorder 6 the abstract definition of subrank ofSection 28 becomes Q(G) = maxm isin N Km 6 G and the abstract definitionof rank becomes R(G) = minn isin N G 6 Kn

Lemma 35

(i) α(G) = Q(G)

(ii) χ(G) = R(G)

Proof We leave the proof to the reader

We see directly that the asymptotic rank is the Shannon capacity

˜R(G) = limNrarrinfin

R(GN)1N = limNrarrinfin

α(GN)1N = Θ(G)

and that the asymptotic subrank is the asymptotic clique cover number

˜Q(G) = limNrarrinfin

Q(GN)1N = limNrarrinfin

χ(GN)1N = ˜χ(G)

Let S sube G be a subsemiring Let G isin S

Corollary 36 Θ(G) = minφisinX(S) φ(G)

Proof Let G be a graph Either G = K0 or K1 6 G 6 K1 or G contains at leastone edge In the first two cases the claim is clearly true In the third case G gt K2

and we may thus apply Corollary 213

Corollary 37 ˜χ(G) = maxφisinX(S) φ(G)

Proof This is Corollary 214

Remark 38 As mentioned earlier it turns out that ˜χ is in fact itself an elementof X(G) See Section 332 (This is a striking difference with the situation fortensors which we will discuss in Chapter 4 there both asymptotic rank andasymptotic subrank are not in the asymptotic spectrum see Remark 44)

Shannon capacity is not in the asymptotic spectrum

Lemma 39 GG ge K|V (G)|

Proof Let D = (u u) u isin V (G) Let (u u) (v v) isin D Then eitheru v isin E(G) or u v isin E(G) (exclusive or) and so (u u) (v v) 6isin E(GG)Therefore the subgraph in GG induced by D is isomorphic to K|V (G)|

Example 310 Let G be the Schlafli graph This is a graph with 27 verticesThus Θ(GG) ge |V (G)| = 27 On the other hand Haemers in [Hae79] showedthat Θ(G)Θ(G) le 21 This implies the map Θ is not in X(G) since it is notmultiplicative under

33 Universal spectral points 41

33 Universal spectral points

The abstract theory of asymptotic spectra of Chapter 2 does not explicitly describethe elements of X(G) ie the universal spectral points (cf Section 213) Howeverseveral graph parameters from the literature can be shown to be universal spectralpoints In fact recently in [BC18] the first infinite family of universal spectralpoints was found the fractional Haemers bounds We give a brief (and probablyincomplete) overview of currently known elements in X(G)

331 Lovasz theta number ϑ

For any real symmetric matrix A let Λ(A) be the largest eigenvalue The Lovasztheta number ϑ(G) is defined as

ϑ(G) = minΛ(A) A isin RV (G)timesV (G) symmetric u v 6isin E(G)rArr Auv = 1

The parameter ϑ(G) was introduced by Lovasz in [Lov79] We refer to [Knu94]and [Sch03] for a survey It follows from well-known properties that ϑ isin X(G)

332 Fractional graph parameters

Besides the Lovasz theta number there are several elements in X(G) that arenaturally obtained as fractional versions of -submultiplicative t-subadditive6-monotone maps G rarr Rge0 For any map φ G rarr Rge0 we define a fractionalversion φf by

φf (G) = infd

φ(GKd

)d

We will discuss several fractional parameters from the literature and prove ageneral theorem about fractional parameters

Fractional clique cover number

We consider the fractional version of the clique cover number χ(G) = χ(G) It iswell-known that χf isin X(G) see eg [Sch03] The fractional clique cover number χfin fact equals the asymptotic clique cover number ˜χ(G) = limNrarrinfin χ(GN)1N

which we introduced in the previous section see [MP71] and also [Sch03 Th 6717]

Fractional Haemers bound

Let rank(A) denote the matrix rank of any matrix A For any set C of matricesdefine rank(C) = minrank(A) A isin C For a field F and a graph G define theset of matrices

MF(G) = A isin FV (G)timesV (G) foralluv Avv 6= 0 u v 6isin E(G)rArr Auv = 0

42 Chapter 3 The asymptotic spectrum of graphs Shannon capacity

Let RF(G) = rank(MF(G)) The parameter RF(G) was introduced by Haemersin [Hae79] and is known as the Haemers bound The fractional Haemers bound RF

f

was studied by Anna Blasiak in [Bla13] and was recently shown to be -multiplica-tive by Bukh and Cox in [BC18] From this it is not hard to prove that RF

f isin X(G)Bukh and Cox in [BC18] furthermore prove a separation result for any field F ofnonzero characteristic and any ε gt 0 there is a graph G such that for any field Fprimewith char(F) 6= char(Fprime) the inequality RF

f(G) lt εRFprimef (G) holds This separation

result implies that there are infinitely many elements in X(G)

Fractional orthogonal rank

In [CMR+14] the orthogonal rank ξ(G) and its fractional version the projectiverank ξf (G) are studied It easily follows from results in [CMR+14] that G 7rarr ξf (G)is in X(G)

General fractional parameters

We will prove something general about fractional parameters Define the lexico-graphic product GnH by

V (GnH) = V (G)times V (H)

E(GnH) =(g h) (gprime hprime) g gprime isin E(G)

or (g = gprime and h hprime isin E(H))

The lexicographic product satisfies GnH = GnH Also define the or-productG lowastH by

V (G lowastH) = V (G)times V (H)

E(G lowastH) =(g h) (gprime hprime) g gprime isin E(G) or h hprime isin E(H)

The or-product and the strong graph product are related by G lowastH = GH Thestrong graph product gives a subgraph of the lexicographic product which gives asubgraph of the or-product

GH sube GnH sube G lowastH

Therefore G lowastH 6 GnH 6 GH Finally GnKd = G lowastKd and of courseGKd = Gtd

We will prove if φ G rarr Rge0 is -submultiplicative t-subadditive and 6-monotone then φf is again -submultiplicative t-subadditive and 6-monotoneMoreover if φ G rarr N is 6-monotone and satisfies

forallGH isin G φ(GnH) ge φ(GnKφ(H))

then φf is n-supermultiplicative and more importantly φf is -supermultiplica-tive

33 Universal spectral points 43

Lemma 311

(i) If φ is t-superadditive then φf is t-superadditive

(ii) If φ is 6-monotone then φf is 6-monotone

(iii) If φ is t-subadditive and 6-monotone then φf is t-subadditive

(iv) If foralln isin N φ(Kn) = n then foralln isin N φf (Kn) = n

(v) If φ is -submultiplicative and 6-monotone then φf is -submultiplicative

Proof Let GH isin G Let d isin N(i) The lexicographic product distributes over the disjoint union

(G tH) nKd = (GnKd) t (H nKd)

By superadditivity

φ((GnKd) t (H nKd)) ge φ(GnKd) + φ(H nKd)

Therefore

φf (G tH) = infd

φ((G tH) nKd)

d

= infd

φ((GnKd) t (H nKd))

d

ge infd

φ(GnKd)

d+φ(H nKd)

d

ge infd1

φ(GnKd1)

d1

+ infd2

φ(H nKd2)

d2

= φf (G) + φf (H)

(ii) Let G 6 H Then G n Kd 6 H n Kd Thus φ(G n Kd) le φ(H n Kd)Therefore φf (G) le φf (H)

(iii) We have GnKd 6 GKd = Gtd Thus by monotonicity and subadditivity

φ(GnKd) le dφ(G)

and for d e isin N

φ(GnKde) = φ((GnKd) nKe) le eφ(GnKd)

We use this inequality to get for d1 d2 isin N

φ(GnKd1)

d1

+φ(H nKd2)

d2

ge φ(GnKd1d2) + φ(H nKd1d2)

d1d2

44 Chapter 3 The asymptotic spectrum of graphs Shannon capacity

From subadditivity follows

φ(GnKd1d2) + φ(H nKd1d2)

d1d2

ge φ((GnKd1d2) t (H nKd1d2))

d1d2

=φ((G tH) nKd1d2)

d1d2

ge φf (G tH)

We conclude φf (G) + φf (H) ge φf (G tH)(iv) Let n isin N Then φf (Kn) = infd φ(Kn nKd)d = infd φ(Knd)d = n(v) Let d1 d2 isin N We claim

(GH) nKd1d2 le (GnKd1) (H nKd2)

This is the same as saying there is a graph homomorphism

(GH) nKd1d2 rarr (GnKd1) (H nKd2)

which is the same as saying there is a graph homomorphism

(G lowastH) nKd1d2 rarr (GnKd1) lowast (H nKd2)

where lowast denotes the or-product of graphs One verifies that (g h (i j)) 7rarr((g i) (h j)) is such a graph homomorphism proving the claim The claimtogether with monotonicity and submultiplicativity gives

φ((GH)nKd1d2) le φ((GnKd1) (H nKd2)) le φ(GnKd1)φ(H nKd2)

Therefore

φf (GH) = infd

φ((GH) nKd)

d

= infd1d2

φ((GH) nKd1d2)

d1d2

le infd1d2

φ(GnKd1)

d1

φ(H nKd2)

d2

= φf (G)φf (H)

This concludes the proof of the lemma

Lemma 312 Let φ G rarr N satisfy

forallGH isin G φ(GnH) ge φ(GnKφ(H)) (32)

Then

infH

φ(GnH)

φ(H)= inf

d

φ(GnKd)

d

33 Universal spectral points 45

Proof From (32) follows

φ(GnH)

φ(H)geφ(GnKφ(H))

φ(H)

and so

φ(GnH)

φ(H)ge inf

d

φ(GnKd)

d

We take the infimum over H to get

infH

φ(GnH)

φ(H)ge inf

d

φ(GnKd)

d

The inequality in the other direction

infH

φ(GnH)

φ(H)le inf

d

φ(GnKd)

d

is trivially true

Lemma 313 Let φ G rarr N be 6-monotone and satisfy

forallGH isin G φ(GnH) ge φ(GnKφ(H))

Then φf is n- and -supermultiplicative

Proof Let AB isin G We have AB gt AnB so

φf (AB) ge φf (AnB)

It remains to show φf (AnB) ge φf (A)φf (B) We have

φ(AnB nH)

φ(H)=φ(An (B nH))

φ(B nH)

φ(B nH)

φ(H)

which implies

φ(AnB nH)

φ(H)ge inf

Hprime

φ(AnH prime)

φ(H prime)infHprimeprime

φ(B nH primeprime)

φ(H primeprime)= φf (A)φf (B)

Take the infimum over H to obtain φf (AnB) ge φf (A)φf (B)

Theorem 314 Let φ G rarr N be t-additive -submultiplicative 6-monotoneand Kn-normalised and satisfy

forallGH isin G φ(GnH) ge φ(GnKφ(H))

Then φf is in X(G)

Proof This follows from Lemma 311 Lemma 312 and Lemma 313

46 Chapter 3 The asymptotic spectrum of graphs Shannon capacity

34 Conclusion

In this chapter we introduced a new connection between Strassenrsquos theory ofasymptotic spectra and the Shannon capacity of graphs In particular we charac-terised the Shannon capacity (which is defined as a supremum) as a minimisationover elements in the asymptotic spectrum of graphs Known elements in theasymptotic spectrum of graphs include the fractional clique cover number theLovasz theta number the projective rank and the fractional Haemers bound Weare left with a clear goal for future work find all elements in the asymptoticspectrum of graphs

Chapter 4

The asymptotic spectrum of tensorsexponent of matrix multiplication

This chapter is based on joint work with Matthias Christandl and Peter Vrana [CVZ18]

41 Introduction

This chapter is about tensors t isin Fn1 otimes middot middot middot otimes Fnk and their asymptotic propertiesThe theory of asymptotic spectra of Chapter 2 was developed by Strassen exactlyfor the purpose of understanding the asymptotic properties of tensors Thischapter is expository and provides the necessary background for understandingChapter 5 and Chapter 6

Let us first define the asymptotic properties of interest and discuss some oftheir applications We need the concepts restriction tensor product and diagonaltensor Let s isin Fn1 otimes middot middot middot otimes Fnk and t isin Fm1 otimes middot middot middot otimes Fmk be tensors We say srestricts to t and write s gt t if there are linear maps Ai Fni rarr Fmi suchthat t = (A1 otimes middot middot middot otimes Ak) middot s The tensor product of s and t is the elements otimes t isin Fn1m1 otimes middot middot middot otimes Fnkmk with coordinates (s otimes t)ij = sitj We naturallydefine the direct sum s oplus t isin Fn1+m1 otimes middot middot middot otimes Fnk+mk We define the diagonaltensors 〈n〉 =

sumni=1 ei otimes middot middot middot otimes ei for n isin N where e1 en is the standard basis

of Fn The tensor rank R(t) is the smallest number n isin N such that t can bewritten as a sum of simple tensors a simple tensor being a tensor of the formv1 otimes middot middot middot otimes vk Equivalently R(t) = minn isin N t 6 〈n〉 The asymptotic rankis the regularisation ˜R(t) = limnrarrinfinR(totimesn)1n While tensor rank is known to behard to compute [Has90 Shi16] we do not know whether asymptotic rank is hardto compute

The exponent of matrix multiplication

The motivating example for studying asymptotic rank is the problem of findingthe exponent of matrix multiplication ω Recall from the introduction that ω

47

48 Chapter 4 The asymptotic spectrum of tensors matrix multiplication

is the infimum over a isin R such that two n times n matrices can be multipliedusing O(na) arithmetic operations (in the algebraic circuit model) It turns out(see [BCS97]) that ω is characterised by the asymptotic rank ˜R(〈2 2 2〉) of thematrix multiplication tensor

〈2 2 2〉 =sum

ijkisin[2]

eij otimes ejk otimes eki isin F4 otimes F4 otimes F4

Namely ˜R(〈2 2 2〉) = 2ω We know the trivial lower bound 2 le ω see Section 43We know the (non-trivial) upper bound ω le 23728639 which is by Coppersmithand Winograd [CW90] and improvements by Stothers Williams and Le Gall[Sto10 Wil12 LG14]

Asymptotic subrank and asymptotic restriction

Besides (asymptotic) rank we naturally define subrank Q(t) = maxm isin N 〈m〉 6 t and the asymptotic subrank ˜Q(t) = limnrarrinfinQ(totimesn)1n Moreover wesay s restricts asymptotically to t written s gtsim t if there is a sequence of naturalnumbers a(n) isin o(n) such that for all n isin N

sotimesn otimes 〈2〉otimesa(n) gt totimesn

One can prove (see [Str91]) that

sotimesn otimes 〈2〉otimeso(n) gt totimesn iff sotimesn+o(n) gt totimesn

Our goal is to understand asymptotic restriction asymptotic rank and asymptoticsubrank

More connections quantum information combinatorics algebraic prop-erty testing

Besides matrix multiplication other applications of asymptotic restriction oftensors asymptotic rank of tensors and asymptotic subrank of tensors includedeciding the feasibility of an asymptotic transformation between pure quantumstates via stochastic local operations and classical communication (slocc) inquantum information theory [BPR+00 DVC00 VDDMV02 HHHH09] boundingthe size of combinatorial structures like cap sets and tri-colored sum-free sets inadditive combinatorics [Ede04 Tao08 ASU13 CLP17 EG17 Tao16 BCC+17KSS16 TS16] see Chapter 5 and bounding the query complexity of certainproperties in algebraic property testing [KS08 BCSX10 Sha09 BX15 HX17FK14]

This chapter is organised as follows In Section 42 we briefly discuss thesemiring of tensors the asymptotic spectrum of tensors and asymptotic rank and

42 The asymptotic spectrum of tensors 49

subrank In Section 43 we discuss the gauge points a simple construction of finitelymany elements in the asymptotic spectrum of tensors In Section 44 we discussthe Strassen support functionals a family of elements in the asymptotic spectrumof ldquoobliquerdquo tensors This family is parametrised by probability distributionson [k] In Section 45 we discuss an extension of the support functionals calledthe Strassen upper support functionals which have the potential to be universalFinally in Section 46 we prove a new result we show how asymptotic slice rankis related to the support functionals

42 The asymptotic spectrum of tensors

Let us properly set up the semiring of tensors and the asymptotic spectrum Forthe proofs we refer to [Str87 Str88 Str91]

421 The semiring of tensor equivalence classes TWe begin by putting an equivalence relation on tensors For example we want toidentify isomorphic tensors and also for any tensor t isin Fn1 otimes middot middot middot otimes Fnk we wantto identify t with toplus 0 where 0 isin Fm1 otimes middot middot middot otimes Fmk is a zero tensor of any format

We say s is isomorphic to t and write s sim= t if there are bijective linear mapsAi Fmi rarr Fni such that t = (A1 Ak) middot s

We say s and t are equivalent and write s sim t if there are zero tensorss0 = 0 isin Fa1 times middot middot middot times Fak and t0 = 0 isin Fb1 times middot middot middot times Fbk such that s oplus s0

sim= t oplus t0The equivalence relation sim is in fact the equivalence relation generated by therestriction preorder 6

Let T be the set of sim-equivalence classes of k-tensors over F for some fixed kand field F The direct sum and the tensor product naturally carry over to T and T becomes a semiring with additive unit 〈0〉 and multiplicative unit 〈1〉(more precisely the equivalence classes of those tensors but we will not make thisdistinction)

422 Strassen preorder via restriction

Restriction 6 induces a partial order on T which behaves well with respect tothe semiring operations and naturally n le m if and only if 〈n〉 6 〈m〉 Thereforerestriction 6 is a Strassen preorder on T

423 The asymptotic spectrum of tensors X(T )

Let S sube T be a subsemiring Let

X(S) = X(S6) = φ isin Hom(SRge0) foralla b isin S a 6 brArr φ(a) le φ(b)

50 Chapter 4 The asymptotic spectrum of tensors matrix multiplication

We call X(S) the asymptotic spectrum of S and we call X(T ) the asymptoticspectrum of k-tensors over F

Theorem 41 ([Str88]) Let s t isin S Then s t iff forallφ isin X(S) φ(s) le φ(t)

Proof This follows from Theorem 212

We refer to Chapter 2 for a discussion of the topological properties of X(S)

Remark 42 We mention that X(S) may equivalently be defined with degenera-tion instead of restriction ge Over C we say f degenerates to g written f gif f sim= f prime and g sim= gprime and gprime is in the Euclidean closure (or equivalently Zariskiclosure) of the orbit GLn1 times middot middot middot timesGLnk middot f prime It is a nontrivial fact from algebraicgeometry (see [Kra84 Lemma III231] or [BCS97]) that there is a degenerationf g if and only if there are matrices Ai with entries polynomial in ε such that(A1 Ak) middot f = εdg + εd+1g1 + middot middot middot + εd+ege for some elements g1 ge Thelatter definition of degeneration is valid when C is replaced by an arbitrary fieldF and that is how degeneration is defined for an arbitrary field Degenerationis weaker than restriction f ge g implies f g Asymptotically however thenotions coincide f amp g if and only if fotimesn otimes 〈2〉otimeso(n) gotimesn We mention thatanalogous to restriction degeneration gives rise to border rank and border subrankR(f) = minr isin N f 〈r〉 and Q(f) = maxs isin N 〈s〉 f respectively

424 Asymptotic rank and asymptotic subrank

The abstract theory of asymptotic spectra characterises asymptotic subrank andasymptotic rank as follows

Corollary 43 Let S sube T be a subsemiring Let a isin S Then

˜Q(a) = minφisinX(S)

φ(a) (41)

˜R(a) = maxφisinX(S)

φ(a) (42)

Proof Statement (42) follows from Corollary 213 since either a = 0 or a gt 1For statement (41) if totimesk gt 2 for some k isin N then we apply Corollary 214Otherwise one can show that ˜Q(t) equals 0 or 1 using the gauge points of thenext section (see [Str88 Lemma 37])

Remark 44 One verifies that ˜R and ˜Q are 6-monotones and have value non 〈n〉 They are not universal spectral points however Namely the asymptoticrank of each of the three tensors

〈2 1 1〉 = e1 otimes e1 otimes 1 + e2 otimes e2 otimes 1 isin F2 otimes F2 otimes F1

〈1 1 2〉 = e1 otimes 1otimes e1 + e2 otimes 1otimes e2 isin F2 otimes F1 otimes F2

43 Gauge points ζ(i) 51

〈1 2 2〉 = 1otimes e1 otimes e1 + 1otimes e2 otimes e2 isin F1 otimes F2 otimes F2

equals 2 whereas their tensor product equals the matrix multiplication ten-sor 〈2 2 2〉 whose tensor rank equals 7 and whose asymptotic rank is thus atmost 7 ie strictly smaller than 23 Therefore asymptotic rank is not multiplica-tive On the other hand the asymptotic subrank of each of the above three tensorsequals 1 whereas the asymptotic subrank of 〈2 2 2〉 equals 4 see Chapter 5Therefore asymptotic subrank is not multiplicative

Goal 45 Our goal is now to explicitly describe elements in X(T ) universalspectral points or more modestly to describe elements in X(S) for interestingsubsemirings S sube T

Strassen constructed a finite family of elements in X(T ) the gauge points andan infinite family of elements in X(oblique tensors) the support functionalsThe support functionals are powerful enough to determine the asymptotic subrankof any ldquotight tensorrdquo Tight tensors are discussed in Chapter 5 In Chapter 6 weconstruct an infinite family in X(k-tensors over C) the quantum functionalsIn the rest of this chapter we discuss the gauge points and the support functionalsWe will focus on the case k = 3 for clarity of exposition

43 Gauge points ζ(i)

Strassen in [Str88] introduced a finite family of elements in X(T ) called the gaugepoints We focus on 3-tensors but the construction generalises immediately tok-tensors Let Vi = Fni Let t isin V1 otimes V2 otimes V3 Let i isin [3] Let flatteni(t) bethe image of t under the grouping V1 otimes V2 otimes V3 rarr Vi otimes (

otimesj 6=i Vj) We think

of flatteni(t) as a matrix Let ζ(i) T rarr N t 7rarr rank(flatteni(t)) with rankdenoting matrix rank We call ζ(1) ζ(2) ζ(3) the gauge points From the propertiesof matrix rank follows directly that ζ(i) is multiplicative under otimes additive under oplusmonotone under restriction 6 (and under degeneration ) and normalised to 1on 〈1〉 = e1 otimes e1 otimes e1

Theorem 46 ζ(1) ζ(2) ζ(3) isin X(T )

Recall ˜Q(t) le φ(t) le ˜R(t) for φ isin X(T ) In particular maxi ζ(i)(t) le ˜R(t)

We do not know whether maxiisin[3] ζ(i) equals ˜R To be precise we do not know any t

for which maxi ζ(i)(t) lt ˜R(t) and we do not know a proof that maxi ζ

(i)(t) = ˜R(t)for all t There are various families of tensors t for which maxi ζ

(i)(t) = ˜R(t) isproven We will see such a family in Section 542 For the matrix multiplicationtensor 〈2 2 2〉 we have 4 = maxi ζ

(i)(〈2 2 2〉) le 2ω so maxi ζ(i)(t) = ˜R(t) would

imply that the matrix multiplication exponent ω equals 2On the other hand ˜Q(t) le mini ζ

(i)(t) There exist t for which ˜Q(t) is

strictly smaller than miniisin[3] ζ(i)(t) To show this strict inequality we need another

52 Chapter 4 The asymptotic spectrum of tensors matrix multiplication

technique of Strassen the support functionals The support functionals are thetopic of the next section

44 Support functionals ζθ

Strassen in [Str91] constructed an infinite family of elements in the asymptoticspectrum of oblique k-tensors called the support functionals In this section we ex-plain the construction of the support functionals The support functionals providethe benchmark for our new quantum functionals (Chapter 6) and are relevant inthe context of combinatorial problems like the cap set problem (Section 542) Forclarity of exposition we focus on 3-tensors The ideas extend directly to k-tensors

Oblique tensors are tensors for which in some basis the support has thefollowing special structure Let t isin Fn1 otimes Fn2 otimes Fn3 Let e1 eni be thestandard basis of Fni Write t =

sumijk tijk ei otimes ej otimes ek Let [ni] = 1 2 ni

Let supp(t) = (i j k) tijk 6= 0 sube [n1] times [n2] times [n3] be the support of t withrespect to the standard basis Let [ni] have the natural ordering 1 lt 2 lt middot middot middot lt [ni]and let [n1]times [n2]times [n3] have the product order denoted by le That is x le yif for all i isin [3] holds xi le yi We call supp(t) oblique if supp(t) is an antichainwith respect to le ie if any two elements in supp(t) are incomparable withrespect to le We call a tensor t oblique if supp(g middot t) is oblique for some groupelement g isin G(t) = GLn1 times GLn2 times GLn3 The family of oblique tensors is asemiring under oplus and otimes

Not all tensors are oblique Obliqueness is not a generic property (see Propo-sition 621) However many tensors that are of interest in algebraic complexitytheory are oblique notably the matrix multiplication tensors

〈a b c〉 =sumiisin[a]

sumjisin[b]

sumkisin[c]

eij otimes ejk otimes eki isin Fab otimes Fbc otimes Fca

For any finite set X let P(X) be the set of all probability distributions on XFor any probability distribution P isin P(X) the Shannon entropy of P is definedas H(P ) = minus

sumxisinX P (x) log2 P (x) with 0 log2 0 understood as 0 Given finite

sets X1 Xk and a probability distribution P isin P(X1 times middot middot middot times Xk) on theproduct set X1 times middot middot middot timesXk we denote the marginal distribution of P on Xi by Pithat is Pi(a) =

sumxxi=a

P (x) for any a isin Xi

Definition 47 Let θ isin Θ = P([3]) For t isin Fn1 otimes Fn2 otimes Fn3 0 with supp(t)oblique define

ζθ(t) = max2sum3i=1 θ(i)H(Pi) P isin P(supp(t))

We call the ζθ for θ isin Θ the support functionals

Theorem 48 ζθ isin X(oblique) for θ isin Θ

44 Support functionals ζθ 53

We work towards the proof of Theorem 48 For p isin [0 1] let h(p) be thebinary entropy function h(p) = minusp log2 p minus (1 minus p) log2(1 minus p) ie h(p) is theShannon entropy of the probability vector (p 1minus p) The following properties ofthe Shannon entropy are well-known

Lemma 49

(i) H(P otimesQ) = H(P ) +H(Q) for P isin P(X1) Q isin P(X2)

(ii) H(P ) le H(P1) +H(P2) for P isin P(X1 timesX2)

(iii) H(pPoplus(1minusp)Q) = pH(P )+(1minusp)H(Q)+h(p) for PQ isin P(X) p isin [0 1]

(iv) 2a + 2b = max0leple1 2pa+(1minusp)b+h(p) for a b isin R

For X sube [n1]times [n2]times [n3] let Xle = y isin [n1]times [n2]times [n3] existx isin X y le x bethe downward closure of X Let max(X) = y isin X forallx isin X y le x rArr y = xbe the maximal points of X with respect to le Let Sn be the symmetric groupof permutations of [n] Then the product group Sn1 times Sn2 times Sn3 acts naturallyon [n1]times [n2]times [n3]

Lemma 410 Let t isin Fn1 otimes Fn2 otimes Fn3 For every g isin G(t) there is a triple ofpermutations w isin W (t) = Sn1 times Sn2 times Sn3 with w middotmax(supp(g middot t)) sube supp(t)le

Proof We prepare for the construction of w Let n isin N Let e1 en bethe standard basis of Fn Let g isin GLn Let f1 fn with fj = g middot ej be thetransformed basis of Fn Let (Ei)iisin[n] and (Fj)jisin[n] be the complete flags of Fnwith

Ei = Spanei ei+1 enFj = Spanfj fj+1 fn

Define the map

π [n]rarr [n] j 7rarr maxi isin [n] Ei cap (fj + Fj+1) 6= empty

(43)

We prove π is injective Let j k isin [n] with j le k and suppose i = π(j) = π(k)Let Ftimes = F 0 From (43) follows

(Ftimesei + Ei+1) cap (fj + Fj+1) 6= empty (44)

Ei+1 cap (fj + Fj+1) = empty (45)

(Ftimesei + Ei+1) cap (fk + Fk+1) 6= empty (46)

Suppose j lt k Then from (44) and (46) we obtain a contradiction to (45) Weconclude that j = k Thus π is injective

54 Chapter 4 The asymptotic spectrum of tensors matrix multiplication

For each Fni define as above the standard complete flag (Eij)jisin[ni] of Fni the

complete flag (F ij )jisin[ni] corresponding to the basis given by gi and the permuta-

tion πi [ni]rarr [ni] Let w = (π1 π2 π3) isin W (t)We will prove w middot max(supp(g middot t)) sube supp(t)le Let y isin max(supp(g middot t))

Let x = w middot y By construction of πi the intersection Eixicap (f iyi + F i

yi+1) is notempty Choose

f iyi isin Eixicap (f iyi + F i

yi+1)

Let tlowast be the multilinear map Fn1 times Fn2 times Fn3 rarr F with tlowast(ei ej ek) = tijk for alli isin [n1] j isin [n2] k isin [n3] Then

tlowast(f 1y1 f 2y2 f 3y3

) = tlowast(f 1y1 f 2y2 f 3y3

) +sum

zisin[n1]times[n2]times[n3]zgty

cz tlowast(f 1

z1 f 2z2 f 3z3

) (47)

for some cz isin F Since y is maximal in supp(gmiddott) the sum over z gt y in (47) equalszero We conclude tlowast(f 1

y1 f 2y2 f 3y3

) = tlowast(f 1y1 f 2y2 f 3y3

) 6= 0 Thus tlowast(E1x1timesE2

x2timesE3

x3)

is not zero and thus x isin supp(t)le

Proof of Theorem 48 We prove ζθ on oblique tensors is otimes-multiplicative oplus-additive 6-monotone and normalised to 1 on 〈1〉 = e1otimese1otimese1 The normalisationζθ(〈1〉) = 1 is clear

We prove ζθ is otimes-supermultiplicative Let s isin Fn1 otimes Fn2 otimes Fn3 and lett isin Fm1otimesFm2otimesFm3 Let P isin P(supp(t)) and Q isin P(supp(s)) Then the productP otimesQ isin P(supp(sotimes t)) has marginals PiotimesQi Since H(PiotimesQi) = H(Pi)+H(Qi)(Lemma 49(i)) we conclude ζθ(s)ζθ(t) le ζθ(sotimes t)

We prove ζθ is otimes-submultiplicative For P isin P(supp(t)) and θ isin Θ we use thenotation Hθ(P ) =

sum3i=1 θ(i)H(Pi) We naturally identify supp(t) with a subset

of [n1] times [n2] times [n3] times [m1] times [m2] times [m3] Let P isin P(supp(t)) Let P[3] be themarginal distribution of P on [n1] times [n2] times [n3] and let P3+[3] be the marginaldistribution of P on [m1]times [m2]times [m3] Then Hθ(P ) le Hθ(P[3]) +Hθ(P3+[3]) byLemma 49(ii) We conclude ζθ(sotimes t) le ζθ(s)ζθ(t)

We prove ζθ is oplus-additive By definition

ζθ(soplus t) = max2Hθ(P ) P isin P(supp(soplus t))= max

max0leple1

2Hθ(pPoplus(1minusp)Q) P isin P(supp(s)) Q isin P(supp(t))

From Lemma 49(iii) and (iv) follows

max

max0leple1

2Hθ(pPoplus(1minusp)Q) P isin P(supp(s)) Q isin P(supp(t))

= max

max0leple1

2pHθ(P )+(1minusp)Hθ(Q)+h(p) P isin P(supp(s)) Q isin P(supp(t))

= max

2Hθ(P ) + 2Hθ(Q) P isin P(supp(s)) Q isin P(supp(t))

44 Support functionals ζθ 55

= ζθ(s) + ζθ(t)

We conclude ζθ(soplus t) = ζθ(s) + ζθ(t)We prove ζθ is 6-monotone Let s 6 t with supp(s) and supp(t) oblique Then

there are linear maps Ai with s = (A1 otimes A2 otimes A3) middot t If A1 A2 A3 are of theform diag(1 1 0 0) then ζθ(s) le ζθ(t) Suppose g = (A1 A2 A3) isin G(t)Let P isin P(supp(t)) maximise Hθ on P(supp(t)) Let σ isin W such that σ middot Phas non-increasing marginals Then Hθ(σ middot P ) = Hθ(P ) and σ middot P maximises Hθ

on P(supp(σ middot t)) Then σ middot P maximises Hθ on P(supp(σ middot t)le) by Lemma 412below Let Q isin P(supp(g middot t)) maximise Hθ on P(supp(g middot t)) By Lemma 410there is a w isin W with w middot supp(g middot t) sube supp(σ middot t)le Then Hθ(w middotQ) = Hθ(Q) leHθ(σ middot P ) = Hθ(P ) Thus maxPisinP(supp(gmiddott)) Hθ(P ) le maxPisinsupp(t) Hθ(P ) Weconclude ζθ(g middot t) le ζθ(t)

The following two lemmas finish the above proof of Theorem 48 Recall thatin the proof we defined Hθ(P ) =

sum3i=1 θ(i)H(Pi) for θ isin Θ

Lemma 411 ([Str91 Prop 21]) Let Φ sube [n1] times [n2] times [n3] Let P isin P(Φ)Let supp(P ) be the support x isin Φ P (x) 6= 0 For x isin Φ define hP (x) =minussum3

i=1 θ(i) log2 Pi(xi) Then P maximises Hθ on P(Φ) if and only if

forallx isin supp(P ) hP (x) = maxyisinΦ

hP (y) (48)

Proof We write Hθ(P ) in terms of hP

Hθ(P ) =3sumi=1

θ(i)H(Pi) =sum

xisinsupp(P )

P (x)hP (x) (49)

For Q isin P(Φ)

limεrarr0+

d

dεHθ

((1minus ε)P + εQ

)= lim

εrarr0+

d

sumx

((1minus ε)P (x) + εQ(x)

)h(1minusε)P+εQ(x)

=sumx

P (x)

( 3sumi=1

θ(i)Pi(xi)minusQi(xi)

Pi(xi) ln(2)

)+sumx

(minusP (x) +Q(x)

)hP (x)

=sumx

Q(x)hP (x)minussumx

P (x)hP (x)

Therefore since Hθ is continuous and concave P maximises Hθ if and only if

forallQ isin P(Φ)sumx

Q(x)hP (x)minussumx

P (x)hP (x) le 0 (410)

56 Chapter 4 The asymptotic spectrum of tensors matrix multiplication

We will prove (410) is equivalent to (48) Supposesum

xQ(x)hP (x) lesum

x P (x)hP (x)for every Q isin P(Φ) In particular hP (y) le

sumx P (x)hP (x) for every y isin Φ so

maxyisinΦ hP (y) lesum

x P (x)hP (x) Then maxyisinΦ hP (y) =sum

x P (x)hP (x) We con-clude maxyisinΦ hP (y) = hP (x) for every x isin supp(P )

Suppose maxyisinΦ hP (y) = hP (x) for every x isin supp(P ) Then hP (y) le hP (x)for every Q isin P(Φ) y isin supp(Q) x isin supp(P ) We conclude

sumxQ(x)hP (x) lesum

x P (x)hP (x)

Lemma 412 ([Str91 Cor 22]) Let Φ sube [n1]times [n2]times [n3] Let P maximise Hθ

on P(Φ) Suppose Pi is nonincreasing on [ni] for each i isin [3] Then P max-imises Hθ on P(Φle) where Φle is the downward closure of Φ with respect to le

Proof We know P satisfies (48) We will prove P satisfies (48) with Φ replacedby Φle Then we are done by Lemma 411 Let x isin Φle Then x le y forsome y isin Φ Then (P1(x1) P2(x2) P3(x3)) ge (P1(y1) P2(y2) P3(y3)) since each Piis nonincreasing Then hP (x) le hP (y) We conclude maxΦle hP le maxΦ hP Onthe other hand Φ sube Φle Therefore maxΦ hP le maxΦle hP

Using the support functionals Strassen managed to fully compute the asymp-totic spectrum of several semirings generated by oblique tensors We will see anexample in Section 542

45 Upper and lower support functionals ζθ ζθ

In Section 44 we defined the support functionals ζθ oblique rarr Rge0 andproved that ζθ isin X(oblique) From the general theory of asymptotic spectra(Chapter 2) we know ζθ is the restriction of some map φ tensors rarr Rge0

in X(T ) However the proof of that fact was non-constructive In other wordswe know that ζθ can be extended to an element of X(T ) In this short sectionwe discuss a candidate extension proposed by Strassen called the upper supportfunctional We also discuss a companion called the lower support functional

For arbitrary t isin Fn1 otimes Fn2 otimes Fn3 the upper support functional and the lowersupport functional are defined as

ζθ(t) = mingisinG(t)

max2Hθ(P ) P isin P(supp(g middot t))

ζθ(t) = maxgisinG(t)

max2Hθ(P ) P isin P(max(supp(g middot t)))

with G(t) = GLn1 timesGLn2 timesGLn3 and Hθ(P ) =sum3

i=1 θ(i)H(Pi) We summarisethe known properties of the upper and lower support functional

Theorem 413 ([Str91]) Let s isin Fn1 otimes Fn2 otimes Fn3 and t isin Fm1 otimes Fm2 otimes Fm3Let θ isin Θ = P([3])

45 Upper and lower support functionals ζθ ζθ 57

(i) ζθ(〈n〉) = n for n isin N

(ii) ζθ(soplus t) = ζθ(s) + ζθ(t)

(iii) ζθ(sotimes t) le ζθ(s)ζθ(t)

(iv) If s gt t then ζθ(s) ge ζθ(t)

Theorem 414 ([Str91]) Let s isin Fn1 otimes Fn2 otimes Fn3 and t isin Fm1 otimes Fm2 otimes Fm3Let θ isin Θ

(i) ζθ(〈n〉) = n for n isin N

(ii) ζθ(soplus t) ge ζθ(s) + ζθ(t)

(iii) ζθ(sotimes t) ge ζθ(s)ζθ(t)

(iv) If s gt t then ζθ(s) ge ζθ(t)

Theorem 415 ([Str91]) ζθ(sotimes t) ge ζθ(s)ζθ(t) and ζθ(t) ge ζθ(t) for θ isin Θ

Regarding statement (ii) in Theorem 414 Burgisser [Bur90] shows that thelower support functional ζθ is not in general additive under the direct sumwhen θi gt 0 for all i See also [Str91 Comment (iii)] In particular this impliesthat the upper support functional ζθ(t) and the lower support functional ζθ(t)are not equal in general the upper support functional being additive In factto show that the lower support functional is not additive Burgisser first showsthat when F is algebraically closed the generic value of ζθ on Fn otimes Fn otimes Fnequals (1minusmini θi) log2 n+ o(n) On the other hand Tobler [Tob91] shows thatthe generic value of ζθ on FnotimesFnotimesFn equals log2 n So even generically ζθ and ζθare different on Fn otimes Fn otimes Fn

For θ isin Θ we say f is θ-robust if ζθ(t) = ζθ(t) We say t is robust if t is θ-robustfor all θ isin Θ Let us try to understand what robust tensors look like A tensor tis θ-robust if and only if

ζθ(t) le ζθ(t) (411)

The set of θ-robust tensors is closed under oplus and otimes since

ζθ(soplus t) = ζθ(s) + ζθ(t) = ζθ(s) + ζθ(t) le ζθ(soplus t)

and

ζθ(sotimes t) le ζθ(s)ζθ(t) = ζθ(s)ζθ(t) le ζθ(sotimes t)

For X sube [n1] times [n2] times [n3] we use the notation Hθ(X) = maxPisinP(X) Hθ(P )Let t isin Fn1 otimes Fn2 otimes Fn3 0 Equation (411) means that there are g h isin G(t)and P isin P(max supp(h middot t)) such that Hθ(supp(g middot t)) le Hθ(P ) In this case we

58 Chapter 4 The asymptotic spectrum of tensors matrix multiplication

have ζθ(t) = ζθ(t) = 2Hθ(P ) In particular t is θ-robust if there is a g isin G(t) suchthat the maximisation Hθ(supp(g middot t) is attained by a P isin P(max(supp(g middot t)))This criterion is automatically satisfied for all θ when supp(g middot t) = max(supp(g middot t))for some g isin G(t) Suppose t is oblique Then supp(g middot t) is an antichainfor some g isin G(t) and thus supp(g middot t) = max supp(g middot t) Then t is robustand ζθ(t) = ζθ(t) = 2Hθ(supp(gmiddott))

46 Asymptotic slice rank

Slice rank is a variation on tensor rank that was introduced by Terence Taoin [Tao16] to study cap sets We will look at cap sets in Section 54 Here westudy the relationship between asymptotic slice rank and the support functionals

Consider the following characterisation of tensor rank Let a simple tensor beany tensor of the form v1 otimes v2 otimes v3 isin V1 otimes V2 otimes V3 with vi isin Vi for i isin [k] Thenthe rank R(t) of t isin V1 otimes V2 otimes V3 is the smallest number r such that t can bewritten as a sum of r simple tensors

Slice rank is defined similarly but with simple tensors replaced by slicesFor S sube [k] let VS =

otimesiisinS Vi For j isin [k] let j = j A tensor in V1 otimes V2 otimes V3

is called a slice if it is of the form v otimes w with v isin Vj and w isin Vj for some j isin [k](under the natural reordering of the tensor legs) Let t isin V1 otimes V2 otimes V3 The slicerank of t denoted by SR(t) is the smallest number r such that t can be writtenas a sum of r slices For example the tensor

W = e1 otimes e1 otimes e2 + e1 otimes e2 otimes e1 + e2 otimes e1 otimes e1 isin F2 otimes F2 otimes F2 (412)

has slice rank 2 since we can write W = e1 otimes (e1 otimes e2 + e2 otimes e1) + e2 otimes e1 otimes e1In fact the slice rank of any element in V1 otimes V2 otimes V3 is at most mini dimVi Thetensor rank of W on the other hand is known to be 3

Slice rank is clearly monotone under restriction The slice rank of the diagonaltensor 〈r〉 equals r [Tao16] It follows that subrank is at most slice rank

Q(t) le SR(t)

The motivation for the introduction of slice rank in [Tao16] was finding upperbounds on subrank Q(t) and asymptotic subrank ˜Q(t)

The main result of this section is the following theorem Recall that a tensor tis oblique if the support supp(g middot t) is an antichain for some g isin G(t)

Theorem 416 Let t be oblique Then

limnrarrinfin

SR(totimesn)1n = minθisinP([3])

ζθ(t)

Our proof of Theorem 416 is based on a proof of Tao and Sawin in [TS16]and discussions of the author with Dion Gijswijt The explicit connection betweenasymptotic slice rank and the support functionals is new

46 Asymptotic slice rank 59

We use Theorem 416 before giving its proof to see that SR is not submulti-plicative and not supermultiplicative under the tensor product otimes In particular wecannot use Feketersquos lemma Lemma 22 to prove that the limit limnrarrinfin SR(totimesn)1n

exists Thus the existence of the limit is a non-trivial consequence of Theorem 416Let W as in (412) Then SR(W ) = 2 We have ζ(131313)(W ) = 2h(13) lt 2

From Theorem 416 follows SR(Wotimesn) le 2nh(13)+o(1) We conclude SR(Wotimesn) lt 2n

for n large enough We conclude SR is not supermultiplicative Now it is alsoclear that slice rank is not the same as (border) subrank since (border) subrankis supermultiplicative

Next the tensorssumn

i=1 eiotimeseiotimes1sumn

i=1 eiotimes1otimeseisumn

i=1 1otimeseiotimesei have slice rankone while their tensor product equals the matrix multiplication tensor 〈n n n〉which has slice rank n2 by Theorem 416 and Theorem 53 in the next chapterapplied to the tight tensor 〈n n n〉 We conclude SR is not submultiplicative

Slice rank and hitting set number

We study the hitting set number of the support of a tensor Let Φ sube [n1]times[n2]times[n3]A hitting set for Φ is a 3-tuple of sets A1 sube [n1] A3 sube [n2] A3 sube [n3] such that forevery a isin Φ there is an i isin [3] with ai isin Ai We may think of Φ as a 3-partite3-uniform hypergraph Then the definition of hitting set says every edge a isin Φ ishit by an element of some Ai A hitting set is also called a vertex cover everyedge being covered by some vertex or a transversal The size of the hittingset (A1 A2 A3) is |A1|+ |A2|+ |A3| The hitting set number τ(Φ) is the size ofthe smallest hitting set for Φ Let t isin Fn1 otimes Fn2 otimes Fn3

Lemma 417 Let g isin G(t) = GLn1timesGLn2timesGLn3 Then SR(t) le τ(supp(g middot t))

Proof This is clear

Lemma 418 Let g isin G(t) Then SR(t) ge τ(max(supp(g middot t)))

Proof It is sufficient to consider g = e Let

t =

r1sumi=1

v1i otimes u1

i +

r2sumi=1

v2i otimes u2

i +

r3sumi=1

v3i otimes u3

i

be a slice decomposition We may assume vj1 vjrj

are linearly independent

Let Vj = Spanvj1 vjrj sube Fnj Let Wj sube (Fnj)lowast be the elements in thedual space that vanish on Vj Let Bj sube Wj be a basis with the followingproperty with respect to the standard basis the matrix with the elementsof Bj as columns is in reduced row echelon form ie each column is of theform (lowast middot middot middot lowast 1 0 middot middot middot 0)T and the pivot elements (the 1rsquos) are all in different rowsLet Sj sube [nj] be the indices of the pivot element Let Sj = [nj] Sj be thecomplement Then |Sj| = rj We claim (S1 S2 S3) is a hitting set for max(supp(t))

60 Chapter 4 The asymptotic spectrum of tensors matrix multiplication

Then r1 + r2 + r3 = |S1|+ |S2|+ |S3| ge τ(max(supp(t))) Let x isin max(supp(t))Suppose x isin S1 times S2 times S3 For every j isin [3] let φj isin Bj have its pivot element atindex xj Let φ = φ1 otimes φ2 otimes φ3 Then φ isin W1 otimesW2 otimesW3 so φ(t) = 0 Since x ismaximal and each Bj is in reduced row echelon form

φ(t) =sumylex

ty φ(ey1 otimes ey2 otimes ey3)

=sumyltx

ty φ(ey1 otimes ey2 otimes ey3) + tx ex1 otimes ex2 otimes ex3

=sumyltx

sy ey1 otimes ey2 otimes ey3 + tx ex1 otimes ex2 otimes ex3

for some sy isin F From φ(t) = 0 follows tx = 0 This contradicts x isin supp(t) sox 6isin S1 times S2 times S3 ie there is a j isin [3] with xj isin Sj

Asymptotic hitting set number

We now study the asymptotic hitting set number ˜τ(Φ) = limnrarrinfin τ(Φtimesn)1nWe will use some basic facts of types and type classes Let X be a finite

set Let N isin N An N-type on X is a probability distribution P on X withN middot P (x) isin N for all x isin X Let P be an N -type on X The type class TNP sube XN

is the set of sequences s = (s1 sN) with x occuring N middot P (x) times in s forevery x isin X ie |i isin [N ] si = x| = N middot P (x)

Lemma 419 The number of N-types on X equals(N+|X|minus1|X|minus1

) Let P be an

N-type The size of the type class TNP equals the multinomial coefficient(NNP

)

Proof We leave the proof to the reader

Lemma 420 Let P be an N-type on X Then

1

(N + 1)|X|2NH(P ) le

(N

NP

)le 2NH(P )

Proof See eg [CT12 Theorem 1113]

Lemma 421 log2˜τ(Φ) le maxPisinP(Φ) miniisin[3]H(Pi)

Proof Let P maximise maxPisinP(Φ) miniH(Pi) Let n isin N We construct a hittingset (A1 A2 A3) for Φn as follows Let x isin Φn Viewing x as an n-tuple of elementsin Φ let Q isin Pn(Φ) be the type of x (ie the empirical distribution) Let j isin [3]with H(Qj) = miniisin[3]H(Qi) By our choice of P we have

H(Qj) = miniisin[3]

H(Qi) le miniisin[3]

H(Pi)

46 Asymptotic slice rank 61

Viewing x as a 3-tuple (x1 x2 x3) add xj to Aj We repeat this for all x isin ΦnThe final (A1 A2 A3) is a hitting set for Φn by construction For each j isin [3]

|Aj| lesumQj

|T nQj | lesumQj

2nH(Qj)

where the sum is over Qj isin Pn(Φj) with H(Qj) le miniisin[3]H(Pi) Then

|Aj| le |Pn(Φj)| 2nminiH(Pi) = poly(n)2nminiH(Pi)

We conclude |A1|+ |A2|+ |A3| le poly(n)2nminiH(Pi)

Lemma 422 log2˜τ(Φ) ge maxPisinP(Φ) miniisin[3]H(Pi)

Proof Let P maximise maxPisinP(Φ) miniH(Pi) Let n isin N Let (A1 A2 A3) be ahitting set for Φn Let Q isin Pn(Φ) be an n-type with miniH(Qi) = miniH(Pi)minuso(n) Let Ψ = T nQ sube Φn be the set of strings with type Q Then (A1 A2 A3) is ahitting set for Ψ Let πi Ψrarr Φn

i (x1 x2 x3) 7rarr xi Then

Ψ = πminus11 (A1) cup πminus1

2 (A2) cup πminus13 (A3)

Let j isin [3] with |πminus1j (Aj)| ge 1

3|Ψ| The fiber πminus1

j (a) has constant size over a isin Ψj

Let cj = |πminus1j (a)| be this size Then

|Ψ| =sumaisinΨj

|πminus1j (a)| =

sumaisinΨj

cj = |Ψj| cj

And

|πminus1j (Aj)| =

sumaisinAjcapΨj

|πminus1j (a)| = |Aj capΨj| cj le |Aj| cj

Therefore

|Aj| ge|πminus1j (Aj)|cj

ge13|Ψ|cj

= 13|Ψj|

We have |Ψj| ge 2nH(Qj)minuso(n) ge 2nminiH(Qi)minuso(n) ge 2nminiH(Pi)minuso(n) We conclude|A1|+ |A2|+ |A3| ge |Aj| ge 1

3|Ψj| ge 1

32nminiH(Pi)minuso(n)

Lemma 423 log2˜τ(Φ) = maxPisinP(Φ) miniisin[3] H(Pi)

Proof This follows directly from the above lemmas

62 Chapter 4 The asymptotic spectrum of tensors matrix multiplication

Asymptotic slice rank

We now combine the above lemmas about slice rank and the asymptotic hittingset number to prove Theorem 416 First we have the following basic lemma

Lemma 424 minθisinΘ maxPisinP(Φ)Hθ(P ) = maxPisinP(Φ) miniisin[3]H(Pi)

Proof Since Hθ(P ) is convex in θ and concave in P von Neumannrsquos minimaxtheorem gives minθ maxPisinP(Φ)Hθ(P ) = maxPisinP(Φ) minθHθ(P ) Finally we usethat minθHθ(P ) = miniH(Pi)

Define fsim(t) = lim supnrarrinfin f(totimesn)1n and fsim(t) = lim infnrarrinfin f(totimesn)1n

Lemma 425 Let t isin Fn1 otimes Fn2 otimes Fn3 Then

maxgisinG(t)

maxPisinP(max supp(gmiddott))

miniH(Pi) le SRsim(t) le SRsim(t) le min

θζθ(t)

Proof By definition SRsim(t) le SRsim(t) From Lemma 417 follows

SRsim(t) le˜τ(supp(g middot t))

for any g isin G(t) Lemma 423 gives ˜τ(supp(g middot t)) = maxPisinP(supp(gmiddott)) mini 2H(Pi)

Thus with the help of Lemma 424

SRsim(t) le mingisinG(t)

maxPisinP(supp(gmiddott))

mini

2H(Pi) = minθζθ(t)

From Lemma 418 follows

˜τ(max(supp(g middot t))) le SRsim(t)

for any g isin G(t) Lemma 423 gives

maxgisinG(t)

maxPisinP(max(supp(gmiddott)))

mini

2H(Pi) le SRsim(t)

This proves the lemma

Proof of Theorem 416 We may assume Φ = supp(t) is oblique Then withthe help of Lemma 424 and Lemma 425

minθisinΘ

ζθ(t) = minθisinΘ

ζθ(t)

= minθisinΘ

maxPisinmax(Φ)

2Hθ(P )

= maxPisinmax(Φ)

miniisin[3]

2H(Pi)

le maxgisinG(t)

maxPisinP(max(supp(gmiddott)))

miniisin[3]

2H(Pi)

le SRsim(t)

le SRsim(t)

le minθisinΘ

ζθ(t)

This proves the claim

47 Conclusion 63

47 Conclusion

The study of asymptotic rank of tensors is motivated by the open problem of findingthe exponent of matrix multiplication Asymptotic subrank has applicationsin for example combinatorics and algebraic property testing Via the theoryof asymptotic spectra Strassen characterised asymptotic rank and asymptoticsubrank in terms of the asymptotic spectrum of tensors Strassen introduced thegauge points in X(T ) and the support functionals in X(oblique) More preciselythere are the lower support functionals and the upper support functionals Thelower support functionals are not additive and can thus not be universal spectralpoints The upper support functionals may be universal spectral points but thiscan however not be shown with the help of the lower support functionals Finallywe showed that for oblique tensors the asymptotic slice rank exists and equals theminimum value over the support functionals In the next chapter we will see asubfamily of the oblique 3-tensors for which the support functionals are powerfulenough to compute the asymptotic subrank

Chapter 5

Tight tensors and combinatorialsubrank cap sets

This chapter is based on joint work with Matthias Christandl and Peter Vrana [CVZ16

CVZ18]

51 Introduction

In the previous chapter we discussed the gauge points and the support function-als ζθ The gauge points are in the asymptotic spectrum of all tensors while thesupport functionals are in the asymptotic spectrum of oblique tensors

How ldquopowerfulrdquo are the support functionals We know ˜Q(t) le ζθ(t) le ˜R(t) for

oblique t Thus maxθ ζθ(t) le ˜R(t) In fact maxθ ζ

θ(t) is at most the maximumover the gauge points maxS ζ(S) and in turn maxS ζ(S) is at most ˜R(t) Asremarked earlier it is not known whether maxS ζ(S) equals ˜R(t) in general

On the other hand we have ˜Q(t) le minθ ζθ(t) Do we attain equality here

in general ˜Q(t) = minθ ζθ(t) The answer is ldquoyesrdquo for the subsemiring of tight

3-tensors In this chapter we study tight k-tensors

Tight tensors

Let I1 Ik be finite sets Let Φ sube I1 times middot middot middot times Ik We say Φ is tight if there areinjective maps ui Ii rarr Z for i isin [k] such that

forallα isin Φ u1(α1) + middot middot middot+ uk(αk) = 0

We say t isin Fn1 otimes middot middot middot otimes Fnk is tight if there is a g isin G(t) = GLn1 times middot middot middot times GLnksuch that the support supp(g middot t) is tight

Recall that a tensor is oblique if the support is an antichain in some basisClearly tight tensors are oblique To summarise the families of tensors that we

65

66 Chapter 5 Tight tensors and combinatorial subrank cap sets

have defined up to now we have

tight sube oblique sube robust sube θ-robust

Recall that the families of oblique robust and θ-robust tensors each form asemiring under otimes and oplus Tight tensors have the same property [Str91 Section 5]Another property is that any subset of a tight set is tight

Example 51 Let k ge 3 be fixed For any integer n ge 1 and c isin [n] the set

Φn(c) = α isin 0 nminus 1k α1 + middot middot middot+ αk = c

is tight For any integer n ge 2 and any c isin [n] the set

Ψn(c) = α isin 0 nminus 1k α1 + middot middot middot+ αk = c mod n

is not tight (cf Exercise 1520 in [BCS97])

Example 52 When F contains a primitive nth root of unity ζ the tensor

tn =sum

αisinΨn(nminus1)

eα1 otimes middot middot middot otimes eαk isin (Fn)otimesk

which has support Ψn(n minus 1) is tight Namely the elements vj =sumn

i=1 ζijei

for j isin [n] form a basis of Fn Let g isin G(tn) be the corresponding basistransformation Then we have tn =

sumnj=1 vj otimes middot middot middot otimes vj and we see that the

support supp(g middot tn) = α isin [n]k α1 = middot middot middot = αk is tight (See also [BCS97Exercise 1525]) When the characteristic of F equals n the tensor tn is also tightas we will see in Section 542

Combinatorial subrank and the CoppersmithndashWinograd method

We care about tight tensors because of a remarkable theorem for tight 3-tensors ofStrassen (Theorem 53 below) To understand the theorem we need the concept ofcombinatorial asymptotic subrank (cf [Str91 Section 5]) We say D sube I1timesmiddot middot middottimesIkis a diagonal when any two distinct α β isin D are distinct in all k coordinates Inother words for elements in D the value at one coordinate uniquely determinesthe value at the other k minus 1 coordinates Let Φ sube I1 times middot middot middot times Ik We say adiagonal D sube I1 times middot middot middot times Ik is free for Φ or simply D sube Φ is a free diagonalif D = Φ cap (D1 times middot middot middot times Dk) where Di = xi (x1 xk) isin D Define the(combinatorial) subrank Q(Φ) as the size of the largest free diagonal D sube ΦFor Φ sube I1 times middot middot middot times Ik and Ψ sube J1 times middot middot middot times Jk we naturally define the productΦtimesΨ sube (I1 times J1)times middot middot middot times (Ik times Jk) by

ΦtimesΨ = ((α1 β1) (αk βk)) α isin Φ β isin Ψ

51 Introduction 67

Define the (combinatorial) asymptotic subrank ˜Q(Φ) = limnrarrinfinQ(Φtimesn)1n Lett isin Fn1 otimes middot middot middot otimes Fnk and let Φ be the support of t in the standard basis ThenQ(Φ) le Q(t) and ˜Q(Φ) le ˜Q(t) The number Q(Φ) may be interpreted as thelargest number n such that 〈n〉 can be obtained from t using a restriction thatconsists of matrices that have at most one nonzero entry in each row and ineach column (This is called M-restriction in [Str87 Section 6] which standsfor monomial restriction) We may also interpret Φ as a k-partite hypergraphThen Q(Φ) is the size of the largest induced k-partite matching in Φ

Let Φ sube [n1]timesmiddot middot middottimes [nk] and let t isin Fn1 otimesmiddot middot middototimesFnk be any tensor with supportequal to Φ Then the (asymptotic) subranks of Φ and t are related as follows

Q(Φ) le Q(t) and ˜Q(Φ) le ˜Q(t)

Strassen proved the following theorem using the method of Coppersmith andWinograd [CW90] Recall that for Φ sube I1 times I2 times I3 we let P(Φ) be the set ofprobability distributions on Φ For P isin P(Φ) let P1 P2 P3 be the marginaldistributions of P on the 3 components of I1 times I2 times I3

Theorem 53 ([Str91 Lemma 51]) Let Φ sube I1 times I2 times I3 be tight Then

˜Q(Φ) = maxPisinP(Φ)

miniisin[3]

2H(Pi) (51)

The consequence of Theorem 53 is that the support functionals are sufficientlypowerful to compute the asymptotic subrank of tight 3-tensors

Corollary 54 ([Str91 Proposition 54]) Let t isin Fn1 otimes Fn2 otimes Fn3 be tight Then

˜Q(t) = minθisinP([3])

ζθ(t)

Moreover if Φ = supp(g middot t) is tight for some g isin G(t) then ˜Q(t) = ˜Q(Φ)

Remark 55 Strassen conjectured in [Str94 Conjecture 53] that for the familyof tight 3-tensors the support functionals give all spectral points in the asymp-totic spectrum X(tight 3-tensors) In [Str91] numerous examples are given ofsubfamilies of tight 3-tensors for which this is the case

Remark 56 Equation (51) becomes false when we let Φ sube I1 times middot middot middot times Ikwith k ge 4 and we let the right-hand side of the equation be maxPisinP(Φ) mini 2H(Pi)see [CVZ16 Example 1138]

New results in this chapter

This chapter is an investigation of tight tensors combinatorial asymptotic subrankand applications More precisely this chapter contains the following new results

68 Chapter 5 Tight tensors and combinatorial subrank cap sets

Higher-order CoppersmithndashWinograd method In Section 52 we extendTheorem 53 to obtain a lower bound for ˜Q(Φ) for tight sets Φ sube I1 times middot middot middot times Ikwith k ge 4 Our lower bound is not known to be optimal in general We computeexamples for which the lower bound is optimal

Combinatorial degeneration method In Section 53 we further extend therange of application of the CoppersmithndashWinograd method via a partial order

on supports of tensors called combinatorial degeneration We prove that if Φ Ψthen ˜Q(Φ) le ˜Q(Ψ) Suppose Ψ is not tight but Φ is tight then we may apply the(higher-order) CoppersmithndashWinograd method to obtain a lower bound on ˜Q(Φ)and thus on ˜Q(Ψ)

Cap sets In Section 54 we relate the theory of asymptotic spectra theCoppersmithndashWinograd method and the combinatorial degeneration methodto the problem of upper bounding the maximum size of cap sets in Fnp

Graph tensors Graph tensors are generalisations of the matrix multiplicationtensor 〈2 2 2〉 parametrised by graphs In Section 55 we discuss how one canapply the higher-order CoppersmithndashWinograd method to obtain upper boundson the asymptotic rank of complete graph tensors We also briefly discuss thesurgery method which gives good upper bounds on the asymptotic rank of graphtensors for sparse graphs like cycle graphs

52 Higher-order CW method

In this section we extend Theorem 53 to tight Φ sube I1 times middot middot middot times Ik with k ge 4We introduce some notation Let P(Φ) be the set of probability distributionson Φ For P isin P(Φ) let P1 Pk be the marginal distributions of P on the kcomponents of I1 times middot middot middot times Ik Let R(Φ) be the set of all subsets R sube Φ2 suchthat R 6sube (x x) x isin Φ and R sube (x y) isin Φ2 xi = yi for some i isin [k]For P isin P(Φ) and R isin R(Φ) let Q(R (P1 Pk)) be the set of probabilitydistributions Q on R whose marginal distributions on the 2k components of Rsatisfy Qi = Qk+i = Pi for i isin [k]

Let I1 Ik be finite subsets of Z The result of this section is a lower bound onthe asymptotic subrank of any Φ sube I1timesmiddot middot middottimesIk satisfying foralla isin Φ

sumki=1 ai = 0 For

R sube R(Φ) let r(R) be the rank over Q of the matrix with rows xminusy (x y) isin R

Theorem 57 Let Φ sube Zk be a finite set with foralla isin Φsumk

i=1 ai = 0 Then

log2 ˜Q(Φ) ge maxP

minRQ

H(P )minus (k minus 2)H(Q)minusH(P )

r(R)

with P isin P(Φ) R isin R(Φ) and Q isin Q(R (P1 Pk))

52 Higher-order CoppersmithndashWinograd method 69

521 Construction

We prepare for the proof of Theorem 57 by discussing some basic facts

Average-free sets

Lemma 58 Let k isin N Let M isin N We say a subset B sube ZMZ is (k minus 1)-average-free if

forallx1 xk isin B x1 + middot middot middot+ xkminus1 = (k minus 1)xk rArr x1 = middot middot middot = xk

There is a (k minus 1)-average-free set B sube ZMZ of size |B| = M1minuso(1)

Proof There is a set A sube 1 bMminus1kminus1c of size |A| = M1minuso(1) with

forallx1 xk isin A x1 + middot middot middot+ xkminus1 = (k minus 1)xk rArr x1 = middot middot middot = xk (52)

see [VC15 Lemma 10] Let B = a mod M a isin A sube ZMZ Then |B| = |A|Let x1 xk isin B with x1 + middot middot middot+ xkminus1 = (k minus 1)xk View x1 xk as elementsin 1 bMminus1

kminus1c Then x1 + middot middot middot+xkminus1 = (kminus1)xk still holds From (52) follows

x1 = middot middot middot = xk in Z and hence also in ZMZ

Linear combinations of uniform variables

Lemma 59 Let M be a prime Let u1 un be independently uniformly dis-tributed over ZMZ Let v1 vm be (ZMZ)-linear combinations of u1 unThen the vector v = (v1 vm) is uniformly distributed over the range of v in(ZMZ)m

Proof Let vi =sum

j cijuj with cij isin ZMZ Then v = Cu with u = (u1 un)and C the matrix with entries Cij = cij Let y in the image of C Then thecardinality of the preimage Cminus1(y) equals the cardinality of the kernel of CIndeed if Cx = y then Cminus1(y) = x + ker(C) Since u is uniform we concludethat v is uniform on the image of C

Free diagonals

Lemma 510 Let G be a graph with n vertices and m edges Then G has at leastnminusm connected components

Proof A graph without edges has n connected components For every edge thatwe add to the graph we lose at most one connected component

Lemma 511 Let I1 Ik be finite sets Let Ψ sube I1 times middot middot middot times Ik Let

C = a b sube Ψ a 6= bexisti isin [k] ai = bi

Then Q(Ψ) ge |Ψ| minus |C| Obviously the statement remains true if we replace C bythe larger set (a b) isin Ψ2 a 6= bexisti isin [k] ai = bi

70 Chapter 5 Tight tensors and combinatorial subrank cap sets

Proof Let G = (Ψ C) be the graph with vertex set Ψ and edge set C Let Γ sube Ψcontain exactly one vertex per connected component of G The vertices in Γ arepairwise not adjacent So Γ is a diagonal Of course Γ sube Ψcap (Γ1times middot middot middot times Γk) Leta isin Ψ cap (Γ1 times middot middot middot times Γk) Let x1 xk isin Γ with

(x1)1 = a1 (x2)2 = a2 (xk)k = ak

Then x1 xk are all adjacent to a in G ie they are all in the same connectedcomponent Then x1 = middot middot middot = xk since Γ contains precisely one vertex perconnected component So a = x1 = middot middot middot = xk So a isin Γ We conclude thatΓ supe Ψ cap (Γ1 times middot middot middot times Γk) Finally |Γ| ge |Ψ| minus |C| by Lemma 510

We now give the proof of Theorem 57 We repeat some notation from aboveLet k ge 3 Let Φ sube Zk be a finite set Let P(Φ) be the set of probabilitydistributions on Φ For P isin P(Φ) let P1 Pk be the marginal distributionsof P on the k components of Zk Let R(Φ) be the set of all subsets R sube Φ2 suchthat R 6sube (x x) x isin Φ and R sube (x y) isin Φ2 xi = yi for some i isin [k]For P isin P(Φ) and R isin R(Φ) let Q(R (P1 Pk)) be the set of probabilitydistributions Q on R whose marginal distributions on the 2k components of Rsatisfy Qi = Qk+i = Pi for i isin [k] For R sube R(Φ) let r(R) be the rank over Q ofthe matrix with rows

xminus y (x y) isin R

For any prime M let rM(R) be the rank over ZMZ of the same matrix

Theorem (Theorem 57) Let Φ sube Zk be a finite set with foralla isin Φsumk

i=1 ai = 0Then

log2 ˜Q(Φ) ge maxP

minRQ

H(P )minus (k minus 2)H(Q)minusH(P )

r(R)

with P isin P(Φ) R isin R(Φ) and Q isin Q(R (P1 Pk))

Proof Let P be a rational probability distribution on Φ ie foralla isin Φ P (a) isin Q

Choice of parameters

This proof involves a variable N that we will let go to infinity and a primenumber M that depends on N For the sake of rigor we first set the dependenceof M on N and make sure that N is large enough for M to have good properties

Let n isin N such that P is an n-type ie foralla isin Φ nP (a) isin N Let N = tn be amultiple of n Let

f(N) = log2

(2|Φ|

2

maxRisinR(Φ)

(N + |R| minus 1

|R| minus 1

))isin o(N) (53)

52 Higher-order CoppersmithndashWinograd method 71

Let

g(N) = |Φ| log2(N + 1) isin o(N)

By Lemma 420

2NH(P )minusg(N) le(N

NP

) (54)

Let

micro(N) = maxRQ

H(Q)minusH(P ) + (1 + g(N) + f(N)) 1N

r(R)(55)

with R isin R(Φ) and Q isin Q(R (P1 Pk)) Let M be a prime with

d2micro(N)Ne leM le 2d2micro(N)Ne (56)

Such a prime exists by Bertrandrsquos postulate see eg [AZ14] We can make Marbitrarily large by choosing N large enough Choose N = tn large enough suchthat

M gt k minus 1 (57)

forallR isin R(Φ) rM(R) = r(R) (58)

We will later let t and thus N go to infinity

Restrict to marginal type classes

The set ΦotimesN is a finite subset of (ZN)k Let a isin ΦotimesN Then we have thatai = ((ai)1 (ai)N) isin ZN for i isin [k] We restrict to those a for which ai is inthe type class TNPi for all i isin [k] Thus let

Ψ = ΦotimesN cap (TNP1times middot middot middot times TNPk)

We prove a lower bound on the size of Ψ Let (s1 sN ) isin TNP Then sj isin Φ forj isin [N ] and ((s1)i (sN)i) isin TNPi for i isin [k] So(

((s1)1 (sN)1) ((s1)k (sN)k))isin ΦotimesN cap (TNP1

times middot middot middot times TNPk) = Ψ

Thus |Ψ| ge |TNP | By Lemma 419 |TNP | =(NNP

) By Lemma 420

(NNP

)ge

2NH(P )minusg(N) Therefore

|Ψ| ge 2NH(P )minusg(N) (59)

72 Chapter 5 Tight tensors and combinatorial subrank cap sets

Hashing

Let u1 ukminus1 v1 vN isin ZMZ For i isin [k] let

hi ZN rarr ZMZ

x 7rarr

ui +

sumNj=1 xjvj for 1 le i le k minus 1

1kminus1

(u1 + middot middot middot+ ukminus1 minus

sumNj=1 xjvj

)for i = k

Note that kminus1 is invertible in ZMZ by (57) Let a isin Ψ Then ((a1)j (ak)j) isinΦ for j isin [N ] So

sumki=1(ai)j = 0 for every j isin [N ] Thus

ksumi=1

Nsumj=1

(ai)jvj =Nsumj=1

vj

ksumi=1

(ai)j = 0

Therefore

h1(a1) + middot middot middot+ hkminus1(akminus1) = (k minus 1)hk(ak)

Restrict to average-free set

Let B sube ZMZ be a (k minus 1)-average-free set of size

|B| geM1minusκ(M) with κ(M) isin o(1) (510)

meaning

forallx1 xk isin B x1 + middot middot middot+ xkminus1 = (k minus 1)xk rArr x1 = middot middot middot = xk (511)

(Lemma 58) Let Ψprime sube Ψ be the subset

Ψprime = a isin Ψ foralli isin [k] hi(ai) isin B

Let a isin Ψprime Then a isin Ψ so

h1(a1) + middot middot middot+ hkminus1(akminus1) = (k minus 1)hk(ak)

Since hi(ai) isin B for every i isin [k] (511) implies

h1(a1) = middot middot middot = hk(ak)

Probabilistic method

Clearly Q(ΦotimesN) ge Q(Ψ) ge Q(Ψprime) Let

C prime = (a b) isin Ψprime2 a 6= bexisti isin [k] ai = bi

52 Higher-order CoppersmithndashWinograd method 73

Let X = |Ψprime| and Y = |C prime| By Lemma 511

Q(Ψprime) ge X minus Y

Let u1 ukminus1 v1 vN be independent uniformly random variables over thefield ZMZ Then X and Y are random variables Then

Q(Ψprime) ge E[X minus Y ] = E[X]minus E[Y ]

where the expectation is over u1 ukminus1 v1 vN We will prove

E[X] = |B| |Ψ|Mminus(kminus1) (512)

E[Y ] le |B|maxRQ

2NH(Q)+f(N)Mminus(kminus1)minusr(R) (513)

with f(N) as defined in (53) and R isin R(Φ) Q isin Q(R (P1 Pk)) Beforeproving (512) and (513) we derive the final bound

Derivation of final bound

From (512) and (513) follows

E[X]minus E[Y ] ge |B| |Ψ|Mminus(kminus1) minus |B|maxRQ

2NH(Q)+f(N)Mminus(kminus1)minusr(R)

We factor out |B| |Ψ| and Mminus(kminus1)

E[X]minus E[Y ] ge |B| |Ψ|Mminus(kminus1)(

1minus 1

|Ψ|maxRQ

2NH(Q)+f(N)Mminusr(R))

From our choice of micro(N) from (55)

micro(N) = maxRQ

H(Q)minusH(P ) + (1 + g(N) + f(N)) 1N

r(R)

follows

maxRQ

2N(H(Q)minusH(P )minusr(R)micro(N))+g(N)+f(N) le 1

2 (514)

Apply |B| geM1minusκ(M) from (510) and |Ψ| ge 2NH(P )minusg(N) from (59) to get

E[X]minus E[Y ] geM1minusκ(M)2NH(P )minusg(N)Mminus(kminus1)

middot(

1minus 2minusNH(P )+g(N) maxRQ

2NH(Q)+f(N)Mminusr(R))

geMminus(kminus2+κ(M))2NH(P )minusg(N)

74 Chapter 5 Tight tensors and combinatorial subrank cap sets

middot(

1minusmaxRQ

2NH(Q)minusNH(P )+g(N)+f(N)Mminusr(R))

(Here we used (514) to see that the second factor is nonnegative) Apply theupper bound 2micro(N)N leM le 2micro(N)N+2 from (56) to get

E[X]minus E[Y ] ge (2micro(N)N+2)minus(kminus2+κ(M))2NH(P )minusg(N)

middot(

1minusmaxRQ

2NH(Q)minusNH(P )+g(N)+f(N)(2micro(N)N)minusr(R))

= 2N(H(P )minus(kminus2+κ(M))micro(N))minus2(kminus2+κ(M))minusg(N)

middot(

1minusmaxRQ

2N(H(Q)minusH(P )minusr(R)micro(N))+g(N)+f(N))

Using (514) we get

E[X]minus E[Y ] ge 2N(H(P )minus(kminus2+κ(M))micro(N))minus2(kminus2+κ(M))minusg(N)(1minus 1

2)

= 2N(H(P )minus(kminus2+κ(M))micro(N))minus2(kminus2+κ(M))minusg(N)minus1

Then

1

Nlog2 Q(ΦotimesN)

ge 1

Nlog2(E[X]minus E[Y ])

ge H(P )minus (k minus 2 + κ(M)) maxRQ

H(Q)minusH(P ) + (1 + g(N) + f(N)) 1N

r(R)

minus 2(k minus 2 + κ(M)) + g(N) + 1

N

We let t and thus N go to infinity and obtain

log2 ˜Q(Φ) ge H(P )minus (k minus 2) maxRQ

H(Q)minusH(P )

r(R)

This lower bound holds for any rational probability distribution P on Φ and bycontinuity for any real probability distribution P on Φ

It remains to prove (512) and (513) We do this in the lemmas below

Lemma 512 E[X] = |B| |Ψ|Mminus(kminus1)

Proof Let a isin Ψ Then h1(a1) + middot middot middot+ hkminus1(akminus1) = (k minus 1)hk(ak) The followingfour statements are equivalent

a isin Ψprime

foralli isin [k] hi(ai) isin B

52 Higher-order CoppersmithndashWinograd method 75

existb isin B h1(a1) = middot middot middot = hk(ak) = b

existb isin B h1(a1) = middot middot middot = hkminus1(akminus1) = b

Therefore

P[a isin Ψprime] =sumbisinB

P[h1(a1) = middot middot middot = hkminus1(akminus1) = b]

For b isin B

P[h1(a1) = middot middot middot = hkminus1(akminus1) = b] = (Mminus1)kminus1

We conclude

E[X] =sumaisinΨ

P[a isin Ψprime]

=sumaisinΨ

sumbisinB

P[h1(a1) = middot middot middot = hkminus1(akminus1) = b]

=sumaisinΨ

sumbisinB

(Mminus1)kminus1

= |Ψ| |B|Mminus(kminus1)

This proves the lemma

Lemma 513 E[Y ] le |B|maxRQ 2NH(Q)+f(N)Mminus(kminus1)minusr(R)

Proof Let

C = (a aprime) isin Ψ2 a 6= aprimeexisti isin [k] ai = aprimei

Let (a aprime) isin C The following statements are equivalent

(a aprime) isin C prime (515)

a aprime isin Ψprime (516)

foralli isin [k] hi(ai) hi(aprimei) isin B (517)

existb isin B h1(a1) = middot middot middot = hk(ak) = h1(aprime1) = middot middot middot = hk(aprimek) = b (518)

Therefore

E[Y ] =sum

(aaprime)isinC

P[(a aprime) isin C prime]

=sum

(aaprime)isinC

sumbisinB

P[h1(a1) = middot middot middot = hk(ak) = h1(aprime1) = middot middot middot = hk(aprimek) = b]

76 Chapter 5 Tight tensors and combinatorial subrank cap sets

Let (a aprime) isin C Then hi(ai) and hi(aprimei) are ZMZ-linear combinations of

u1 ukminus1 v1 vN The random variable(h1(a1) hk(ak) h1(aprime1) hk(a

primek))

is uniformly distributed over the image subspace V sube (ZMZ)2k Let b isin BThen (b b) isin V since u1 = middot middot middot = uk = b v1 vN = 0 is a valid assignmentTherefore

P[h1(a1) = middot middot middot = hk(ak) = h1(aprime1) = middot middot middot = hk(aprimek) = b] = |V |minus1

And |V | equals M to the power the rank of the matrix1 0 middot middot middot 0 1

kminus11 0 middot middot middot 0 1

kminus1

0 1 0 1kminus1

0 1 0 1kminus1

0 0 middot middot middot 1 1

kminus10 0 middot middot middot 1 1

kminus1

a1 a2 middot middot middot akminus1 minus akkminus1

aprime1 aprime2 middot middot middot aprimekminus1 minusaprimekkminus1

(519)

over ZMZ with a1 ak aprime1 a

primek thought of as column vectors in (ZMZ)N

With column operations we transform (519) into0 0 middot middot middot 0 0 1 0 middot middot middot 0 00 0 middot middot middot 0 0 0 1 0 0

0 0 middot middot middot 0 0 0 0 1 0

a1 minus aprime1 a2 minus aprime2 middot middot middot akminus1 minus aprimekminus1 ak minus aprimek aprime1 aprime2 middot middot middot aprimekminus1 0

(520)

Matrix (520) has rank equal to k minus 1 plus rM(a aprime) = rk(A(a aprime)) where

A(a aprime) =(a1 minus aprime1 a2 minus aprime2 middot middot middot ak minus aprimek

)

We obtain

E[Y ] lesum

(aaprime)isinC

sumbisinB

Mminus(kminus1+rM (aaprime))

Since the summands are independent of b we get

E[Y ] le |B|sum

(aaprime)isinC

Mminus(kminus1+rM (aaprime))

Let (a aprime) isin C Consider the rows of A(a aprime) The N rows are of theform xi minus yi with (xi yi) isin Φ2 Let s = ((x1 y1) (xN yN)) Let R =

52 Higher-order CoppersmithndashWinograd method 77

(x1 y1) (xN yN) We have rM(a aprime) = rM(R) and rM(R) = r(R) by (58)Let Q be the N -type with supp(Q) = R and s isin TNQ From a 6= aprime followsR 6sube (x x) x isin Φ From existi isin [k] ai = aprimei follows existi isin [k] R sube (x y) xi = yiFrom a aprime isin TNP1

times middot middot middot times TNPk follows Qi = Qk+i = Pi for all i isin [k] We thus have

E[Y ] le |B|sum

RisinR(Φ)

sumQisinQ(R(P1Pk))

supp(Q)=RQ is N -type

sumsisinTNQ

Mminus(kminus1+r(R))

The number of N -types Q with supp(Q) = R is at most the number of N -typeson R which is at most

(N+|R|minus1|R|minus1

)(Lemma 419) For any Q isin Q(R (P1 Pk))

|TNQ | le 2NH(Q) (Lemma 419) Therefore

E[Y ] le |B|sum

RisinR(Φ)

(N + |R| minus 1

|R| minus 1

)max

QisinQ(R(P1Pk))2NH(Q) Mminus(kminus1+r(R))

Also |R(Φ)| le 2|Φ|2

Therefore

E[Y ] le |B| 2|Φ|2 maxRisinR(Φ)

(N + |R| minus 1

|R| minus 1

)max

QisinQ(R(P1Pk))2NH(Q) Mminus(kminus1+r(R))

We conclude that

E[Y ] le |B|maxRQ

2NH(Q)+f(N)Mminus(kminus1)minusr(R)

This proves the lemma

522 Computational remarks

The following two lemmas are helpful when applying Theorem 57 We leave theproof to the reader

Lemma 514 Let P isin P(Φ) Let RRprime isin R(Φ) with R sube Rprime and r(R) = r(Rprime)Then

maxQisinQ(R(P1Pk))

H(Q)minusH(P )

r(R)le max

QisinQ(Rprime(P1Pk))

H(Q)minusH(P )

r(Rprime)

Lemma 515 Let R isin R(Φ) There is an equivalence relation Rprime isin R(Φ) withR sube Rprime and r(R) = r(Rprime)

78 Chapter 5 Tight tensors and combinatorial subrank cap sets

523 Examples type sets

We discuss some examples The first example we will use to get good upperbounds on the asymptotic rank of complete graph tensors in Section 55 We focuson one family of examples that is parametrised by partitions Let λ ` k be aninteger partition of k with d parts Let

Φλ = a isin 0 1 dminus 1 type(a) = λ

The set Φλ is tight

Theorem 516 log2 ˜Q(Φ(22)) = 1

Proof Let Φ = Φ(22) Clearly ˜Q(Φ) le 2 After relabelling foralla isin Φsumk

i=1 ai = 0We may thus apply Theorem 57 Let P be the uniform probability distributionon Φ Then H(P ) = log2 6

Let R isin R(Φ) We may assume that

R sube (1 1 0 0) (1 0 1 0) (1 0 0 1)2

cup(0 0 1 1) (0 1 0 1) (0 1 1 0)2

We may assume R is an equivalence relation (Lemma 515) Let (x y) isin RLet Rprime = R cup ((1 1 1 1) minus x (1 1 1 1) minus y) isin R(Φ) Then R sube Rprime andRprime isin R(Φ) and r(R) = r(Rprime) We may thus assume that if (x y) isin R then also((1 1 1 1)minus x (1 1 1 1)minus y) isin R (Lemma 514)

Let S = (1 1 0 0) (1 0 1 0) (1 0 0 1) By the above observation it sufficesto consider equivalence relations on S There are three types of such equivalencerelations

Type (3) all three elements of S are equivalent Then |R| = 18 and r(R) = 2Type (2 1) two elements of S are equivalent and inequivalent to the third

element (which is equivalent to itself) Then |R| = 10 and r(R) = 1Type (1 1 1)) all elements of S are inequivalent Then R sube (x x) x isin Φ

which is a contradictionFor type (3) and (2 1) the uniform probability distribution Q on R has

marginals Qi = Q4+i = Pi for i isin [4] The uniform Q is optimal Then H(Q) =log2 |R| Let R(3) and R(21) be equivalence relations of type (3) and (2 1) Then

log2 ˜Q(Φ) ge minH(P )minus 2

r(R(3))

(log2 |R(3)| minusH(P )

)

H(P )minus 2

r(R(21))

(log2 |R(21)| minusH(P )

)= min log2 6minus 2

2(log2 18minus log2 6)

log2 6minus 21(log2 10minus log2 6)

= min1 log25425 = 1

This proves the theorem

53 Combinatorial degeneration method 79

Theorem 517 log2 ˜Q(Φ(0kminus11)) = h(1k)

Proof We refer to [CVZ16]

With Srinivasan Arunachalam and Peter Vrana we have the following unpub-lished result

Theorem 518 log2 ˜Q(Φ(0k21k2)) = 1

53 Combinatorial degeneration method

In this section we extend the (higher-order) CoppersmithndashWinograd method via apreorder called combinatorial degeneration Suppose Ψ sube I1timesmiddot middot middottimes Ik is not tightbut has a tight subset Φ sube Ψ In the rest of this section we focus on obtaining alower bound on ˜Q(Ψ) via Φ This has an application in the context of tri-coloredsum-free sets (Section 542) for example

Definition 519 ([BCS97]) Let Φ sube Ψ sube I1 times middot middot middot times Ik We say that Φ is acombinatorial degeneration of Ψ and write Ψ Φ if there are maps ui Ii rarr Z(i isin [k]) such that for all α isin I1 times middot middot middot times Ik if α isin Ψ Φ then

sumki=1 ui(αi) gt 0

and if α isin Φ thensumk

i=1 ui(αi) = 0 Note that the maps ui need not be injective

Combinatorial degeneration gets its name from the following standard proposi-tion see eg [BCS97 Proposition 1530]

Proposition 520 Let t isin Fn1 otimes middot middot middot otimes Fnk Let Ψ = supp(t) Let Φ sube Ψ suchthat Ψ Φ Then t t|Φ

Proposition 520 brings us only slightly closer to our goal Namely givent isin Fn1 otimesmiddot middot middototimesFnk with Ψ = supp(t) and given Φ sube Ψ such that ΨΦ it followsdirectly from Proposition 520 that t t|Φ and thus ˜Q(t) ge ˜Q(t|Φ) This howeverdoes not give us a lower bound on the combinatorial asymptotic subrank ˜Q(Ψ)The following theorem does Our theorem extends a result in [KSS16]

Theorem 521 Let Φ sube Ψ sube I1 times middot middot middot times Ik If Ψ Φ then

˜Q(Ψ) ge ˜Q(Φ)

Lemma 522 Let Φ sube Ψ sube I1 times middot middot middot times Ik If Ψ Φ then ˜Q(Ψ) ge Q(Φ)

Proof Pick maps ui Ii rarr Z such that

ksumi=1

ui(αi) = 0 for α isin Φ

ksumi=1

ui(αi) gt 0 for α isin Ψ Φ

80 Chapter 5 Tight tensors and combinatorial subrank cap sets

Let D be a free diagonal in Φ with |D| = Q(Φ) and let

wi =sumxisinDi

ui(x)

Let n isin N and define

Wi =

(x1 xn|D|) isin Itimesn|D|i

n|D|sumj=1

ui(xj) = nwi

Then

Ψtimesn|D| cap (W1 times middot middot middot timesWk) = Φtimesn|D| cap (W1 times middot middot middot timesWk)

The inclusion supe is clear To show sube let (x1 xk) isin Ψtimesn|D| cap (W1 times middot middot middot timesWk)Write xi = (xi1 xi2 xin|D|) and consider the n|D| times k matrix of evaluations

u1(x11) u2(x21) middot middot middot uk(xk1)u1(x12) u2(x22) middot middot middot uk(xk2)

u1(x1n|D|) u2(x2n|D|) middot middot middot uk(xkn|D|)

The sum of the ith column is nwi by definition of Wi andsumk

i=1 nwi = 0 Therow sums are nonnegative by definition of the maps u1 uk We conclude thatthe row sums are zero Therefore (x1 xk) is an element of Φtimesn|D|

Since D is a free diagonal in Φ Dtimesn|D| is a free diagonal in Φtimesn|D| and alsoDtimesn|D| cap (W1times middot middot middot timesWk) is a free diagonal in Φtimesn|D| cap (W1times middot middot middot timesWk) which inturn is equal to Ψtimesn|D| cap (W1 times middot middot middot timesWk) Therefore Dtimesn|D| cap (W1 times middot middot middot timesWk) isalso a free diagonal in Ψtimesn|D| ie

Q(Ψtimesn|D|) ge |Dtimesn|D| cap (W1 times middot middot middot timesWk)|

In the set Dtimesn|D| consider the strings with uniform type ie where all |D|elements of D occur exactly n times These are clearly in W1 times middot middot middot timesWk andtheir number is

(n|D|nn

) Therefore

Q(Ψtimesn|D|) ge(

n|D|n n

)= |D|n|D|minuso(n)

which implies ˜Q(Ψ) = limnrarrinfinQ(Ψtimesn|D|)1

n|D| ge |D|

Proof of Theorem 521 We have ˜Q(Ψ) = limnrarrinfin ˜Q(Ψtimesn)1n It follows fromLemma 522 that

limnrarrinfin ˜Q(Ψtimesn)1n ge lim

nrarrinfinQ(Φtimesn)1n

The right-hand side is ˜Q(Φ)

54 Cap sets 81

54 Cap sets

A subset A sube (Z3Z)n is called a cap set if any line in A is a point a linebeing a triple of points of the form (u u + v u + 2v) Until recently it wasnot known whether the maximal size of a cap set in (Z3Z)n grows like 3nminuso(n)

or like cnminuso(n) for some c lt 3 Gijswijt and Ellenberg in [EG17] inspired bythe work of Croot Lev and Pach in [CLP17] settled this question showing thatc le 3(207+33

radic33)138 asymp 2755 Tao realised in [Tao16] that the cap set question

may naturally be phrased as the problem of computing the size of the largestmain diagonal in powers of the ldquocap set tensorrdquo

sumα eα1 otimes eα2 otimes eα3 where the

sum is over α1 α2 α3 isin F3 with α1 + α2 + α3 = 0 Here main diagonal refersto a subset A of the basis elements such that restricting the cap set tensor toAtimesAtimesA gives the tensor

sumvisinA votimes votimes v We show that the cap set tensor is in

the GL3(F3)times3 orbit of the ldquoreduced polynomial multiplication tensorrdquo which wasstudied in [Str91] and we show how recent results follow from this connectionusing Theorem 521

541 Reduced polynomial multiplication

Let tn be the tensorsum

α eα1 otimes eα2 otimes eα3 where the sum is over (α1 α2 α3) in0 1 nminus13 such that α1 +α2 = α3 We call tn the reduced polynomial multi-plication tensor since tn is essentially the structure tensor of the algebra F[x](xn)of univariate polynomials modulo the ideal generated by xn The support of tnequals

(α1 α2 α3) isin 0 nminus 13

∣∣α1 + α2 = α3

which via α3 7rarr nminus 1minus α3 we may identify with the set

Φn =

(α1 α2 α3) isin 0 nminus 13∣∣α1 + α2 + α3 = nminus 1

(521)

The support Φn is tight (cf Example 51) Strassen proves in [Str91 Theorem 67]using Corollary 54 that ˜Q(tn) = ˜Q(Φn) = z(n) where z(n) is defined as

z(n) =γn minus 1

γ minus 1γminus2(nminus1)3 (522)

with γ equal to the unique positive real solution of the equation 1γminus1minus n

γnminus1= nminus1

3

The following table contains values of z(n) for small n See also [Str91 Table 1]

82 Chapter 5 Tight tensors and combinatorial subrank cap sets

n z(n)

rounded exact

2 188988 3223 = 2h(13)

3 275510 3(207 + 33radic

33)1384 3610725 4461586 5309737 6156208 7001559 78461210 869012

In fact [Str91 Theorem 67] says that the asymptotic spectrum of tn is completelydetermined by the support functionals and that the possible values that thespectral points can take on tn form the closed interval [z(n) n] (cf Remark 221)

X(N[tn]) = ζθ|N[tn] θ isin P([3]) φ(tn) φ isin X(N[tn]) = [z(n) n]

542 Cap sets

We turn to cap sets

Definition 523 A three-term progression-free set is a set A sube (ZmZ)n satisfy-ing the following For all (x1 x2 x3) isin Atimes3 there are u v isin (ZmZ)n such that(x1 x2 x3) = (u u + v u + 2v) if and only if x1 = x2 = x3 Let r3((ZmZ)n) bethe size of the largest three-term progression-free set in (ZmZ)n and define theregularisation ˜r3(ZmZ) = limnrarrinfin r3((ZmZ)n)1n

A three-term progression-free set in (Z3Z)n is called a cap or cap set Wenext discuss an asymmetric variation on three-term progression free sets calledtri-colored sum-free sets which are potentially larger They are interesting sinceall known upper bound techniques for the size of three-term progression-free setsturn out to be upper bounds on the size of tri-colored sum-free sets

Definition 524 Let G be an abelian group Let Γ sube GtimesGtimesG For i isin [3] wedefine the marginal sets Γi = x isin G existα isin Γ αi = x We say Γ is tricoloredsum-free if the following holds The set Γ is a diagonal and for any α isin Γ1timesΓ2timesΓ3α1 + α2 + α3 = 0 if and only if α isin Γ (Recall that Γ sube I1 times I2 times I3 is a diagonalwhen any two distinct α β isin Γ are distinct in all coordinates) Let s3(G) be thesize of the largest tricolored sum-free set in GtimesGtimesG and define the regularisation

˜s3(G) = limnrarrinfin s3(Gtimesn)1n

Equivalently Γ sube GtimesGtimesG is a tricolored sum-free set if and only if Γ is afree diagonal in α isin GtimesGtimesG α1 + α2 + α3 = 0

54 Cap sets 83

If the set A sube G = (ZmZ)n is three-term progression-free then the setΓ = (a aminus2a) a isin A sube G times G times G is tri-colored sum-free Therefore wehave ˜r3(ZmZ) le ˜s3(ZmZ)

We summarise the recent history of results on cap sets For clarity we focuson m = 3 we refer the reader to the references for the general results Edel in[Ede04] proved the lower bound 221739 le ˜r3(Z3Z) In [EG17] Ellenberg andGijswijt proved the upper bound

˜r3(Z3Z) le 3(207 + 33radic

33)138 asymp 2755

Blasiak et al [BCC+17] proved that in fact

˜s3(Z3Z) le 3(207 + 33radic

33)138

This upper bound was shown to be an equality in [KSS16 Nor16 Peb16]

Theorem 525 ˜s3(Z3Z) = 3(207 + 33radic

33)138

We reprove Theorem 525 by proving that ˜s3(ZmZ) equals the asymptoticsubrank z(m) of tm discussed in Section 541 when m is a prime power Thesignificance of our proof lies in the explicit connection to the framework ofasymptotic spectra and not in the obtained value which also for prime powers mwas already computed in [BCC+17 KSS16 Nor16 Peb16]

Proof We will prove ˜s3(ZmZ) = z(m) when m is a prime power By defini-tion ˜s3(ZmZ) equals the asymptotic subrank of the set

α isin 0 mminus 13 α1 + α2 + α3 = 0 mod m

which via α3 7rarr α3 minus (mminus 1) we may identify with the set

Ψm = α isin 0 mminus 13 α1 + α2 + α3 = mminus 1 mod m

and so ˜s3(ZmZ) = ˜Q(Ψm) Let

Φm = α isin 0 mminus 13 α1 + α2 + α3 = mminus 1

We know ˜Q(Φm) = z(m) (Section 541) We will show that ˜Q(Φm) = ˜Q(Ψm)when m is a prime power This proves the theorem

We prove ˜Q(Φm) le ˜Q(Ψm) There is a combinatorial degeneration Φm ΨmIndeed let ui 0 mminus 1 rarr 0 mminus 1 be the identity map If α isin Φmthen

sum3i=1 ui(αi) = m minus 1 and if α isin Ψm Φm then

sum3i=1 ui(αi) equals m minus 1

plus a positive multiple of m This means Theorem 521 applies and we thusobtain ˜Q(Φm) le ˜Q(Ψm) This proves the claim

We show ˜Q(Ψm) le ˜Q(Φm) when m is a power of the prime p Let F = FpLet fm isin Fm otimes Fm otimes Fm have support Ψm with all nonzero coefficients equal

84 Chapter 5 Tight tensors and combinatorial subrank cap sets

to 1 Obviously ˜Q(Ψm) le ˜Q(fm) To compute ˜Q(fm) we show that there is abasis in which the support of fm equals the tight set Φm Then ˜Q(fm) = ˜Q(Φm)(Corollary 54) This implies the claim We prepare to give the basis (which isthe same basis as used in [BCC+17]) First observe that the rule x 7rarr

(xa

)gives a

well-defined map ZmZrarr ZpZ since for a isin 0 1 mminus 1 if x = y mod mthen

(xa

)=(ya

)mod p by Lucasrsquo theorem Let (ex)x be the standard basis of Fm

The elements (sum

xisinZmZ(xa

)ex)aisinZmZ form a basis of Fm since the matrix (

(xa

))ax

is upper triangular with ones on the diagonal We will now rewrite fm in the basis((sum

x

(xa

)ex)a (

sumy

(yb

)ey)b (

sumz

(zc

)ez)c) Observe that

(x

mminus1

)equals 1 if and only

if x equals mminus 1 and hence

fm =sum

xyzisinZmZx+y+z=mminus1

ex otimes ey otimes ez =sum

xyzisinZmZ

(x+ y + z

mminus 1

)ex otimes ey otimes ez

The identity(x+y+zw

)=sum(

xa

)(yb

)(zc

)with sum over a b c isin 0 1 mminus 1 such

that a+ b+ c = w is true and thussumxyzisinZmZ

(x+ y + z

mminus 1

)ex otimes ey otimes ez

=sum

xyzisinZmZ

sumabcisin01mminus1

a+b+c=mminus1

(x

a

)(y

b

)(z

c

)ex otimes ey otimes ez (523)

We may simply rewrite (523) as

sumabcisin01mminus1

a+b+c=mminus1

sumxisinZmZ

(x

a

)ex otimes

sumyisinZmZ

(y

b

)eb otimes

sumzisinZmZ

(z

c

)ez

Therefore with respect to the basis ((sum

x

(xa

)ex)a (

sumy

(yb

)ey)b (

sumz

(zc

)ez)c) the

support of fm equals the tight set Φm (And even stronger fm is isomorphic tothe tensor F[x](xm) of Section 541)

Remark 526 Why did we reprove the cap set result Theorem 525 Ourmotivation being interested in the asymptotic spectrum of tensors was to seeif the techniques in the cap set papers are stronger than the Strassen supportfunctionals ie whether they give any new spectral points Above we have seenthat the cap set result itself can be proven with the support functionals In fact weshow in Section 46 that for oblique tensors the asymptotic slice-rank which wasintroduced in [Tao16] to give a concise proof of [EG17] equals the minimum valueover the support functionals In Section 611 we show that for all complex tensorsasymptotic slice-rank equals the minimum value of the quantum functionals

55 Graph tensors 85

55 Graph tensors

In this section we briefly discuss the application that motivated us to proveTheorem 57 in [CVZ16] namely upper bounding the asymptotic rank of so-calledgraph tensors Graph tensors are defined as follows

Let G = (VE) be a graph (or hypergraph) with vertex set V and edgeset E Let n isin N Let (bi)iisin[n] be the standard basis of Fn We define the graphtensor Tn(G) as

Tn(G) =sumiisin[n]E

otimesvisinV

(otimeseisinEvisine

bie

)

seen as a |V |-tensor Given a vertex v isin V let d(v) denote the degree of v thatis d(v) equals the number of edges e isin E that contain v Then Tn(G) is naturallyinotimes

visinV Fd(v) We write T(G) for T2(G) For example for the complete graphon four vertices K4 the graph tensor is

T(K4) = T( )

= T( )

otimes T( )

otimes T( )

otimes T( )

otimes T( )

otimes T( )

=sum

iisin016(bi1otimesbi2otimesbi5)otimes (bi2otimesbi3otimesbi6)otimes (bi3otimesbi4otimesbi5)otimes (bi1otimesbi4otimesbi6)

living in (C8)otimes4 Let Kk be the complete graph on k vertices The 2times2 matrix mul-tiplication tensor 〈2 2 2〉 equals the tensor T(K3) Define the exponent ω(T(G)) =log2 ˜R(T(G)) We study the exponent per edge τ(T(G)) = ω(T(G)) |E(G)|

Our result is an upper bound on τ(T(K4)) in terms of the combinatorialasymptotic subrank ˜Q(Φ(22)) which we studied in Theorem 516

Theorem 527 For any q ge 1 τ(T(K4)) le logq

( q + 2

˜Q(Φ(22))

)

Proof We apply a generalisation of the laser method See [CVZ16]

Corollary 528 Let k ge 4 Then τ(T(Kk)) le 0772943

Proof In the bound of Theorem 527 we plug in the value ˜Q(Φ(22)) = 2 fromTheorem 516 Then we optimise over q to obtain the value 0772943 Bya ldquocovering argumentrdquo we can show that τ(T(Kk)) is non-increasing when kincreases

For k ge 4 Corollary 528 improves the upper bound τ(T(Kk)) le 0790955that can be derived from the well-known upper bound of Le Gall [LG14] on theexponent of matrix multiplication ω = ω(T(K3))

86 Chapter 5 Tight tensors and combinatorial subrank cap sets

A standard ldquoflattening argumentrdquo (ie using the gauge points from the asymp-totic spectrum) yields the lower bound τ(T(Kk)) ge 1

2k(k minus 1) if k is even and

τ(T(Kk)) ge 12(k + 1)k if k is odd As a consequence if the exponent of matrix

multiplication ω equals 2 then τ(T(K4)) = τ(T(K3)) = 23 We raise the following

question is there a k ge 5 such that τ(T(Kk)) lt23

Tensor surgery cycle graphs

For graph tensors given by sparse graphs good upper bounds on the asymptoticrank can be obtained with an entirely different method called tensor surgery whichwe introduced in [CZ18] As an illustration let me mention the results we obtainedfor cycle graphs with tensor surgery Recall ω = log2 ˜R(〈2 2 2〉) = log2 ˜R(T(Ck))Let ωk = log2 ˜R(T(Ck)) First observe that ωk = k for even k For odd k triviallyk minus 1 le ωk le k We prove the following

Theorem 529 For k ` odd ωk+`minus1 le ωk + ω`

Corollary 530 Let k ge 5 odd Then ωk le ωkminus2 + ω3 and thus ωk le kminus12ω

Corollary 531 If ω = 2 then ωk = k minus 1 for all odd k

See [CZ18] for the proofs

56 Conclusion

Tight tensors are a subfamily of the oblique tensors For tight 3-tensors theminimum over the support functionals equals the asymptotic subrank This isproven via the CoppersmithndashWinograd method The construction is in fact of avery combinatorial nature In this chapter we studied the combinatorial notion ofsubrank We proved that combinatorial subrank is monotone under combinatorialdegeneration We studied the cap set problem via the support functionals Weextended the CoppersmithndashWinograd method to higher-order tensors and appliedthis method to study graph tensors

Chapter 6

Universal points in the asymp-totic spectrum of tensors entanglementpolytopes moment polytopes

This chapter is based on joint work with Matthias Christandl and Peter Vrana [CVZ18]

61 Introduction

In Chapter 4 following Strassen we introduced the asymptotic spectrum oftensors X(T ) = X(T 6) for T the semiring of k-tensors over F for some fixedinteger k and field F with addition given by direct sum oplus multiplication givenby tensor product otimes and preorder 6 given by restriction (or degeneration) Theasymptotic spectrum characterises the asymptotic rank ˜R and the asymptoticsubrank ˜Q We have seen that the asymptotic rank plays an important role inalgebraic complexity theory the asymptotic rank of the matrix multiplicationtensor 〈2 2 2〉 =

sumijkisin[2] eij otimes ejkotimes eki isin F4otimesF4otimesF4 characterises the exponent

of the arithmetic complexity of multiplying two n times n matrices over F thatis ˜R(〈2 2 2〉) = 2ω We have also seen in Chapter 5 how one may use theasymptotic subrank to upper bound the size of combinatorial objects like forexample cap sets in Fn3

New results in this chapter

So far the only elements we have seen in X(T ) (ie universal spectral pointscf Section 213) are the gauge points (Section 43) Besides that we have seenin Section 44 that the Strassen support functionals ζθ are in X(oblique) Inthis chapter we introduce for the first time an explicit infinite family of universalspectral points (over the complex numbers) the quantum functionals Our newinsight is to use the moment polytope Given a tensor t isin Cn1 otimes Cn2 otimes Cn3 themoment polytope P(t) is a convex polytope that carries representation-theoretic

87

88 Chapter 6 Universal points in the asymptotic spectrum of tensors

information about t The quantum functionals are defined as maximisations overmoment polytopes

Let me immediately put a disclaimer The quantum functionals do not give anew lower bound on the asymptotic rank of matrix multiplication 〈2 2 2〉 namelythe quantum functionals give the same lower bound as the gauge points Alsothe quantum functionals being defined for tensors over complex numbers only wedo not expect to get new upper bounds on the size of combinatorial objects thatare ldquolike cap setsrdquo

So what have we gained Arguably we have found the ldquorightrdquo viewpoint onhow to construct universal spectral points for tensors (In fact after writing ourpaper [CVZ18] we realised that Strassen had begun a study of moment polytopesin the appendix of the German survey [Str05] Strassen did not construct newuniversal spectral points however not in that publication at least) If there aremore universal spectral points then our viewpoint may lead the way to findingthem Moreover whereas no efficient algorithm is known for evaluating the supportfunctionals the moment polytope viewpoint may open the way to having efficientalgorithms for evaluating the quantum functionals

In Sections 62ndash67 we work towards the construction of the quantum functionalsand we give a proof that they are universal spectral points In Sections 68ndash610 wecompare the quantum functionals and the support functionals and in Section 611we relate asymptotic slice rank to the quantum functionals

In this chapter we will focus on 3-tensors but the theory naturally generalisesto k-tensors

62 SchurndashWeyl duality

For background on representation theory we refer to [Kra84] [Ful97] and [GW09]Let Sn be the symmetric group on n symbols Let Sn act on the tensor

space (Cd)otimesn by permuting the tensor legs

π middot v1 otimes middot middot middot otimes vn = vπminus1(1) otimes middot middot middot otimes vπminus1(n) π isin Sn

Let GLd be the general linear group of Cd Let GLd act on (Cd)otimesn via the diagonalembedding GLd rarr GLtimesnd g 7rarr (g g)

g middot v1 otimes middot middot middot otimes vn = (gv1)otimes middot middot middot otimes (gvn) g isin GLd

The actions of Sn and GLd commute so we have a well-defined action of the productgroup Sn timesGLd on (Cd)otimesn SchurndashWeyl duality describes the decomposition ofthe space (Cd)otimesn into a direct sum of irreducible Sn timesGLd representations Thisdecomposition is

(Cd)otimesn sim=oplusλ`dn

[λ]otimes Sλ(Cd) (61)

62 SchurndashWeyl duality 89

with [λ] an irreducible Sn representation of type λ and Sλ(Cd) an irreducibleGLd-representation of type λ when `(λ) le d and 0 when `(λ) gt d We use thenotation λ `d n for the partitions of n with at most d parts Let

Pλ (Cd)otimesn rarr (Cd)otimesn

be the equivariant projector onto the isotypical component of type λ ie onto thesubspace of (Cd)otimesn isomorphic to [λ]otimes Sλ(Cd) The projector Pλ is given by theaction of the group algebra element

Pλ =(dim[λ]

n

)2 sumTisinTab(λ)

cT isin C[Sn]

where Tab(λ) is the set of Young tableaux of shape λ filled with [n] and with cTthe Young symmetrizer

cT =sum

σisinC(T )

sgn(σ)σsum

πisinR(T )

π

where C(T ) R(T ) sube Sn are the subgroups of permutations inside columns andpermutations inside rows respectively The element Pλ is a minimal centralidempotent in C[Sn] and

sumλ`n Pλ = e

Back to the decomposition of (Cd)otimesn We need a handle on the size of thecomponents in the direct sum decomposition (61) For our application it is goodto think of d as a constant and n as a large number The number of summands inthe direct sum decomposition (61) is upper bounded by a polynomial in n

|λ `d n| le (n+ 1)d

ie there are only few summands compared to the total dimension dn There arethe following well-known bounds on the dimensions of the irreducible representa-tions [λ] and Sλ(Cd) that make up the summands

nprodd`=1(λ` + dminus `)

le dim[λ] le nprodd`=1 λ`

(62)

dimSλ(Cd) le (n+ 1)d(dminus1)2 (63)

Let p isin Rn be a probability vector iesumn

i=1 pi = 1 and pi ge 0 for i isin [n]Let H(p) be the Shannon entropy of the probability vector p

H(p) =nsumi=1

pi log2

1

pi

For α isin [0 1] let h(α) = H((α 1 minus α)) be the binary entropy For a partitionλ = (λ1 λ`) ` n let λ = λn = (λ1n λ`n) be the probability vectorobtained by normalising λ

90 Chapter 6 Universal points in the asymptotic spectrum of tensors

Let λ ` n For N isin N let Nλ = (Nλ1 Nλ2 Nλ`) be the stretchedpartition We see that asymptotically in the stretching factor N the dimensionof [Nλ] behaves like a multinomial coefficient and

2NnH(λ)minuso(N) le dim[Nλ] le 2NnH(λ) (64)

63 Kronecker and LittlewoodndashRichardson coef-

ficients gλmicroν cλmicroν

Let micro ν ` n Let Sn rarr SntimesSn π 7rarr (π π) be the diagonal embedding Considerthe decomposition of the tensor product [micro] otimes [ν] restricted along the diagonalembedding

[micro]otimes [ν] darrSntimesSnSnsim=otimesλ`n

HomSn([λ] [micro]otimes [ν])otimes [λ]

Define the Kronecker coefficient

gλmicroν = dim HomSn([λ] [micro]otimes [ν])

ie gλmicroν is the multiplicity of [λ] in [micro]otimes [ν]

Let λ `a+b Let GLa timesGLb rarr GLa+b (AB) 7rarr AoplusB be the block-diagonalembedding Consider the decomposition of the representation Sλ(Ca+b) restrictedalong the block-diagonal embedding

Sλ(Ca+b) darrGLa+b

GLatimesGLbsim=oplusmicro`aν`b

Hλmicroν otimes Smicro(Ca)otimes Sν(Cb)

with

Hλmicroν = HomGLatimesGLb(Smicro(Ca)otimes Sν(Cb)Sλ(Ca+b))

Define the LittlewoodndashRichardson coefficient cλmicroν = dimHλmicroν

For partitions λ λprime ` define λ + λprime elementwise The Kronecker and theLittlewoodndashRichardson coefficients have the following semigroup property (seeeg [CHM07])

Lemma 61 Let λ micro ν α β γ ` be partitions

(i) If gλmicroν gt 0 and gαβγ gt 0 then gλ+α micro+β ν+γ gt 0

(ii) If cλmicroν gt 0 and cαβγ gt 0 then cλ+αmicro+β ν+γ gt 0

64 Entropy inequalities 91

64 Entropy inequalities

The semigroup properties imply the following lemma Of this lemma the firststatement can be found in a paper by Christandl and Mitchison [CM06] while wedo not know of any source that explicitly states the second statement For theconvenience of the reader we give the proofs of both statements

Lemma 62 Let λ micro ν ` be partitions

(i) If gλmicroν gt 0 then H(λ) le H(micro) +H(ν)

(ii) If cλmicroν gt 0 then H(λ) le |micro||micro|+|ν|H(micro) + |ν|

|micro|+|ν|H(ν) + h( |micro||micro|+|ν|

)

Proof (i) Let gλmicroν gt 0 Suppose λ micro ν ` n Let N isin N Then Lemma 61implies gNλNmicroNν gt 0 This means HomSnN ([Nλ] [Nmicro]otimes [Nν]) 6= 0 which impliesdim[Nλ] le dim[Nmicro] dim[Nν] From (64) we have the dimension bounds

2NnH(λ)minuso(N) le dim[Nλ]

dim[Nmicro] le 2NnH(micro)

dim[Nν] le 2NnH(ν)

Thus NnH(λ) minus o(N) le NnH(micro) + NnH(ν) Divide by Nn and let N go toinfinity to get H(λ) le H(micro) +H(ν)

(ii) We restrict the decomposition

(Ca+b)otimesn sim=oplusλ`a+bn

[λ]otimes Sλ(Ca+b)

along the block-diagonal embedding to get

(Ca+b)otimesn darrGLa+b

GLatimesGLbsim=otimesλ`a+bn

[λ]otimes Sλ(Ca+b) darrGLa+b

GLatimesGLb

sim=oplusλ`a+bn

[λ]otimesoplusmicro`aν`b

Ccλmicroν otimes Smicro(Ca)otimes Sν(Cb)

sim=oplusmicro`aν`b

(oplusλ`a+bn

[λ]otimes Ccλmicroν)otimes Smicro(Ca)otimes Sν(Cb)

On the other hand

(Ca+b)otimesn darr sim= (Ca oplus Cb)otimesn darrsim= (Ca)otimesn oplus ((Ca)otimesnminus1 otimes Cb)oplus middot middot middot oplus (Cb)otimesn darr

sim=noplusk=0

C(nk) otimesoplusmicro`ak

([micro]otimes Smicro(Ca))otimesoplus

ν`bnminusk

([ν]otimes Sν(Cb))

92 Chapter 6 Universal points in the asymptotic spectrum of tensors

sim=noplusk=0

oplusmicro`akν`bnminusk

(C(nk) otimes [micro]otimes [ν]

)otimes Smicro(Ca)otimes Sν(Cb)

Suppose cλmicroν gt 0 Comparing the above expressions gives the inequality dim[λ] le(n|micro|

)dim[micro] dim[ν] By the semigroup property Lemma 61 we have cNλNmicroNν gt 0

for all N isin N Thus dim[Nλ] le(NnN |micro|

)dim[Nmicro] dim[Nν] for all N isin N Then

from (64) follows

2NnH(λ)minuso(N) le 2Nnh(|micro|n

)2N |micro|H(micro)2N |ν|H(ν)

We conclude H(λ) le h( |micro|n

) + |micro|nH(micro) + |ν|

nH(ν)

Let x = (x(1) x(2) x(3)) be a triple of probability vectors x(i) isin Rni Let θ isin Θbe a weighting Let Hθ(x) be the θ-weighted average of the Shannon entropies ofthe probability vectors x(1) x(2) and x(3)

Hθ(x) = θ(1)H(x(1)) + θ(2)H(x(2)) + θ(3)H(x(3))

(Note that this notation is slightly different from the notation used in Chapter 4)We will use the notation λ `3 n to say that λ is a triple of partitions of n ie λequals (λ(1) λ(2) λ(3)) where each λ(i) is a partition of n We write λ for the

normalised triple (λ(1) λ(2) λ(3))

Lemma 63 Let λ micro ν `3 be three triples of partitions

(i) If gλ(i)micro(i)ν(i) gt 0 for all i then 2Hθ(λ) le 2Hθ(micro)2Hθ(ν)

(ii) If micro `3 m ν `3 nminusm and cλ(i)

micro(i)ν(i) gt 0 for all i then 2Hθ(λ) le 2Hθ(micro) + 2Hθ(ν)

Proof (i) Suppose gλ(i)micro(i)ν(i) gt 0 for all i Then H(λ(i)) le H(micro(i)) +H(ν(i)) for

all i by Lemma 62 Thussum

i θ(i)H(λ(i)) lesum

i θ(i)H(micro(i))+sum

i θ(i)H(ν(i)) Then

Hθ(λ) le Hθ(micro) +Hθ(ν) We conclude 2Hθ(λ) le 2Hθ(micro)2Hθ(ν)

(ii) Suppose cλ(i)

micro(i)ν(i) gt 0 for all i Then H(λ(i)) le mnH(micro(i))+ nminusm

nH(ν(i))+h(m

n)

by Lemma 62 We take the θ-weighted average to get Hθ(λ) le mnHθ(micro) +

nminusmmHθ(ν) + h(m

n) We conclude 2Hθ(λ) le 2Hθ(micro) + 2Hθ(ν) by Lemma 49(iv)

65 Hilbert spaces and density operators

Endow the vector space Cn with a hermitian inner product (one may take thestandard hermitian inner product 〈u v〉 =

sumni=1 uivi for u v isin Cn where middot denotes

taking the complex conjugate) so that it is a Hilbert space

66 Moment polytopes P(t) 93

Let (V1 〈middot middot〉) and (V2 〈middot middot〉) be Hilbert spaces On V1 oplus V2 we define the innerproduct by 〈u1 oplus u2 v1 oplus v2〉 = 〈u1 v1〉+ 〈u2 v3〉 On V1 otimes V2 we define the innerproduct by 〈u1 otimes u2 v1 otimes v2〉 = 〈u1 v1〉〈u2 v2〉 and extending linearly

Let V be a Hilbert space A positive semidefinite hermitian operator ρ V rarr Vwith trace one is called a density operator The sequence of eigenvalues of a densityoperator ρ is a probability vector Let spec(ρ) = (p1 pn) be the sequence ofeigenvalues of ρ ordered non-increasingly p1 ge middot middot middot ge pn

Let V1 and V2 be Hilbert spaces Given a density operator ρ on V1 otimes V2the reduced density operator ρ1 = tr2 ρ is uniquely defined by the property thattr(ρ1X1) = tr(ρ(X1otimes IdV2)) for all operators X1 on V1 The operator ρ1 is again adensity operator The operation tr2 is called the partial trace over V2 Explicitly ρ1

is given by 〈ei ρ1(ej)〉 =sum

`〈ei otimes f` ρ(ej otimes f`)〉 where the ei are some basis of V1

and the fi are some basis of V2 (the statement is independent of basis choice)Let Vi be a Hilbert space and consider the tensor product V1otimesV2otimesV3 Associate

with t isin V1 otimes V2 otimes V3 the dual element tlowast = 〈t middot〉 isin (V1 otimes V2 otimes V3)lowast Then

ρt = ttlowast〈t t〉 = t〈t middot〉〈t t〉

is a density operator on V1 otimes V2 otimes V3 Viewing ρt as a density operator on theregrouped space V1otimes (V2otimes V3) we may take the partial trace of ρt over V2otimes V3 asdescribed above We denote the resulting density operator by ρt1 = tr23 ρ

t Wesimilarly define ρt2 = tr13 ρ

t and ρt3 = tr12 ρt

66 Moment polytopes P(t)

We give a brief introduction to moment polytopes We refer to [Nes84 Bri87Fra02 Wal14] for more information We begin with the general setting and thenspecialise to orbit closures in tensor spaces

661 General setting

Let G be a connected reductive algebraic group (We refer to Kraft [Kra84] andHumphreys [Hum75] for an introduction to algebraic groups) Fix a maximal torusT sube G and a Borel subgroup T sube B sube G We have the character group X(T ) theWeyl group W the root system Φ sube X(T ) and the system of positive roots Φ+ sube ΦFor λ micro isin X(T ) we set λ 4 micro if micro minus λ is a sum of positive roots Let V bea rational G-representation The restriction of the action of G to T gives adecomposition

V =oplus

λisinX(T )

Vλ Vλ = v isin V forallt isin T t middot v = λ(t)v

This decomposition is called the weight decomposition of V The λ isin X(T )with Vλ 6= 0 are called the weights of V with respect to T The Vλ are the

94 Chapter 6 Universal points in the asymptotic spectrum of tensors

weight spaces of V For v isin V let vλ be the component of v in Vλ Letsupp(v) = λ vλ 6= 0

Let E be the real vector space E = X(T ) otimes R The Weyl group W actson X(T ) and thus on E We enlarge 4 to a partial order on E as follows Forx y isin E let x 4 y if y minus x is a nonnegative linear combination of positive rootsLet D sube E be the positive Weyl chamber For every x isin E the orbit W middot xintersects the positive Weyl chamber D in exactly one point which we denote bydom(x)

Let V be a finite-dimensional rational G-module Let χ isin X(T ) cap D bea dominant character We denote the χ-isotypical component of V with V(χ)Let Z sube V be a Zariski closed set We denote the coordinate ring of Z with C[Z]We denote the degree d part of C[Z] with C[Z]d If Z is G-stable then C[Z]d is aG-module

Definition 64 Let V be a rational G-module and Z sube V a nontrivial irreducibleclosed G-stable cone The moment polytope of Z denoted by

P(Z)

is defined as the Euclidean closure in E of the set

R(Z) = χd (C[Z]d)(χlowast) 6= 0

of normalised characters χd for which the χlowast-isotypical component (C[Z]d)(χlowast) isnot zero

Theorem 65 (MumfordndashNess [Nes84] Brion [Bri87] Franz [Fra02]) The momentpolytope is indeed a convex polytope and it is equal to the image of the so-calledmoment map intersected with the positive Weyl chamber

P(Z) = micro(Z 0) capD

Let Z = G middot v be the orbit closure (in the Zariski topology) of a vector v isin V 0and suppose G middot v is a cone

Lemma 66 (See eg [Str05]) Suppose G middot v is a cone Then

R(G middot v) = χd (C[G middot v]d)(χlowast) 6= 0= χd (lin(G middot votimesd))(χ) 6= 0

662 Tensor spaces

We specialise to 3-tensors Let V = V1 otimes V2 otimes V3 with Vi = Cni Let

G = GLn1 timesGLn2 timesGLn3

67 Quantum functionals F θ(t) 95

T = T1 times T2 times T3

with Ti the diagonal matrices in GLni The weight decomposition of V is thedecomposition with respect to the standard basis elements ex1 otimes ex2 otimes ex3 wherex isin [n1]times [n2]times [n3] The support supp(v) is the support of v with respect to thestandard basis

In the current setting there is a beautiful rephrasing of Theorem 65 in termsof ordered spectra of reduced density matrices Recall from Section 65 that forv isin V 0 we have a density matrix ρv and reduced density matrices ρvi of whichwe may take the non-increasingly ordered spectra spec(ρvi )

Theorem 67 (WalterndashDoranndashGrossndashChristandl [WDGC13]) Let Z sube V be anontrivial irreducible closed G-stable cone Then

P(Z) = (spec ρz1 spec ρz2 spec ρz3) z isin Z 0

Let v isin V 0 We consider the moment polytope of the orbit closure Z = G middot vIn this setting Lemma 66 specialises to the following

Lemma 68 (See eg [Str05])

R(G middot v) = χd (C[G middot v]d)(χlowast) 6= 0= χd (lin(G middot votimesd))(χ) 6= 0= χd Pχv

otimesd 6= 0

where Pχ = Pχ(1) otimes Pχ(2) otimes Pχ(3) with Pχ(i) V otimesdi rarr V otimesdi the projector onto the

isotypical component of type χ(i) discussed in Section 62

On the other hand Theorem 67 immediately gives a description of the momentpolytope P(G middot v) in terms of ordered spectra of reduced density matrices

Theorem 69 Let v isin V 0 Then

P(G middot v) = (spec ρu1 spec ρu2 spec ρu3) u isin G middot v 0

Summarising we have two descriptions of the moment polytope a represen-tation-theoretic or invariant-theoretic description (Lemma 68) and a quantummarginal spectra description (Theorem 69) These two descriptions are the keyto proving the properties of the quantum functionals that we need

67 Quantum functionals F θ(t)

We will now define the quantum functionals and prove that they are universalspectral points

96 Chapter 6 Universal points in the asymptotic spectrum of tensors

Let p = (p1 pn) isin Rn be a probability vector iesumn

i=1 pi = 1 andpi ge 0 for all i isin [n] Recall that H(p) denotes the Shannon entropy of theprobability vector p H(p) =

sumni=1 pi log2 1pi Let x = (x(1) x(2) x(3)) be a

triple of probability vectors x(i) isin Rni Let θ isin Θ be a weighting Recallthat Hθ(x) denotes the θ-weighted average of the Shannon entropies of the threeprobability vectors x(1) x(2) x(3)

Hθ(x) = θ(1)H(x(1)) + θ(2)H(x(2)) + θ(3)H(x(3))

Let V = Cn1 otimes Cn2 otimes Cn3 Let G = GLn1 timesGLn2 timesGLn3 Let v isin V 0 Weuse the notation P(v) = P(G middot v) for the moment polytope of the orbit closureof v

Definition 610 For θ isin Θ and v isin V 0 let

F θ(v) = max2Hθ(x) x isin P(v)

Let F θ(0) = 0 We call the functions F θ the quantum functionals The namequantum functional comes from the fact that the moment polytope P(t) consistsof triples of quantum marginal entropies

Theorem 611 Let T be the semiring of 3-tensors over C Let 6 be the restrictionpreorder For θ isin Θ

F θ isin X(T 6)

In other words F θ is a semiring homomorphism T rarr Rge0 which is monotoneunder degeneration 6 In fact F θ is monotone under degeneration

Remark 612 The results in this chapter generalise to k-tensors over C In ourpaper [CVZ18] we discuss this general situation in detail and make a distinctionbetween upper quantum functionals and lower quantum functionals

Let p isin Rn and q isin Rm be probability vectors The tensor product potimesq isin Rnm

defined by

potimes q = (piqj i isin [n] j isin [m])

is a probability vector The direct sum poplus q isin Rn+m defined by

poplus q = (p1 pn q1 qm)

is a probability vectorLet x = (x(1) x(2) x(3)) and y = (y(1) y(2) y(3)) be triples of probability vectors

We define the tensor product xotimes y elementwise

xotimes y = (x(1) otimes y(1) x(2) otimes y(2) x(3) otimes y(3))

67 Quantum functionals F θ(t) 97

We define the direct sum xoplus y elementwise

xoplus y = (x(1) oplus y(1) x(2) oplus y(2) x(3) oplus y(3))

For x otimes y and x oplus y to be in the moment polytope we will need to reorder thecomponents non-increasingly For a triple of probability vectors x = (x(1) x(2) x(3))let

dom(x)

be the triple of probability vectors obtained from x be reordering the compo-nents x(i) such that they become non-increasing Let dom(S) = dom(x) x isin S

For v isin Cn1otimesCn2otimesCn3 we will use the notation G(v) = GLn1timesGLn2timesGLn3

to denote the group that naturally corresponds to the space that v lives in Wewill use the notation P(v) = P(G(v) middot v) for the moment polytope of the orbitclosure of v

Theorem 613 Let s isin Cn1 otimes Cn2 otimes Cn3 and t isin Cm1 otimes Cm2 otimes Cm3

(i) dom(P(s)otimesP(t)

)sube P(sotimes t)

(ii) forallα isin [0 1] dom(αP(s)oplus (1minus α) P(t)

)sube P(soplus t)

(iii) If s t isin Cn1 otimes Cn2 otimes Cn3 0 and s isin G(t) middot t then P(s) sube P(t)

(iv) P(soplus 0) = P(s)oplus 0

(v) P(〈1〉) = ((1) (1) (1)) with 〈1〉 = e1 otimes e1 otimes e1 isin C1 otimes C1 otimes C1

Proof To prove statements (i) and (ii) let x isin P(s) and y isin P(t) Then thereare elements a isin G(s) middot s and b isin G(t) middot t with ordered marginal spectra x and y

x = (spec ρa1 spec ρa2 spec ρa3)

y = (spec ρb1 spec ρb2 spec ρb3)

We prove statement (i) We have aotimes b isin G(sotimes t) middot sotimes t Thus

dom(xotimes y) = (spec ρaotimesb1 spec ρaotimesb2 spec ρaotimesb3 ) isin P(sotimes t)

We conclude dom(P(s)otimesP(t)) sube P(sotimes t) We prove statement (ii) Let α isin [0 1]Define the tensor u(α) isin Cn1+m1 otimes Cn2+m2 otimes Cn3+m3 by

u(α) =

radicαradic〈s s〉

aoplusradic

1minus αradic〈t t〉

b

Then u(α) isin G(soplus t) middot soplus t We have ρu(α)i = αρai oplus (1 minus α)ρbi From the

observation

spec(αρai oplus (1minus α)ρbi) = dom(αxoplus (1minus α)y)

98 Chapter 6 Universal points in the asymptotic spectrum of tensors

follows dom(αxoplus (1minus α)y) isin P(G(soplus t) middot soplus t) We conclude

dom(αP(s)oplus (1minus α)P(t)) sube P(soplus t)

We have thus proven statement (i) and (ii)We prove statement (iii) Let G = G(t) = G(s) Let s isin G middot t Then

G middot s sube G middot t so we have a G-equivariant restriction map C[G middot s] C[G middot t] onthe coordinate rings Let χd isin R(G middot s) with (C[G middot s]d)(χlowast) 6= 0 Then also(C[G middot t]d)(χlowast) 6= 0 by Schurrsquos lemma Thus χd isin R(G middot t) sube P(G middot t) Weconclude P(s) sube P(t)

We prove statement (iv) Let χd isin R(G(soplus 0) middot (soplus 0)) with Pχ(soplus0)otimesd 6= 0Recall from Section 62 that Pχ is given by the action of an element in the groupalgebra C[Sd] which we also denoted by Pχ From this viewpoint we see that also

Pχsotimesd 6= 0 So χd isin R(G(s) middot s)Statement (v) is a direct observation

Corollary 614

(i) F θ(s)F θ(t) le F θ(sotimes t)

(ii) F θ(s) + F θ(t) le F θ(soplus t)

(iii) If s t then F θ(s) le F θ(t)

(iv) F θ(〈1〉) = 1

Proof (i) Let x isin P(s) and y isin P(t) Then xotimesy isin P(sotimest) by Theorem 613 It isa basic fact that Hθ(x)+Hθ(y) = Hθ(xotimesy) (Lemma 49) so 2Hθ(x)2Hθ(y) = 2Hθ(xotimesy)We conclude F θ(s)F θ(t) le F θ(sotimes t)

(ii) Let x isin P(s) and y isin P(t) Then by Theorem 613 for all α isin [0 1]

dom(αxoplus (1minus α)y) isin P(soplus t)

It is a basic fact that αHθ(x) + (1 minus α)Hθ(y) + h(α) = Hθ(αx oplus (1 minus α)y)(Lemma 49) Thus for any α isin [0 1] we have 2αHθ(x)+(1minusα)Hθ(y)+h(α) le F θ(soplus t)Using Lemma 49(iv) we conclude F θ(s) + F θ(t) le F θ(soplus t)

(iii) This follows from statement (iii) and (iv) of Theorem 613 since bydefinition degeneration s t means soplus 0 isin G(toplus 0) middot (toplus 0)

(iv) This follows from statement (v) of Theorem 613

67 Quantum functionals F θ(t) 99

Theorem 615

(i) R(sotimes t) sube λN existmicroN isin R(s) νN isin R(t) gλ(i)micro(i)ν(i) gt 0 for all i

(ii) R(soplus t) sube λN existmicrom isin R(s) ν(N minusm) isin R(t) cλ(i)

micro(i)ν(i) gt 0 for all i

Proof (i) Let s isin V1 otimes V2 otimes V3 and let t isin W1 otimesW2 otimesW3 Let λN isin R(sotimes t)with Pλ(sotimes t)otimesN 6= 0 Let π be the natural reordering map

π ((V1 otimesW1)otimes (V2 otimesW2)otimes (V3 otimesW3))otimesN

rarr (V1 otimes V2 otimes V3)otimesN otimes (W1 otimesW2 otimesW3)otimesN

Then

(sotimes t)otimesN =summicroν

πminus1(Pmicro otimes Pν)π(sotimes t)otimesN

Let micro ν `3 N with Pλπminus1(Pmicro otimes Pν)π(s otimes t)otimesN 6= 0 Then Pmicros

otimesN 6= 0 andPνt

otimesN 6= 0 ie microN isin R(s) and νN isin R(t) Moreover Pλπminus1(Pmicro otimes Pν)π 6= 0

which means the Kronecker coefficients gλ(i)micro(i)ν(i) are nonzero(ii) Let λN isin R(soplus t) with Pλ(soplus t)otimesN 6= 0 Let us expand (soplus t)otimesN as

(soplus t)otimesN = sotimesN oplus (sotimesNminus1 otimes t)oplus middot middot middot oplus totimesN

Then Pλ does not vanish on some summand which we may assume to be of theform sotimesm otimes totimesNminusm Let π be the natural projection

π ((V1 oplusW1)otimes (V2 oplusW2)otimes (V3 oplusW3))otimesN

rarr (V1 otimes V2 otimes V3)otimesm otimes (W1 otimesW2 otimesW3)otimesNminusm

Let micro ν with Pλπminus1(Pmicro otimes Pν)π(soplus t)otimesN 6= 0 Then Pmicros

otimesm 6= 0 and PνtotimesNminusm 6= 0

Moreover Pλπminus1(Pmicro otimes Pν)π 6= 0 Therefore the LittlewoodndashRichardson coeffi-

cients cλ(i)

micro(i)ν(i) are nonzero

Corollary 616

(i) F θ(sotimes t) le F θ(s)F θ(t)

(ii) F θ(soplus t) le F θ(s) + F θ(t)

Proof (i) Let λN isin R(s otimes t) By Theorem 615 there is a microN isin R(s) and aνN isin R(t) such that the Kronecker coefficient gλ(i)micro(i)ν(i) is nonzero for every i

Then 2Hθ(micro) le F θ(s) and 2Hθ(ν) le F θ(t) by definition of F θ The Kroneckercoefficients being nonzero implies

2Hθ(λ) le 2Hθ(micro)2Hθ(ν)

100 Chapter 6 Universal points in the asymptotic spectrum of tensors

by Lemma 63 We conclude F θ(sotimes t) le F θ(s)F θ(t)

(ii) Let λN isin R(s oplus t) Then by Theorem 615 there are microm isin R(s) and

ν(N minus m) isin R(t) such that the LittlewoodndashRichardson coefficient cλ(i)

micro(i)ν(i) isnonzero for every i This means

2Hθ(λ) le 2Hθ(micro) + 2Hθ(ν)

by Lemma 63 We conclude F θ(soplus t) le F θ(s) + F θ(t)

Proof of Theorem 611 Corollary 614 and Corollary 616 together prove The-orem 611

68 Outer approximation

In this section we discuss an outer approximation of P(t) We will use this outerapproximation to show that the quantum functionals are at most the supportfunctionals

Let 4 be the dominance order ie majorization order on triples of probabilityvectors For any set S sube Rn1 times Rn2 times Rn3 of triples of probability vectors let S4

denote the upward closure with respect to 4

S4 = y isin Rn1 times Rn2 times Rn3 existx isin S x 4 y

Let conv(S) denote the convex hull of S in Rn1 times Rn2 times Rn3 Recall that for x isinS we defined dom(x) as the triple of probability vectors obtained from x =(x(1) x(2) x(3)) by reordering the components x(i) such that they become non-increasing and dom(S) = dom(x) x isin S

Theorem 617 (Strassen [Str05]) Let v isin V 0 Then

P(v) sube (dom conv supp v)4 (65)

Proof We give the proof for the convenience of the reader Let χd isin R(G middot v)Then (lin(G middot votimesd))(χ) 6= 0 Let Mχ sube lin(G middot votimesd) be a simple G-submodule withhighest weight χ Let N sube V otimesd be the G-module complement N oplusMχ = V otimesdThen votimesd is not in N Let v =

oplusγisinsupp v vγ be the weight decomposition Then votimesd

is a sum of tensor products of the vγ At least one summand is not in N say ofweight η =

sumγ dγγ with

sumγ dγ = d The projection V otimesd rarrMχ along N maps this

summand onto a nonzero weight vector of weight η So η is a weight of Mχ Thenalso dom(η) is a weight of Mχ Since χ is the highest weight of Mχ dom(η) 4 χ

Then dom(ηd) 4 χd We have ηd =sum

γdγdγ isin conv supp v We conclude

R(G middot v) sube (dom conv supp v)4 and thus P(G middot v) sube (dom conv supp v)4

69 Inner approximation for free tensors 101

69 Inner approximation for free tensors

In this section we discuss an inner approximation for the moment polytope of afree tensor We will use this inner approximation in the next section to prove thatthe quantum functionals coincide with the support functionals when restricted tofree tensors We will prove that not all tensors are free

We say a set Φ sube [n1] times [n2] times [n3] is free if every two different elementsof Φ differ in at least two coordinates in other words if the elements of Φ haveHamming distance at least two We say v isin V = Cn1 otimes Cn2 otimes Cn3 is free if forsome g isin G(v) = GLn1 timesGLn2 timesGLn3 the support supp(g middot v) sube [n1]times [n2]times [n3]is free (Free is called schlicht in [Str05])

Theorem 618 (Strassen [Str05]) Let v isin V 0 with supp(v) free Then

dom conv supp v sube P(v)

Proof We refer to [Str05]

Corollary 619 Let v isin V 0 with supp(v) free Then

P(v)4 =(dom conv supp v

)4

Proof By Theorem 618 dom conv supp v sube P(v) We take the upward closureon both sides to get (dom conv supp v)4 sube P(v)4 On the other hand fromTheorem 617 follows P(v)4 sube (dom conv supp v)4

Remark 620 Recall that v isin V is oblique if the support supp(g middot v) is anantichain for some g isin G(v) (Section 44) Such antichains are free so obliquetensors are free Thus tight sube oblique sube free Like the tight tensors andoblique tensors free tensors from a semigroup under otimes and oplus

Proposition 621 For n ge 5 there exists a tensor that is not free in CnotimesCnotimesCn

Proof We upper bound the maximal size of a free support Let Φ sube [n]times [n]times [n]be free Any two distinct elements in Φ are still distinct if we forget the thirdcoefficient of each Therefore |Φ| = |(α1 α2) α isin Φ| le n2 (This is a specialcase of the Singleton bound [Sin64] from coding theory This upper bound is tightsince Φ = (a b c) a b c isin [n] c = a+ b mod n is free and has size n2) Secondwe apply the following observation of Burgisser [Bur90 page 3] Let

Zn = t isin Cn otimes Cn otimes Cn existg isin G(t) |supp(g middot t)| lt n3 minus 3n2

Let Yn = Cn otimes Cn otimes Cn Zn Then the set Yn is Zariski open and nonemptyNow let n ge 5 and let t isin Yn Then forallg isin G(t) |supp(g middot t)| ge n3 minus 3n2 gt n2 Weconclude t is not free

102 Chapter 6 Universal points in the asymptotic spectrum of tensors

610 Quantum functionals versus support func-

tionals

We discussed the support functionals ζθ isin X(oblique 3-tensors over F) in Chap-ter 4 We recall its definition over C Let V = Cn1otimesCn2otimesCn3 For θ isin Θ = P([3])and t isin V 0 with supp(t) oblique

ζθ(t) = max2Hθ(P ) P isin P(supp(t))

We also discussed an extension of ζθ to all 3-tensors over C the upper supportfunctional

ζθ(t) = mingisinG(t)

max2Hθ(P ) P isin P(supp(g middot t))

We know ζθ(s otimes t) le ζθ(s)ζθ(t) ζθ(s oplus t) = ζθ(s) + ζθ(t) ζθ(〈1〉) = 1 ands 6 trArr ζθ(s) le ζθ(t) for any s t isin V

The set conv supp(g middot t) is the set of marginals of probability distributions onsupp(g middot t) Thus dom conv supp(g middot t) is the set of ordered marginals of probabilitydistributions on supp(g middot t) Therefore

ζθ(t) = mingisinG(t)

maxxisinS(gmiddott)

2Hθ(x)

with S(w) = dom conv suppw Let X sube Rn1 times Rn2 times Rn3 be a set of triples ofprobability vectors From Schur-convexity of the Shannon entropy function followsmaxxisinX 2Hθ(x) = maxxisinX4 2Hθ(x) Also Hθ(x) = Hθ(domx)

Theorem 622 ζθ(t) ge F θ(t)

Proof Let g isin G(t) such that

maxxisinS

2Hθ(x) = ζθ(t)

with S = dom conv supp(g middot t) We have

maxxisinS

2Hθ(x) = maxxisinS4

2Hθ(x)

By Theorem 617 P(t) sube S4 We conclude F θ(t) le ζθ(t)

Theorem 623 Let t isin V be free Then ζθ(t) = F θ(t)

Proof We know from Theorem 622 that ζθ(t) ge F θ(t) We prove ζθ(t) le F θ(t)Let g isin G(t) such that supp(g middot t) is free Let S = dom conv supp(g middot t) Thenζθ(t) le maxxisinS 2Hθ(x) = maxxisinS4 2Hθ(x) By Theorem 618 we have S4 = P(t)4We conclude ζθ(t) le F θ(t)

611 Asymptotic slice rank 103

We can show that the regularised upper support functional equals the quantumsupport functional As a consequence the quantum functional is at least the lowersupport functional which was discussed in Chapter 4

Theorem 624 limnrarrinfin1nζθ(totimesn)1n = F θ(t)

Proof We refer the reader to [CVZ18]

Corollary 625 F θ(v) ge ζθ(v)

Proof By Theorem 624 F θ(v) = limnrarrinfin ζθ(votimesn)1n We know ζθ(v) ge ζθ(v)

by Theorem 415 and thus limnrarrinfin ζθ(votimesn)1n ge limnrarrinfin ζθ(v

otimesn)1n The lowersupport functional ζθ is supermultiplicative under otimes (Theorem 414) so

limnrarrinfin

ζθ(votimesn)1n ge ζθ(v)

Combining these three inequalities proves the theorem

611 Asymptotic slice rank

We proved in Section 46 that for oblique t isin Fn1 otimes Fn2 otimes Fn3 the asymptotic slicerank limnrarrinfin SR(totimesn)1n exists and equals minθisinΘ ζ

θ(t) with Θ = P([3]) In thissection we prove the analogous statement for the quantum functionals

Theorem 626 Let t isin Cn1 otimes Cn2 otimes Cn3 Then

limnrarrinfin

SR(totimesn)1n = minθisinΘ

F θ(t)

We work towards the proof of Theorem 626 Let t isin Cn1 otimesCn2 otimesCn3 0 LetEθ(t) = log2 F

θ(t)

Lemma 627 For any ε gt 0 there is an n0 isin N such that for all n ge n0 there isa λn isin R(t) with miniisin[3]H(λ(i)) ge minθisinΘE

θ(t)minus ε

Proof By definition

minθisinΘ

Eθ(t) = minθisinΘ

maxxisinP(t)

sumjisin[3]

θ(j)H(x(j))

By Von Neumannrsquos minimax theorem the right-hand side equals

maxxisinP(t)

minθisinΘ

sumjisin[3]

θ(j)H(x(j))

which equals

maxxisinP(t)

minjisin[3]

H(x(j))

104 Chapter 6 Universal points in the asymptotic spectrum of tensors

Let ε gt 0 Let microm isin R(t) with minjisin[3] H(micro(j)) ge minθisinΘ Eθ(t)minus ε2 We will

use two facts We have (P(1) otimes P(1) otimes P(1))t = t 6= 0 The triples of partitions λwith Pλt

otimesn 6= 0 for some n form a semigroup Let n isin N We can write n = qm+rwith q r isin N 0 le r lt m Let λ(j) = qmicro(j) + (r) Then by the semigroup property

Pλtotimesn 6= 0 ie λn isin R(t) We have 1

n(qmicro(j) + (r)) = qm

nmicro(j) + r

n(r) By concavity

of Shannon entropy

H( 1n(qmicro(j) + (r))) = H( qm

nmicro(j) + r

n(r))

ge qmnH(micro(j))

ge (1minus mn

)H(micro(j))

When n is large enough (1minus mn

)H(micro(j)) is at least H(micro(j))minus ε2 Let n0 isin N suchthat this is the case for all j isin [3]

Lemma 628 Let λn isin R(t) Then SR(totimesn) ge miniisin[3] dim[λ(i)]

Proof We have the restriction totimesn ge Pλtotimesn 6= 0 Choose rank-one projections Aj

in the vector spaces Sλ(j)(Cnj) with

s = (id[λ(1)]otimesA1)otimes (id[λ(2)]otimesA2)otimes (id[λ(3)]otimesA3)Pλtotimesn 6= 0

The tensor s is invariant under Sn acting diagonally on (Cn1)otimesnotimes(Cn2)otimesnotimes(Cn3)otimesnThus the marginal spectra spec ρsi are uniform This implies s is semistableFrom [BCC+17 Theorem 46] follows that SR(s) equals miniisin[3] dim[λ(i)]

Lemma 629 lim infnrarrinfin SR(totimesn)1n ge minθisinΘ Fθ(t)

Proof Let ε gt 0 For n large enough choose λn isin R(t) as in Lemma 627 ByLemma 628 SR(totimesn) ge miniisin[3] dim[λ(i)] The right-hand side we lower bound by

miniisin[3]

dim[λ(i)] ge miniisin[3]

2nH(λ(i))2minuso(n) ge 2n(minθisinΘ Eθ(t)minusε)2minuso(n)

Then lim infnrarrinfin SR(totimesn)1n ge 2minθisinΘ Eθ(t)minusε

Lemma 630 lim supnrarrinfin SR(totimesn)1n le F θ(t)

Proof Let n isin N Define s1 s2 s3 isin Cn1 otimes Cn2 otimes Cn3 by

s1 =(sumλ(1)`n

H(λ(1))leEθ(t)

Pλ(1) otimes Idotimes Id)totimesn

s2 =(sumλ(2)`n

H(λ(2))leEθ(t)

Idotimes Pλ(2) otimes Id)

(totimesn minus s1)

612 Conclusion 105

s3 =(sumλ(3)`n

H(λ(3))leEθ(t)

Idotimes Idotimes Pλ(3)

)(totimesn minus s1 minus s2)

Then totimesn = s1 +s2 +s3 The slice rank of an element in the image of Pλ(1)otimes Idotimes Id

is at most dim[λ(1)] otimes Sλ(1)(Cn1) which is at most 2nH(λ(1))+o(n) (Section 62)Similarly for Id otimes Pλ(2) otimes Id and Id otimes Id otimes Pλ(3) The tensor s1 is in the imageof the sum

sumλ(1) Pλ(1) otimes Id otimes Id over λ(1) ` n with at most n1 parts There are

at most (n+ 1)n1 such partitions Thus SR(s1) le (n+ 1)n12nEθ(t)+o(n) Similarly

for s2 and s3 Therefore

lim supnrarrinfin

SR(totimesn)1n le lim supnrarrinfin

(3(n+ 1)maxiisin[3] ni 2nE

θ(t)+o(n))1n

(66)

The right-hand side of (66) equals F θ(t)

Proof of Theorem 626 Lemma 629 and Lemma 630 together prove Theo-rem 626

612 Conclusion

In this chapter we constructed the first infinite family of spectral points for 3-tensors over C the quantum functionals For 30 years the only explicit spectralpoints known were the gauge points The constructions in this chapter naturallygeneralise to higher-order tensors for which we refer to our paper [CVZ18] Wedo not know whether the quantum functionals are all spectral points for 3-tensorsover C Finally we showed that for complex tensors the asymptotic slice rankexists and equals the minimum value over the quantum functionals

Chapter 7

Algebraic branching programsapproximation and nondeterminism

This chapter is based on joint work with Karl Bringmann and Christian

Ikenmeyer [BIZ17]

71 Introduction

The study of asymptotic tensor rank in previous chapters was originally motivatedby the study of the complexity of matrix multiplication in the algebraic circuitmodel an algebraic model of computation In this chapter we will study severalother algebraic models of computation and algebraic complexity classes

Formulas the class VPe and the determinant

An (arithmetic) formula is a rooted binary tree whose leaves are each labeledwith a variable or a field constant and whose root and intermediate vertices arelabeled with either + (addition) or times (multiplication) In the natural way viarecursion over the tree structure a formula computes a multivariate polynomial f The formula size of a multivariate polynomial f is the smallest number of verticesrequired for any formula to compute f Here is an example of a formula of size 7computing the polynomial (3 + x)(3 + y)

3 x 3 y

+ +

times

A sequence of multivariate polynomials (fn)nisinN is called a family Valiant inhis seminal paper [Val79] introduced the complexity class VPe that is defined as

107

108 Chapter 7 Algebraic branching programs

the set of all families whose formula size is polynomially bounded (We say asequence (an)n isin NN of natural numbers is polynomially bounded if there exists aunivariate polynomial q such that an le q(n) for all n) For example the family((x1)

n + (x2)n + middot middot middot+ (xn)n)n is in VPe because the formula size of this family

grows quadratically

The smallest known formulas for the determinant family detn have size nO(logn)This follows from Berkowitzrsquo algorithm [Ber84] which gives an algebraic cir-cuit of depth O(log2 n) and thus by expanding we get an algebraic formula ofdepth O(log2 n) whose size is then trivially bounded by 2O(log2 n) = nO(logn) Itis a major open question in algebraic complexity theory whether formulas ofpolynomially bounded size exist for detn This question can be phrased in termsof complexity classes as asking whether or not the inclusion VPe sube VPs is strict(We will define VPs shortly)

Motivated by this question we study the closure class VPe of families ofpolynomials that can be approximated arbitrarily closely by families in VPe

(see Section 724 for the formal definition) Over the field R or C one can thinkof VPe as the set of families whose border formula size is polynomially boundedThe border formula size of a polynomial f is the smallest number c such that thereexists a sequence gi of polynomials with formula size at most c and limirarrinfin gi = f

Continuous lower bounds

In algebraic complexity theory problem instances correspond to vectors v isin FnA complexity lower bound often takes the form of a function f Fn rarr F that is zeroon the vectors of ldquolow complexityrdquo and nonzero on v We refer to Grochow [Gro13]for a discussion of settings where complexity lower bounds are obtained in thisway (eg [NW97 Raz09 LO15 GKKS13 LMR13 BI13]) Over the complexnumbers we can in fact assume that these functions f are continuous [Gro13](and even so-called highest-weight vector polynomials) If C and D are algebraiccomplexity classes with C sube D (for example C = VPe and D = VPs) thena proof of separation D 6sube C in this continuous manner implies the strongerseparation D 6sube C In our case it is thus natural to aim for the separation VPs 6subeVPe instead of the slightly weaker VPs 6sube VPe which provides further motivationfor studying VPe This is exactly analogous to the geometric complexity theoryapproach of Mulmuley and Sohoni (see eg [MS01 MS08] and the exposition[BLMW11 Sec 9]) which aims to prove the separation VNP 6sube VPs to attackValiantrsquos famous conjecture VPs 6= VNP [Val79] (Here VNP is the class ofp-definable families see Section 724)

New results in this chapter

We prove two new results in this chapter

71 Introduction 109

Algebraic branching programs of width 2 An algebraic branching pro-gram (abp) is a directed acyclic graph with a source vertex s and a sink vertex tthat has affine linear forms over the base field F as edge labels Moreover werequire that each vertex is labeled with an integer (its layer) and that edges in theabp only point from vertices in layer i to vertices in layer i+ 1 The width of anabp is the cardinality of its largest layer The size of an abp is the number of itsvertices The value of an abp is the sum of the values of all sndasht-paths where thevalue of an sndasht-path is the product of its edge labels We say that an abp computesits value The class VPs coincides with the class of families of polynomials thatcan be computed by abps of polynomially bounded size see eg [Sap16]

For k isin N we introduce the class VPk as the class of families of polyno-mials computable by width-k abps of polynomially bounded size It is well-known (see Lemma 72) that VPk sube VPe for all k ge 1 In 1992 Ben-Or andCleve [BOC92] showed that VPk = VPe for all k ge 3 In 2011 Allender andWang [AW16] showed that width-2 abps cannot compute every polynomial so inparticular we have a strict inclusion VP2 ( VP3

We prove that the closure of VP2 and the closure of VPe are equal

VP2 = VPe (71)

when char(F) 6= 2 From (71) and the result of Allender and Wang follows directlythat the inclusion VP2 ( VP2 is strict We have thus separated a complexityclass from its approximation closure

VNP via affine linear forms Every algebraic complexity class has a nondeter-ministic closure (see Section 725 for the definition) The nondeterministic closureof VP is called VNP and the nondeterministic closure of VPe is called VNPeIn 1980 Valiant [Val80] proved VNPe = VNP The nondeterministic closureof VP1 and VP2 we call VNP1 and VNP2 Using interpolation techniques wecan deduce VNP2 = VNP from (71) provided the field is infinite Using moresophisticated techniques we prove

VNP1 = VNP (72)

From (72) easily follows VP1 ( VNP1 Also from [AW16] we get VP2 ( VNP2We have thus separated complexity classes from their nondeterministic closures

Further related work

An excellent exposition on the history of small-width computation can be foundin [AW16] along with an explicit polynomial that cannot be computed by width-2abps namely x1x2 + x3x4 + middot middot middot+ x15x16 Saha Saptharishi and Saxena in [SSS09Cor 14] showed that x1x2 + x3x4 + x5x6 cannot be computed by width-2 abpsthat correspond to the iterated matrix multiplication of upper triangular matrices

110 Chapter 7 Algebraic branching programs

Burgisser in [Bur04] studied approximations in the model of general algebraiccircuits finding general upper bounds on the error degree For most algebraiccomplexity classes C the relation between C and C has not been an activeobject of study As pointed out recently by Forbes [For16] Nisanrsquos result [Nis91]implies that C = C for C being the class of size-k algebraic branching programson noncommuting variables A structured study of VP and VPs was startedin [GMQ16] Much work in lower bounds for algebraic approximation algorithmshas been done in the area of bilinear complexity dating back to [BCRL79 Str83Lic84] and more recently eg [Lan06 LO15 HIL13 Zui17 LM16a]

This chapter is organised as follows In Section 72 we discuss definitions andbasic results In Section 73 we prove that the approximation closure of VP2

equals the approximation closure of VPe ie VP2 = VPe In Section 74 we provethat the nondeterminism closure of VP1 equals VNP

72 Definitions and basic results

We briefly recall the definition of circuits formulas and branching programs andwe recall the definition of the corresponding complexity classes Then we discusssome straightforward relationships among these classes and review the proof of atheorem by Ben-Or and Cleve which inspired our work Finally we discuss theapproximation closure and the nondeterminism closure for algebraic complexityclasses

721 Computational models

Let x1 x2 be formal variables By F[x] we mean the ring of polynomials over Fwith variables x1 x2 xk with k large enough

A circuit is a directed acyclic graph G with one or more source vertices andone sink vertex Each source vertex is labelled by a variable xi or a constant c isin FThe other vertices are labelled by either + or times and have in-degree 2 (that isfan-in 2) Each vertex computes an element in F[x] by recursion over the graphThe element computed by the sink is the element computed by the circuit Thesize of a circuit is the number of vertices

A formula is a circuit whose graph is a treeAn algebraic branching program (abp) is a directed acyclic graph with a source

vertex s and a sink vertex t that has affine linear forms αxi + β α β isin F asedge labels Moreover we require that each vertex is labeled with an integer (itslayer) and that edges in the abp only point from vertices in layer i to vertices inlayer i+ 1 The width of an abp is the cardinality of its largest layer The size ofan abp is the number of its vertices The value of an abp is the sum of the valuesof all sndasht-paths where the value of an sndasht-path is the product of its edge labels

72 Definitions and basic results 111

We say that an abp computes its value

For example the following abp has depth 5 width 3 and computes thepolynomial x1x2 + x2 + 2x1 minus 1

x1 2

x1x2minus1

An abp G corresponds naturally to an iterated product of matrices for any twoconsecutive layers Li Li+1 in G let Mi be the matrix (evw)visinLiwisinLi+1

with evwthe label of the edge from v to w (or 0 if there is no edge from v to w) Then thevalue of G equals the product Mk middot middot middotM2M1

For example the above abp corresponds to the following iterated matrixproduct

(1 1 1

)minus1 0 00 x2 00 0 x1

1 0 0x1 1 00 0 2

1

11

722 Complexity classes VP VPe VPk

The circuit size of a polynomial f is the size of the smallest circuit computing f The formula size of a polynomial f is the size of the smallest formula computing f

A family is a sequence (fn)nisinN of multivariate polynomials over F A class is aset of families The class VP consists of all families (fn) with circuit size degreeand number of variables in poly(n) The class VPe consists of all families (fn)with formula size in poly(n) (The origin of the subscript e in VPe is the termldquoarithmetic expressionrdquo) Clearly VPe sube VP

We introduce classes defined by abps Let k ge 1 The class VPk consists of allfamilies computed by polynomial-size width-k abps with edges labelled by affinelinear forms

sumi αixi + β with coefficients αi β isin F

We note that the above classes depend on the choice of the ground field F

In our paper [BIZ17] we make a distinction between three different types ofedge labels for abps The class VPk in this chapter corresponds to the class VPg

k

in [BIZ17]

112 Chapter 7 Algebraic branching programs

723 The theorem of Ben-Or and Cleve

This subsection is about the relations among VPk and VPe

Lemma 71 VPk sube VP` when k le `

Proof This is clearly true

Lemma 72 VPk sube VPe for any k

Proof For the simple proof we refer to [BIZ17]

Ben-Or and Cleve [BOC92] showed that for k ge 3 the classes VPk and VPe

are in fact equal

Theorem 73 (Ben-Or and Cleve [BOC92]) For k ge 3 VPk = VPe

We will review the construction of Ben-Or and Cleve here because we will useit to prove Theorem 78 and Theorem 715 The following depth-reduction lemmafor formulas by Brent is a crucial ingredient

Lemma 74 (Brent [Bre74]) Let f be an n-variate degree-d polynomial computedby a formula of size s Then f can also be computed by a formula of size poly(s n d)and depth O(log s)

Proof See the survey of Saptharishi [Sap16 Lemma 55] for a modern proof

Proof of Theorem 73 Lemma 72 says VPk sube VPe We will prove theinlusion VPe sube VP3 from which follows VPe sube VPk by Lemma 71 andthus VPk = VPe For a polynomial h define the matrix

M(h) =

1 0 0h 1 00 0 1

which as part of an abp looks like

h

We call the following matrices primitive

bull M(h) with h any variable or any constant in F

bull the 3times 3 permutation matrices denoted by Mπ with π isin S3

bull the diagonal matrices Mabc = diag(a b c) with a b c isin F

72 Definitions and basic results 113

The entries of the primitives are variables or constants in F making them suitableto use in the construction of a width-3 abp

Let (fn) isin VPe Then fn can be computed by a formula of size s(n) isin poly(n)By Brentrsquos depth-reduction theorem for formulas (Lemma 74) fn can be computedby a formula of size poly(n) and depth d(n) isin O(log s(n))

We will construct a sequence of primitives A1 Am(n) such that

A1 middot middot middotAm(n) =

1 0 0fn 1 00 0 1

with m(n) isin O(4d(n)) = poly(n) Then

fn(x) = ( 1 1 1 )Mminus110A1 middot middot middotAm(

111

)

so fn(x) can be computed by a width-3 abp of length poly(n) proving the theoremTo explain the construction let h be a polynomial and consider a formula

computing h of depth d The goal is to construct (recursively on the formulastructure) primitives A1 Am such that

A1 middot middot middotAm =

1 0 0h 1 00 0 1

with m isin O(4d)

Suppose h is a variable or a constant Then M(h) is itself a primitive matrixSuppose h = f + g is a sum of two polynomials f g and suppose M(f) and

M(g) can be written as a product of primitives Then M(f + g) equals a productof primitives because M(f + g) = M(f)M(g) This can easily be verified directlyor by noting that in the corresponding partial abps the top-bottom paths (ui-vjpaths) have the same value

u1 u2 u3

v1 v2 v3

f

gsim

u1 u2 u3

v1 v2 v3

f+g

Suppose h = fg is a product of two polynomials f g and suppose M(f) andM(g) can be written as a product of primitives Then M(fg) equals a product ofprimitives because

M(f middot g) = M(23)

(M1minus11M(123)M(g)M(132)M(f)

)2M(23)

114 Chapter 7 Algebraic branching programs

(here (23) isin S3 denotes the transposition 1 7rarr 1 2 7rarr 3 3 7rarr 2 and (123) isin S3

denotes the cyclic shift 1 7rarr 2 2 7rarr 3 3 7rarr 1) as can be verified either directly orby checking that in the corresponding partial abps the top-bottom paths (ui-vjpaths) have the same value

u1 u2 u3

v1 v2 v3

f

minus1

g

f

g

minus1

sim

u1 u2 u3

v1 v2 v3

f middotg

This completes the construction

The length m of the construction is m(h) = 1 for h a variable or constant andrecursively m(f + g) = m(f) +m(g) m(f middot g) = 2(m(f) +m(g)) so m isin O(4d)where d is the formula size of h

The above result of Ben-Or and Cleve (Theorem 73) raises the intriguingquestion whether the inclusion VP2 sube VPe is strict Allender and Wang [AW16]show that the inclusion is indeed strict in fact they show that some polynomialscannot be computed by any width-2 abp

Theorem 75 (Allender and Wang [AW16]) The polynomial

x1x2 + x3x4 + middot middot middot+ x15x16

cannot be computed by any width-2 abp Therefore we have the separation ofclasses VP2 ( VP3 = VPe

72 Definitions and basic results 115

724 Approximation closure C

We define the norm of a complex multivariate polynomial as the sum of theabsolute values of its coefficients This defines a topology on the polynomial ringC[x1 xm] Given a complexity measure L say abp size or formula size thereis a natural notion of approximate complexity that is called border complexityNamely a polynomial f isin C[x] has border complexity Ltop at most c if there isa sequence of polynomials g1 g2 in C[x] converging to f such that each gisatisfies L(gi) le c It turns out that for reasonable classes over the field of complexnumbers C this topological notion of approximation is equivalent to what we callalgebraic approximation (see eg [Bur04]) Namely a polynomial f isin C[x] satisfiesL(f)alg le c iff there are polynomials f1 fe isin C[x] such that the polynomial

h = f + εf1 + ε2f2 + middot middot middot+ εefe isin C[εx]

has complexity LC(ε)(h) le c where ε is a formal variable and LC(ε)(h) denotesthe complexity of h over the field extension C(ε) This algebraic notion ofapproximation makes sense over any base field and we will use it in the statementsand proofs of this chapter

Definition 76 Let C(F) be a class over the field F We define the approximationclosure C(F) as follows a family (fn) over F is in C(F) if there are polynomialsfni(x) isin F[x] and a function e Nrarr N such that the family (gn) defined by

gn(x) = fn(x) + εfn1(x) + ε2fn2(x) + middot middot middot+ εe(n)fne(n)(x)

is in C(F(ε)) We define the poly-approximation closure Cpoly

(F) similarly butwith the additional requirement that e(n) isin poly(n) We call e(n) the errordegree

725 Nondeterminism closure N(C)

We introduce the nondeterminism closure for algebraic complexity classes

Definition 77 Let C be a class The class N(C) consists of families (fn) withthe following property there is a family (gn) isin C and p(n) q(n) isin poly(n) suchthat

fn(x) =sum

bisin01p(n)

gq(n)(bx)

where x and b denote sequences of variables x1 x2 and b1 b2 bp(n) We saythat f(x) is a hypercube sum over g and that b1 b2 bp(n) are the hypercubevariables For any subscript x we will use the notation VNPx to denote N(VPx)We remark that the map C 7rarr N(C) trivially satisfies all properties of being aKuratowski closure operator ie N(empty) = empty C sube N(C) N(CcupD) = N(C)cupN(D)and N(N(C)) = N(C)

116 Chapter 7 Algebraic branching programs

73 Approximation closure of VP2

We show that every polynomial can be approximated by a width-2 abp Evenbetter we show that every polynomial can be approximated by a width-2 abpof size polynomial in the formula size and with error degree polynomial in theformula size This is the main result of the current chapter

Theorem 78 VPe sube VP2poly

when char(F) 6= 2

Proof For a polynomial h define the matrix M(h) =(

1 0h 1

) We call the following

matrices primitives

bull M(h) with h any variable or constant in F

bull(

12ε

00 1

)

(minus2ε 0

0 1

)

(minus1 ε0 1

)

(minus1 minusε0 1

)

(minus1 00 1

)The entries of the primitives are variables or constants in the base field F(ε)making them suitable to use in a width-2 abp over the base field F(ε)

Let (fn) isin VPe so fn(x) can be computed by a formula of size s(n) isin poly(n)By Brentrsquos depth reduction theorem for formulas (Lemma 74) fn can be computedby a formula of size poly(n) and depth d(n) isin O(log s(n))

We will construct a sequence of primitives A1 Am(n) such that

A1 middot middot middotAm(n) =

(1 0fn 1

)+ ε

(fn111 fn112

fn121 fn122

)+ ε2

(fn211 fn212

fn221 fn222

)+ middot middot middot + εe

(fne11 fne12

fne21 fne22

)for some fnijk isin F[x] with m(n) e(n) isin O(8d(n)) = poly(n) Then

( 1 1 )(minus1 00 1

)A1 middot middot middotAm(n)( 1

1 ) = fn(x) +O(ε)

so fn(x) can be approximated by a width-2 abp of length poly(n) and with errordegree poly(n) proving the theorem

We begin with the construction Let h be a polynomial and consider a formulacomputing h of depth d The goal is to construct recursively on the tree structureof the formula a sequence of primitives A1 Am such that for some hijk isin F[x]

A1 middot middot middotAm =

(1 0h 1

)+ ε

(0 0h121 0

)+ ε2

(h211 h212

h221 h222

)+

middot middot middot + εe(he11 he12

he21 he22

)(73)

with m e isin O(8d) Notice the particular first-degree error pattern in (73) whichour recursion will rely on

73 Approximation closure of VP2 117

Suppose h is a variable or a constant Then M(h) is itself a primitive satisfy-ing (73)

Suppose h = f + g is a sum of two polynomials f g and suppose that

F =

(1 0f 1

)+ ε

(0 0f prime 0

)+O(ε2) (74)

G =

(1 0g 1

)+ ε

(0 0gprime 0

)+O(ε2) (75)

are products of primitives for some f prime gprime isin F[x] Then

G middot F =

(1 0

f + g 1

)+ ε

(0 0

f prime + gprime 0

)+O(ε2)

is a product of primitives satisfying (73)Suppose h = fg is a product of two polynomials and suppose that F and G

are of the form (74) and (75) and are products of primitives We will constructM((f + g)2) M(minusf 2) M(minusg2) approximately in such a way that when we usethe identity (f + g)2 minus f 2 minus g2 = 2fg the error terms cancel properly Define theexpressions sq+(A) and sqminus(A) by

sqplusmn(A) =

(minusε 00 1

)middot A middot

(minus1 plusmnε0 1

)middot A middot

(1ε

00 1

)

Then

sqplusmn(F ) =

(1∓ εf 0

plusmnf 2 +O(ε) 1plusmn εf

)+O(ε2)

We have

sqminus(F ) middot sqminus(G) middot sq+(G middot F )

=

(1 + εg 0

minusg2 +O(ε) 1minus εg

)middot(

1 + εf 0minusf 2 +O(ε) 1minus εf

)middot(

1minus ε(f + g) 0(f + g)2 +O(ε) 1 + ε(f + g)

)+O(ε2)

which simplifies to

sqminus(F ) middot sqminus(G) middot sq+(G middot F ) =

(1 0

2fg +O(ε) 1

)+O(ε2)

118 Chapter 7 Algebraic branching programs

We conclude(2 00 1

)middot sqminus(G) middot sqminus(F ) middot sq+(G middot F ) middot

(12

00 1

)=

(minus2ε 0

0 1

)middotG middot

(minus1 minusε0 1

)middotG middot

(minus1 00 1

)middot F middot

(minus1 minusε0 1

)F

middot(minus1 00 1

)middotG middot F middot

(minus1 ε0 1

)middotG middot F middot

(12ε

00 1

)=

(1 0

fg +O(ε) 1

)+O(ε2)

This completes the constructionThe length m of the construction is m(h) = 1 for h a variable or constant

and recursively m(f + g) = m(f) + m(g) m(f middot g) = 4(m(f) + m(g)) + 7 Weconclude m isin O(8d) The error degree e of the construction satisfies the samerecursion so e isin O(8d)

Remark 79 The construction in the above proof of Theorem 78 is differentfrom the construction in our paper [BIZ17] The recursion in the above proof issimpler while the construction in [BIZ17] has a better error degree and has aspecial form which relates it to a family of polynomials called continuants

Corollary 710 VP2 = VPe and VP2poly

= VPepoly

when char(F) 6= 2

Proof We have VP2 sube VPe by Lemma 72 Taking closures on both sides weobtain VP2 sube VPe and VP2

polysube VPepoly

When char(F) 6= 2 VPe sube VP2

poly(Theorem 78) By taking closures follows

VPe sube VP2 and VPepolysube VP2

poly

Corollary 711 VP2poly

= VPe when char(F) 6= 2 and F is infinite

Proof By Corollary 710 VP2poly

= VPepoly

We prove VPepoly

= VPe inLemma 712 below

Lemma 712 VPepoly

= VPe when char(F) 6= 2 and F is infinite

Proof The inclusion VPe sube VPepoly

is trivially true We prove the other directionLet (fn) isin VPe

poly Then there are polynomials fni(x) isin F[x] and e(n) isin poly(n)

such that

fn(x) + εfn1(x) + ε2fn2(x) + middot middot middot+ εe(n)fne(n)(x)

is computed by a poly-size formula Γ over F(ε) Let α0 α1 αe(n) be distinctelements in F such that replacing ε by αj in Γ is a valid substitution ie not

74 Nondeterminism closure of VP1 119

causing division by zero These αj exist since our field is infinite by assumptionView

gn(ε) = fn(x) + εfn1(x) + ε2fn2(x) + middot middot middot+ εe(n)fne(n)(x)

as a polynomial in ε The polynomial gn(ε) has degree at most e(n) so we canwrite gn(ε) as follows (Lagrange interpolation on e(n) + 1 points)

gn(ε) =

e(n)sumj=0

gn(αj)prod

0lemlee(n)m6=j

εminus αmαj minus αm

(76)

Clearly fn(x) = gn(0) However replacing ε by 0 in Γ is not a valid substitutionin general From (76) we see directly how to write gn(0) as a linear combinationof the values gn(αj) namely

gn(0) =

e(n)sumj=0

gn(αj)prod

0lemlee(n)m6=j

minusαmαj minus αm

that is

gn(0) =e(n)sumj=0

βj gn(αj) with βj =prod

0lemlee(n)m6=j

αmαm minus αj

The value gn(αj) is computed by the formula Γ with ε replaced by αj which we

denote by Γ|ε=αj Thus fn(x) is computed by the poly-size formulasume(n)

j=0 βj Γ|ε=αj We conclude (fn) isin VPe

Remark 713 The statement of Lemma 712 also holds with VPe replaced withVPs or with VP by a similar proof

74 Nondeterminism closure of VP1

Recall the definition of VNPx = N(VPx) from Definition 77 Valiant proved thefollowing characterisation of VNP in his seminal work [Val80] See also [BCS97Thm 2126] [Bur00 Thm 213] and [MP08 Thm 2]

Theorem 714 (Valiant [Val80]) VNPe = VNP

We strengthen Valiantrsquos characterisation of VNP from VNPe to VNP1

Theorem 715 VNP1 = VNP when char(F) 6= 2

120 Chapter 7 Algebraic branching programs

The idea of the proof is ldquoto simulate in VNP1rdquo the primitives that we used inthe proof of VPe sube VP3 (Theorem 73)

Proof of Theorem 715 Clearly VNP1 sube VNP by Lemma 72 and takingthe nondeterminism closure N We will prove that VNP sube VNP1 Recall thatin the proof of VPe sube VP3 (Theorem 73) we defined for any polynomial h thematrix

M(h) =

1 0 0h 1 00 0 1

and we called the following matrices primitives

bull M(h) with h any variable or any constant in F

bull the 3times 3 permutation matrices denoted by Mπ for π isin S3

bull the diagonal matrices Mabc = diag(a b c) with a b c isin F

In the proof of VPe sube VP3 we constructed for any family (fn) isin VPe a sequenceof primitive matrices An1 Ant(n) with t(n) isin poly(n) such that

fn(x) = ( 1 1 1 )Mminus110A1 middot middot middotAm(

111

) (77)

We will show VPe sube VNP1 by constructing a hypercube sum over a width-1abp that evaluates the right-hand side of (77) This implies VNPe sube VNP1 bytaking the N-closure Then by Valiantrsquos Theorem 714 VNP sube VNP1

Let f(x) be a polynomial and let A1 Ak be primitive matrices suchthat f(x) is computed as

f(x) = ( 1 1 1 )Ak middot middot middotA1

(111

)

View this expression as a width-3 abp G with vertex layers labeled as shown inthe left-hand diagram in Fig 71 Assume for simplicity that all edges betweenlayers are present possibly with label 0 The sum of the values of every sndasht pathin G equals f(x)

f(x) =sumjisin[3]k

Ak[jk jkminus1] middot middot middotA1[j2 j1] (78)

We introduce some hypercube variables To every vertex of G except s and twe associate a bit the bits in the ith layer we call b1[i] b2[i] b3[i] To an sndashtpath in G we associate an assignment of the bj[i] by setting the bits of verticesvisited by the path to 1 and the others to 0 For example in the right-hand

74 Nondeterminism closure of VP1 121

s

0

1

2

kminus1

k

t

A1

A2

Ak

s

1 0 0

0 1 0

0 1 0

0 0 1

0 1 0

t

Figure 71 Illustration of the layer labelling and the path labelling used in theproof of Theorem 715

diagram in Fig 71 we show an sndasht path with the corresponding assignment of thebits bj[i] The assignments of the bj[i] corresponding to sndasht paths are preciselythe assignments such that for every i isin [k] exactly one of b1[i] b2[i] b3[i] equals 1Let

V (b1 b2 b3) =prodiisin[k]

(b1[i] + b2[i] + b3[i]

)prodstisin[3]s 6=t

(1minus bs[i]bt[i]

) (79)

Then the assignments of the bj[i] corresponding to sndasht paths are precisely theassignments such that V (b1 b2 b3) = 1 Otherwise V (b1 b2 b3) = 0

We will write f(x) as a hypercube sum by replacing each Ai[ji jiminus1] in (78)by a product of affine linear forms Si(Ai) with variables b and xsum

b

V (b1 b2 b3)Sk(Ak) middot middot middotS1(A1)

Define the expression Eq(α β) = (1minus α minus β)(1minus α minus β) for α β isin 0 1 Theexpression Eq(α β) evaluates to 1 if α equals β and evaluates to 0 otherwise

bull For any variable or constant x define

Si(M(x)) =(1 + (xminus 1)(b1[i]minus b1[iminus1])

)middot(1minus (1minus b2[i])b2[iminus1]

)middot Eq

(b3[iminus1] b3[i]

)

122 Chapter 7 Algebraic branching programs

bull For any permutation π isin S3 define

Si(Mπ) = Eq(b1[iminus1] bπ(1)[i]

)middot Eq

(b2[iminus1] bπ(2)[i]

)middot Eq

(b3[iminus1] bπ(3)[i]

)

bull For any constants a b c isin F define

Si(Mabc) =(a middot b1[iminus1] + b middot b2[iminus1] + c middot b3[iminus 1]

)middot Eq

(b1[iminus1] b1[i]

)middot Eq

(b2[iminus1] b2[i]

)middot Eq

(b3[iminus1] b3[i]

)

One verifies that

f(x) =sumb

V (b1 b2 b3)Sk(Ak) middot middot middotS1(A1)

Some of the factors in the expressions for the Si(Ai) are not affine linear As afinal step we apply the equality 1 + xy = 1

2

sumcisin01(x + 1 minus 2c)(y + 1 minus 2c) to

write these factors as products of affine linear forms introducing new hypercubevariables

75 Conclusion

We finish with an overview of inclusions equalities and separations among theclasses VPk VPe VP and their approximation and nondeterminism closures(when char(F) 6= 2) see Fig 72 The figure relies on the following two simplelemmas of which proofs can be found in our paper [BIZ17]

Lemma 716 ([BIZ17 Prop 510]) VP1 = VP1

Lemma 717 ([BIZ17 Prop 511]) VP1 ( VNP1 when char(F) 6= 2

75 Conclusion 123

VP1 VP2 VPe VP

VP1 VP2 VPe VP

VNP1 VNP2 VNPe VNP=

(

(

= =

(

= sube

( sube[AW16]

717

=716 sube sube(

subesube(710

715 [Val80]

[Val79]

Figure 72 Overview of relations among the algebraic complexity classes VPkVPe VP and their approximation and nondeterminism closures (when char(F) isnot 2) The relations without reference are either by definition or follow logicallyfrom the other relations

Bibliography

[AJRS13] Elizabeth S Allman Peter D Jarvis John A Rhodes andJeremy G Sumner Tensor rank invariants inequalities andapplications SIAM J Matrix Anal Appl 34(3)1014ndash1045 2013doi101137120899066 p 14

[Alo98] Noga Alon The Shannon capacity of a union Combinatorica18(3)301ndash310 1998 doi101007PL00009824 p 37

[ASU13] Noga Alon Amir Shpilka and Christopher Umans On sunflowersand matrix multiplication Comput Complexity 22(2)219ndash243Jun 2013 doi101007s00037-013-0060-1 p 48

[AW16] Eric Allender and Fengming Wang On the power of algebraicbranching programs of width two Comput Complexity25(1)217ndash253 2016 doi101007s00037-015-0114-7 p 17109 114 123

[AZ14] Martin Aigner and Gunter M Ziegler Proofs from The BookSpringer-Verlag Berlin fifth edition 2014doi101007978-3-662-44205-0 p 71

[BC18] Boris Bukh and Christopher Cox On a fractional version ofHaemersrsquo bound arXiv 2018 arXiv180200476 p 41 42

[BCC+17] Jonah Blasiak Thomas Church Henry Cohn Joshua A GrochowEric Naslund William F Sawin and Chris Umans On cap setsand the group-theoretic approach to matrix multiplication DiscreteAnal 2017 arXiv160506702 doi1019086da1245 p 4883 84 104

125

126 Bibliography

[BCPZ16] Harry Buhrman Matthias Christandl Christopher Perry andJeroen Zuiddam Clean quantum and classical communicationprotocols Phys Rev Lett 117230503 Dec 2016doi101103PhysRevLett117230503 p 1

[BCRL79] Dario Bini Milvio Capovani Francesco Romani and Grazia LottiO(n27799) complexity for ntimes n approximate matrix multiplicationInf Process Lett 8(5)234ndash235 1979doi1010160020-0190(79)90113-3 p 3 110

[BCS97] Peter Burgisser Michael Clausen and M Amin ShokrollahiAlgebraic complexity theory volume 315 of Grundlehren MathWiss Springer-Verlag Berlin 1997doi101007978-3-662-03338-8 p 4 6 48 50 66 79 119

[BCSX10] Arnab Bhattacharyya Victor Chen Madhu Sudan and Ning XieTesting Linear-Invariant Non-linear Properties A Short Reportpages 260ndash268 Springer Berlin Heidelberg Berlin Heidelberg2010 doi101007978-3-642-16367-8_18 p 48

[BCZ17a] Markus Blaser Matthias Christandl and Jeroen Zuiddam Theborder support rank of two-by-two matrix multiplication is sevenarXiv 2017 arXiv170509652 p 1 15

[BCZ17b] Harry Buhrman Matthias Christandl and Jeroen ZuiddamNondeterministic Quantum Communication Complexity the CyclicEquality Game and Iterated Matrix Multiplication In Christos HPapadimitriou editor 8th Innovations in Theoretical ComputerScience Conference (ITCS 2017) pages 241ndash2418 2017arXiv160303757 doi104230LIPIcsITCS201724 p 115

[Ber84] Stuart J Berkowitz On computing the determinant in smallparallel time using a small number of processors Inform ProcessLett 18(3)147ndash150 1984 doi1010160020-0190(84)90018-8p 108

[BI13] Peter Burgisser and Christian Ikenmeyer Explicit lower bounds viageometric complexity theory Proceedings 45th Annual ACMSymposium on Theory of Computing 2013 pages 141ndash150 2013doi10114524886082488627 p 108

[Bin80] Dario Bini Relations between exact and approximate bilinearalgorithms Applications Calcolo 17(1)87ndash97 1980doi101007BF02575865 p 3

Bibliography 127

[BIZ17] Karl Bringmann Christian Ikenmeyer and Jeroen Zuiddam OnAlgebraic Branching Programs of Small Width In Ryan OrsquoDonnelleditor 32nd Computational Complexity Conference (CCC 2017)pages 201ndash2031 2017 doi104230LIPIcsCCC201720 p 1107 111 112 118 122

[Bla13] Anna Blasiak A graph-theoretic approach to network coding PhDthesis Cornell University 2013 URL httpsecommonscornelledubitstreamhandle181334147ab675pdf p 42

[BLMW11] Peter Burgisser Joseph M Landsberg Laurent Manivel and JerzyWeyman An overview of mathematical issues arising in thegeometric complexity theory approach to VP 6= VNP SIAM JComput 40(4)1179ndash1209 2011 doi101137090765328 p 108

[BOC92] Michael Ben-Or and Richard Cleve Computing algebraic formulasusing a constant number of registers SIAM J Comput21(1)54ndash58 1992 doi1011370221006 p 17 109 112

[BPR+00] Charles H Bennett Sandu Popescu Daniel Rohrlich John ASmolin and Ashish V Thapliyal Exact and asymptotic measuresof multipartite pure-state entanglement Phys Rev A63(1)012307 2000 doi101103PhysRevA63012307 p 48

[Bre74] Richard P Brent The parallel evaluation of general arithmeticexpressions J ACM 21(2)201ndash206 April 1974doi101145321812321815 p 112

[Bri87] Michel Brion Sur lrsquoimage de lrsquoapplication moment In Seminairedrsquoalgebre Paul Dubreil et Marie-Paule Malliavin (Paris 1986)volume 1296 of Lecture Notes in Math pages 177ndash192 SpringerBerlin 1987 doi101007BFb0078526 p 9 93 94

[BS83] Eberhard Becker and Niels and Schwartz Zum Darstellungssatzvon Kadison-Dubois Arch Math (Basel) 40(5)421ndash428 1983doi101007BF01192806 p 7 12 33

[Bur90] Peter Burgisser Degenerationsordnung und Tragerfunktionalbilinearer Abbildungen PhD thesis Universitat Konstanz 1990httpnbn-resolvingdeurnnbndebsz352-opus-20311p 57 101

[Bur00] Peter Burgisser Completeness and reduction in algebraiccomplexity theory volume 7 of Algorithms and Computation inMathematics Springer-Verlag Berlin 2000doi101007978-3-662-04179-6 p 119

128 Bibliography

[Bur04] Peter Burgisser The complexity of factors of multivariatepolynomials Found Comput Math 4(4)369ndash396 2004doi101007s10208-002-0059-5 p 110 115

[BX15] Arnab Bhattacharyya and Ning Xie Lower bounds for testingtriangle-freeness in boolean functions Comput Complexity24(1)65ndash101 2015 doi101007s00037-014-0092-1 p 48

[BZ17] Jop Briet and Jeroen Zuiddam On the orthogonal rank of Cayleygraphs and impossibility of quantum round elimination QuantumInf Comput 17(1amp2) 2017 URL httpwwwrintonpresscomxxqic17qic-17-120106-0116pdfarXiv160806113 p 2

[CHM07] Matthias Christandl Aram W Harrow and Graeme MitchisonNonzero Kronecker coefficients and what they tell us about spectraComm Math Phys 270(3)575ndash585 2007doi101007s00220-006-0157-3 p 90

[CJZ18] Matthias Christandl Asger Kjaeligrulff Jensen and Jeroen ZuiddamTensor rank is not multiplicative under the tensor product LinearAlgebra Appl 543125ndash139 2018doi101016jlaa201712020 p 2 15

[CKSV16] Suryajith Chillara Mrinal Kumar Ramprasad Saptharishi andV Vinay The chasm at depth four and tensor rank Old resultsnew insights arXiv 2016 arXiv160604200 p 15

[CLP17] Ernie Croot Vsevolod F Lev and Peter Pal Pach Progression-freesets in Zn

4 are exponentially small Ann of Math (2)185(1)331ndash337 2017 doi104007annals201718517 p 4881

[CM06] Matthias Christandl and Graeme Mitchison The spectra ofquantum states and the Kronecker coefficients of the symmetricgroup Comm Math Phys 261(3)789ndash797 2006doi101007s00220-005-1435-1 p 91

[CMR+14] Toby Cubitt Laura Mancinska David E Roberson SimoneSeverini Dan Stahlke and Andreas Winter Bounds onentanglement-assisted source-channel coding via the Lovasz thetanumber and its variants IEEE Trans Inform Theory60(11)7330ndash7344 2014 arXiv13107120doi101109TIT20142349502 p 42

Bibliography 129

[CT12] Thomas M Cover and Joy A Thomas Elements of informationtheory John Wiley amp Sons 2012 p 60

[CU13] Henry Cohn and Christopher Umans Fast matrix multiplicationusing coherent configurations In Proceedings of the Twenty-FourthAnnual ACM-SIAM Symposium on Discrete Algorithms pages1074ndash1086 SIAM 2013 p 15

[CVZ16] Matthias Christandl Peter Vrana and Jeroen ZuiddamAsymptotic tensor rank of graph tensors beyond matrixmultiplication arXiv 2016 arXiv160907476 p 2 65 67 7985

[CVZ18] Matthias Christandl Peter Vrana and Jeroen Zuiddam Universalpoints in the asymptotic spectrum of tensors In Proceedings of 50thAnnual ACM SIGACT Symposium on the Theory of Computing(STOCrsquo18) ACM New York 2018 arXiv170907851doi10114531887453188766 p 2 47 65 87 88 96 103 105

[CW82] Don Coppersmith and Shmuel Winograd On the asymptoticcomplexity of matrix multiplication SIAM J Comput11(3)472ndash492 1982 doi1011370211038 p 3

[CW87] Don Coppersmith and Shmuel Winograd Matrix multiplication viaarithmetic progressions In Proceedings of the nineteenth annualACM symposium on Theory of computing pages 1ndash6 ACM 1987p 3

[CW90] Don Coppersmith and Shmuel Winograd Matrix multiplication viaarithmetic progressions J Symbolic Comput 9(3)251ndash280 1990doi101016S0747-7171(08)80013-2 p 4 6 8 10 48 67

[CZ18] Matthias Christandl and Jeroen Zuiddam Tensor surgery andtensor rank Comput Complexity Mar 2018doi101007s00037-018-0164-8 p 2 86

[Dra15] Jan Draisma Multilinear Algebra and Applications (lecture notes)2015 URL httpsmathsitesunibechjdraismapublicationsmlapplpdfp 15

[DVC00] Wolfgang Dur Guivre Vidal and Juan Ignacio Cirac Three qubitscan be entangled in two inequivalent ways Phys Rev A (3)62(6)062314 12 2000 doi101103PhysRevA62062314 p 48

130 Bibliography

[Ede04] Yves Edel Extensions of generalized product caps Des CodesCryptogr 31(1)5ndash14 2004 doi101023A1027365901231p 48 83

[EG17] Jordan S Ellenberg and Dion Gijswijt On large subsets of Fnq with

no three-term arithmetic progression Ann of Math (2)185(1)339ndash343 2017 doi104007annals201718518 p 1048 81 83 84

[FK14] Hu Fu and Robert Kleinberg Improved lower bounds for testingtriangle-freeness in boolean functions via fast matrix multiplicationIn Approximation Randomization and CombinatorialOptimization Algorithms and Techniques (APPROXRANDOM2014) pages 669ndash676 2014doi104230LIPIcsAPPROX-RANDOM2014669 p 48

[For16] Michael Forbes Some concrete questions on the border complexityof polynomials Presentation given at the Workshop on AlgebraicComplexity Theory WACT 2016 in Tel Avivhttpswwwyoutubecomwatchv=1HMogQIHT6Q 2016 p 110

[Fra02] Matthias Franz Moment polytopes of projective G-varieties andtensor products of symmetric group representations J Lie Theory12(2)539ndash549 2002 URLhttpemisamsorgjournalsJLTvol12_no216htmlp 93 94

[Fri17] Tobias Fritz Resource convertibility and ordered commutativemonoids Math Structures Comput Sci 27(6)850ndash938 2017doi101017S0960129515000444 p 37

[Ful97] William Fulton Young tableaux volume 35 of LondonMathematical Society Student Texts Cambridge University PressCambridge 1997 With applications to representation theory andgeometry p 88

[GKKS13] Ankit Gupta Pritish Kamath Neeraj Kayal and RamprasadSaptharishi Approaching the chasm at depth four In 2013 IEEEConference on Computational ComplexitymdashCCC 2013 pages 65ndash73IEEE Computer Soc Los Alamitos CA 2013doi101109CCC201316 p 108

[GMQ16] Joshua A Grochow Ketan D Mulmuley and Youming QiaoBoundaries of VP and VNP In Ioannis Chatzigiannakis MichaelMitzenmacher Yuval Rabani and Davide Sangiorgi editors 43rd

Bibliography 131

International Colloquium on Automata Languages andProgramming (ICALP 2016) volume 55 pages 341ndash3414 2016arXiv160502815 doi104230LIPIcsICALP201634 p 110

[Gro13] Joshua A Grochow Unifying and generalizing known lower boundsvia geometric complexity theory arXiv 2013 arXiv13046333p 108

[GW09] Roe Goodman and Nolan R Wallach Symmetry representationsand invariants volume 255 of Graduate Texts in MathematicsSpringer Dordrecht 2009 doi101007978-0-387-79852-3p 88

[Hae79] Willem Haemers On some problems of Lovasz concerning theShannon capacity of a graph IEEE Trans Inform Theory25(2)231ndash232 1979 doi101109TIT19791056027 p 37 4042

[Has90] Johan Hastad Tensor rank is NP-complete J Algorithms11(4)644ndash654 1990 doi1010160196-6774(90)90014-6 p 47

[HHHH09] Ryszard Horodecki Pawe l Horodecki Micha l Horodecki and KarolHorodecki Quantum entanglement Rev Modern Phys81(2)865ndash942 2009 doi101103RevModPhys81865 p 48

[HIL13] Jonathan D Hauenstein Christian Ikenmeyer and Joseph MLandsberg Equations for lower bounds on border rank ExpMath 22(4)372ndash383 2013 doi101080105864582013825892p 15 110

[Hum75] James E Humphreys Linear algebraic groups Springer-VerlagNew York-Heidelberg 1975 Graduate Texts in Mathematics No21 p 93

[HX17] Ishay Haviv and Ning Xie Sunflowers and testing triangle-freenessof functions Comput Complexity 26(2)497ndash530 Jun 2017doi101007s00037-016-0138-7 p 48

[Ike13] Christian Ikenmeyer Geometric complexity theory tensor rankand LittlewoodndashRichardson coefficients PhD thesis UniversitatPaderborn 2013 p 14

[Kar72] Richard M Karp Reducibility among combinatorial problems InComplexity of computer computations (Proc Sympos IBM ThomasJ Watson Res Center Yorktown Heights NY 1972) pages85ndash103 Plenum New York 1972 p 36

132 Bibliography

[Knu94] Donald E Knuth The sandwich theorem Electron J Combin 11994 URL httpwwwcombinatoricsorgVolume_1Abstractsv1i1a1htmlp 41

[Kra84] Hanspeter Kraft Geometrische Methoden in der InvariantentheorieSpringer 1984 doi101007978-3-663-10143-7 p 50 88 93

[KS08] Tali Kaufman and Madhu Sudan Algebraic property testing Therole of invariance In Proceedings of the Fortieth Annual ACMSymposium on Theory of Computing STOC rsquo08 pages 403ndash412New York NY USA 2008 ACMdoi10114513743761374434 p 48

[KSS16] Robert Kleinberg William F Sawin and David E Speyer Thegrowth rate of tri-colored sum-free sets arXiv 2016arXiv160700047 p 48 79 83

[Lan06] Joseph M Landsberg The border rank of the multiplication of2times 2 matrices is seven J Amer Math Soc 19(2)447ndash459 2006doi101090S0894-0347-05-00506-0 p 110

[LG14] Francois Le Gall Powers of tensors and fast matrix multiplicationIn ISSAC 2014mdashProceedings of the 39th International Symposiumon Symbolic and Algebraic Computation pages 296ndash303 ACM NewYork 2014 doi10114526086282608664 p 4 6 8 48 85

[Lic84] Thomas Lickteig A note on border rank Inf Process Lett18(3)173ndash178 1984 doi1010160020-0190(84)90023-1p 110

[LM16a] Joseph M Landsberg and Mateusz Micha lek A 2n2 minus log(n)minus 1lower bound for the border rank of matrix multiplication arXiv2016 arXiv160807486 p 110

[LM16b] Joseph M Landsberg and Mateusz Micha lek Abelian tensorsJ Math Pures Appl 2016 doi101016jmatpur201611004p 14

[LMR13] Joseph M Landsberg Laurent Manivel and Nicolas RessayreHypersurfaces with degenerate duals and the geometric complexitytheory program Comment Math Helv 88(2)469ndash484 2013doi104171CMH292 p 108

[LO15] Joseph M Landsberg and Giorgio Ottaviani New lower bounds forthe border rank of matrix multiplication Theory Comput

Bibliography 133

11285ndash298 2015 arXiv11126007doi104086toc2015v011a011 p 108 110

[Lov79] Laszlo Lovasz On the Shannon capacity of a graph IEEE TransInform Theory 25(1)1ndash7 1979 doi101109TIT19791055985p 13 35 41

[Mar08] Murray Marshall Positive polynomials and sums of squaresvolume 146 of Mathematical Surveys and Monographs AmericanMathematical Society Providence RI 2008doi101090surv146 p 34

[MP71] Robert J McEliece and Edward C Posner Hide and seek datastorage and entropy The Annals of Mathematical Statistics42(5)1706ndash1716 1971 doi101214aoms1177693169 p 41

[MP08] Guillaume Malod and Natacha Portier Characterizing Valiantrsquosalgebraic complexity classes J Complexity 24(1)16ndash38 2008doi101016jjco200609006 p 119

[MS01] Ketan D Mulmuley and Milind Sohoni Geometric complexitytheory I An approach to the P vs NP and related problemsSIAM J Comput 31(2)496ndash526 2001doi101137S009753970038715X p 14 108

[MS08] Ketan D Mulmuley and Milind Sohoni Geometric complexitytheory II Towards explicit obstructions for embeddings amongclass varieties SIAM J Comput 38(3)1175ndash1206 2008doi101137080718115 p 108

[Nes84] Linda Ness A stratification of the null cone via the moment mapAmer J Math 106(6)1281ndash1329 1984 With an appendix byDavid Mumford doi1023072374395 p 9 93 94

[Nis91] Noam Nisan Lower bounds for non-commutative computation InProceedings of the twenty-third annual ACM symposium on Theoryof computing pages 410ndash418 ACM 1991doi101145103418103462 p 110

[Nor16] Sergey Norin A distribution on triples with maximum entropymarginal arXiv 2016 arXiv160800243 p 83

[NW97] Noam Nisan and Avi Wigderson Lower bounds on arithmeticcircuits via partial derivatives Comput Complexity 6(3)217ndash234199697 doi101007BF01294256 p 108

134 Bibliography

[Pan78] Victor Ya Pan Strassenrsquos algorithm is not optimal Trilineartechnique of aggregating uniting and canceling for constructingfast algorithms for matrix operations In 19th Annual Symposiumon Foundations of Computer Science (Ann Arbor Mich 1978)pages 166ndash176 IEEE Long Beach Calif 1978 p 3

[Pan80] Victor Ya Pan New fast algorithms for matrix operations SIAMJ Comput 9(2)321ndash342 1980 doi1011370209027 p 3

[Pan81] Victor Ya Pan New combinations of methods for the accelerationof matrix multiplication Comput Math Appl 7(1)73ndash125 1981doi1010160898-1221(81)90009-2 p 3

[Pan84] Victor Ya Pan How to multiply matrices faster volume 179 ofLecture Notes in Computer Science Springer-Verlag Berlin 1984doi1010073-540-13866-8 p 3

[Pan18] Victor Ya Pan Fast feasible and unfeasible matrix multiplicationarXiv 2018 arXiv180404102 p 6

[PD01] Alexander Prestel and Charles N Delzell Positive polynomialsSpringer Monographs in Mathematics Springer-Verlag Berlin2001 From Hilbertrsquos 17th problem to real algebradoi101007978-3-662-04648-7 p 34

[Peb16] Luke Pebody Proof of a conjecture of Kleinberg-Sawin-SpeyerarXiv 2016 arXiv160805740 p 83

[PS98] George Polya and Gabor Szego Problems and theorems inanalysis I Classics in Mathematics Springer-Verlag Berlin 1998Series integral calculus theory of functions Translated from theGerman by Dorothee Aeppli Reprint of the 1978 Englishtranslation doi101007978-3-642-61905-2 p 21

[Raz09] Ran Raz Multi-linear formulas for permanent and determinant areof super-polynomial size J ACM 56(2)Art 8 17 2009doi10114515027931502797 p 108

[Raz13] Ran Raz Tensor-rank and lower bounds for arithmetic formulasJ ACM 60(6)Art 40 15 2013 doi1011452535928 p 14

[Rom82] Francesco Romani Some properties of disjoint sums of tensorsrelated to matrix multiplication SIAM J Comput 11(2)263ndash2671982 doi1011370211020 p 3

Bibliography 135

[Sap16] Ramprasad Saptharishi A survey of lower bounds in arithmeticcircuit complexity 302 2016 Online survey URLhttpsgithubcomdasarpmarlowerbounds-survey p 6 17109 112

[Sch81] Arnold Schonhage Partial and total matrix multiplication SIAMJ Comput 10(3)434ndash455 1981 p 3

[Sch03] Alexander Schrijver Combinatorial optimization polyhedra andefficiency volume 24 Springer Science amp Business Media 2003p 37 41

[Sha56] Claude E Shannon The zero error capacity of a noisy channelInstitute of Radio Engineers Transactions on Information TheoryIT-2(September)8ndash19 1956 doi101109TIT19561056798p 13 35

[Sha09] Asaf Shapira Greenrsquos conjecture and testing linear-invariantproperties In Proceedings of the Forty-first Annual ACMSymposium on Theory of Computing STOC rsquo09 pages 159ndash166New York NY USA 2009 ACMdoi10114515364141536438 p 48

[Shi16] Yaroslav Shitov How hard is the tensor rank arXiv 2016arXiv161101559 p 47

[Sin64] Richard C Singleton Maximum distance q-nary codes IEEETrans Information Theory IT-10116ndash118 1964doi101109TIT19641053661 p 101

[SOK14] Adam Sawicki Micha l Oszmaniec and Marek Kus Convexity ofmomentum map Morse index and quantum entanglement RevMath Phys 26(3)1450004 39 2014doi101142S0129055X14500044 p 9

[SSS09] Chandan Saha Ramprasad Saptharishi and Nitin Saxena Thepower of depth 2 circuits over algebras In IARCS AnnualConference on Foundations of Software Technology and TheoreticalComputer Science volume 4 pages 371ndash382 2009arXiv09042058 doi104230LIPIcsFSTTCS20092333p 109

[Sto10] Andrew James Stothers On the complexity of matrix multiplicationPhD thesis University of Edinburgh 2010httphdlhandlenet18424734 p 4 6 8 48

136 Bibliography

[Str69] Volker Strassen Gaussian elimination is not optimal NumerMath 13(4)354ndash356 1969 doi101007BF02165411 p 3 5

[Str83] Volker Strassen Rank and optimal computation of generic tensorsLinear Algebra Appl 5253645ndash685 1983doi1010160024-3795(83)80041-X p 110

[Str86] Volker Strassen The asymptotic spectrum of tensors and theexponent of matrix multiplication In Proceedings of the 27thAnnual Symposium on Foundations of Computer Science SFCS rsquo86pages 49ndash54 Washington DC USA 1986 IEEE Computer Societydoi101109SFCS198652 p 4 7

[Str87] Volker Strassen Relative bilinear complexity and matrixmultiplication J Reine Angew Math 375376406ndash443 1987doi101515crll1987375-376406 p 3 4 49 67

[Str88] Volker Strassen The asymptotic spectrum of tensors J ReineAngew Math 384102ndash152 1988doi101515crll1988384102 p 4 7 12 19 26 27 28 2930 32 33 49 50 51

[Str91] Volker Strassen Degeneration and complexity of bilinear mapssome asymptotic spectra J Reine Angew Math 413127ndash1801991 doi101515crll1991413127 p 3 4 10 48 49 5255 56 57 66 67 81 82

[Str94] Volker Strassen Algebra and complexity In First EuropeanCongress of Mathematics Vol II (Paris 1992) volume 120 ofProgr Math pages 429ndash446 Birkhauser Basel 1994doi101007s10107-008-0221-1 p 67

[Str05] Volker Strassen Komplexitat und Geometrie bilinearerAbbildungen Jahresber Deutsch Math-Verein 107(1)3ndash31 2005p 4 88 94 95 100 101

[Tao08] Terence Tao Structure and randomness pages from year one of amathematical blog American Mathematical Soc 2008 p 48

[Tao16] Terence Tao A symmetric formulation of theCrootndashLevndashPachndashEllenbergndashGijswijt capset boundhttpsterrytaowordpresscom 2016 p 48 58 81 84

[Tob91] Verena Tobler Spezialisierung und Degeneration von TensorenPhD thesis Universitat Konstanz 1991httpnbn-resolvingdeurnnbndebsz352-opus-20324p 57

Bibliography 137

[TS16] Terence Tao and Will Sawin Notes on the ldquoslice rankrdquo of tensorshttpsterrytaowordpresscom 2016 p 48 58

[Val79] Leslie G Valiant Completeness classes in algebra In ConferenceRecord of the Eleventh Annual ACM Symposium on Theory ofComputing (Atlanta Ga 1979) pages 249ndash261 ACM New York1979 doi101145800135804419 p 107 108 123

[Val80] Leslie G Valiant Reducibility by algebraic projections Universityof Edinburgh Department of Computer Science 1980 InternalReport p 109 119 123

[VC15] Peter Vrana and Matthias Christandl Asymptotic entanglementtransformation between W and GHZ states J Math Phys56(2)022204 12 2015 arXiv13103244doi10106314908106 p 69

[VDDMV02] F Verstraete J Dehaene B De Moor and H Verschelde Fourqubits can be entangled in nine different ways Phys Rev A (3)65(5 part A)052112 5 2002 doi101103PhysRevA65052112p 48

[Wal14] Michael Walter Multipartite quantum states and their marginalsPhD thesis ETH Zurich 2014 arXiv14106820 p 93

[WDGC13] Michael Walter Brent Doran David Gross and MatthiasChristandl Entanglement polytopes multiparticle entanglementfrom single-particle information Science 340(6137)1205ndash12082013 arXiv12080365 doi101126science1232957 p 8 995

[Wil12] Virginia Vassilevska Williams Multiplying matrices faster thanCoppersmith-Winograd Extended abstract InSTOCrsquo12mdashProceedings of the 2012 ACM Symposium on Theory ofComputing pages 887ndash898 ACM New York 2012doi10114522139772214056 p 4 6 8 48

[Zui17] Jeroen Zuiddam A note on the gap between rank and border rankLinear Algebra Appl 52533ndash44 2017doi101016jlaa201703015 p 2 14 110

[Zui18] Jeroen Zuiddam The asymptotic spectrum of graphs and theShannon capacity arXiv 2018 arXiv180700169 p 35

Glossary

〈n〉 ntimes middot middot middot times n diagonal tensor 47

〈a b c〉 matrix multiplication tensor 48

G lowastH or-product 42

GH strong graph product and-product 35

α(G) stability number 35

χ(G) clique cover number 40

Kk complete graph on k vertices 36

F θ(t) quantum functional 96

G(t) GLn1 times middot middot middot timesGLnk for t isin Fn1 otimes middot middot middot otimes Fnk 52

H(P ) Shannon entropy of probability distribution P 52

h(p) binary entropy of probability p isin [0 1] 53

τ(Φ) hitting set number 59

˜τ(Φ) asymptotic hitting set number 60

ω matrix multiplication exponent 47

P moment polytope 94

139

140 Glossary

P(X) the set of probability distributions on X 52

R rank 27

˜R asymptotic rank 27

R(t) border rank 50

R(G) rank of a graph clique cover number 40

R(t) tensor rank 47

SR(t) slice rank 58

Q subrank 27

˜Q asymptotic subrank 27

Q(t) border subrank 50

Q(Φ) combinatorial subrank 10

Q(G) subrank of a graph stability number 40

supp(t) support 52

Θ(G) Shannon capacity 35

ϑ(G) Lovasz theta number 41

G tH disjoint union 36

W (t) Sn1 times middot middot middot times Snk for t isin Fn1 otimes middot middot middot otimes Fnk 53

X(S6) asymptotic spectrum of semiring S with Strassen preorder 6 25

ζ(S)(t) gauge point 51

ζθ(t) support functional 52

Samenvatting

Algebraısche complexiteit asymptotische spectra enverstrengelingspolytopen

Het is welbekend dat de rang van een matrix multiplicatief is onder het Krone-ckerproduct additief onder de directe som genormaliseerd op identiteitsmatricesen niet-stijgend onder vermenigvuldiging van links en van rechts met matricesMatrixrang is zelfs de enige reele parameter met deze vier eigenschappen In 1986initieerde Strassen de studie van de uitbreiding naar tensoren vind alle afbeel-dingen van k-tensoren naar de reele getallen die multiplicatief zijn onder hettensor Kroneckerproduct additief onder de directe som genormaliseerd op ldquoiden-titeitstensorenrdquo en niet-stijgend onder het toepassen van lineaire afbeeldingen opde k tensorfactoren Strassen noemde de verzameling van deze afbeeldingen hetldquoasymptotische spectrum van k-tensorenrdquo Hij bewees als we het asymptotischespectrum begrijpen dan begrijpen we de asymptotische relaties tussen tensorswaaronder de asymptotische subrang en de asymptotische rang In het bijzonderals we het asymptotische spectrum kennen dan kennen we de aritmetische com-plexiteit van matrixvermenigvuldiging een centraal probleem in de algebraıschecomplexiteitstheorie

Een van de hoofdresultaten in dit proefschrift is de eerste expliciete construc-tie van een oneindige familie van elementen in het asymptotische spectrum vancomplexe k-tensoren genaamd de quantumfunctionalen Onze constructie is geba-seerd op informatietheorie en momentpolytopen ook wel verstrengelingspolytopengenoemd Daarnaast bestuderen we onder andere de relatie tussen de recentgeıntroduceerde slice rang en de quantumfunctionalen en we bewijzen dat deldquoasymptotischerdquo slice rang gelijk is aan het minimum over de quantumfunctionalenNaast het bestuderen van de bovengenoemde tensorparameters geven we eenuitbreiding van de CoppersmithndashWinograd-methode (voor het verkrijgen vanondergrenzen op de asymptotische combinatorische subrang) naar hogere-orde

141

142 Samenvatting

tensoren dwz tensoren van orde minstens 4 We passen deze uitbreiding toeom nieuwe bovengrenzen te krijgen op de asymptotische tensorrang van complete-graaftensoren via de lasermethode (Gezamenlijk werk met Christandl en VranaQIP 2018 STOC 2018)

Als een nieuwe toepassing van de abstracte theorie van asymptotische spectraintroduceren we het asymptotische spectrum van grafen in de grafentheorie Ana-loog aan de situatie voor tensoren geldt als we het asymptotisch spectrum vangrafen begrijpen dan begrijpen we de Shannoncapaciteit een graafparameter diede zero-error-communicatiecomplexiteit van communicatiekanalen karakteriseertMet andere woorden we bewijzen een nieuwe dualiteitsstelling voor de Shannon-capaciteit Voorbeelden van elementen in het asymptotische spectrum van grafenzijn het thetagetal van Lovasz en de fractionele Haemersgrenzen

Tot slot bestuderen we een algebraısch model van berekening genaamd algebraicbranching programs Een algebraic branching program (abp) is het spoor vaneen product van matrices met polynomen van graad hoogstens 1 als elementenDe maximale grootte van de matrices heet de breedte van de abp In 1992bewezen Ben-Or en Cleve dat elk polynoom berekend kan worden door eenbreedte-3 abp met een aantal matrices dat polynomiaal is in de formula size vanhet polynoom Daarentegen bewezen Allender en Wang in 2011 dat sommigepolynomen niet berekend kunnen worden door breedte-2 abps Wij bewijzen dat elkpolynoom benaderd kan worden door een breedte-2 abp met een aantal matricesdat polynomiaal is in de formula size van het polynoom waarbij benaderingwordt bedoeld in de zin van degeneration (Gezamenlijk werk met Ikenmeyer enBringmann CCC 2017 JACM 2018)

Summary

Algebraic complexity asymptotic spectra andentanglement polytopes

Matrix rank is well-known to be multiplicative under the Kronecker productadditive under the direct sum normalised on identity matrices and non-increasingunder multiplying from the left and from the right by any matrices In fact matrixrank is the only real matrix parameter with these four properties In 1986 Strassenproposed to study the extension to tensors find all maps from k-tensors to thereals that are multiplicative under the tensor Kronecker product additive underthe direct sum normalised on ldquoidentity tensorsrdquo and non-increasing under actingwith linear maps on the k tensor factors Strassen called the collection of thesemaps the ldquoasymptotic spectrum of k-tensorsrdquo He proved that understandingthe asymptotic spectrum implies understanding the asymptotic relations amongtensors including the asymptotic subrank and the asymptotic rank In particularknowing the asymptotic spectrum means knowing the arithmetic complexity ofmatrix multiplication a central problem in algebraic complexity theory

One of the main results in this dissertation is the first explicit construction ofan infinite family of elements in the asymptotic spectrum of complex k-tensorscalled the quantum functionals Our construction is based on information theoryand moment polytopes ie entanglement polytopes Moreover among otherthings we study the relation of the recently introduced slice rank to the quantumfunctionals and find that ldquoasymptoticrdquo slice rank equals the minimum over thequantum functionals Besides studying the above tensor parameters we extendthe CoppersmithndashWinograd method (for obtaining asymptotic combinatorialsubrank lower bounds) to higher-order tensors ie order at least 4 We applythis generalisation to obtain new upper bounds on the asymptotic tensor rankof complete graph tensors via the laser method (Joint work with Christandland Vrana QIP 2018 STOC 2018)

143

144 Summary

In graph theory as a new instantiation of the abstract theory of asymptoticspectra we introduce the asymptotic spectrum of graphs Analogous to thesituation for tensors understanding the asymptotic spectrum of graphs meansunderstanding the Shannon capacity a graph parameter capturing the zero-errorcommunication complexity of communication channels In different words weprove a new duality theorem for Shannon capacity Some known elements in theasymptotic spectrum of graphs are the Lovasz theta number and the fractionalHaemers bounds

Finally we study an algebraic model of computation called algebraic branchingprograms An algebraic branching program (abp) is the trace of a product ofmatrices with affine linear forms as matrix entries The maximum size of thematrices is called the width of the abp In 1992 Ben-Or and Cleve provedthat width-3 abps can compute any polynomial efficiently in the formula sizeOn the other hand in 2011 Allender and Wang proved that some polynomialscannot be computed by any width-2 abp We prove that any polynomial can beefficiently approximated by a width-2 abp where approximation is defined in thesense of degeneration (Joint work with Ikenmeyer and Bringmann CCC 2017JACM 2018)

Titles in the ILLC Dissertation Series

ILLC DS-2009-01 Jakub SzymanikQuantifiers in TIME and SPACE Computational Complexity of GeneralizedQuantifiers in Natural Language

ILLC DS-2009-02 Hartmut FitzNeural Syntax

ILLC DS-2009-03 Brian Thomas SemmesA Game for the Borel Functions

ILLC DS-2009-04 Sara L UckelmanModalities in Medieval Logic

ILLC DS-2009-05 Andreas WitzelKnowledge and Games Theory and Implementation

ILLC DS-2009-06 Chantal BaxSubjectivity after Wittgenstein Wittgensteinrsquos embodied and embedded subjectand the debate about the death of man

ILLC DS-2009-07 Kata BaloghTheme with Variations A Context-based Analysis of Focus

ILLC DS-2009-08 Tomohiro HoshiEpistemic Dynamics and Protocol Information

ILLC DS-2009-09 Olivia LadinigTemporal expectations and their violations

ILLC DS-2009-10 Tikitu de JagerrdquoNow that you mention it I wonderrdquo Awareness Attention Assumption

ILLC DS-2009-11 Michael FrankeSignal to Act Game Theory in Pragmatics

ILLC DS-2009-12 Joel UckelmanMore Than the Sum of Its Parts Compact Preference Representation OverCombinatorial Domains

ILLC DS-2009-13 Stefan BoldCardinals as Ultrapowers A Canonical Measure Analysis under the Axiom ofDeterminacy

ILLC DS-2010-01 Reut TsarfatyRelational-Realizational Parsing

ILLC DS-2010-02 Jonathan ZvesperPlaying with Information

ILLC DS-2010-03 Cedric DegremontThe Temporal Mind Observations on the logic of belief change in interactivesystems

ILLC DS-2010-04 Daisuke IkegamiGames in Set Theory and Logic

ILLC DS-2010-05 Jarmo KontinenCoherence and Complexity in Fragments of Dependence Logic

ILLC DS-2010-06 Yanjing WangEpistemic Modelling and Protocol Dynamics

ILLC DS-2010-07 Marc StaudacherUse theories of meaning between conventions and social norms

ILLC DS-2010-08 Amelie GheerbrantFixed-Point Logics on Trees

ILLC DS-2010-09 Gaelle FontaineModal Fixpoint Logic Some Model Theoretic Questions

ILLC DS-2010-10 Jacob VosmaerLogic Algebra and Topology Investigations into canonical extensions dualitytheory and point-free topology

ILLC DS-2010-11 Nina GierasimczukKnowing Onersquos Limits Logical Analysis of Inductive Inference

ILLC DS-2010-12 Martin Mose BentzenStit Iit and Deontic Logic for Action Types

ILLC DS-2011-01 Wouter M KoolenCombining Strategies Efficiently High-Quality Decisions from ConflictingAdvice

ILLC DS-2011-02 Fernando Raymundo Velazquez-QuesadaSmall steps in dynamics of information

ILLC DS-2011-03 Marijn KoolenThe Meaning of Structure the Value of Link Evidence for Information Retrieval

ILLC DS-2011-04 Junte ZhangSystem Evaluation of Archival Description and Access

ILLC DS-2011-05 Lauri KeskinenCharacterizing All Models in Infinite Cardinalities

ILLC DS-2011-06 Rianne KapteinEffective Focused Retrieval by Exploiting Query Context and Document Struc-ture

ILLC DS-2011-07 Jop BrietGrothendieck Inequalities Nonlocal Games and Optimization

ILLC DS-2011-08 Stefan MinicaDynamic Logic of Questions

ILLC DS-2011-09 Raul Andres LealModalities Through the Looking Glass A study on coalgebraic modal logic andtheir applications

ILLC DS-2011-10 Lena KurzenComplexity in Interaction

ILLC DS-2011-11 Gideon BorensztajnThe neural basis of structure in language

ILLC DS-2012-01 Federico SangatiDecomposing and Regenerating Syntactic Trees

ILLC DS-2012-02 Markos MylonakisLearning the Latent Structure of Translation

ILLC DS-2012-03 Edgar Jose Andrade LoteroModels of Language Towards a practice-based account of information innatural language

ILLC DS-2012-04 Yurii KhomskiiRegularity Properties and Definability in the Real Number Continuum idealizedforcing polarized partitions Hausdorff gaps and mad families in the projectivehierarchy

ILLC DS-2012-05 David Garcıa SorianoQuery-Efficient Computation in Property Testing and Learning Theory

ILLC DS-2012-06 Dimitris GakisContextual Metaphilosophy - The Case of Wittgenstein

ILLC DS-2012-07 Pietro GallianiThe Dynamics of Imperfect Information

ILLC DS-2012-08 Umberto GrandiBinary Aggregation with Integrity Constraints

ILLC DS-2012-09 Wesley Halcrow HollidayKnowing What Follows Epistemic Closure and Epistemic Logic

ILLC DS-2012-10 Jeremy MeyersLocations Bodies and Sets A model theoretic investigation into nominalisticmereologies

ILLC DS-2012-11 Floor SietsmaLogics of Communication and Knowledge

ILLC DS-2012-12 Joris DormansEngineering emergence applied theory for game design

ILLC DS-2013-01 Simon PauwSize Matters Grounding Quantifiers in Spatial Perception

ILLC DS-2013-02 Virginie FiutekPlaying with Knowledge and Belief

ILLC DS-2013-03 Giannicola ScarpaQuantum entanglement in non-local games graph parameters and zero-errorinformation theory

ILLC DS-2014-01 Machiel KeestraSculpting the Space of Actions Explaining Human Action by IntegratingIntentions and Mechanisms

ILLC DS-2014-02 Thomas IcardThe Algorithmic Mind A Study of Inference in Action

ILLC DS-2014-03 Harald A BastiaanseVery Many Small Penguins

ILLC DS-2014-04 Ben RodenhauserA Matter of Trust Dynamic Attitudes in Epistemic Logic

ILLC DS-2015-01 Marıa Ines CrespoAffecting Meaning Subjectivity and evaluativity in gradable adjectives

ILLC DS-2015-02 Mathias Winther MadsenThe Kid the Clerk and the Gambler - Critical Studies in Statistics andCognitive Science

ILLC DS-2015-03 Shengyang ZhongOrthogonality and Quantum Geometry Towards a Relational Reconstructionof Quantum Theory

ILLC DS-2015-04 Sumit SourabhCorrespondence and Canonicity in Non-Classical Logic

ILLC DS-2015-05 Facundo CarreiroFragments of Fixpoint Logics Automata and Expressiveness

ILLC DS-2016-01 Ivano A CiardelliQuestions in Logic

ILLC DS-2016-02 Zoe ChristoffDynamic Logics of Networks Information Flow and the Spread of Opinion

ILLC DS-2016-03 Fleur Leonie BouwerWhat do we need to hear a beat The influence of attention musical abilitiesand accents on the perception of metrical rhythm

ILLC DS-2016-04 Johannes MartiInterpreting Linguistic Behavior with Possible World Models

ILLC DS-2016-05 Phong LeLearning Vector Representations for Sentences - The Recursive Deep LearningApproach

ILLC DS-2016-06 Gideon Maillette de Buy WennigerAligning the Foundations of Hierarchical Statistical Machine Translation

ILLC DS-2016-07 Andreas van CranenburghRich Statistical Parsing and Literary Language

ILLC DS-2016-08 Florian SpeelmanPosition-based Quantum Cryptography and Catalytic Computation

ILLC DS-2016-09 Teresa PiovesanQuantum entanglement insights via graph parameters and conic optimization

ILLC DS-2016-10 Paula HenkNonstandard Provability for Peano Arithmetic A Modal Perspective

ILLC DS-2017-01 Paolo GaleazziPlay Without Regret

ILLC DS-2017-02 Riccardo PinosioThe Logic of Kantrsquos Temporal Continuum

ILLC DS-2017-03 Matthijs WesteraExhaustivity and intonation a unified theory

ILLC DS-2017-04 Giovanni CinaCategories for the working modal logician

ILLC DS-2017-05 Shane Noah Steinert-ThrelkeldCommunication and Computation New Questions About Compositionality

ILLC DS-2017-06 Peter HawkeThe Problem of Epistemic Relevance

ILLC DS-2017-07 Aybuke OzgunEvidence in Epistemic Logic A Topological Perspective

ILLC DS-2017-08 Raquel Garrido AlhamaComputational Modelling of Artificial Language Learning Retention Recogni-tion amp Recurrence

ILLC DS-2017-09 Milos StanojevicPermutation Forests for Modeling Word Order in Machine Translation

ILLC DS-2018-01 Berit JanssenRetained or Lost in Transmission Analyzing and Predicting Stability in DutchFolk Songs

ILLC DS-2018-02 Hugo HuurdemanSupporting the Complex Dynamics of the Information Seeking Process

ILLC DS-2018-03 Corina KoolenReading beyond the female The relationship between perception of authorgender and literary quality

ILLC DS-2018-04 Jelle BruinebergAnticipating Affordances Intentionality in self-organizing brain-body-environmentsystems

ILLC DS-2018-05 Joachim DaiberTypologically Robust Statistical Machine Translation Understanding and Ex-ploiting Differences and Similarities Between Languages in Machine Transla-tion

ILLC DS-2018-06 Thomas BrochhagenSignaling under Uncertainty

ILLC DS-2018-07 Julian SchloderAssertion and Rejection

ILLC DS-2018-08 Srinivasan ArunachalamQuantum Algorithms and Learning Theory

ILLC DS-2018-09 Hugo de Holanda Cunha NobregaGames for functions Baire classes Weihrauch degrees transfinite computa-tions and ranks

ILLC DS-2018-10 Chenwei ShiReason to Believe

ILLC DS-2018-11 Malvin GattingerNew Directions in Model Checking Dynamic Epistemic Logic

ILLC DS-2018-12 Julia IlinFiltration Revisited Lattices of Stable Non-Classical Logics

  • Acknowledgements
  • Introduction
    • Matrix multiplication
    • The asymptotic spectrum of tensors
    • Higher-order CW method
    • Abstract asymptotic spectra
    • The asymptotic spectrum of graphs
    • Tensor degeneration
    • Combinatorial degeneration
    • Algebraic branching program degeneration
    • Organisation
      • The theory of asymptotic spectra
        • Introduction
        • Semirings and preorders
        • Strassen preorders
        • Asymptotic preorders
        • Maximal Strassen preorders
        • The asymptotic spectrum
        • The representation theorem
        • Abstract rank and subrank
        • Topological aspects
        • Uniqueness
        • Subsemirings
        • Subsemirings generated by one element
        • Universal spectral points
        • Conclusion
          • The asymptotic spectrum of graphs Shannon capacity
            • Introduction
            • The asymptotic spectrum of graphs
              • The semiring of graph isomorphism classes
              • Strassen preorder via graph homomorphisms
              • The asymptotic spectrum of graphs
              • Shannon capacity
                • Universal spectral points
                  • Lovaacutesz theta number
                  • Fractional graph parameters
                    • Conclusion
                      • The asymptotic spectrum of tensors matrix multiplication
                        • Introduction
                        • The asymptotic spectrum of tensors
                          • The semiring of tensor equivalence classes
                          • Strassen preorder via restriction
                          • The asymptotic spectrum of tensors
                          • Asymptotic rank and asymptotic subrank
                            • Gauge points
                            • Support functionals
                            • Upper and lower support functionals
                            • Asymptotic slice rank
                            • Conclusion
                              • Tight tensors and combinatorial subrank cap sets
                                • Introduction
                                • Higher-order CoppersmithndashWinograd method
                                  • Construction
                                  • Computational remarks
                                  • Examples type sets
                                    • Combinatorial degeneration method
                                    • Cap sets
                                      • Reduced polynomial multiplication
                                      • Cap sets
                                        • Graph tensors
                                        • Conclusion
                                          • Universal points in the asymptotic spectrum of tensors entanglement polytopes moment polytopes
                                            • Introduction
                                            • SchurndashWeyl duality
                                            • Kronecker and LittlewoodndashRichardson coefficients
                                            • Entropy inequalities
                                            • Hilbert spaces and density operators
                                            • Moment polytopes
                                              • General setting
                                              • Tensor spaces
                                                • Quantum functionals
                                                • Outer approximation
                                                • Inner approximation for free tensors
                                                • Quantum functionals versus support functionals
                                                • Asymptotic slice rank
                                                • Conclusion
                                                  • Algebraic branching programs approximation and nondeterminism
                                                    • Introduction
                                                    • Definitions and basic results
                                                      • Computational models
                                                      • Complexity classes
                                                      • The theorem of Ben-Or and Cleve
                                                      • Approximation closure
                                                      • Nondeterminism closure
                                                        • Approximation closure of VP2
                                                        • Nondeterminism closure of VP1
                                                        • Conclusion
                                                          • Bibliography
                                                          • Glossary
                                                          • Samenvatting
                                                          • Summary
Page 4: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch

ILLC Dissertation Series DS-2018-13

For further information about ILLC-publications please contact

Institute for Logic Language and ComputationUniversiteit van Amsterdam

Science Park 1071098 XG Amsterdam

phone +31-20-525 6051e-mail illcuvanl

homepage httpwwwillcuvanl

The investigations were supported by the Netherlands Organization for ScientificResearch NWO (617023116) the European Commission and the QuSoft ResearchCenter for Quantum Software

Copyright ccopy 2018 by Jeroen Zuiddam

ISBN 978-94-028-1175-9

Algebraic complexity

asymptotic spectra and

entanglement polytopes

Academisch Proefschrift

ter verkrijging van de graad van doctoraan de Universiteit van Amsterdamop gezag van de Rector Magnificus

prof dr ir KIJ Maexten overstaan van een door het College voor Promoties ingestelde

commissie in het openbaar te verdedigen in de Agnietenkapelop dinsdag 23 oktober 2018 te 1200 uur

door

Jeroen Zuiddam

geboren te Leiderdorp

Promotiecommisie

Promotores prof dr HM Buhrman Universiteit van Amsterdamprof dr M Christandl Koslashbenhavns Universitet

Overige leden prof dr M Laurent Tilburg Universityprof dr EM Opdam Universiteit van Amsterdamprof dr RM de Wolf Universiteit van Amsterdamdr J Briet CWI Amsterdamdr M Walter Universiteit van Amsterdam

Faculteit der Natuurwetenschappen Wiskunde en Informatica

Contents

Acknowledgements ix

1 Introduction 311 Matrix multiplication 512 The asymptotic spectrum of tensors 613 Higher-order CW method 1014 Abstract asymptotic spectra 1115 The asymptotic spectrum of graphs 1216 Tensor degeneration 1417 Combinatorial degeneration 1518 Algebraic branching program degeneration 1519 Organisation 17

2 The theory of asymptotic spectra 1921 Introduction 1922 Semirings and preorders 1923 Strassen preorders 2024 Asymptotic preorders 4sim 2125 Maximal Strassen preorders 2326 The asymptotic spectrum X(S6) 2527 The representation theorem 2628 Abstract rank and subrank RQ 2729 Topological aspects 29210 Uniqueness 30211 Subsemirings 31212 Subsemirings generated by one element 32213 Universal spectral points 33

v

214 Conclusion 33

3 The asymptotic spectrum of graphs Shannon capacity 3531 Introduction 3532 The asymptotic spectrum of graphs 37

321 The semiring of graph isomorphism classes G 37322 Strassen preorder via graph homomorphisms 38323 The asymptotic spectrum of graphs X(G) 39324 Shannon capacity Θ 39

33 Universal spectral points 41331 Lovasz theta number ϑ 41332 Fractional graph parameters 41

34 Conclusion 46

4 The asymptotic spectrum of tensors matrix multiplication 4741 Introduction 4742 The asymptotic spectrum of tensors 49

421 The semiring of tensor equivalence classes T 49422 Strassen preorder via restriction 49423 The asymptotic spectrum of tensors X(T ) 49424 Asymptotic rank and asymptotic subrank 50

43 Gauge points ζ(i) 5144 Support functionals ζθ 5245 Upper and lower support functionals ζθ ζθ 5646 Asymptotic slice rank 5847 Conclusion 63

5 Tight tensors and combinatorial subrank cap sets 6551 Introduction 6552 Higher-order CoppersmithndashWinograd method 68

521 Construction 69522 Computational remarks 77523 Examples type sets 78

53 Combinatorial degeneration method 7954 Cap sets 81

541 Reduced polynomial multiplication 81542 Cap sets 82

55 Graph tensors 8556 Conclusion 86

6 Universal points in the asymptotic spectrum of tensors entan-glement polytopes moment polytopes 8761 Introduction 87

vi

62 SchurndashWeyl duality 8863 Kronecker and LittlewoodndashRichardson coefficients gλmicroν c

λmicroν 90

64 Entropy inequalities 9165 Hilbert spaces and density operators 9266 Moment polytopes P(t) 93

661 General setting 93662 Tensor spaces 94

67 Quantum functionals F θ(t) 9568 Outer approximation 10069 Inner approximation for free tensors 101610 Quantum functionals versus support functionals 102611 Asymptotic slice rank 103612 Conclusion 105

7 Algebraic branching programs approximation and nondetermi-nism 10771 Introduction 10772 Definitions and basic results 110

721 Computational models 110722 Complexity classes VP VPe VPk 111723 The theorem of Ben-Or and Cleve 112724 Approximation closure C 115725 Nondeterminism closure N(C) 115

73 Approximation closure of VP2 11674 Nondeterminism closure of VP1 11975 Conclusion 122

Bibliography 125

Glossary 139

Samenvatting 141

Summary 143

vii

Acknowledgements

First of all I thank all my coauthors for very fruitful collaboration Harry BuhrmanMatthias Christandl Peter Vrana Jop Briet Chris Perry Asger Jensen MarkusBlaser Christian Ikenmeyer and Karl Bringmann

Chris Zaal Leen Torenvliet and Robert Belleman I thank for all their effortsto set up for me the ldquodouble bachelor programmerdquo in Mathematics and Computerscience at the University of Amsterdam (UvA) in 2009 This programme as wellas the ldquowebklasrdquo on the Riemann hypothesis organised by Jan van de Craats andRoland van der Veen and the close vicinity of the UvA to the Dutch nationalresearch institute for mathematics and computer science (CWI) made me decideto come to Amsterdam My enjoyable master thesis project in mathematics withEric Opdam made me follow the academic path for which I thank Eric

Of course most importantly I thank my PhD supervisor Harry Buhrmanfor introducing me to research as a bachelor student for absorbing me into theAlgorithms and Complexity group at CWI for having enough faith in me to hireme as his PhD student in 2014 and for his general guidance throughout I feelvery lucky for the opportunities and scientific freedom that this has brought me

Matthias Christandl has been my closest collaborator and mentor since wemet in Berkeley in 2014 In practice this meant countless nights of fun Skypesessions between Amsterdam and Copenhagen countless enjoyable visits to theUniversity of Copenhagen and countless kitchen table sessions at the HallinsgadeThanks Matthias for the energy inspiration and optimism And thanks Matthiasand Henriette for the hospitality

Jop Briet I thank for his general guidance and for lots of inspiration Thepolynomial method reading group which he mainly organised inspired partof my paper with Matthias Christandl and Peter Vrana on universal points inthe asymptotic spectrum of tensors (This reading group also resulted in DionGijswijtrsquos paper on cap sets) My paper with Jop on round elimination laterinspired me to write the paper on the asymptotic spectrum of graphs

ix

Christian Ikenmeyer I thank for numerous inspiring discussions on algebraiccomplexity theory and tensors which greatly influenced my papers on tensor rankand our joint paper with Karl Bringmann on algebraic branching programs

Peter Vrana I thank for our many enjoyable research collaborations the resultsof which form a central part of this dissertation for his clever insights and forfinding several mathematical mistakes while reading the draft of this dissertation

Ronald de Wolf I thank for his general advice throughout my PhD and formany suggestions regarding the current version of this dissertation which will beincorporated in the next version (but not in the printed version because of theregulations of the University of Amsterdam)

Jop Briet Monique Laurent Lex Schrijver Peter Vrana Matthias ChristandlMaris Ozols Michael Walter and Bart Sevenster I thank for helpful discussionsregarding the results in Chapter 2 and Chapter 3 of this dissertation

Srinivasan Arunachalam I thank for sharing the ups and downs during ourfour years as PhD students at CWI Florian Speelman Farrokh Labib SvenPolak Bart Litjens and Bart Sevenster I thank for numerous valuable researchdiscussions

Bikkie Aldeias and Rob van Rooijen I thank for their excellent library servicesMartijn Zuiddam and Maris Ozols I thank for proofreading the draft of this

dissertationFinally I thank my parents and my brothers and my friends for their support

Amsterdam Jeroen ZuiddamAugust 31 2018

x

Publications

[BCPZ16] Harry Buhrman Matthias Christandl Christopher Perry and JeroenZuiddam Clean quantum and classical communication protocolsPhysical Review Letters 117230503 2016httpslinkapsorgdoi101103PhysRevLett117230503

httparxivorgabs160507948

[BCZ17a] Markus Blaser Matthias Christandl and Jeroen Zuiddam The bordersupport rank of two-by-two matrix multiplication is sevenManuscript 2017httpsarxivorgabs170509652

[BCZ17b] Harry Buhrman Matthias Christandl and Jeroen ZuiddamNondeterministic Quantum Communication Complexity the CyclicEquality Game and Iterated Matrix MultiplicationIn Christos H Papadimitriou editor 8th Innovations in TheoreticalComputer Science Conference (ITCS) 2017httpdropsdagstuhldeopusvolltexte20178181

[BIZ17] Karl Bringmann Christian Ikenmeyer and Jeroen Zuiddam On algebraicbranching programs of small widthIn Ryan OrsquoDonnell editor 32nd Computational ComplexityConference (CCC) 2017httpsdoiorg104230LIPIcsCCC201720

httpsarxivorgabs170205328

Journal of the ACM Vol 65 No 5 Article 32 2018httpsdoiorg1011453209663

1

2 Acknowledgements

[BZ17] Jop Briet and Jeroen Zuiddam On the orthogonal rank of Cayley graphsand impossibility of quantum round eliminationQuantum Information and Computation 2017httpwwwrintonpresscomxxqic17qic-17-120106-0116pdf

httpsarxivorgabs160806113

[CJZ18] Matthias Christandl Asger Kjaeligrulff Jensen and Jeroen ZuiddamTensor rank is not multiplicative under the tensor productLinear Algebra and its Applications 543125ndash139 2018httpsdoiorg101016jlaa201712020

httpsarxivorgabs170509379

[CVZ16] Matthias Christandl Peter Vrana and Jeroen Zuiddam Asymptotic tensorrank of graph tensors beyond matrix multiplicationManuscript 2016httpsarxivorgabs160907476

[CVZ18] Matthias Christandl Peter Vrana and Jeroen Zuiddam Universal Pointsin the Asymptotic Spectrum of Tensors Extended AbstractIn Proceedings of 50th Annual ACM SIGACT Symposium on the Theory ofComputing (STOC) 2018httpsdoiorg10114531887453188766

httpsarxivorgabs170907851

[CZ18] Matthias Christandl and Jeroen Zuiddam Tensor surgery and tensor rank(Journal of) computational complexity 2018httpsdoiorg101007s00037-018-0164-8

httpsarxivorgabs160604085

[Zui17] Jeroen Zuiddam A note on the gap between rank and border rankLinear Algebra and its Applications 52533ndash44 2017httpsdoiorg101016jlaa201703015

httparxivorgabs150405597

[Zui18] Jeroen Zuiddam The asymptotic spectrum of graphs and the ShannoncapacityManuscript 2018httparxivorgabs180700169

This dissertation is based on the above papers with primary focus on the fourhighlighted papers

Toelichting op het relatieve belang van de co-auteurs voor elk artikel is hetbelang van de co-auteurs ongeveer gelijk verdeeld

Chapter 1

Introduction

Volker Strassen published in 1969 his famous algorithm for multiplying anytwo ntimes n matrices using only O(n281) rather than O(n3) arithmetical opera-tions [Str69] His discovery marked the beginning of a still ongoing line of researchin the field of algebraic complexity theory a line of research that by now touchesseveral fields of mathematics including algebraic geometry representation theory(quantum) information theory and combinatorics This dissertation is inspired byand contributes to this line of research

No further progress followed for almost 10 years after Strassenrsquos discoverydespite the fact that ldquomany scientists understood that discovery as a signal toattack the problem and to push the exponent further downrdquo [Pan84] Then in 1978Pan improved the exponent from 281 to 279 [Pan78 Pan80] One year later BiniCapovani Lotti and Romani improved the exponent to 278 by constructing fastldquoapproximativerdquo algorithms for matrix multiplication and making these algorithmsexact via the method of interpolation [BCRL79 Bin80] Cast in the languageof tensors the result of Bini et al corresponds to what we now call a ldquoborderrankrdquo upper bound The idea of studying approximative complexity or bordercomplexity for algebraic problems has nowadays become an important theme inalgebraic complexity theory

Schonhage then obtained the exponent 255 by constructing a fast algorithm forcomputing many ldquodisjointrdquo small matrix multiplications and transforming this intoan algorithm for one large matrix multiplication [Sch81] The upper bound was im-proved shortly after by works of Pan [Pan81] Romani [Rom82] and Coppersmithand Winograd [CW82] resulting in the exponent 250 Then in 1987 Strassenpublished the laser method with which he obtained the exponent 248 [Str87] Thelaser method was used in the same year by Coppersmith and Winograd to obtainthe exponent 238 [CW87] To do this they invented a method for constructingcertain large combinatorial structures This method or actually the extendedversion that Strassen published in [Str91] we now call the CoppersmithndashWinogradmethod All further improvements on upper bounding the exponent essentially

3

4 Chapter 1 Introduction

follow the framework of Coppersmith and Winograd and the improvements donot affect the first two digits after the comma [CW90 Sto10 Wil12 LG14]

Define ω to be the optimal exponent in the complexity of matrix multiplicationWe call ω the exponent of matrix multiplication To summarise the above historicalaccount on upper bounds ω lt 238 On the other hand the only lower bound wecurrently have is the trivial lower bound 2 le ω

The history of upper bounds on the matrix multiplication exponent ω whichbegan with Strassenrsquos algorithm and ended with the Strassen laser methodand CoppersmithndashWinograd method is well-known and well-documented seeeg [BCS97 Section 1513] However there is remarkable work of Strassen ona theory of lower bounds for ω and similar types of exponents and this workhas received almost no attention This theory of lower bounds is the theory ofasymptotic spectra of tensors and is the topic of a series of papers by Strassen[Str86 Str87 Str88 Str91 Str05]

In the foregoing the word tensor has popped up twicemdashnamely when wementioned border rank and just now when we mentioned asymptotic spectraof tensorsmdashbut we have not discussed at all why tensors should be relevant forunderstanding the complexity of matrix multiplication First we give a mini courseon tensors A k-tensor t = (ti1ik)i1ik is a k-dimensional array of numbers fromsome field say the complex numbers C Thus a 2-tensor is simply a matrix Ak-tensor is called simple if there exist k vectors v1 vk such that the entries of tare given by the products ti1ik = (v1)i1 middot middot middot (vk)ik for all indices ij The tensorrank of t is the smallest number n such that t can be written as a sum of n simpletensors Thus the tensor rank of a 2-tensor is simply its matrix rank Returning tothe problem of finding the complexity of matrix multiplication there is a special3-tensor called the matrix multiplication tensor that encodes the computationalproblem of multiplying two 2times 2 matrices This 3-tensor is commonly denotedby 〈2 2 2〉 It turns out that the matrix multiplication exponent ω is exactly theasymptotic rate of growth of the tensor rank of the ldquoKronecker powersrdquo of thetensor 〈2 2 2〉 This important observation follows from the fundamental fact thatthe computational problem of multiplying matrices is ldquoself-reduciblerdquo Namely wecan multiply two matrices by viewing them as block matrices and then performmatrix multiplication at the level of the blocks

We wrap up this introductory story To understand the computational com-plexity of matrix multiplication one should understand the asymptotic rate ofgrowth of the tensor rank of a certain family of tensors a family that is obtainedby taking powers of a fixed tensor The theory of asymptotic spectra is the theoryof bounds on such asymptotic parameters of tensors

The main story line of this dissertation concerns the theory of asymptoticspectra In Section 11 of this introduction we discuss in more detail the computa-tional problem of multiplying matrices In Section 12 we discuss the asymptoticspectrum of tensors and discuss a new result an explicit description of infinitely

11 Matrix multiplication 5

many elements in the asymptotic spectrum of tensors In Section 13 we considera new higher-order CoppersmithndashWinograd method

The theory of asymptotic spectra of tensors is a special case of an abstracttheory of asymptotic spectra of preordered semirings which we discuss in Sec-tion 14 In Section 15 we apply this abstract theory to a new setting namelyto graphs By doing this we obtain a new dual characterisation of the Shannoncapacity of graphs

The second story line of this dissertation is about degeneration an algebraickind of approximation related to the concept of border rank of Bini et al We discussdegeneration in the context of tensors in Section 16 There is a combinatorialversion of tensor degeneration which we call combinatorial degeneration Wediscuss a new result regarding combinatorial degeneration in Section 17 FinallySection 18 is about a new result concerning degeneration for algebraic branchingprograms an algebraic model of computation

We finish in Section 19 with a discussion of the organisation of this dissertationinto chapters

11 Matrix multiplication

In this section we discuss in more detail the computational problem of multiplyingtwo matrices

Algebraic complexity theory studies algebraic algorithms for algebraic problemsRoughly speaking algebraic algorithms are algorithms that use only the basicarithmetical operations + and times over some field say R or C A fundamentalexample of an algebraic problem is matrix multiplication

If we multiply two ntimesn matrices by computing the inner products between anyrow of the first matrix and any column of the second matrix one by one we needroughly 2 middot n3 arithmetical operations (+ and times) For example we can multiplytwo 2times2 matrices with 12 arithmetical operations namely 8 multiplications and 4additions(

a11 a12

a21 a22

)(b11 b12

b21 b22

)=

(a11b11 + a12b21 a11b12 + a12b22

a21b11 + a22b21 a21b12 + a22b22

)

Since matrix multiplication is a basic operation in linear algebra it is worthwhileto see if we can do better than 2 middot n3 In 1969 Strassen [Str69] published a betteralgorithm The base routine of Strassenrsquos algorithm is an algorithm for multiplyingtwo 2times 2 matrices with 7 multiplications 18 additions and certain sign changes(

a11 a12

a21 a22

)(b11 b12

b21 b22

)=

(x1 + x4 minus x5 + x7 x3 + x5

x2 + x4 x1 + x3 minus x2 + x6

)with

x1 = (a11 + a22)(b11 + b22)

6 Chapter 1 Introduction

x2 = (a21 + a22)b11

x3 = a11(b12 minus b22)

x4 = a22(minusb11 + b21)

x5 = (a11 + a12)b22

x6 = (minusa11 + a21)(b11 + b12)

x7 = (a12 minus a22)(b21 + b22)

The general routine of Strassenrsquos algorithm multiplies two n times n matrices byrecursively dividing the matrices into four blocks and applying the base routineto multiply the blocks (this is the self-reducibility of matrix multiplication thatwe mentioned earlier) The base routine does not assume commutativity of thevariables for correctness so indeed we can take the variables to be matrices Afterexpanding the recurrence we see that Strassenrsquos algorithm uses 47middotnlog2 7 asymp 47middotn281

arithmetical operations Over the years Strassenrsquos algorithm was improved bymany researchers The best algorithm known today uses C middot n238 arithmeticaloperations where C is some constant [CW90 Sto10 Wil12 LG14] The exponentof matrix multiplication ω is the infimum over all real numbers β such that forsome constant Cβ we can multiply for any n isin N any two ntimes n matrices with atmost Cβ middot nβ arithmetical operations From the above it follows that ω le 238From a simple flattening argument it follows that 2 le ω We are left with thefollowing well-known open problem what is the value of the matrix multiplicationexponent ω

The constant C for the currently best algorithm is impractically large (fora discussion of this issue see eg [Pan18]) For a practical fast algorithm oneshould either improve C or find a balance between C and the exponent of nWe will ignore the size of C in this dissertation and focus on the exponent ωFor an overview of the field of algebraic complexity theory the reader shouldconsult [BCS97] and [Sap16]

12 The asymptotic spectrum of tensors

We now discuss the theory of asymptotic spectra for tensors

Let s and t be k-tensors over a field F s isin Fn1otimesmiddot middot middototimesFnk t isin Fm1otimesmiddot middot middototimesFmk We say s restricts to t and write s gt t if there are linear maps Ai Fni rarr Fmisuch that (A1 otimes middot middot middot otimes Ak)(s) = t Let [n] = 1 n for n isin N We definethe product s otimes t isin Fn1m1 otimes middot middot middot otimes Fnkmk by (s otimes t)(i1j1)(ikjk) = si1iktj1jkfor i isin [n1] times middot middot middot [nk] and j isin [m1] times middot middot middot times [mk] This product generalizes thewell-known Kronecker product of matrices We refer to this product as the tensor(Kronecker) product We define the direct sum s oplus t isin Fn1+m1 otimes middot middot middot otimes Fnk+mk

by (s oplus t)`1`k = s`1`k if ` isin [n1] times middot middot middot times [nk] (s oplus t)n1+`1nk+`k = t`1`k if` isin [m1]times middot middot middot times [mk] and (soplus t)`1`k = 0 for the remaining indices

12 Asymptotic spectra of tensors 7

The asymptotic restriction problem asks to compute the infimum of all realnumbers β ge 0 such that for all n isin N

sotimesβn+o(n) gt totimesn

We may think of the asymptotic restriction problem as having two directionsnamely to find

1 obstructions ldquocertificatesrdquo that prohibit sotimesβn+o(n) gt totimesn or

2 constructions linear maps that carry out sotimesβn+o(n) gt totimesn

Ideally we would like to find matching obstructions and constructions so that weindeed learn the value of β

What do obstructions look like We set β equal to one it turns out that itis sufficient to understand this case We say s restricts asymptotically to t andwrite s gtsim t if

sotimesn+o(n) gt totimesn

What do obstructions look like for asymptotic restriction gtsim More precisely whatdo obstructions look like for gtsim restricted to a subset S sube k-tensors over FLet us assume S is closed under direct sum and tensor product and containsthe diagonal tensors 〈n〉 =

sumni=1 ei otimes middot middot middot otimes ei for n isin N where e1 en is the

standard basis of Fn Let X(S) be the set of all maps φ S rarr Rge0 that are

(a) monotone under restriction gt

(b) multiplicative under the tensor Kronecker product otimes

(c) additive under the direct sum oplus

(d) normalised to φ(〈n〉) = n at the diagonal tensor 〈n〉

The elements φ isin X(S) are called spectral points of S The set X(S) is called theasymptotic spectrum of S

Spectral points φ isin X(S) are obstructions Let s t isin S If s gtsim t thenby definition we have a restriction sotimesn+o(n) gt totimesn Then (a) and (b) imply theinequality φ(s)n+o(n) = φ(sotimesn+o(n)) gt φ(totimesn) = φ(t)n This implies φ(s) gt φ(t)We negate that statement if φ(s) lt φ(t) then not s gtsim t In that case φ is anobstruction to s gtsim t

The remarkable fact is that X(S) is a complete set of obstructions for gtsimNamely for s t isin S the asymptotic restriction s gtsim t holds if and only if we haveφ(s) gt φ(t) for all spectral points φ isin X(S) This was proven by Volker Strassenin [Str86 Str88] His proof uses a theorem of Becker and Schwarz [BS83] which iscommonly referred to as the KadisonndashDubois theorem (for historical reasons) or

8 Chapter 1 Introduction

the real representation theorem (We will say more about this completeness resultin Section 14)

Let us introduce tensor rank and subrank and their asymptotic versionsThe tensor rank of t is the size of the smallest diagonal tensor that restrictsto t R(t) = minr isin N t 6 〈r〉 and the subrank of t is the size of thelargest diagonal tensor to which t restricts Q(t) = maxr isin N 〈r〉 6 tAsymptotic rank is defined as ˜R(t) = limnrarrinfinR(totimesn)1n and asymptotic subrankis defined as ˜Q(t) = limnrarrinfinQ(totimesn)1n From Feketersquos lemma it follows that

˜Q(t) = supn Q(totimesn)1n and ˜R(t) = infn R(totimesn)1n One easily verifies that everyspectral point φ isin X(S) is an upper bound on asymptotic subrank and a lowerbound on asymptotic rank for any tensor t isin S

˜Q(t) le φ(t) le ˜R(t)

Strassen used the completeness of X(S) for 6sim to prove ˜Q(t) = minφisinX(S) φ(t) and

˜R(t) = maxφisinX(S) φ(t) One should think of these expressions as being dual to thedefining expressions for ˜Q and ˜R

We mentioned that Strassen was motivated to study the asymptotic spectrumof tensors by the study of the complexity of matrix multiplication The preciseconnection with matrix multiplication is as follows The matrix multiplicationexponent ω is characterised by the asymptotic rank ˜R(〈2 2 2〉) of the matrixmultiplication tensor

〈2 2 2〉 =sum

ijkisin[2]

eij otimes ejk otimes eki isin F4 otimes F4 otimes F4

via ˜R(〈2 2 2〉) = 2ω We know the trivial lower bound 2 le ω see Section 43 Weknow the (nontrivial) upper bound ω le 23728639 which is by Coppersmith andWinograd [CW90] and improvements by Stothers [Sto10] Williams [Wil12] andLe Gall [LG14] It may seem that for the study of matrix multiplication only theasymptotic rank ˜R is of interest and that the asymptotic subrank ˜Q is just a toyparameter Asymptotic subrank however plays an important role in the currentlybest matrix multiplication algorithms We will discuss this idea in the context ofthe asymptotic subrank of so-called complete graph tensors in Section 55

The important message is understanding the asymptotic spectrum of ten-sors X(S) means understanding asymptotic restriction 6sim the asymptotic sub-rank ˜Q and the asymptotic rank ˜R of tensors Of course we should now find anexplicit description of X(S)

Our main result regarding the asymptotic spectrum of tensors is the explicitdescription of an infinite family of elements in the asymptotic spectrum of allcomplex tensors X(complex k-tensors) which we call the quantum function-als (Chapter 6) Finding such an infinite family has been an open problemsince the work of Strassen Moment polytopes (studied under the name en-tanglement polytopes in quantum information theory [WDGC13]) play a key

12 Asymptotic spectra of tensors 9

role here To each tensor t is associated a convex polytope P(t) collectingrepresentation-theoretic information about t called the moment polytope of t(See eg [Nes84 Bri87 WDGC13 SOK14]) The moment polytope has twoimportant equivalent descriptions

Quantum marginal spectra description We begin with the descriptionof P(t) in terms of quantum marginal spectra

Let V be a (finite-dimensional) Hilbert space In quantum information theorya positive semidefinite hermitian operator ρ V rarr V with trace one is calleda density operator The sequence of eigenvalues of a density operator ρ is aprobability vector We let spec(ρ) = (p1 pn) be the sequence of eigenvalues of ρordered non-increasingly p1 ge middot middot middot ge pn Let V1 and V2 be Hilbert spaces Given adensity operator ρ on V1 otimes V2 the reduced density operator ρ1 = tr2 ρ is uniquelydefined by the property that tr(ρ1X1) = tr(ρ(X1otimesIdV2)) for all operators X1 on V1The operator ρ1 is again a density operator The operation tr2 is called the partialtrace over V2 In an explicit form ρ1 is given by 〈ei ρ1(ej)〉 =

sum`〈eiotimesf` ρ(ejotimesf`)〉

where the ei form a basis of V1 and the fi form an orthonormal basis of V2 (thestatement is independent of basis choice)

Let Vi be a Hilbert space and consider the tensor product V1 otimes V2 otimes V3Associate with t isin V1 otimes V2 otimes V3 the dual element tlowast = 〈t middot〉 isin (V1 otimes V2 otimes V3)

lowastThen ρt = ttlowast〈t t〉 = t〈t middot〉〈t t〉 is a density operator on V1otimesV2otimesV3 Viewing ρt

as a density operator on the regrouped space V1 otimes (V2 otimes V3) we may take thepartial trace of ρt over V2otimesV3 as described above We denote the resulting densityoperator by ρt1 = tr23 ρ

t We similarly define ρt2 and ρt3

Let V = V1otimesV2otimesV3 Let G = GL(V1)timesGL(V2)timesGL(V3) act naturally on V Let t isin V 0 The moment polytope of t is

P(t) = P(G middot t) = (spec(ρu1) spec(ρu2) spec(ρu3)) u isin G middot t 0

Here G middot t denotes the Zariski closure or equivalently the Euclidean closure in Vof the orbit G middot t = g middot t g isin G

Representation-theoretic description On the other hand there is a de-scription of P(t) in terms of non-vanishing of representation-theoretic multiplicitiesWe do not state this description here but stress that it is crucial for our proofs

Quantum functionals For any probability vector θ isin Rk (iesumk

i=1 θ(i) = 1and θ(i) ge 0 for all i isin [k]) we define the quantum functional F θ as an optimisationover the moment polytope

F θ(t) = max

2sumki=1 θ(i)H(x(i)) (x(1) x(k)) isin P(t)

Here H(y) denotes Shannon entropy of the probability vector y We prove that F θ

satisfies properties (a) (b) (c) and (d) for all complex k-tensors

Theorem (Theorem 611) F θ isin X(complex k-tensors)

10 Chapter 1 Introduction

To put our result into context Strassen in [Str91] constructed elements in theasymptotic spectrum of S = oblique k-tensors over F with the preorder 6|SThe set S is a strict and non-generic subset of all k-tensors over F These elementswe call the (Strassen) support functionals On oblique tensors over C the quantumfunctionals and the support functionals coincide An advantage of the supportfunctionals over the quantum functionals is that they are defined over any fieldIn fact the support functionals are ldquopowerful enoughrdquo to reprove the result ofEllenbergndashGijswijt on cap sets [EG17] We discuss the support functionals inSection 44

13 Higher-order CW method

Recall that in the asymptotic restriction problem we have an obstruction directionand a construction direction The quantum functionals and the support functionalsprovide obstructions Now we look at the construction direction Constructionsare asymptotic transformations sotimesβn+o(n) gt totimesn We restrict attention to the casethat t is a diagonal tensor 〈r〉 Constructions in this case essentially correspondto lower bounds on the asymptotic subrank ˜Q(s) The goal is now to constructgood lower bounds on ˜Q(s)

Strassen solved the problem of computing the asymptotic subrank for so-calledtight 3-tensors with the CoppersmithndashWinograd (CW) method and the supportfunctionals [CW90 Str91] The CW method is combinatorial Let us introduce thecombinatorial viewpoint Let I1 Ik be finite sets We call a set D sube I1timesmiddot middot middottimesIka diagonal if any two distinct elements a b isin D differ in all k coordinates LetΦ sube I1timesmiddot middot middottimes Ik We call a diagonal D sube Φ free if D = Φcap (D1timesmiddot middot middottimesDk) HereDi = ai a isin D is the projection ofD onto the ith coordinate The subrank Q(Φ)of Φ is the size of the largest free diagonal D sube Φ For two sets Φ sube I1 times middot middot middot times Ikand Ψ sube J1 times middot middot middot times Jk we define the product ΦtimesΨ sube (I1 times J1)times middot middot middot times (Ik times Jk)by Φ times Ψ = ((a1 b1) (ak bk)) a isin Φ b isin Ψ The asymptotic subrankis defined as ˜Q(Φ) = limnrarrinfinQ(Φtimesn)1n One may think of Φ as a k-partitehypergraph and of a free diagonal in Φ as an induced k-partite matching

How does this combinatorial version of subrank relate to the tensor version ofsubrank that we defined earlier Let t isin Fn1otimesmiddot middot middototimesFnk Expand t in the standardbasis t =

sumiisin[n1]timesmiddotmiddotmiddottimes[nk] ti ei1 otimes middot middot middot otimes eik Let supp(t) be the support of t in the

standard basis supp(t) = i isin [n1]timesmiddot middot middottimes [nk] ti 6= 0 Then Q(supp(t)) le Q(t)We want to construct large free diagonals Let Φ sube I1timesmiddot middot middottimesIk We call Φ tight

if there are injective maps αi Ii rarr Z such that if a isin Φ thensumk

i=1 αi(ai) = 0For a set X let P(X) be the set of probability distributions on X For θ isin P([k]) letHθ(Φ) = maxPisinP(Φ)

sumki=1 θ(i)H(Pi) where H(Pi) denotes the Shannon entropy

of the ith marginal distribution of P In [Str91] Strassen used the CW methodand the support functionals to characterise the asymptotic subrank ˜Q(Φ) fortight Φ sube I1 times I2 times I3 He proved the following Let Φ sube I1 times I2 times I3 be tight

14 Abstract asymptotic spectra 11

Then

˜Q(Φ) = minθisinP([3])

2Hθ(Φ) = maxPisinP(Φ)

miniisin[3]

2H(Pi) (11)

We study the higher-order regime Φ sube I1 times middot middot middot times Ik k ge 4

Theorem (Theorem 57) Let Φ sube I1 times middot middot middot times Ik be tight Then ˜Q(Φ) is lowerbounded by an expression that generalizes the right-hand side of (11)

Stating the lower bound requires a few definitions so we do not state it hereIt is not known whether our new lower bound matches the upper bound given byquantum or support functionals

Using Theorem 57 we managed to exactly determine the asymptotic subranksof several new examples These results in turn we used to obtain upper boundson the asymptotic rank of so-called complete graph tensors via a higher-orderStrassen laser method

14 Abstract asymptotic spectra

Strassen mainly studied tensors but he developed an abstract theory of asymptoticspectra in a general setting In the next section we apply this abstract theory tographs We now introduce the abstract theory One has a semiring S (think of asemiring as a ring without additive inverses) that contains N and a preorder 6on S that (1) behaves well with respect to the semiring operations (2) inducesthe natural order on N and (3) for any a b isin S b 6= 0 there is an r isin N sube Swith a 6 r middot b We call such a preorder a Strassen preorder The main theoremis that the asymptotic version 6sim of the Strassen preorder is characterised by themonotone semiring homomorphisms S rarr Rge0 For a b isin S let a 6sim b if there is a

sequence xn isin NN with x1nn rarr 1 when nrarrinfin and an 6 bnxn for all n isin N Let

X = X(S6) = φ isin Hom(SRge0) foralla b isin S a 6 brArr φ(a) le φ(b)

The set X is called the asymptotic spectrum of (S6)

Theorem (Strassen) a 6sim b iff forallφ isin X φ(a) le φ(b)

Strassen applies this theorem to study rank and subrank of tensors Wedefine an abstract notion of rank R(a) = minn isin N a 6 n and an abstractnotion of subrank Q(a) = maxm isin N m 6 a We then naturally have anasymptotic rank ˜R(a) = limnrarrinfinR(an)1n and (under certain mild conditions) anasymptotic subrank ˜Q(a) = limnrarrinfinQ(an)1n In fact ˜R(a) = infn R(an)1n and

˜Q(a) = supn Q(an)1n by Feketersquos lemma The theorem implies the following dualcharacterisations

12 Chapter 1 Introduction

Corollary (Section 28) If a isin S with ak gt 2 for some k isin N then

˜Q(a) = minφisinX

φ(a)

If a isin S with φ(a) ge 1 for some φ isin X then

˜R(a) = maxφisinX

φ(a)

In Chapter 2 we will discuss the abstract theory of asymptotic spectra Wewill discuss a proof of the above theorem that is obtained by integrating the proofsof Strassen in [Str88] and the proof of the KadisonndashDubois theorem of Beckerand Schwarz in [BS83] We will also discuss some basic properties of generalasymptotic spectra

15 The asymptotic spectrum of graphs

In the previous section we have seen the abstract theory of asymptotic spectraWe now discuss a problem in graph theory where we can apply this abstracttheory Consider a communication channel with input alphabet a b c d e andoutput alphabet 1 2 3 4 5 When the sender gives an input to the channel thereceiver gets an output according to the following diagram where an outgoingarrow is picked randomly (say uniformly randomly)

a 1

b 2

c 3

d 4

e 5

Output 2 has an incoming arrow from a and an incoming arrow from b Wesay a and b are confusable because the receiver cannot know whether a or bwas given as an input to the channel In this channel the pairs of inputsa b b c c d d e e a are confusable If we restrict the input set toa subset of pairwise non-confusable letters say a c then we can use the channelto communicate two messages with zero error It is clear that for this channel anynon-confusable set of inputs has size at most two Can we make better use of thechannel if we use the channel twice Yes now the input set is the set of two letterwords aa ab ac ad ae ba bb and we have a set of pairwise non-confusablewords aa bc ce db ed which has size 5 Thus ldquoper channel userdquo we can send atleast

radic5 letters What happens if we use the channel n times

15 The asymptotic spectrum of graphs 13

The situation is concisely described by drawing the confusability graph of thechannel which has the input letters as vertices and the confusable pairs of inputletters as edges For the above channel the confusability graph is the 5-cycle C5

a

b

cd

e

A subset of inputs that are pairwise non-confusable corresponds to a subset ofthe vertices in the confusability graph that contains no edges an independent setThe independence number of any graph G is the size of the largest independentset in G and is denoted by α(G) If G is the confusability graph of some channelthen the confusability graph for using the channel n times is denoted by Gn (thegraph product is called the strong graph product) The question of how manyletters we can send asymptotically translates to computing the limit

Θ(G) = limnrarrinfin

α(Gn)1n

which exists because α is supermultiplicative under The parameter Θ(G) wasintroduced by Shannon [Sha56] and is called the Shannon capacity of the graph GComputing the Shannon capacity is a nontrivial problem already for small graphsLovasz in 1979 [Lov79] computed the value Θ(C5) =

radic5 by introducing and

evaluating a new graph parameter ϑ which is now known as the Lovasz thetanumber Already for the 7-cycle C7 the Shannon capacity is not known

Duality theorem We propose a new application of the abstract theory ofasymptotic spectra to graph theory The main theorem that results from this is adual characterisation of the Shannon capacity of graphs For graphs G and H wesay G 6 H if there is a graph homomorphism Grarr H ie from the complementof G to the complement of H We show graphs are a semiring under the stronggraph product and the disjoint union t and 6 is a Strassen preorder onthis semiring The rank in this setting is the clique cover number χ(middot) = χ( middot )ie the chromatic number of the complement The subrank in this setting is theindependence number α(middot) Let X(G) be the set of semiring homomorphismsfrom graphs to Rge0 that are monotone under 6 From the abstract theory ofasymptotic spectra we derive the following duality theorem

Theorem (Theorem 31) Θ(G) = minφisinX(G) φ(G)

In Chapter 3 we will prove Theorem 31 and we will discuss the known elementsin X(G) which are the Lovasz theta number and a family of parameters obtainedby ldquofractionalisingrdquo

14 Chapter 1 Introduction

16 Tensor degeneration

We move to the second story line that we mentioned earlier degeneration Degen-eration is a prominent theme in algebraic complexity theory Roughly speakingdegeneration is an algebraic notion of approximation defined via orbit closures

For tensors for example degeneration is defined as follows Let V1 V2 V3

be finite-dimensional complex vector spaces and let V = V1 otimes V2 otimes V3 be thetensor product space Let G = GL(V1) times GL(V2) times GL(V3) act naturally on V Let s t isin V Let G middot t = g middot t g isin G be the orbit of t under G We say tdegenerates to s and write t s if s is an element in the orbit closure G middot t Herethe closure is taken with respect to the Zariski topology or equivalently withrespect to the Euclidean topology One should think of this degeneration asa topologically closed version of the restriction preorder le for tensors that wedefined earlier Degeneration is a ldquolargerrdquo preorder than restriction in the sensethat s t implies s le t

In several algebraic models of computation approximative computations cor-respond to certain degenerations In some models such an approximative com-putation can be turned into an exact computation at a small cost for exampleusing the method of interpolation The currently fastest matrix multiplicationalgorithms are constructed in this way for example

On the other hand it turns out that if a lower bound technique for analgebraic measure of complexity is ldquocontinuousrdquo then the lower bounds obtainedwith this technique are already lower bounds on the approximative version ofthe complexity measure This observation turns approximative complexity anddegeneration into an interesting topic itself A research program in this directionis the geometric complexity theory program of Mulmuley and Sohoni towardsseparating the algebraic complexity class VP (and related classes) from VNP[MS01] (see also [Ike13])

In this section we briefly discuss three results related to degeneration of tensorsthat are not discussed further in this dissertation Then we will discuss resultson combinatorial degeneration in Section 17 and algebraic branching programdegeneration in Section 18

Ratio of tensor rank and border rank The approximative or degenera-tion version of tensor rank is called border rank and is denoted by R It has beenknown since the work of Bini and Strassen that tensor rank R and border rank Rare different How much can they be different In [Zui17] we showed the followinglower bound Let k ge 3 There is a sequence of k-tensors tn in (C2n)otimesk such thatR(tn)R(tn) ge k minus o(1) when n rarr infin This answers a question of Landsbergand Micha lek [LM16b] and disproves a conjecture of Rhodes [AJRS13] Furtherprogress will most likely require the construction of explicit tensors with hightensor rank which has implications in formula complexity [Raz13]

Border support rank Support rank is a variation on tensor rank whichhas its own approximative version called border support rank A border support

17 Combinatorial degeneration 15

rank upper bound for the matrix multiplication tensor yields an upper bound onthe asymptotic complexity This was shown by Cohn and Umans in the contextof the group theoretic approach towards fast matrix multiplication [CU13] Theyasked what is the border support rank of the smallest matrix multiplicationtensor 〈2 2 2〉 In [BCZ17a] we showed that it equals seven Our proof usesthe highest-weight vector technique (see also [HIL13]) Our original motivationto study support rank is a connection that we found between support rank andnondeterministic multiparty quantum communication complexity [BCZ17b]

Tensor rank under outer tensor product We applied degeneration asa tool to study an outer tensor product otimes on tensors For s isin Cn1 otimes middot middot middot otimes Cnk

and t isin Cm1 otimes middot middot middot otimes Cm` let s otimes t be the natural (k + `)-tensor in Cn1 otimes middot middot middot otimesCnk otimes Cm1 otimes middot middot middot otimes Cm` The products otimes and otimes differ by a regrouping of thetensor indices It is well known that tensor rank is not multiplicative under otimesIn [CJZ18] we showed that tensor rank is already not multiplicative under otimes astronger result Nonmultiplicativity occurs when taking a power of a tensor whoseborder rank is strictly smaller than its tensor rank This answers a question ofDraisma [Dra15] and Saptharishi et al [CKSV16]

17 Combinatorial degeneration

In the previous section we introduced the general idea of degeneration and discusseddegeneration of tensors Combinatorial degeneration is the combinatorial analogueof tensor degeneration Consider sets Φ sube Ψ sube I1 times middot middot middot times Ik of k-tuples Wesay Φ is a combinatorial degeneration of Ψ and write Ψ Φ if there are mapsui Ii rarr Z such that for all α isin I1 times middot middot middot times Ik if α isin Ψ Φ then

sumki=1 ui(αi) gt 0

and if α isin Φ thensumk

i=1 ui(αi) = 0 We prove that combinatorial asymptoticsubrank is nonincreasing under combinatorial degeneration

Theorem (Theorem 521) If Ψ Φ then ˜Q(Ψ) ge ˜Q(Φ)

The analogous statement for subrank of tensors is trivially true The crucialpoint is that Theorem 521 is about combinatorial subrank As an example Theo-rem 521 combined with the CW method yields an elegant optimal constructionof tri-colored sum-free sets which are combinatorial objects related to cap sets

18 Algebraic branching program degeneration

We now consider degeneration in the context of algebraic branching programs Acentral theme in algebraic complexity theory is the study of the power of differentalgebraic models of computation and the study of the corresponding complexityclasses We have already (implicitly) used an algebraic model of computationwhen we discussed matrix multiplication circuits

16 Chapter 1 Introduction

bull A circuit is a directed acyclic graph G with one or more source verticesand one sink vertex Each source vertex is labelled by a variable xi ora constant α isin F The other vertices are labelled by either + or times andhave in-degree 2 (that is fan-in 2) Each vertex of G naturally computesa polynomial The value of G is the element computed at the sink vertexThe size of G is the number of vertices (One may also allow multiple sinkvertices in order to compute multiple polynomials eg to compute matrixmultiplication) Here is an example of a circuit computing xy + 2x+ y minus 1

minus1 2 x y source vertices

times times

+ +

+ sink vertex

Consider the following two models

bull A formula is a circuit whose graph is a tree

bull An algebraic branching program (abp) is a directed acyclic graph G withone source vertex s one sink vertex t and affine linear forms over the basefield F as edge labels Moreover each vertex is labeled with an integer (itslayer) and the arrows in the abp point from vertices in layer i to vertices inlayer i+ 1 The cardinality of the largest layer we call the width of the abpThe number of vertices we call the size of the abp The value of an abp isthe sum of the values of all sndasht-paths where the value of an sndasht-path is theproduct of its edge labels We say that an abp computes its value Here isan example of a width-3 abp computing xy + 2x+ y minus 1

s

t

x2

xyminus1

19 Organisation 17

The above models of computation give rise to complexity classes A complexityclass consists of families of multivariate polynomials (fn)n = (f(x1 xqn)n)nisinNover some fixed field F We say a family of polynomials (fn)n is a p-family if thedegree of fn and the number of variables of fn grow polynomially in n Let VPbe the class of p-families with polynomially bounded circuit size Let VPe be theclass of p-families with polynomially bounded formula size For k isin N let VPk bethe class of families of polynomials computable by width-k abps of polynomiallybounded size Let VPs be the class of p-families computable by skew circuitsof polynomial size Skew circuits are a type of circuits between formulas andgeneral circuits The class VPs coincides with the class of families of polynomialscomputable by abps of polynomially bounded size (see eg [Sap16]) Ben-Orand Cleve proved that VP3 = VP4 = middot middot middot = VPe [BOC92] Allender and Wangproved VP2 ( VP3 [AW16] Thus VP2 ( VP3 = VP4 = middot middot middot = VPe sube VPsThe following separation problem is one of the many open problems regardingalgebraic complexity classes Is the inclusion VPe sube VPs strict Motivated by thisseparation problem we study the approximation closure of VPe We mentionedthat Ben-Or and Cleve proved that formula size is polynomially equivalent towidth-3 abp size [BOC92] Regarding width-2 there are explicit polynomials thatcannot be computed by any width-2 abp of any size [AW16] The abp model hasa natural notion of approximation When we allow approximation in our abpsthe situation changes completely

Theorem (Theorem 78) Any polynomial can be approximated by a width-2 abpof size polynomial in the formula size

In terms of complexity classes this means VP2 = VPe where middot denotes theldquoapproximation closurerdquo of the complexity class The theorem suggests an ap-proach regarding the separation of VPe and VPs Namely superpolynomial lowerbounds on formula size may be obtained from superpolynomial lower bounds onapproximate width-2 abp size We moreover study the nondeterminism closure ofcomplexity classes and prove a new characterisation of the complexity class VNP

19 Organisation

This dissertation is divided into chapters as follows We will begin with the abstracttheory of asymptotic spectra in Chapter 2 Then we introduce the asymptoticspectra of graphs and a new characterisation of the Shannon capacity in Chapter 3In Chapter 4 we introduce the asymptotic spectrum of tensors discuss the supportfunctionals of Strassen for oblique tensors and a characterisation of asymptoticslice rank of oblique tensors as the minimum over the support functionals InChapter 5 we discuss tight tensors the higher-order CoppersmithndashWinogradmethod the combinatorial degeneration method and applications to the cap setproblem type sets and graph tensors In Chapter 6 we introduce an infinite family

18 Chapter 1 Introduction

of elements in the asymptotic spectrum of complex k-tensors and characterise theasymptotic slice rank as the minimum over the quantum functionals Finally inChapter 7 we study algebraic branching programs and approximation closure andnondeterminism closure of algebraic complexity classes

Chapter 2

The theory of asymptotic spectra

21 Introduction

This is an expository chapter about the abstract theory of asymptotic spectra ofVolker Strassen [Str88] The theory studies semirings S that are endowed with apreorder 6 The main result Theorem 212 is that under certain conditions theasymptotic version 6sim of this preorder is characterised by the semiring homomor-phisms S rarr Rge0 that are monotone under 6 These monotone homomorphismsmake up the ldquoasymptotic spectrumrdquo of (S6) For the elements of S we havenatural notions of rank and subrank generalising rank and subrank of tensorsThe asymptotic spectrum gives a dual characterisation of the asymptotic versionsof rank and subrank This dual description may be thought of as a ldquolower boundrdquomethod in the sense of computational complexity theory In Chapter 3 andChapter 4 we will study two specific pairs (S6)

22 Semirings and preorders

A (commutative) semiring is a set S with a binary addition operation + a binarymultiplication operation middot and elements 0 1 isin S such that for all a b c isin S

(1) + is associative (a+ b) + c = a+ (b+ c)

(2) + is commutative a+ b = b+ a

(3) 0 + a = a

(4) middot is associative (a middot b) middot c = a middot (b middot c)

(5) middot is commutative a middot b = b middot a

(6) 1 middot a = a

19

20 Chapter 2 The theory of asymptotic spectra

(7) middot distributes over + a middot (b+ c) = (a middot b) + (a middot c)

(8) 0 middot a = 0

As usual we abbreviate a middot b as ab A preorder is a relation 4 on a set X such thatfor all a b c isin X

(1) 4 is reflexive a 4 a

(2) 4 is transitive a 4 b and b 4 c implies a 4 c

As usual a 4 b is the same as b lt a Let N = 0 1 2 be the set of naturalnumbers and let Ngt0 = 1 2 be the set of strictly-positive natural numbersWe write le for the natural order 0 le 1 le 2 le 3 le middot middot middot on N

23 Strassen preorders

Let S be a semiring with N sube S A preorder 4 on S is a Strassen preorder if

(1) forallnm isin N n le m iff n 4 m

(2) foralla b c d isin S if a 4 b and c 4 d then a+ c 4 b+ d and ac 4 bd

(3) foralla b isin S b 6= 0 existr isin N a 4 rb

Note that condition (2) is equivalent to the condition foralla b s isin S if a 4 b thena+ s 4 b+ s and as 4 bs

Let 4 be a Strassen preorder on S Then 0 4 1 by condition (1) For a isin Swe have a 4 a by reflexivity and thus 0 4 a by condition (2)

Examples

We give two examples of a semiring with a Strassen preorder Proofs and formaldefinitions are given later

Graphs Let S be the set of all (isomorphism classes of) finite simple graphsLet GH isin S Let G t H be the disjoint union of G and H Let G H bethe strong graph product of G and H (see Chapter 3) With addition t andmultiplication the set S becomes a semiring The 0 in S is the graph with novertices and the 1 in S is the graph with a single vertex Let G be the complementof G Define a preorder 6 on S by G 6 H if there is a graph homomorphismGrarr H Then 6 is a Strassen preorder We will investigate this semiring furtherin Chapter 3

24 Asymptotic preorders 4sim 21

Tensors Let F be a field Let k isin N Let S be the set of all k-tensors over Fwith arbitrary format that is S = cupFn1 otimes middot middot middot otimes Fnk n1 nk isin N Fors isin Fn1 otimes middot middot middot otimes Fnk and t isin Fm1 otimes middot middot middot otimes Fmk let s 6 t if there are linear mapsAi Fmi rarr Fni with (A1otimesmiddot middot middototimesAk)t = s We identify any s t isin S for which s 6 tand t 6 s Let oplus be the direct sum of k-tensors and let otimes be the tensor productof k-tensors (see Chapter 4) With addition oplus and multiplication otimes the set Sbecomes a semiring The 0 in S is the zero tensor and the 1 in S is the standardbasis element e1otimes middot middot middot otimes e1 isin F1otimes middot middot middot otimes F1 The preorder 6 is a Strassen preorderWe will investigate this semiring further in Chapter 4 Chapter 5 and Chapter 6

24 Asymptotic preorders 4simDefinition 21 Let 4 be a relation on S Define the relation 4sim on S by

a2 4sim a1 if exist(xN) isin NN infNx

1NN = 1 forallN isin N aN2 4 aN1 xN (21)

If 4 is a Strassen preorder then we may in (21) replace the infimum infN x1NN

by the limit limNrarrinfin x1NN since we may assume xN+M le xNxM (if aN2 4 aN1 xN

and aM2 4 aM1 xM then aN+M2 4 aN+M

1 xNxM) and then apply Feketersquos lemma(Lemma 22)

Lemma 22 (Feketersquos lemma see [PS98 No 98]) Let x1 x2 x3 isin Rge0 satisfyxn+m le xn + xm Then limnrarrinfin xnn = infn xnn

Proof Let y = infn xnn Let ε gt 0 Let m isin Ngt0 with xmm lt y + ε Anyn isin N can be written in the form n = qm+ r where r is an integer 0 le r le mminus 1Set x0 = 0 Then xn = xqm+r le xm + xm + middot middot middot+ xm + xr = qxm + xr Therefore

xnn

=xqm+r

qm+ rle qxm + xr

qm+ r=xmm

qm

qm+ r+xrn

Thus

y le xnnlt (y + ε)

qm

n+xrn

The claim follows because xrnrarr 0 and qmnrarr 1 when nrarrinfin

For a1 a2 isin S if a1 4 a2 then clearly a1 4sim a2

Lemma 23 Let 4 be a Strassen preorder on S Then 4sim is a Strassen preorderon S the ldquoasymptotic preorderrdquo corresponding to 4

Proof Let a b c d isin S We verify that 4sim is a preorderFirst reflexivity We have a 4 a so aN 4 aN middot 1 so a 4sim a

22 Chapter 2 The theory of asymptotic spectra

Second transitivity Let a 4sim b and b 4sim c This means aN 4 bNxN andbN 4 cNyN with x

1NN rarr 1 and y

1NN rarr 1 Then aN 4 bNxN 4 cNxNyN Since

(xNyN)1N rarr 1 we conclude a 4sim cWe verify condition (1) Let nm isin N If n le m then n 4 m so n 4sim m If

n 4sim m then nN 4 nMxN so nN le mNxN which implies n le mWe verify condition (2) Let a 4sim b and c 4sim d This means aN 4 bNxN and

cN 4 dNyN Thus aNcN 4 bNdNxNyN and so ac 4sim bd Assume xN and yN arenondecreasing (otherwise set xN = maxnleN xn) Then

(a+ c)N =Nsumm=0

(N

m

)amcNminusm 4

Nsumm=0

(N

m

)bmdNminusmxmyNminusm

4Nsumm=0

(N

m

)bmdNminusmxNyN = (b+ d)NxNyN

Thus a+ c 4sim b+ dWe verify (3) Let a b isin S b 6= 0 Then there is an r isin N with a 4 rb and

thus a 4sim rb

Lemma 24 Let 4 be a Strassen preorder on S Let a1 a2 b isin S

(i) If a2 + b 4 a1 + b then a2 4sim a1

(ii) If a2b 4 a1b with b 6= 0 then a2 4sim a1

(iii) If a24simsim a1 then a2 4sim a1

(iv) If exists isin S foralln isin N na2 4 na1 + s then a2 4sim a1

Proof (ii) Let a2b 4 a1b By an inductive argument similar to the argument weused to prove (24)

forallN isin N aN2 b 4 aN1 b (22)

Let m r isin N with 1 4 mb 4 r (We use b 6= 0) From (22) follows

forallN isin N aN2 4 aN2 mb 4 aN1 mb 4 aN1 r

Thus we conclude a2 4sim a1(iii) Let a2

4simsim a1 This means aN2 4sim aN1 xN with x1NN rarr 1 This in turn means

that (aN2 )M 4 (aN1 xN)MyNM with forallN isin N y1MNM rarr 1 that is

aNM2 4 aNM1 xMN yNM

Choose a sequence N 7rarrMN such that (yNMN)1MN le 2 eg given N let MN be

the smallest M for which (yNM)1M le 2 Then aNMN2 4 aNMN

1 xMNN yNMN

and

(xMNN yNMN

)1(NMN ) = x1NN (yNMN

)1(NMN ) le x1NN 21N rarr 1

25 Maximal Strassen preorders 23

We conclude a2 4sim a1(iv) Let s isin S with foralln isin N na2 4 na1 + s We may assume a1 6= 0 Let k isin N

with s 4 ka1 Then

foralln isin N kna2 4 kna1 + ka1 = ka1(n+ 1) (23)

Apply (ii) to (23) to get

foralln isin N a2n 4sim a1(n+ 1)

By an inductive argument

forallN isin N aN2 4sim aNminus12 a12 4sim aNminus2

2 a213 4sim middot middot middot 4sim aN1 (N + 1)

Since (N + 1)1N rarr 1 a24simsim a1 From (iii) follows a2 4sim a1

(i) Let a2 + b 4 a1 + b We first prove

forallq isin N qa2 + b 4 qa1 + b (24)

By assumption the statement is true for q = 1 suppose the statement is truefor q minus 1 then

qa2 + b = (q minus 1)a2 + (a2 + b) 4 (q minus 1)a2 + (a1 + b)

= ((q minus 1)a2 + b) + a1 4 ((q minus 1)a1 + b) + a1 = qa1 + b

which proves the statement by induction Then foralln isin N na2 4 na1 + b From (iv)follows a2 4sim a1

25 Maximal Strassen preorders

Let P be the set of Strassen preorders on S For 4142 isin P we write 42 sube 41

if for all a b isin S a 42 b implies a 41 b (The notation 42 sube 41 is natural if wethink of the relations 4i as sets of pairs (a b) with a 4i b)

Lemma 25 Let 4 isin P with 4 = 4sim and a2 64 a1 Then there is an element4a1a2 isin P with 4 sube 4a1a2 and a1 4a1a2 a2

Proof For x1 x2 isin S let

x1 4a1a2 x2 if exists isin S x1 + sa2 4 x2 + sa1

The relation 4a1a2 is reflexive since x + 0 middot a2 4 x + 0 middot a1 The relation 4a1a2

is transitive if x1 4a1a2 x2 and x2 4a1a2 x3 then x1 + sa2 4 x2 + sa1 andx2 + ta2 4 x3 + ta1 for some s t isin S and so x1 + (t + s)a2 4 x2 + ta2 + sa1 4x3 + ta1 + sa1 = x3 + (t + s)a1 Thus x1 4a1a2 x3 We conclude that 4a1a2 is apreorder on S

24 Chapter 2 The theory of asymptotic spectra

We prove that 4a1a2 is a Strassen preorder If x1 4a1a2 x2 and y1 4a1a2 y2then clearly x1 + y1 4a1a2 x2 + y2 If x1 4a1a2 x2 and y isin S then x1y 4a1a2 x2yFrom this follows if x1 4a1a2 x2 and y1 4a1a2 y2 then x1y2 4a1a2 x2y2

Let nm isin N If n le m then n 4 m so n 4a1a2 m If n 6le m then n ge m+ 1Suppose n 4a1a2 m Let s isin S with n+ sa2 4 m+ sa1 Adding m+ 1 4 n gives

m+ 1 + n+ sa2 4 n+m+ sa1

Since 4 = 4sim we may apply Lemma 24 (i) to obtain

1 + sa2 4 sa1 (25)

From (25) follows s 6= 0 From (25) also follows

sa2 4 sa1 (26)

Since 4 = 4sim we may apply Lemma 24 (ii) to (26) to obtain the contradiction

a2 4 a1

Therefore n 64a1a2 m We conclude that 4a1a2 is a Strassen preorder thatis 4 isin P

Finally we have a1 4a1a2 a2 since a1 + 1 middot a2 4 a2 + 1 middot a1 Also if x1 4 x2then x1 + 0 middot a2 4 x2 + 0 middot a1 that is 4 sube 4a1a2

Let 4 be a Strassen preorder Let P4 be the set of Strassen preorderscontaining 4 ordered by inclusion sube Let C sube P4 be any chain Then theunion of all preorders in C is an element of P4 and contains all elements of CTherefore by Zornrsquos lemma P4 contains a maximal element (maximal withrespect to inclusion sube)

Lemma 26 Let 4 be maximal in P Then 4 = 4sim

Proof Trivially 4 sube 4sim From Lemma 23 we know 4sim isin P From maximalityof 4 follows 4 = 4sim

A relation 4 on S is total if for all a b isin S a 4 b or b 4 a

Lemma 27 Let 4 be maximal in P Then 4 is total

Proof Suppose 4 is not total say a1 64 a2 and a2 64 a1 By Lemma 25 there is anelement 4a1a2isin P with 4 sube 4a1a2 and a1 4a1a2 a2 Then 4 is strictly containedin 4a1a2 which contradicts the maximality of 4 We conclude 4 is total

26 The asymptotic spectrum X(S6) 25

26 The asymptotic spectrum X(S6)

Definition 28 Let S be a semiring with N sube S and let 6 be a Strassen preorderon S Let

X(S6) = φ isin Hom(SRge0) a 6 brArr φ(a) le φ(b)

We call X(S6) the asymptotic spectrum of (S6) We call the elements ofX(S6) spectral points

Lemma 29 Let 4 isin P be total There is exactly one semiring homomorphismφ S rarr Rge0 with

a 4 brArr φ(a) le φ(b)

Moreover if 4 is maximal in P then

a 4 bhArr φ(a) le φ(b)

Proof Let 4 isin P be total For a isin S define

φ(a) = inf rs

r s isin N sa 4 rψ(a) = supu

v u v isin N u 4 va

We prove ψ(a) le φ(a) Let r s u v isin N Suppose u 4 va and sa 4 r Thenfollows su 4 vsa 4 vr Thus uv le rs We prove ψ(a) ge φ(a) Supposeψ(a) lt φ(a) Let r s isin N with ψ(a) lt rs lt φ(a) Then sa 64 r Fromtotality follows sa lt r Thus ψ(a) ge rs which is a contradiction We concludeψ(a) = φ(a)

Let a b isin S We prove φ(a+ b) le φ(a) + φ(b) Let sa sb ra rb isin N Supposesaa 4 ra and sbb 4 rb Then sasba 4 sbra and sasbb 4 sarb By additionsasb(a+b) 4 sbra+sarb Thus φ(a+b) le ra

sa+ rb

sb We prove ψ(a+b) ge ψ(a)+ψ(b)

Suppose ua 4 vaa and ub 4 vbb Then vbua 4 vavba and vaub 4 vavbb By additionvbua + vaub 4 vavb(a+ b) Thus ψ(a+ b) ge ua

va+ ub

vb We thus have additivity

We prove φ(ab) le φ(a)φ(b) Suppose saa 4 ra and sbb 4 rb Then sasbab 4rarb Thus φ(ab) le ra

sa

rbsb

We prove ψ(ab) ge ψ(a)ψ(b) Suppose ua 4 vaa and

ub 4 vbb Then uaub 4 vavbab Thus uava

ubvble ψ(ab) We thus have multiplicativity

We prove monotonicity a 4 brArr φ(a) le φ(b) Suppose sbb 4 rb From a 4 bfollows sba 4 sbb 4 rb Thus φ(a) le rb

sb

We prove φ(1) = 1 Trivially 1 4 1 Therefore φ(1) le 11

= 1 and ψ(1) ge 11

= 1We prove φ(0) = 0 Trivially sa0 4 0 so φ(0) le 0

sa= 0 Trivially 0 4 va0 so

φ(0) ge 0va

= 0We prove the uniqueness of φ Let φ1 φ2 be semiring homomorphisms S rarr Rge0

with a 4 b rArr φi(a) le φi(b) Suppose φ1(a) lt φ2(a) Let u v isin N with

26 Chapter 2 The theory of asymptotic spectra

φ1(a) lt uvlt φ2(a) Then va 64 u so by totality va lt u Thus φ1(a) ge u

v which

is a contradiction This proves uniquenessFinally suppose 4 is maximal in P Lemma 26 gives 4 = 4sim Let a 64 b

From Lemma 24 (iv) follows existn na 64 nb+ 1 By totality na lt nb+ 1 Apply φto get φ(a) ge φ(b) + 1

n In particular φ(a) gt φ(b)

Lemma 210 The map

X(S6)rarr maximal elements in P6 φ 7rarr 4φwith a 4φ b iff φ(a) le φ(b) is a bijection

Proof Let φ isin X(S6) One verifies that 4φ is a Strassen preorder and 6 sube 6sim sube4φ Let 4 be maximal in P4φ Lemma 27 says that 4 is total By Lemma 29there is a ψ isin X(S6) with 4 sube 4ψ Clearly 4φ sube 4ψ The uniqueness statementof Lemma 29 implies φ = ψ This means 4φ = 4 that is 4φ is maximal Weconclude that the map is well defined

Let 4 maximal in P6 Then 4 is total By Lemma 29 there is a φ isin X(S6)with 4 sube 4φ We conclude the map is surjective

Let φ ψ isin X(S6) with 4φ = 4ψ From Lemma 29 follows φ = ψ Weconclude the map is injective

Lemma 211 Let a b isin S Then a 6sim b iff a 4 b for all maximal 4 isin P6

Proof Let 4 isin P6 be maximal Then 6sim sube 4sim = 4 by Lemma 26 so a 6sim bimplies a 4 b

Suppose a 66sim b Let n isin Nge1 with na 66sim nb+1 (Lemma 24 (iv)) By Lemma 25there is an element 4nb+1na isin P with 6sim sube 4nb+1na and we may assume 4nb+1na

is maximal Then nb+ 1 4nb+1na na and so a 64nb+1na b

27 The representation theorem

The following theorem is the main theorem

Theorem 212 ([Str88 Th 24]) Let S be a commutative semiring with N sube Sand let 6 be a Strassen preorder on S Let X = X(S6) be the set of 6-monotonesemiring homomorphisms from S to Rge0

X = X(S6) = φ isin Hom(SRge0) foralla b isin S a 6 brArr φ(a) le φ(b)

For a b isin S let a 6sim b if there is a sequence (xN) isin NN with x1NN rarr 1 when

N rarrinfin such that forallN isin N aN 6 bNxN Then

foralla b isin S a 6sim b iff forallφ isin X φ(a) le φ(b)

Proof Let a b isin S Suppose a 6sim b Then clearly for all φ isin X we haveφ(a) le φ(b) Suppose a 66sim b By Lemma 211 there is a maximal 4isin P6 witha 64 b By Lemma 210 there is a φ isin X with φ(a) gt φ(b)

28 Abstract rank and subrank RQ 27

28 Abstract rank and subrank RQ

We generalise the notions of rank and subrank for tensors to arbitrary semiringswith a Strassen preorder Let a isin S Define the rank

R(a) = minr isin N a 6 r

and the subrank

Q(a) = maxr isin N r 6 a

Then Q(a) le R(a) Define the asymptotic rank

˜R(a) = limNrarrinfin

R(aN)1N

Define the asymptotic subrank

˜Q(a) = limNrarrinfin

Q(aN)1N

By Feketersquos lemma (Lemma 22) asymptotic rank is an infimum and asymptoticsubrank is a supremum as follows

˜R(a) = infN

R(aN)1N

˜Q(a) = supN

Q(aN)1N when a = 0 or a ge 1

Theorem 212 implies that the asymptotic rank and asymptotic subrank have thefollowing dual characterisation in terms of the asymptotic spectrum (This is astraightforward generalisation of [Str88 Th 38])

Corollary 213 (cf [Str88 Th 38]) For a isin S with existφ isin Xφ(a) ge 1

˜R(a) = maxφisinX

φ(a)

Proof Let φ isin X For N isin N R(aN) ge φ(a)N Therefore ˜R(a) ge φ(a)and so ˜R(a) ge maxφisinX φ(a) It remains to prove ˜R(a) le maxφisinX φ(a) Welet x = maxφisinX φ(a) By assumption x ge 1 By definition of x we have

forallφ isin X φ(a) le x

Take the mth power on both sides

forallφ isin Xm isin N φ(am) le xm

Take the ceiling on the right-hand side

forallφ isin Xm isin N φ(am) le dxme

28 Chapter 2 The theory of asymptotic spectra

Apply Theorem 212 to get asymptotic preorders

forallm isin N am 6sim dxme

Then by definition of asymptotic preorder

forallmN isin N amN 6 dxmeN2εmN for some εmN isin o(N)

Then

forallmN isin N R(amN)1mN le dxme1m2εmNmN

From x ge 1 follows dxme1m rarr x when m rarr infin Choose m = m(N) withm(N)rarrinfin as N rarrinfin and εm(N)N isin o(N) to get ˜R(a) = infN R(aN )1N le x

Corollary 214 (cf [Str88 Th 38]) For a isin S with existk isin N ak gt 2

˜Q(a) = minφisinX

φ(a)

Proof Let φ isin X For N isin N Q(aN ) le φ(a)N Therefore ˜Q(a) le φ(a) so ˜Q(a) leminφisinX φ(a) It remains to prove ˜Q(a) ge minφisinX φ(a) Let y = minφisinX φ(a)

From the assumption ak gt 2 follows y gt 1 By definition of y we have

forallφ isin X φ(a) ge y

Take the mth power on both sides

forallφ isin Xm isin N φ(am) ge ym

Take the floor on the right-hand side

forallφ isin Xm isin N φ(am) ge bymc

Apply Theorem 212 to get asymptotic preorders

forallm isin N am gtsim bymc

Then by definition of asymptotic preorder

forallmN isin N amN2εmN gt bymcN for some εmN isin o(N)

Now we use ak gt 2 to get

forallmN isin N amN+kεmN gt bymcN

Then

forallmN isin N Q(amN+kεmN )1

mN+kεmN ge bymcN

mN+kεmN

Choose m = m(N) with m(N) rarr infin as N rarr infin and εm(N)N isin o(N) to obtain

˜Q(a) = supN Q(aN)1N ge y

29 Topological aspects 29

29 Topological aspects

Theorem 212 does not tell the full story Namely there is also a topologicalcomponent which we will now discuss Let S be a semiring with N sube S Let 6 bea Strassen preorder on S Let X = X(S6) be the asymptotic spectrum of (S6)For a isin S let

a Xrarr Rge0 φ 7rarr φ(a) (27)

The map a simply evaluates a given homomorphism φ at a One may think of aas the collection (φ(a))φisinX of all evaluations of the elements of X at a Let Rge0

have the Euclidean topology Endow X with the weak topology with respect tothe family of functions a a isin S That is endow X with the coarsest topologysuch that each a becomes continuous

Let C(XRge0) be the semiring of continuous functions Xrarr Rge0 with additionand multiplication defined pointwise on X that is (f + g)(x) = f(x) + g(x)and (f middot g)(x) = f(x)g(x) for f g isin C(XRge0) and x isin X Define the semiringhomomorphism

Φ S rarr C(XRge0) a 7rarr a

which maps a to the evaluator a defined in (27)

Theorem 215 ([Str88 Th 24])

(i) X is a nonempty compact Hausdorff space

(ii) foralla b isin S a 6sim b iff Φ(a) le Φ(b) pointwise on X

(iii) Φ(S) separates the points of X

Proof Statement (ii) follows from Theorem 212Statement (iii) is clearWe prove statement (i) We have 2 66sim 1 so from Theorem 212 follows that X

cannot be emptyFor a isin S let na isin N with a le na Then for φ isin X φ(a) le na and so

φ(a) isin [0 na] Embed X subeprod

aisinS[0 na] as a set via φ 7rarr (φ(a))aisinS The setprodaisinS[0 na] with the product topology is compact by the theorem of TychonoffTo see that X is closed in

prodaisinS[0 na] we write X as an intersection of sets

X =φ isin

prodaisinS

[0 na] φ(0) = 0capφ isin

prodaisinS

[0 na] φ(1) = 1

cap⋂bcisinS

φ isin

prodaisinS

[0 na] φ(b+ c)minus φ(b)minus φ(c) = 0

cap⋂bcisinS

φ isin

prodaisinS

[0 na] φ(bc)minus φ(b)φ(c) = 0

30 Chapter 2 The theory of asymptotic spectra

cap⋂bcisinSblec

φ isin

prodaisinS

[0 na] φ(b) le φ(c)

and we observe that the intersected sets are closed

X = 0minus1(0) cap 1minus1(1)

cap⋂bcisinS

((b+ c)minus bminus c

)minus1(0)

cap⋂bcisinS

((bc)minus bc

)minus1(0)

cap⋂bcisinSblec

(cminus b

)minus1([0infin))

This implies X is also compactLet φ ψ isin X be distinct Let a isin S with φ(a) 6= ψ(a) Then a(φ) 6= a(ψ)

Let U 3 a(φ) V 3 a(ψ) be open and disjoint subsets of Rge0 Then aminus1(U) andaminus1(V ) are open and disjoint subsets of X We conclude that X is Hausdorff

210 Uniqueness

Let S be a semiring with N sube S Let 6 be a Strassen preorder on S LetX = X(S6) be the asymptotic spectrum of (S6) The object X is unique inthe following sense

Theorem 216 ([Str88 Cor 27]) Let Y be a compact Hausdorff space LetΨ S rarr C(YRge0) be a homomorphism of semirings such that

Ψ(S) separates the points of Y (28)

and

foralla b isin S a 6sim bhArr Ψ(a) le Ψ(b) pointwise on Y (29)

Then there is a unique homeomorphism (continuous bijection with continuousinverse) h Y rarr X such that the diagram

S

C(XRge0) C(YRge0)

ΨΦ

hlowast

(210)

commutes where hlowast φ 7rarr φ h Namely let h y 7rarr(a 7rarr Ψ(a)(y)

)

211 Subsemirings 31

Proof We prove uniqueness Suppose there are two such homeomorphisms

h1 h2 Y rarr X

Suppose x 6= h2(hminus11 (x)) for some x isin X Since Φ(S) separates the points of X

there is an a isin S with Φ(a)(x) 6= Φ(a)(h2(hminus11 (x))) Let y = hminus1

1 (x) isin Y ThenΦ(a)(h1(y)) 6= Φ(a)(h2(y)) Since (210) commutes Φ(a)(h1(y)) = Ψ(a)(y) andΦ(a)(h2(y)) = Ψ(a)(y) a contradiction

We prove existence Let h Y rarr X y 7rarr (a 7rarr Ψ(a)(y)) One verifies that his well-defined continuous injective and that the diagram in (210) commutes Itremains to show that h is surjective We know that Q middot Φ(S) is a Q-subalgebraof C(XR) which separates points and which contains the nonzero constantfunction Φ(1) so by the StonendashWeierstrass theorem Q middot Φ(S) is dense in C(XR)under the sup-norm Suppose h is not surjective Then h(Y) ( X is a properclosed subset Let x0 isin X h(Y) be in the complement Since X is a compactHausdorff space there is a continuous function f Xrarr [minus1 1] with

f(h(Y)) = 1

f(x0) = minus1

We know that f can be approximated by elements from Q middot Φ(S) ie let ε gt 0then there are a1 a2 isin S N isin N such that

1N

(Φ(a1)(x)minus Φ(a2)(x)

)gt 1minus ε for all x isin h(Y)

1N

(Φ(a1)(x0)minus Φ(a2)(x0)

)lt minus1 + ε

This means Ψ(a1) ge Ψ(a2) pointwise on Y so a1 gtsim a2 but also Φ(a1) 6ge Φ(a2)pointwise on X so a1 6gtsim a2 This is a contradiction

211 Subsemirings

Let S be a subsemiring of a semiring T and let 6 be a Strassen preorder on T Then the restriction 6|S is a Strassen preorder on S How are the asymptoticspectra X(S6|S) and X(T6) related Obviously for φ isin X(T6) we haveφ|S isin X(S6|S) In fact the uniqueness theorem of Section 210 implies that allelements of X(S6|S) are restrictions of elements of X(T6)

Corollary 217 Let S be a subsemiring of a semiring T Let 6 be a Strassenpreorder on T Then

X(S6|S) = X(T6)|S

Proof Let

X = X(S6|S)

32 Chapter 2 The theory of asymptotic spectra

Φ S rarr C(XRge0) a 7rarr a

and let

Y = X(T6)|S = φ|S φ isin X(T6)Ψ S rarr C(YRge0) a 7rarr

(φ|S 7rarr φ|S(a)

)

Then Y is a compact Hausdorff space Let φ|S ψ|S isin Y be distinct Then there isan a isin S with φ|S(a) 6= ψ|S(a) so (28) holds For a b isin S a 6sim b iff Φ(a) le Φ(b)iff Ψ(a) le Ψ(b) so (29) holds Therefore

h X(T6)|S rarr X(S6|S) φ|S 7rarr(a 7rarr Ψ(a)(φ|S)

)= φ|S

is a homeomorphism

212 Subsemirings generated by one element

Let S be a semiring and let 6 be a Strassen preorder on S We specialise to thesimplest type of subsemiring of S Namely let a isin S and let

N[a] = ksumi=0

ni ai k isin N ni isin N

sube S

be the subsemiring of S generated by a We call X(N[a]) = X(N[a]6|N[a]) theasymptotic spectrum of a

Corollary 218 (cf [Str88]) If ak gt 2 for some k isin N then

˜Q isin X(N[a])

If φ(a) ge 1 for some φ isin X then

˜R isin X(N[a])

Proof Let X = X(N[a]) Let n1 nq By Corollary 214

˜Q(an1 + middot middot middot+ anq) = minφisinX

φ(an1 + middot middot middot+ anq)

Since φ is a homomorphism φ(an1 + middot middot middot+ anq) = φ(a)n1 + middot middot middot+ φ(a)nq Now weobserve that xn1 + middot middot middot+ xnq is minimised by taking x minimal in the domain Weconclude

˜Q(an1 + middot middot middot+ anq) =

qsumi=1

(minφisinX

φ(a))ni = ˜Q(a)n1 + middot middot middot+ ˜Q(a)nq

The claim for asymptotic rank ˜R similarly follows from Corollary 213

213 Universal spectral points 33

Remark 219 In general asymptotic subrank ˜Q and asymptotic rank ˜R are notelements of the asymptotic spectrum We will see an example in Chapter 4 relatedto the matrix multiplication tensor

Remark 220 Corollary 218 is closely related to Schonhagersquos τ -theorem fortensors also called Schonhagersquos asymptotic sum inequality The τ -theorem featuresin every recent fast matrix multiplication algorithm (ie every algorithm based onthe laser method)

Remark 221 An element φ isin X(N[a]) is uniquely determined by the valueof φ(a) isin Rge0 We may thus identify the asymptotic spectrum X(N[a]) with acompact (ie closed and bounded) subset of the positive reals Rge0 via φ 7rarr φ(a)

213 Universal spectral points

Having discussed the simplest type of subsemiring in the previous section letus discuss the most difficult type of supersemiring When applying the theoryof asymptotic spectra to some setting there is a natural largest semiring S inwhich the objects of study live For example we may study the semiring S of all(equivalence classes of) 3-tensors of arbitrary format over F Or we may studythe semiring S of all (isomorphism classes of) finite simple graphs We refer tothe elements of the asymptotic spectrum X(S) of the ldquoambientrdquo semiring S bythe term universal spectral points (cf [Str88 page 119]) The universal spectralpoints are the most useful monotone homomorphisms

214 Conclusion

To a semiring S with a Strassen preorder 6 we associated an asymptotic pre-order 6sim We proved that this asymptotic preorder is characterised by the6-monotone semiring homomorphisms S rarr Rge0 which make up the asymp-totic spectrum X(S6) of (S6) For (S6) we naturally have a rank functionR S rarr N and a subrank function Q S rarr N Their asymptotic versions

˜R(a) = infn R(an)1n and ˜Q(a) = supn Q(an)1n coincide with maxφisinX(S6) φ(a)

and minφisinX(S6) φ(a) respectively assuming existφ isin Xφ(a) ge 1 and existk isin N ak gt 2respectively Unfortunately we have proved the existence of the asymptotic spec-trum by nonconstructive means Explicitly constructing spectral points for a givenpair (S6) will be a challenging task

Some remarks about our proof in this chapter The proof in [Str88] uses theKadisonndashDubois theorem from the paper of Becker and Schwartz [BS83] as ablack-box Our presentation basically integrates the proof of Strassen with theproof of Becker and Schwartz The notions of rank and subrank were in [Str88] onlydiscussed for tensors We considered the straightforward generalisation to arbitrary

34 Chapter 2 The theory of asymptotic spectra

semirings with a Strassen preorder An evident feature of our presentation is thatwe do not pass from the semiring to its Grothendieck ring but instead stay inthe semiring In this way we stay close to the ldquoreal worldrdquo objects I thank JopBriet and Lex Schrijver for this idea There is a large body of literature on theKadisonndashDubois theorem for which we refer to the modern books by Prestel andDelzell [PD01 Theorem 526] and Marshall [Mar08 Theorem 544]

Chapter 3

The asymptotic spectrum of graphsShannon capacity

This chapter is based on the manuscript [Zui18]

31 Introduction

This chapter is about the Shannon capacity of graphs which was introduced byClaude Shannon in the context of coding theory [Sha56] More precisely we willapply the theory of asymptotic spectra of Chapter 2 to gain a better understandingof Shannon capacity (and other asymptotic properties of graphs)

We first recall the definition of the Shannon capacity of a graph Let G be a(finite simple) graph with vertex set V (G) and edge set E(G) An independent setor stable set in G is a subset of V (G) that contains no edges The independencenumber or stability number α(G) is the cardinality of the largest independentset in G For graphs G and H the and-product GH also called strong graphproduct is defined by

V (GH) = V (G)times V (H)

E(GH) =(g h) (gprime hprime)

(g gprime isin E(G) or g = gprime

)and

(h hprime isin E(H) or h = hprime

)and (g h) 6= (gprime hprime)

The Shannon capacity Θ(G) is defined as the limit

Θ(G) = limNrarrinfin

α(GN)1N (31)

This limit exists and equals the supremum supN α(GN)1N by Feketersquos lemma(Lemma 22)

Computing the Shannon capacity is nontrivial already for small graphs Lovaszin [Lov79] computed the value Θ(C5) =

radic5 where Ck denotes the k-cycle graph

by introducing and evaluating a new graph parameter ϑ which is now known as

35

36 Chapter 3 The asymptotic spectrum of graphs Shannon capacity

the Lovasz theta number For example the value of Θ(C7) is currently not knownThe Shannon capacity Θ is not known to be hard to compute in the sense ofcomputational complexity On the other hand deciding whether α(G) le k givena graph G and k isin N is NP-complete [Kar72]

New result dual description of Shannon capacity

The new result of this chapter is a dual characterisation of the Shannon capacityof graphs This characterisation is obtained by applying Strassenrsquos theory ofasymptotic spectra of Chapter 2 Thus this chapter also serves as an illustrationof the theory of asymptotic spectra

To state the theorem we need the standard notions graph homomorphismgraph complement and graph disjoint union Let G and H be graphs A graphhomomorphism f G rarr H is a map f V (G) rarr V (H) such that for allu v isin V (G) if u v isin E(G) then f(u) f(v) isin E(H) In other words a graphhomomorphism maps edges to edges The complement G of G is defined by

V (G) = V (G)

E(G) =u v u v 6isin E(G) u 6= v

We define a relation 6 on graphs let G 6 H if there is a graph homomor-phism Grarr H from the complement of G to the complement of H The disjointunion G tH is defined by

V (G tH) = V (G) t V (H)

E(G tH) = E(G) t E(H)

For n isin N the complete graph Kn is the graph with V (Kn) = [n] = 1 2 nand E(Kn) = i j i j isin [n] i 6= j Thus K0 = K0 is the empty graphand K1 = K1 is the graph consisting of a single vertex and no edges

Theorem 31 Let S sube graphs be a collection of graphs which is closed underthe disjoint union t and the strong graph product and which contains the graphwith a single vertex K1 Define the asymptotic spectrum X(S) as the set of allmaps φ S rarr Rge0 such that for all GH isin S

(1) if G 6 H then φ(G) le φ(H)

(2) φ(G tH) = φ(G) + φ(H)

(3) φ(GH) = φ(G)φ(H)

(4) φ(K1) = 1

32 The asymptotic spectrum of graphs 37

Let G 6sim H if there is a sequence (xN) isin NN with x1NN rarr 1 when N rarr infin such

that for every N isin N

GN 6 (HN)txN = HN t middot middot middot tHN︸ ︷︷ ︸xN

Then

(i) G 6sim H iff forallφ isin X(S) φ(G) le φ(H)

(ii) Θ(G) = minφisinX(S) φ(G)

Statement (ii) of Theorem 31 is nontrivial in the sense that Θ is not anelement of X(graphs) Namely Θ is not additive under t by a result ofAlon [Alo98] and Θ is not multiplicative under by a result of Haemers [Hae79]It turns out that the graph parameter G 7rarr maxφisinX(graphs) φ(G) is itself anelement of X(graphs) and is equal to the fractional clique cover number χf (seeSection 332 and eg [Sch03 Eq (67112)]) Fritz in [Fri17] proves (independentlyof Strassenrsquos line of work) a statement that is weaker than Theorem 31 Namelyhe proves the statement of Theorem 31 without the additivity condition (2)

In Section 32 we will prove Theorem 31 by applying the theory of asymptoticspectra of Chapter 2 to the appropriate semiring and preorder In Section 33 wewill discuss the elements in the asymptotic spectrum of graphs X(graphs) thatare currently known to me the Lovasz theta number the fractional clique covernumber the fractional orthogonal rank of the complement and the fractionalHaemers bounds We moreover prove a sufficient condition for the ldquofractionalisa-tionrdquo of a graph parameter to be in the asymptotic spectrum of graphs

32 The asymptotic spectrum of graphs

In this section we prove Theorem 31 by applying the theory of asymptotic spectrato the appropriate semiring

321 The semiring of graph isomorphism classes GA graph homomorphism f Grarr H is a graph isomorphism if f is bijective asa map V (G)rarr V (H) and bijective as a map E(G)rarr E(H) We write G sim= Hif there is a graph isomorphism f G rarr H The relation sim= is an equivalencerelation on graphs which we call isomorphism For example the graphs Gand H given by

V (G) = a b c d E(G) = a b b c c d a dV (H) = 1 2 3 4 E(H) = 1 3 2 3 2 4 1 4

38 Chapter 3 The asymptotic spectrum of graphs Shannon capacity

are isomorphic Let G = graphssim= be the set of equivalence classes in graphsunder sim= ie the isomorphism classes The relation 6 is a preorder on G Recallthat Kn is the complete graph on n vertices and thus Kn is the graph with nvertices and no edges

Lemma 32 Let ABC isin graphs

(i) t and are commutative and associative operations on G

(ii) distributes over t on G ie A (B t C) = (AB) t (A C)

(iii) K1 A = A

(iv) K0 A = K0

(v) K0 t A = A

(vi) Kn tKm = Kn+m

Proof We leave the proof to the reader

In other words Lemma 32 says that (Gt K0 K1) is a (commutative)semiring in which the elements K0 K1 K2 behave like the natural numbers NWe will denote this semiring simply by G

322 Strassen preorder via graph homomorphisms

Let G be the semiring of graphs Recall that G 6 H if there is a graph homomor-phism f Grarr H

Lemma 33 The preorder 6 is a Strassen preorder on G That is for graphsABCD isin G we have the following

(i) For nm isin N Kn 6 Km iff n le m

(ii) If A 6 B and C 6 D then A t C 6 B tD and A C 6 B D

(iii) For AB isin G if B 6= K0 then there is an r isin N with A 6 Kr B

Proof Statement (i) is easy to verify We prove (ii) Let f Ararr B and g C rarr Dbe graph homomorphisms Let the map f t g V (A) t V (C)rarr V (B) t V (D) bedefined by

(f + g)(a) = f(a) for a isin V (A)

(f + g)(c) = g(c) for c isin V (C)

32 The asymptotic spectrum of graphs 39

One verifies directly that f t g is a graph homomorphism A t C rarr B tD Letthe map f g V (A)times V (C)rarr V (B)times V (D) be defined by

(f g)(a c) = (f(a) g(c))

One verifies directly that f g is a graph homomorphism A C rarr B D Thisproves (ii) We prove (iii) Let r = |V (A)| Then A 6 Kr By assumptionB 6= K0 so K1 6 B Therefore A 6 Kr

sim= Kr1 6 KrB This proves (iii)

323 The asymptotic spectrum of graphs X(G)

We thus have a semiring G with a Strassen preorder 6 We are therefore in theposition to apply the theory of asymptotic spectra (Chapter 2) Let us translatethe abstract terminology to this setting

Let G 6sim H if there is a sequence (xN) isin NN with (xN)1N rarr 1 such that forevery N isin N we have GN 6 HN KxN ie GN 6 (HN)txN

Let S sube G be a subsemiring For example one may take S = G or one maychoose any set X sube G and let S = N[X] be the subsemiring of G generated by Xunder t and

The asymptotic spectrum of S is the set X(S) of 6-monotone semiring homo-morphisms S rarr Rge0 ie all maps φ S rarr Rge0 such that for all GH isin S

(1) if G 6 H then φ(G) le φ(H)

(2) φ(G tH) = φ(G) + φ(H)

(3) φ(GH) = φ(G)φ(H)

(4) φ(K1) = 1

We call X(G) the asymptotic spectrum of graphs

Theorem 34 Let GH isin S Then G 6sim H iff forallφ isin X(S) φ(G) le φ(H)

Proof By Lemma 32 we have a semigroup S and by Lemma 33 we have aStrassen preorder 6 so we may apply Theorem 212

We refer to Chapter 2 for a discussion of the topological properties of X(S)

324 Shannon capacity Θ

Let us discuss the (asymptotic) rank and (asymptotic) subrank for (G6) Recallthat an independent set in G is a subset of V (G) that contains no edges andthe independence number α(G) is the cardinality of the largest independent setin G A colouring of G is an assignment of colours to the elements of V (G) suchthat connected vertices get distinct colours The chromatic number χ(G) is the

40 Chapter 3 The asymptotic spectrum of graphs Shannon capacity

smallest number of colours in any colouring of G The clique cover number χ(G)is defined as the chromatic number of the complement χ(G) = χ(G)

For the semiring G with preorder 6 the abstract definition of subrank ofSection 28 becomes Q(G) = maxm isin N Km 6 G and the abstract definitionof rank becomes R(G) = minn isin N G 6 Kn

Lemma 35

(i) α(G) = Q(G)

(ii) χ(G) = R(G)

Proof We leave the proof to the reader

We see directly that the asymptotic rank is the Shannon capacity

˜R(G) = limNrarrinfin

R(GN)1N = limNrarrinfin

α(GN)1N = Θ(G)

and that the asymptotic subrank is the asymptotic clique cover number

˜Q(G) = limNrarrinfin

Q(GN)1N = limNrarrinfin

χ(GN)1N = ˜χ(G)

Let S sube G be a subsemiring Let G isin S

Corollary 36 Θ(G) = minφisinX(S) φ(G)

Proof Let G be a graph Either G = K0 or K1 6 G 6 K1 or G contains at leastone edge In the first two cases the claim is clearly true In the third case G gt K2

and we may thus apply Corollary 213

Corollary 37 ˜χ(G) = maxφisinX(S) φ(G)

Proof This is Corollary 214

Remark 38 As mentioned earlier it turns out that ˜χ is in fact itself an elementof X(G) See Section 332 (This is a striking difference with the situation fortensors which we will discuss in Chapter 4 there both asymptotic rank andasymptotic subrank are not in the asymptotic spectrum see Remark 44)

Shannon capacity is not in the asymptotic spectrum

Lemma 39 GG ge K|V (G)|

Proof Let D = (u u) u isin V (G) Let (u u) (v v) isin D Then eitheru v isin E(G) or u v isin E(G) (exclusive or) and so (u u) (v v) 6isin E(GG)Therefore the subgraph in GG induced by D is isomorphic to K|V (G)|

Example 310 Let G be the Schlafli graph This is a graph with 27 verticesThus Θ(GG) ge |V (G)| = 27 On the other hand Haemers in [Hae79] showedthat Θ(G)Θ(G) le 21 This implies the map Θ is not in X(G) since it is notmultiplicative under

33 Universal spectral points 41

33 Universal spectral points

The abstract theory of asymptotic spectra of Chapter 2 does not explicitly describethe elements of X(G) ie the universal spectral points (cf Section 213) Howeverseveral graph parameters from the literature can be shown to be universal spectralpoints In fact recently in [BC18] the first infinite family of universal spectralpoints was found the fractional Haemers bounds We give a brief (and probablyincomplete) overview of currently known elements in X(G)

331 Lovasz theta number ϑ

For any real symmetric matrix A let Λ(A) be the largest eigenvalue The Lovasztheta number ϑ(G) is defined as

ϑ(G) = minΛ(A) A isin RV (G)timesV (G) symmetric u v 6isin E(G)rArr Auv = 1

The parameter ϑ(G) was introduced by Lovasz in [Lov79] We refer to [Knu94]and [Sch03] for a survey It follows from well-known properties that ϑ isin X(G)

332 Fractional graph parameters

Besides the Lovasz theta number there are several elements in X(G) that arenaturally obtained as fractional versions of -submultiplicative t-subadditive6-monotone maps G rarr Rge0 For any map φ G rarr Rge0 we define a fractionalversion φf by

φf (G) = infd

φ(GKd

)d

We will discuss several fractional parameters from the literature and prove ageneral theorem about fractional parameters

Fractional clique cover number

We consider the fractional version of the clique cover number χ(G) = χ(G) It iswell-known that χf isin X(G) see eg [Sch03] The fractional clique cover number χfin fact equals the asymptotic clique cover number ˜χ(G) = limNrarrinfin χ(GN)1N

which we introduced in the previous section see [MP71] and also [Sch03 Th 6717]

Fractional Haemers bound

Let rank(A) denote the matrix rank of any matrix A For any set C of matricesdefine rank(C) = minrank(A) A isin C For a field F and a graph G define theset of matrices

MF(G) = A isin FV (G)timesV (G) foralluv Avv 6= 0 u v 6isin E(G)rArr Auv = 0

42 Chapter 3 The asymptotic spectrum of graphs Shannon capacity

Let RF(G) = rank(MF(G)) The parameter RF(G) was introduced by Haemersin [Hae79] and is known as the Haemers bound The fractional Haemers bound RF

f

was studied by Anna Blasiak in [Bla13] and was recently shown to be -multiplica-tive by Bukh and Cox in [BC18] From this it is not hard to prove that RF

f isin X(G)Bukh and Cox in [BC18] furthermore prove a separation result for any field F ofnonzero characteristic and any ε gt 0 there is a graph G such that for any field Fprimewith char(F) 6= char(Fprime) the inequality RF

f(G) lt εRFprimef (G) holds This separation

result implies that there are infinitely many elements in X(G)

Fractional orthogonal rank

In [CMR+14] the orthogonal rank ξ(G) and its fractional version the projectiverank ξf (G) are studied It easily follows from results in [CMR+14] that G 7rarr ξf (G)is in X(G)

General fractional parameters

We will prove something general about fractional parameters Define the lexico-graphic product GnH by

V (GnH) = V (G)times V (H)

E(GnH) =(g h) (gprime hprime) g gprime isin E(G)

or (g = gprime and h hprime isin E(H))

The lexicographic product satisfies GnH = GnH Also define the or-productG lowastH by

V (G lowastH) = V (G)times V (H)

E(G lowastH) =(g h) (gprime hprime) g gprime isin E(G) or h hprime isin E(H)

The or-product and the strong graph product are related by G lowastH = GH Thestrong graph product gives a subgraph of the lexicographic product which gives asubgraph of the or-product

GH sube GnH sube G lowastH

Therefore G lowastH 6 GnH 6 GH Finally GnKd = G lowastKd and of courseGKd = Gtd

We will prove if φ G rarr Rge0 is -submultiplicative t-subadditive and 6-monotone then φf is again -submultiplicative t-subadditive and 6-monotoneMoreover if φ G rarr N is 6-monotone and satisfies

forallGH isin G φ(GnH) ge φ(GnKφ(H))

then φf is n-supermultiplicative and more importantly φf is -supermultiplica-tive

33 Universal spectral points 43

Lemma 311

(i) If φ is t-superadditive then φf is t-superadditive

(ii) If φ is 6-monotone then φf is 6-monotone

(iii) If φ is t-subadditive and 6-monotone then φf is t-subadditive

(iv) If foralln isin N φ(Kn) = n then foralln isin N φf (Kn) = n

(v) If φ is -submultiplicative and 6-monotone then φf is -submultiplicative

Proof Let GH isin G Let d isin N(i) The lexicographic product distributes over the disjoint union

(G tH) nKd = (GnKd) t (H nKd)

By superadditivity

φ((GnKd) t (H nKd)) ge φ(GnKd) + φ(H nKd)

Therefore

φf (G tH) = infd

φ((G tH) nKd)

d

= infd

φ((GnKd) t (H nKd))

d

ge infd

φ(GnKd)

d+φ(H nKd)

d

ge infd1

φ(GnKd1)

d1

+ infd2

φ(H nKd2)

d2

= φf (G) + φf (H)

(ii) Let G 6 H Then G n Kd 6 H n Kd Thus φ(G n Kd) le φ(H n Kd)Therefore φf (G) le φf (H)

(iii) We have GnKd 6 GKd = Gtd Thus by monotonicity and subadditivity

φ(GnKd) le dφ(G)

and for d e isin N

φ(GnKde) = φ((GnKd) nKe) le eφ(GnKd)

We use this inequality to get for d1 d2 isin N

φ(GnKd1)

d1

+φ(H nKd2)

d2

ge φ(GnKd1d2) + φ(H nKd1d2)

d1d2

44 Chapter 3 The asymptotic spectrum of graphs Shannon capacity

From subadditivity follows

φ(GnKd1d2) + φ(H nKd1d2)

d1d2

ge φ((GnKd1d2) t (H nKd1d2))

d1d2

=φ((G tH) nKd1d2)

d1d2

ge φf (G tH)

We conclude φf (G) + φf (H) ge φf (G tH)(iv) Let n isin N Then φf (Kn) = infd φ(Kn nKd)d = infd φ(Knd)d = n(v) Let d1 d2 isin N We claim

(GH) nKd1d2 le (GnKd1) (H nKd2)

This is the same as saying there is a graph homomorphism

(GH) nKd1d2 rarr (GnKd1) (H nKd2)

which is the same as saying there is a graph homomorphism

(G lowastH) nKd1d2 rarr (GnKd1) lowast (H nKd2)

where lowast denotes the or-product of graphs One verifies that (g h (i j)) 7rarr((g i) (h j)) is such a graph homomorphism proving the claim The claimtogether with monotonicity and submultiplicativity gives

φ((GH)nKd1d2) le φ((GnKd1) (H nKd2)) le φ(GnKd1)φ(H nKd2)

Therefore

φf (GH) = infd

φ((GH) nKd)

d

= infd1d2

φ((GH) nKd1d2)

d1d2

le infd1d2

φ(GnKd1)

d1

φ(H nKd2)

d2

= φf (G)φf (H)

This concludes the proof of the lemma

Lemma 312 Let φ G rarr N satisfy

forallGH isin G φ(GnH) ge φ(GnKφ(H)) (32)

Then

infH

φ(GnH)

φ(H)= inf

d

φ(GnKd)

d

33 Universal spectral points 45

Proof From (32) follows

φ(GnH)

φ(H)geφ(GnKφ(H))

φ(H)

and so

φ(GnH)

φ(H)ge inf

d

φ(GnKd)

d

We take the infimum over H to get

infH

φ(GnH)

φ(H)ge inf

d

φ(GnKd)

d

The inequality in the other direction

infH

φ(GnH)

φ(H)le inf

d

φ(GnKd)

d

is trivially true

Lemma 313 Let φ G rarr N be 6-monotone and satisfy

forallGH isin G φ(GnH) ge φ(GnKφ(H))

Then φf is n- and -supermultiplicative

Proof Let AB isin G We have AB gt AnB so

φf (AB) ge φf (AnB)

It remains to show φf (AnB) ge φf (A)φf (B) We have

φ(AnB nH)

φ(H)=φ(An (B nH))

φ(B nH)

φ(B nH)

φ(H)

which implies

φ(AnB nH)

φ(H)ge inf

Hprime

φ(AnH prime)

φ(H prime)infHprimeprime

φ(B nH primeprime)

φ(H primeprime)= φf (A)φf (B)

Take the infimum over H to obtain φf (AnB) ge φf (A)φf (B)

Theorem 314 Let φ G rarr N be t-additive -submultiplicative 6-monotoneand Kn-normalised and satisfy

forallGH isin G φ(GnH) ge φ(GnKφ(H))

Then φf is in X(G)

Proof This follows from Lemma 311 Lemma 312 and Lemma 313

46 Chapter 3 The asymptotic spectrum of graphs Shannon capacity

34 Conclusion

In this chapter we introduced a new connection between Strassenrsquos theory ofasymptotic spectra and the Shannon capacity of graphs In particular we charac-terised the Shannon capacity (which is defined as a supremum) as a minimisationover elements in the asymptotic spectrum of graphs Known elements in theasymptotic spectrum of graphs include the fractional clique cover number theLovasz theta number the projective rank and the fractional Haemers bound Weare left with a clear goal for future work find all elements in the asymptoticspectrum of graphs

Chapter 4

The asymptotic spectrum of tensorsexponent of matrix multiplication

This chapter is based on joint work with Matthias Christandl and Peter Vrana [CVZ18]

41 Introduction

This chapter is about tensors t isin Fn1 otimes middot middot middot otimes Fnk and their asymptotic propertiesThe theory of asymptotic spectra of Chapter 2 was developed by Strassen exactlyfor the purpose of understanding the asymptotic properties of tensors Thischapter is expository and provides the necessary background for understandingChapter 5 and Chapter 6

Let us first define the asymptotic properties of interest and discuss some oftheir applications We need the concepts restriction tensor product and diagonaltensor Let s isin Fn1 otimes middot middot middot otimes Fnk and t isin Fm1 otimes middot middot middot otimes Fmk be tensors We say srestricts to t and write s gt t if there are linear maps Ai Fni rarr Fmi suchthat t = (A1 otimes middot middot middot otimes Ak) middot s The tensor product of s and t is the elements otimes t isin Fn1m1 otimes middot middot middot otimes Fnkmk with coordinates (s otimes t)ij = sitj We naturallydefine the direct sum s oplus t isin Fn1+m1 otimes middot middot middot otimes Fnk+mk We define the diagonaltensors 〈n〉 =

sumni=1 ei otimes middot middot middot otimes ei for n isin N where e1 en is the standard basis

of Fn The tensor rank R(t) is the smallest number n isin N such that t can bewritten as a sum of simple tensors a simple tensor being a tensor of the formv1 otimes middot middot middot otimes vk Equivalently R(t) = minn isin N t 6 〈n〉 The asymptotic rankis the regularisation ˜R(t) = limnrarrinfinR(totimesn)1n While tensor rank is known to behard to compute [Has90 Shi16] we do not know whether asymptotic rank is hardto compute

The exponent of matrix multiplication

The motivating example for studying asymptotic rank is the problem of findingthe exponent of matrix multiplication ω Recall from the introduction that ω

47

48 Chapter 4 The asymptotic spectrum of tensors matrix multiplication

is the infimum over a isin R such that two n times n matrices can be multipliedusing O(na) arithmetic operations (in the algebraic circuit model) It turns out(see [BCS97]) that ω is characterised by the asymptotic rank ˜R(〈2 2 2〉) of thematrix multiplication tensor

〈2 2 2〉 =sum

ijkisin[2]

eij otimes ejk otimes eki isin F4 otimes F4 otimes F4

Namely ˜R(〈2 2 2〉) = 2ω We know the trivial lower bound 2 le ω see Section 43We know the (non-trivial) upper bound ω le 23728639 which is by Coppersmithand Winograd [CW90] and improvements by Stothers Williams and Le Gall[Sto10 Wil12 LG14]

Asymptotic subrank and asymptotic restriction

Besides (asymptotic) rank we naturally define subrank Q(t) = maxm isin N 〈m〉 6 t and the asymptotic subrank ˜Q(t) = limnrarrinfinQ(totimesn)1n Moreover wesay s restricts asymptotically to t written s gtsim t if there is a sequence of naturalnumbers a(n) isin o(n) such that for all n isin N

sotimesn otimes 〈2〉otimesa(n) gt totimesn

One can prove (see [Str91]) that

sotimesn otimes 〈2〉otimeso(n) gt totimesn iff sotimesn+o(n) gt totimesn

Our goal is to understand asymptotic restriction asymptotic rank and asymptoticsubrank

More connections quantum information combinatorics algebraic prop-erty testing

Besides matrix multiplication other applications of asymptotic restriction oftensors asymptotic rank of tensors and asymptotic subrank of tensors includedeciding the feasibility of an asymptotic transformation between pure quantumstates via stochastic local operations and classical communication (slocc) inquantum information theory [BPR+00 DVC00 VDDMV02 HHHH09] boundingthe size of combinatorial structures like cap sets and tri-colored sum-free sets inadditive combinatorics [Ede04 Tao08 ASU13 CLP17 EG17 Tao16 BCC+17KSS16 TS16] see Chapter 5 and bounding the query complexity of certainproperties in algebraic property testing [KS08 BCSX10 Sha09 BX15 HX17FK14]

This chapter is organised as follows In Section 42 we briefly discuss thesemiring of tensors the asymptotic spectrum of tensors and asymptotic rank and

42 The asymptotic spectrum of tensors 49

subrank In Section 43 we discuss the gauge points a simple construction of finitelymany elements in the asymptotic spectrum of tensors In Section 44 we discussthe Strassen support functionals a family of elements in the asymptotic spectrumof ldquoobliquerdquo tensors This family is parametrised by probability distributionson [k] In Section 45 we discuss an extension of the support functionals calledthe Strassen upper support functionals which have the potential to be universalFinally in Section 46 we prove a new result we show how asymptotic slice rankis related to the support functionals

42 The asymptotic spectrum of tensors

Let us properly set up the semiring of tensors and the asymptotic spectrum Forthe proofs we refer to [Str87 Str88 Str91]

421 The semiring of tensor equivalence classes TWe begin by putting an equivalence relation on tensors For example we want toidentify isomorphic tensors and also for any tensor t isin Fn1 otimes middot middot middot otimes Fnk we wantto identify t with toplus 0 where 0 isin Fm1 otimes middot middot middot otimes Fmk is a zero tensor of any format

We say s is isomorphic to t and write s sim= t if there are bijective linear mapsAi Fmi rarr Fni such that t = (A1 Ak) middot s

We say s and t are equivalent and write s sim t if there are zero tensorss0 = 0 isin Fa1 times middot middot middot times Fak and t0 = 0 isin Fb1 times middot middot middot times Fbk such that s oplus s0

sim= t oplus t0The equivalence relation sim is in fact the equivalence relation generated by therestriction preorder 6

Let T be the set of sim-equivalence classes of k-tensors over F for some fixed kand field F The direct sum and the tensor product naturally carry over to T and T becomes a semiring with additive unit 〈0〉 and multiplicative unit 〈1〉(more precisely the equivalence classes of those tensors but we will not make thisdistinction)

422 Strassen preorder via restriction

Restriction 6 induces a partial order on T which behaves well with respect tothe semiring operations and naturally n le m if and only if 〈n〉 6 〈m〉 Thereforerestriction 6 is a Strassen preorder on T

423 The asymptotic spectrum of tensors X(T )

Let S sube T be a subsemiring Let

X(S) = X(S6) = φ isin Hom(SRge0) foralla b isin S a 6 brArr φ(a) le φ(b)

50 Chapter 4 The asymptotic spectrum of tensors matrix multiplication

We call X(S) the asymptotic spectrum of S and we call X(T ) the asymptoticspectrum of k-tensors over F

Theorem 41 ([Str88]) Let s t isin S Then s t iff forallφ isin X(S) φ(s) le φ(t)

Proof This follows from Theorem 212

We refer to Chapter 2 for a discussion of the topological properties of X(S)

Remark 42 We mention that X(S) may equivalently be defined with degenera-tion instead of restriction ge Over C we say f degenerates to g written f gif f sim= f prime and g sim= gprime and gprime is in the Euclidean closure (or equivalently Zariskiclosure) of the orbit GLn1 times middot middot middot timesGLnk middot f prime It is a nontrivial fact from algebraicgeometry (see [Kra84 Lemma III231] or [BCS97]) that there is a degenerationf g if and only if there are matrices Ai with entries polynomial in ε such that(A1 Ak) middot f = εdg + εd+1g1 + middot middot middot + εd+ege for some elements g1 ge Thelatter definition of degeneration is valid when C is replaced by an arbitrary fieldF and that is how degeneration is defined for an arbitrary field Degenerationis weaker than restriction f ge g implies f g Asymptotically however thenotions coincide f amp g if and only if fotimesn otimes 〈2〉otimeso(n) gotimesn We mention thatanalogous to restriction degeneration gives rise to border rank and border subrankR(f) = minr isin N f 〈r〉 and Q(f) = maxs isin N 〈s〉 f respectively

424 Asymptotic rank and asymptotic subrank

The abstract theory of asymptotic spectra characterises asymptotic subrank andasymptotic rank as follows

Corollary 43 Let S sube T be a subsemiring Let a isin S Then

˜Q(a) = minφisinX(S)

φ(a) (41)

˜R(a) = maxφisinX(S)

φ(a) (42)

Proof Statement (42) follows from Corollary 213 since either a = 0 or a gt 1For statement (41) if totimesk gt 2 for some k isin N then we apply Corollary 214Otherwise one can show that ˜Q(t) equals 0 or 1 using the gauge points of thenext section (see [Str88 Lemma 37])

Remark 44 One verifies that ˜R and ˜Q are 6-monotones and have value non 〈n〉 They are not universal spectral points however Namely the asymptoticrank of each of the three tensors

〈2 1 1〉 = e1 otimes e1 otimes 1 + e2 otimes e2 otimes 1 isin F2 otimes F2 otimes F1

〈1 1 2〉 = e1 otimes 1otimes e1 + e2 otimes 1otimes e2 isin F2 otimes F1 otimes F2

43 Gauge points ζ(i) 51

〈1 2 2〉 = 1otimes e1 otimes e1 + 1otimes e2 otimes e2 isin F1 otimes F2 otimes F2

equals 2 whereas their tensor product equals the matrix multiplication ten-sor 〈2 2 2〉 whose tensor rank equals 7 and whose asymptotic rank is thus atmost 7 ie strictly smaller than 23 Therefore asymptotic rank is not multiplica-tive On the other hand the asymptotic subrank of each of the above three tensorsequals 1 whereas the asymptotic subrank of 〈2 2 2〉 equals 4 see Chapter 5Therefore asymptotic subrank is not multiplicative

Goal 45 Our goal is now to explicitly describe elements in X(T ) universalspectral points or more modestly to describe elements in X(S) for interestingsubsemirings S sube T

Strassen constructed a finite family of elements in X(T ) the gauge points andan infinite family of elements in X(oblique tensors) the support functionalsThe support functionals are powerful enough to determine the asymptotic subrankof any ldquotight tensorrdquo Tight tensors are discussed in Chapter 5 In Chapter 6 weconstruct an infinite family in X(k-tensors over C) the quantum functionalsIn the rest of this chapter we discuss the gauge points and the support functionalsWe will focus on the case k = 3 for clarity of exposition

43 Gauge points ζ(i)

Strassen in [Str88] introduced a finite family of elements in X(T ) called the gaugepoints We focus on 3-tensors but the construction generalises immediately tok-tensors Let Vi = Fni Let t isin V1 otimes V2 otimes V3 Let i isin [3] Let flatteni(t) bethe image of t under the grouping V1 otimes V2 otimes V3 rarr Vi otimes (

otimesj 6=i Vj) We think

of flatteni(t) as a matrix Let ζ(i) T rarr N t 7rarr rank(flatteni(t)) with rankdenoting matrix rank We call ζ(1) ζ(2) ζ(3) the gauge points From the propertiesof matrix rank follows directly that ζ(i) is multiplicative under otimes additive under oplusmonotone under restriction 6 (and under degeneration ) and normalised to 1on 〈1〉 = e1 otimes e1 otimes e1

Theorem 46 ζ(1) ζ(2) ζ(3) isin X(T )

Recall ˜Q(t) le φ(t) le ˜R(t) for φ isin X(T ) In particular maxi ζ(i)(t) le ˜R(t)

We do not know whether maxiisin[3] ζ(i) equals ˜R To be precise we do not know any t

for which maxi ζ(i)(t) lt ˜R(t) and we do not know a proof that maxi ζ

(i)(t) = ˜R(t)for all t There are various families of tensors t for which maxi ζ

(i)(t) = ˜R(t) isproven We will see such a family in Section 542 For the matrix multiplicationtensor 〈2 2 2〉 we have 4 = maxi ζ

(i)(〈2 2 2〉) le 2ω so maxi ζ(i)(t) = ˜R(t) would

imply that the matrix multiplication exponent ω equals 2On the other hand ˜Q(t) le mini ζ

(i)(t) There exist t for which ˜Q(t) is

strictly smaller than miniisin[3] ζ(i)(t) To show this strict inequality we need another

52 Chapter 4 The asymptotic spectrum of tensors matrix multiplication

technique of Strassen the support functionals The support functionals are thetopic of the next section

44 Support functionals ζθ

Strassen in [Str91] constructed an infinite family of elements in the asymptoticspectrum of oblique k-tensors called the support functionals In this section we ex-plain the construction of the support functionals The support functionals providethe benchmark for our new quantum functionals (Chapter 6) and are relevant inthe context of combinatorial problems like the cap set problem (Section 542) Forclarity of exposition we focus on 3-tensors The ideas extend directly to k-tensors

Oblique tensors are tensors for which in some basis the support has thefollowing special structure Let t isin Fn1 otimes Fn2 otimes Fn3 Let e1 eni be thestandard basis of Fni Write t =

sumijk tijk ei otimes ej otimes ek Let [ni] = 1 2 ni

Let supp(t) = (i j k) tijk 6= 0 sube [n1] times [n2] times [n3] be the support of t withrespect to the standard basis Let [ni] have the natural ordering 1 lt 2 lt middot middot middot lt [ni]and let [n1]times [n2]times [n3] have the product order denoted by le That is x le yif for all i isin [3] holds xi le yi We call supp(t) oblique if supp(t) is an antichainwith respect to le ie if any two elements in supp(t) are incomparable withrespect to le We call a tensor t oblique if supp(g middot t) is oblique for some groupelement g isin G(t) = GLn1 times GLn2 times GLn3 The family of oblique tensors is asemiring under oplus and otimes

Not all tensors are oblique Obliqueness is not a generic property (see Propo-sition 621) However many tensors that are of interest in algebraic complexitytheory are oblique notably the matrix multiplication tensors

〈a b c〉 =sumiisin[a]

sumjisin[b]

sumkisin[c]

eij otimes ejk otimes eki isin Fab otimes Fbc otimes Fca

For any finite set X let P(X) be the set of all probability distributions on XFor any probability distribution P isin P(X) the Shannon entropy of P is definedas H(P ) = minus

sumxisinX P (x) log2 P (x) with 0 log2 0 understood as 0 Given finite

sets X1 Xk and a probability distribution P isin P(X1 times middot middot middot times Xk) on theproduct set X1 times middot middot middot timesXk we denote the marginal distribution of P on Xi by Pithat is Pi(a) =

sumxxi=a

P (x) for any a isin Xi

Definition 47 Let θ isin Θ = P([3]) For t isin Fn1 otimes Fn2 otimes Fn3 0 with supp(t)oblique define

ζθ(t) = max2sum3i=1 θ(i)H(Pi) P isin P(supp(t))

We call the ζθ for θ isin Θ the support functionals

Theorem 48 ζθ isin X(oblique) for θ isin Θ

44 Support functionals ζθ 53

We work towards the proof of Theorem 48 For p isin [0 1] let h(p) be thebinary entropy function h(p) = minusp log2 p minus (1 minus p) log2(1 minus p) ie h(p) is theShannon entropy of the probability vector (p 1minus p) The following properties ofthe Shannon entropy are well-known

Lemma 49

(i) H(P otimesQ) = H(P ) +H(Q) for P isin P(X1) Q isin P(X2)

(ii) H(P ) le H(P1) +H(P2) for P isin P(X1 timesX2)

(iii) H(pPoplus(1minusp)Q) = pH(P )+(1minusp)H(Q)+h(p) for PQ isin P(X) p isin [0 1]

(iv) 2a + 2b = max0leple1 2pa+(1minusp)b+h(p) for a b isin R

For X sube [n1]times [n2]times [n3] let Xle = y isin [n1]times [n2]times [n3] existx isin X y le x bethe downward closure of X Let max(X) = y isin X forallx isin X y le x rArr y = xbe the maximal points of X with respect to le Let Sn be the symmetric groupof permutations of [n] Then the product group Sn1 times Sn2 times Sn3 acts naturallyon [n1]times [n2]times [n3]

Lemma 410 Let t isin Fn1 otimes Fn2 otimes Fn3 For every g isin G(t) there is a triple ofpermutations w isin W (t) = Sn1 times Sn2 times Sn3 with w middotmax(supp(g middot t)) sube supp(t)le

Proof We prepare for the construction of w Let n isin N Let e1 en bethe standard basis of Fn Let g isin GLn Let f1 fn with fj = g middot ej be thetransformed basis of Fn Let (Ei)iisin[n] and (Fj)jisin[n] be the complete flags of Fnwith

Ei = Spanei ei+1 enFj = Spanfj fj+1 fn

Define the map

π [n]rarr [n] j 7rarr maxi isin [n] Ei cap (fj + Fj+1) 6= empty

(43)

We prove π is injective Let j k isin [n] with j le k and suppose i = π(j) = π(k)Let Ftimes = F 0 From (43) follows

(Ftimesei + Ei+1) cap (fj + Fj+1) 6= empty (44)

Ei+1 cap (fj + Fj+1) = empty (45)

(Ftimesei + Ei+1) cap (fk + Fk+1) 6= empty (46)

Suppose j lt k Then from (44) and (46) we obtain a contradiction to (45) Weconclude that j = k Thus π is injective

54 Chapter 4 The asymptotic spectrum of tensors matrix multiplication

For each Fni define as above the standard complete flag (Eij)jisin[ni] of Fni the

complete flag (F ij )jisin[ni] corresponding to the basis given by gi and the permuta-

tion πi [ni]rarr [ni] Let w = (π1 π2 π3) isin W (t)We will prove w middot max(supp(g middot t)) sube supp(t)le Let y isin max(supp(g middot t))

Let x = w middot y By construction of πi the intersection Eixicap (f iyi + F i

yi+1) is notempty Choose

f iyi isin Eixicap (f iyi + F i

yi+1)

Let tlowast be the multilinear map Fn1 times Fn2 times Fn3 rarr F with tlowast(ei ej ek) = tijk for alli isin [n1] j isin [n2] k isin [n3] Then

tlowast(f 1y1 f 2y2 f 3y3

) = tlowast(f 1y1 f 2y2 f 3y3

) +sum

zisin[n1]times[n2]times[n3]zgty

cz tlowast(f 1

z1 f 2z2 f 3z3

) (47)

for some cz isin F Since y is maximal in supp(gmiddott) the sum over z gt y in (47) equalszero We conclude tlowast(f 1

y1 f 2y2 f 3y3

) = tlowast(f 1y1 f 2y2 f 3y3

) 6= 0 Thus tlowast(E1x1timesE2

x2timesE3

x3)

is not zero and thus x isin supp(t)le

Proof of Theorem 48 We prove ζθ on oblique tensors is otimes-multiplicative oplus-additive 6-monotone and normalised to 1 on 〈1〉 = e1otimese1otimese1 The normalisationζθ(〈1〉) = 1 is clear

We prove ζθ is otimes-supermultiplicative Let s isin Fn1 otimes Fn2 otimes Fn3 and lett isin Fm1otimesFm2otimesFm3 Let P isin P(supp(t)) and Q isin P(supp(s)) Then the productP otimesQ isin P(supp(sotimes t)) has marginals PiotimesQi Since H(PiotimesQi) = H(Pi)+H(Qi)(Lemma 49(i)) we conclude ζθ(s)ζθ(t) le ζθ(sotimes t)

We prove ζθ is otimes-submultiplicative For P isin P(supp(t)) and θ isin Θ we use thenotation Hθ(P ) =

sum3i=1 θ(i)H(Pi) We naturally identify supp(t) with a subset

of [n1] times [n2] times [n3] times [m1] times [m2] times [m3] Let P isin P(supp(t)) Let P[3] be themarginal distribution of P on [n1] times [n2] times [n3] and let P3+[3] be the marginaldistribution of P on [m1]times [m2]times [m3] Then Hθ(P ) le Hθ(P[3]) +Hθ(P3+[3]) byLemma 49(ii) We conclude ζθ(sotimes t) le ζθ(s)ζθ(t)

We prove ζθ is oplus-additive By definition

ζθ(soplus t) = max2Hθ(P ) P isin P(supp(soplus t))= max

max0leple1

2Hθ(pPoplus(1minusp)Q) P isin P(supp(s)) Q isin P(supp(t))

From Lemma 49(iii) and (iv) follows

max

max0leple1

2Hθ(pPoplus(1minusp)Q) P isin P(supp(s)) Q isin P(supp(t))

= max

max0leple1

2pHθ(P )+(1minusp)Hθ(Q)+h(p) P isin P(supp(s)) Q isin P(supp(t))

= max

2Hθ(P ) + 2Hθ(Q) P isin P(supp(s)) Q isin P(supp(t))

44 Support functionals ζθ 55

= ζθ(s) + ζθ(t)

We conclude ζθ(soplus t) = ζθ(s) + ζθ(t)We prove ζθ is 6-monotone Let s 6 t with supp(s) and supp(t) oblique Then

there are linear maps Ai with s = (A1 otimes A2 otimes A3) middot t If A1 A2 A3 are of theform diag(1 1 0 0) then ζθ(s) le ζθ(t) Suppose g = (A1 A2 A3) isin G(t)Let P isin P(supp(t)) maximise Hθ on P(supp(t)) Let σ isin W such that σ middot Phas non-increasing marginals Then Hθ(σ middot P ) = Hθ(P ) and σ middot P maximises Hθ

on P(supp(σ middot t)) Then σ middot P maximises Hθ on P(supp(σ middot t)le) by Lemma 412below Let Q isin P(supp(g middot t)) maximise Hθ on P(supp(g middot t)) By Lemma 410there is a w isin W with w middot supp(g middot t) sube supp(σ middot t)le Then Hθ(w middotQ) = Hθ(Q) leHθ(σ middot P ) = Hθ(P ) Thus maxPisinP(supp(gmiddott)) Hθ(P ) le maxPisinsupp(t) Hθ(P ) Weconclude ζθ(g middot t) le ζθ(t)

The following two lemmas finish the above proof of Theorem 48 Recall thatin the proof we defined Hθ(P ) =

sum3i=1 θ(i)H(Pi) for θ isin Θ

Lemma 411 ([Str91 Prop 21]) Let Φ sube [n1] times [n2] times [n3] Let P isin P(Φ)Let supp(P ) be the support x isin Φ P (x) 6= 0 For x isin Φ define hP (x) =minussum3

i=1 θ(i) log2 Pi(xi) Then P maximises Hθ on P(Φ) if and only if

forallx isin supp(P ) hP (x) = maxyisinΦ

hP (y) (48)

Proof We write Hθ(P ) in terms of hP

Hθ(P ) =3sumi=1

θ(i)H(Pi) =sum

xisinsupp(P )

P (x)hP (x) (49)

For Q isin P(Φ)

limεrarr0+

d

dεHθ

((1minus ε)P + εQ

)= lim

εrarr0+

d

sumx

((1minus ε)P (x) + εQ(x)

)h(1minusε)P+εQ(x)

=sumx

P (x)

( 3sumi=1

θ(i)Pi(xi)minusQi(xi)

Pi(xi) ln(2)

)+sumx

(minusP (x) +Q(x)

)hP (x)

=sumx

Q(x)hP (x)minussumx

P (x)hP (x)

Therefore since Hθ is continuous and concave P maximises Hθ if and only if

forallQ isin P(Φ)sumx

Q(x)hP (x)minussumx

P (x)hP (x) le 0 (410)

56 Chapter 4 The asymptotic spectrum of tensors matrix multiplication

We will prove (410) is equivalent to (48) Supposesum

xQ(x)hP (x) lesum

x P (x)hP (x)for every Q isin P(Φ) In particular hP (y) le

sumx P (x)hP (x) for every y isin Φ so

maxyisinΦ hP (y) lesum

x P (x)hP (x) Then maxyisinΦ hP (y) =sum

x P (x)hP (x) We con-clude maxyisinΦ hP (y) = hP (x) for every x isin supp(P )

Suppose maxyisinΦ hP (y) = hP (x) for every x isin supp(P ) Then hP (y) le hP (x)for every Q isin P(Φ) y isin supp(Q) x isin supp(P ) We conclude

sumxQ(x)hP (x) lesum

x P (x)hP (x)

Lemma 412 ([Str91 Cor 22]) Let Φ sube [n1]times [n2]times [n3] Let P maximise Hθ

on P(Φ) Suppose Pi is nonincreasing on [ni] for each i isin [3] Then P max-imises Hθ on P(Φle) where Φle is the downward closure of Φ with respect to le

Proof We know P satisfies (48) We will prove P satisfies (48) with Φ replacedby Φle Then we are done by Lemma 411 Let x isin Φle Then x le y forsome y isin Φ Then (P1(x1) P2(x2) P3(x3)) ge (P1(y1) P2(y2) P3(y3)) since each Piis nonincreasing Then hP (x) le hP (y) We conclude maxΦle hP le maxΦ hP Onthe other hand Φ sube Φle Therefore maxΦ hP le maxΦle hP

Using the support functionals Strassen managed to fully compute the asymp-totic spectrum of several semirings generated by oblique tensors We will see anexample in Section 542

45 Upper and lower support functionals ζθ ζθ

In Section 44 we defined the support functionals ζθ oblique rarr Rge0 andproved that ζθ isin X(oblique) From the general theory of asymptotic spectra(Chapter 2) we know ζθ is the restriction of some map φ tensors rarr Rge0

in X(T ) However the proof of that fact was non-constructive In other wordswe know that ζθ can be extended to an element of X(T ) In this short sectionwe discuss a candidate extension proposed by Strassen called the upper supportfunctional We also discuss a companion called the lower support functional

For arbitrary t isin Fn1 otimes Fn2 otimes Fn3 the upper support functional and the lowersupport functional are defined as

ζθ(t) = mingisinG(t)

max2Hθ(P ) P isin P(supp(g middot t))

ζθ(t) = maxgisinG(t)

max2Hθ(P ) P isin P(max(supp(g middot t)))

with G(t) = GLn1 timesGLn2 timesGLn3 and Hθ(P ) =sum3

i=1 θ(i)H(Pi) We summarisethe known properties of the upper and lower support functional

Theorem 413 ([Str91]) Let s isin Fn1 otimes Fn2 otimes Fn3 and t isin Fm1 otimes Fm2 otimes Fm3Let θ isin Θ = P([3])

45 Upper and lower support functionals ζθ ζθ 57

(i) ζθ(〈n〉) = n for n isin N

(ii) ζθ(soplus t) = ζθ(s) + ζθ(t)

(iii) ζθ(sotimes t) le ζθ(s)ζθ(t)

(iv) If s gt t then ζθ(s) ge ζθ(t)

Theorem 414 ([Str91]) Let s isin Fn1 otimes Fn2 otimes Fn3 and t isin Fm1 otimes Fm2 otimes Fm3Let θ isin Θ

(i) ζθ(〈n〉) = n for n isin N

(ii) ζθ(soplus t) ge ζθ(s) + ζθ(t)

(iii) ζθ(sotimes t) ge ζθ(s)ζθ(t)

(iv) If s gt t then ζθ(s) ge ζθ(t)

Theorem 415 ([Str91]) ζθ(sotimes t) ge ζθ(s)ζθ(t) and ζθ(t) ge ζθ(t) for θ isin Θ

Regarding statement (ii) in Theorem 414 Burgisser [Bur90] shows that thelower support functional ζθ is not in general additive under the direct sumwhen θi gt 0 for all i See also [Str91 Comment (iii)] In particular this impliesthat the upper support functional ζθ(t) and the lower support functional ζθ(t)are not equal in general the upper support functional being additive In factto show that the lower support functional is not additive Burgisser first showsthat when F is algebraically closed the generic value of ζθ on Fn otimes Fn otimes Fnequals (1minusmini θi) log2 n+ o(n) On the other hand Tobler [Tob91] shows thatthe generic value of ζθ on FnotimesFnotimesFn equals log2 n So even generically ζθ and ζθare different on Fn otimes Fn otimes Fn

For θ isin Θ we say f is θ-robust if ζθ(t) = ζθ(t) We say t is robust if t is θ-robustfor all θ isin Θ Let us try to understand what robust tensors look like A tensor tis θ-robust if and only if

ζθ(t) le ζθ(t) (411)

The set of θ-robust tensors is closed under oplus and otimes since

ζθ(soplus t) = ζθ(s) + ζθ(t) = ζθ(s) + ζθ(t) le ζθ(soplus t)

and

ζθ(sotimes t) le ζθ(s)ζθ(t) = ζθ(s)ζθ(t) le ζθ(sotimes t)

For X sube [n1] times [n2] times [n3] we use the notation Hθ(X) = maxPisinP(X) Hθ(P )Let t isin Fn1 otimes Fn2 otimes Fn3 0 Equation (411) means that there are g h isin G(t)and P isin P(max supp(h middot t)) such that Hθ(supp(g middot t)) le Hθ(P ) In this case we

58 Chapter 4 The asymptotic spectrum of tensors matrix multiplication

have ζθ(t) = ζθ(t) = 2Hθ(P ) In particular t is θ-robust if there is a g isin G(t) suchthat the maximisation Hθ(supp(g middot t) is attained by a P isin P(max(supp(g middot t)))This criterion is automatically satisfied for all θ when supp(g middot t) = max(supp(g middot t))for some g isin G(t) Suppose t is oblique Then supp(g middot t) is an antichainfor some g isin G(t) and thus supp(g middot t) = max supp(g middot t) Then t is robustand ζθ(t) = ζθ(t) = 2Hθ(supp(gmiddott))

46 Asymptotic slice rank

Slice rank is a variation on tensor rank that was introduced by Terence Taoin [Tao16] to study cap sets We will look at cap sets in Section 54 Here westudy the relationship between asymptotic slice rank and the support functionals

Consider the following characterisation of tensor rank Let a simple tensor beany tensor of the form v1 otimes v2 otimes v3 isin V1 otimes V2 otimes V3 with vi isin Vi for i isin [k] Thenthe rank R(t) of t isin V1 otimes V2 otimes V3 is the smallest number r such that t can bewritten as a sum of r simple tensors

Slice rank is defined similarly but with simple tensors replaced by slicesFor S sube [k] let VS =

otimesiisinS Vi For j isin [k] let j = j A tensor in V1 otimes V2 otimes V3

is called a slice if it is of the form v otimes w with v isin Vj and w isin Vj for some j isin [k](under the natural reordering of the tensor legs) Let t isin V1 otimes V2 otimes V3 The slicerank of t denoted by SR(t) is the smallest number r such that t can be writtenas a sum of r slices For example the tensor

W = e1 otimes e1 otimes e2 + e1 otimes e2 otimes e1 + e2 otimes e1 otimes e1 isin F2 otimes F2 otimes F2 (412)

has slice rank 2 since we can write W = e1 otimes (e1 otimes e2 + e2 otimes e1) + e2 otimes e1 otimes e1In fact the slice rank of any element in V1 otimes V2 otimes V3 is at most mini dimVi Thetensor rank of W on the other hand is known to be 3

Slice rank is clearly monotone under restriction The slice rank of the diagonaltensor 〈r〉 equals r [Tao16] It follows that subrank is at most slice rank

Q(t) le SR(t)

The motivation for the introduction of slice rank in [Tao16] was finding upperbounds on subrank Q(t) and asymptotic subrank ˜Q(t)

The main result of this section is the following theorem Recall that a tensor tis oblique if the support supp(g middot t) is an antichain for some g isin G(t)

Theorem 416 Let t be oblique Then

limnrarrinfin

SR(totimesn)1n = minθisinP([3])

ζθ(t)

Our proof of Theorem 416 is based on a proof of Tao and Sawin in [TS16]and discussions of the author with Dion Gijswijt The explicit connection betweenasymptotic slice rank and the support functionals is new

46 Asymptotic slice rank 59

We use Theorem 416 before giving its proof to see that SR is not submulti-plicative and not supermultiplicative under the tensor product otimes In particular wecannot use Feketersquos lemma Lemma 22 to prove that the limit limnrarrinfin SR(totimesn)1n

exists Thus the existence of the limit is a non-trivial consequence of Theorem 416Let W as in (412) Then SR(W ) = 2 We have ζ(131313)(W ) = 2h(13) lt 2

From Theorem 416 follows SR(Wotimesn) le 2nh(13)+o(1) We conclude SR(Wotimesn) lt 2n

for n large enough We conclude SR is not supermultiplicative Now it is alsoclear that slice rank is not the same as (border) subrank since (border) subrankis supermultiplicative

Next the tensorssumn

i=1 eiotimeseiotimes1sumn

i=1 eiotimes1otimeseisumn

i=1 1otimeseiotimesei have slice rankone while their tensor product equals the matrix multiplication tensor 〈n n n〉which has slice rank n2 by Theorem 416 and Theorem 53 in the next chapterapplied to the tight tensor 〈n n n〉 We conclude SR is not submultiplicative

Slice rank and hitting set number

We study the hitting set number of the support of a tensor Let Φ sube [n1]times[n2]times[n3]A hitting set for Φ is a 3-tuple of sets A1 sube [n1] A3 sube [n2] A3 sube [n3] such that forevery a isin Φ there is an i isin [3] with ai isin Ai We may think of Φ as a 3-partite3-uniform hypergraph Then the definition of hitting set says every edge a isin Φ ishit by an element of some Ai A hitting set is also called a vertex cover everyedge being covered by some vertex or a transversal The size of the hittingset (A1 A2 A3) is |A1|+ |A2|+ |A3| The hitting set number τ(Φ) is the size ofthe smallest hitting set for Φ Let t isin Fn1 otimes Fn2 otimes Fn3

Lemma 417 Let g isin G(t) = GLn1timesGLn2timesGLn3 Then SR(t) le τ(supp(g middot t))

Proof This is clear

Lemma 418 Let g isin G(t) Then SR(t) ge τ(max(supp(g middot t)))

Proof It is sufficient to consider g = e Let

t =

r1sumi=1

v1i otimes u1

i +

r2sumi=1

v2i otimes u2

i +

r3sumi=1

v3i otimes u3

i

be a slice decomposition We may assume vj1 vjrj

are linearly independent

Let Vj = Spanvj1 vjrj sube Fnj Let Wj sube (Fnj)lowast be the elements in thedual space that vanish on Vj Let Bj sube Wj be a basis with the followingproperty with respect to the standard basis the matrix with the elementsof Bj as columns is in reduced row echelon form ie each column is of theform (lowast middot middot middot lowast 1 0 middot middot middot 0)T and the pivot elements (the 1rsquos) are all in different rowsLet Sj sube [nj] be the indices of the pivot element Let Sj = [nj] Sj be thecomplement Then |Sj| = rj We claim (S1 S2 S3) is a hitting set for max(supp(t))

60 Chapter 4 The asymptotic spectrum of tensors matrix multiplication

Then r1 + r2 + r3 = |S1|+ |S2|+ |S3| ge τ(max(supp(t))) Let x isin max(supp(t))Suppose x isin S1 times S2 times S3 For every j isin [3] let φj isin Bj have its pivot element atindex xj Let φ = φ1 otimes φ2 otimes φ3 Then φ isin W1 otimesW2 otimesW3 so φ(t) = 0 Since x ismaximal and each Bj is in reduced row echelon form

φ(t) =sumylex

ty φ(ey1 otimes ey2 otimes ey3)

=sumyltx

ty φ(ey1 otimes ey2 otimes ey3) + tx ex1 otimes ex2 otimes ex3

=sumyltx

sy ey1 otimes ey2 otimes ey3 + tx ex1 otimes ex2 otimes ex3

for some sy isin F From φ(t) = 0 follows tx = 0 This contradicts x isin supp(t) sox 6isin S1 times S2 times S3 ie there is a j isin [3] with xj isin Sj

Asymptotic hitting set number

We now study the asymptotic hitting set number ˜τ(Φ) = limnrarrinfin τ(Φtimesn)1nWe will use some basic facts of types and type classes Let X be a finite

set Let N isin N An N-type on X is a probability distribution P on X withN middot P (x) isin N for all x isin X Let P be an N -type on X The type class TNP sube XN

is the set of sequences s = (s1 sN) with x occuring N middot P (x) times in s forevery x isin X ie |i isin [N ] si = x| = N middot P (x)

Lemma 419 The number of N-types on X equals(N+|X|minus1|X|minus1

) Let P be an

N-type The size of the type class TNP equals the multinomial coefficient(NNP

)

Proof We leave the proof to the reader

Lemma 420 Let P be an N-type on X Then

1

(N + 1)|X|2NH(P ) le

(N

NP

)le 2NH(P )

Proof See eg [CT12 Theorem 1113]

Lemma 421 log2˜τ(Φ) le maxPisinP(Φ) miniisin[3]H(Pi)

Proof Let P maximise maxPisinP(Φ) miniH(Pi) Let n isin N We construct a hittingset (A1 A2 A3) for Φn as follows Let x isin Φn Viewing x as an n-tuple of elementsin Φ let Q isin Pn(Φ) be the type of x (ie the empirical distribution) Let j isin [3]with H(Qj) = miniisin[3]H(Qi) By our choice of P we have

H(Qj) = miniisin[3]

H(Qi) le miniisin[3]

H(Pi)

46 Asymptotic slice rank 61

Viewing x as a 3-tuple (x1 x2 x3) add xj to Aj We repeat this for all x isin ΦnThe final (A1 A2 A3) is a hitting set for Φn by construction For each j isin [3]

|Aj| lesumQj

|T nQj | lesumQj

2nH(Qj)

where the sum is over Qj isin Pn(Φj) with H(Qj) le miniisin[3]H(Pi) Then

|Aj| le |Pn(Φj)| 2nminiH(Pi) = poly(n)2nminiH(Pi)

We conclude |A1|+ |A2|+ |A3| le poly(n)2nminiH(Pi)

Lemma 422 log2˜τ(Φ) ge maxPisinP(Φ) miniisin[3]H(Pi)

Proof Let P maximise maxPisinP(Φ) miniH(Pi) Let n isin N Let (A1 A2 A3) be ahitting set for Φn Let Q isin Pn(Φ) be an n-type with miniH(Qi) = miniH(Pi)minuso(n) Let Ψ = T nQ sube Φn be the set of strings with type Q Then (A1 A2 A3) is ahitting set for Ψ Let πi Ψrarr Φn

i (x1 x2 x3) 7rarr xi Then

Ψ = πminus11 (A1) cup πminus1

2 (A2) cup πminus13 (A3)

Let j isin [3] with |πminus1j (Aj)| ge 1

3|Ψ| The fiber πminus1

j (a) has constant size over a isin Ψj

Let cj = |πminus1j (a)| be this size Then

|Ψ| =sumaisinΨj

|πminus1j (a)| =

sumaisinΨj

cj = |Ψj| cj

And

|πminus1j (Aj)| =

sumaisinAjcapΨj

|πminus1j (a)| = |Aj capΨj| cj le |Aj| cj

Therefore

|Aj| ge|πminus1j (Aj)|cj

ge13|Ψ|cj

= 13|Ψj|

We have |Ψj| ge 2nH(Qj)minuso(n) ge 2nminiH(Qi)minuso(n) ge 2nminiH(Pi)minuso(n) We conclude|A1|+ |A2|+ |A3| ge |Aj| ge 1

3|Ψj| ge 1

32nminiH(Pi)minuso(n)

Lemma 423 log2˜τ(Φ) = maxPisinP(Φ) miniisin[3] H(Pi)

Proof This follows directly from the above lemmas

62 Chapter 4 The asymptotic spectrum of tensors matrix multiplication

Asymptotic slice rank

We now combine the above lemmas about slice rank and the asymptotic hittingset number to prove Theorem 416 First we have the following basic lemma

Lemma 424 minθisinΘ maxPisinP(Φ)Hθ(P ) = maxPisinP(Φ) miniisin[3]H(Pi)

Proof Since Hθ(P ) is convex in θ and concave in P von Neumannrsquos minimaxtheorem gives minθ maxPisinP(Φ)Hθ(P ) = maxPisinP(Φ) minθHθ(P ) Finally we usethat minθHθ(P ) = miniH(Pi)

Define fsim(t) = lim supnrarrinfin f(totimesn)1n and fsim(t) = lim infnrarrinfin f(totimesn)1n

Lemma 425 Let t isin Fn1 otimes Fn2 otimes Fn3 Then

maxgisinG(t)

maxPisinP(max supp(gmiddott))

miniH(Pi) le SRsim(t) le SRsim(t) le min

θζθ(t)

Proof By definition SRsim(t) le SRsim(t) From Lemma 417 follows

SRsim(t) le˜τ(supp(g middot t))

for any g isin G(t) Lemma 423 gives ˜τ(supp(g middot t)) = maxPisinP(supp(gmiddott)) mini 2H(Pi)

Thus with the help of Lemma 424

SRsim(t) le mingisinG(t)

maxPisinP(supp(gmiddott))

mini

2H(Pi) = minθζθ(t)

From Lemma 418 follows

˜τ(max(supp(g middot t))) le SRsim(t)

for any g isin G(t) Lemma 423 gives

maxgisinG(t)

maxPisinP(max(supp(gmiddott)))

mini

2H(Pi) le SRsim(t)

This proves the lemma

Proof of Theorem 416 We may assume Φ = supp(t) is oblique Then withthe help of Lemma 424 and Lemma 425

minθisinΘ

ζθ(t) = minθisinΘ

ζθ(t)

= minθisinΘ

maxPisinmax(Φ)

2Hθ(P )

= maxPisinmax(Φ)

miniisin[3]

2H(Pi)

le maxgisinG(t)

maxPisinP(max(supp(gmiddott)))

miniisin[3]

2H(Pi)

le SRsim(t)

le SRsim(t)

le minθisinΘ

ζθ(t)

This proves the claim

47 Conclusion 63

47 Conclusion

The study of asymptotic rank of tensors is motivated by the open problem of findingthe exponent of matrix multiplication Asymptotic subrank has applicationsin for example combinatorics and algebraic property testing Via the theoryof asymptotic spectra Strassen characterised asymptotic rank and asymptoticsubrank in terms of the asymptotic spectrum of tensors Strassen introduced thegauge points in X(T ) and the support functionals in X(oblique) More preciselythere are the lower support functionals and the upper support functionals Thelower support functionals are not additive and can thus not be universal spectralpoints The upper support functionals may be universal spectral points but thiscan however not be shown with the help of the lower support functionals Finallywe showed that for oblique tensors the asymptotic slice rank exists and equals theminimum value over the support functionals In the next chapter we will see asubfamily of the oblique 3-tensors for which the support functionals are powerfulenough to compute the asymptotic subrank

Chapter 5

Tight tensors and combinatorialsubrank cap sets

This chapter is based on joint work with Matthias Christandl and Peter Vrana [CVZ16

CVZ18]

51 Introduction

In the previous chapter we discussed the gauge points and the support function-als ζθ The gauge points are in the asymptotic spectrum of all tensors while thesupport functionals are in the asymptotic spectrum of oblique tensors

How ldquopowerfulrdquo are the support functionals We know ˜Q(t) le ζθ(t) le ˜R(t) for

oblique t Thus maxθ ζθ(t) le ˜R(t) In fact maxθ ζ

θ(t) is at most the maximumover the gauge points maxS ζ(S) and in turn maxS ζ(S) is at most ˜R(t) Asremarked earlier it is not known whether maxS ζ(S) equals ˜R(t) in general

On the other hand we have ˜Q(t) le minθ ζθ(t) Do we attain equality here

in general ˜Q(t) = minθ ζθ(t) The answer is ldquoyesrdquo for the subsemiring of tight

3-tensors In this chapter we study tight k-tensors

Tight tensors

Let I1 Ik be finite sets Let Φ sube I1 times middot middot middot times Ik We say Φ is tight if there areinjective maps ui Ii rarr Z for i isin [k] such that

forallα isin Φ u1(α1) + middot middot middot+ uk(αk) = 0

We say t isin Fn1 otimes middot middot middot otimes Fnk is tight if there is a g isin G(t) = GLn1 times middot middot middot times GLnksuch that the support supp(g middot t) is tight

Recall that a tensor is oblique if the support is an antichain in some basisClearly tight tensors are oblique To summarise the families of tensors that we

65

66 Chapter 5 Tight tensors and combinatorial subrank cap sets

have defined up to now we have

tight sube oblique sube robust sube θ-robust

Recall that the families of oblique robust and θ-robust tensors each form asemiring under otimes and oplus Tight tensors have the same property [Str91 Section 5]Another property is that any subset of a tight set is tight

Example 51 Let k ge 3 be fixed For any integer n ge 1 and c isin [n] the set

Φn(c) = α isin 0 nminus 1k α1 + middot middot middot+ αk = c

is tight For any integer n ge 2 and any c isin [n] the set

Ψn(c) = α isin 0 nminus 1k α1 + middot middot middot+ αk = c mod n

is not tight (cf Exercise 1520 in [BCS97])

Example 52 When F contains a primitive nth root of unity ζ the tensor

tn =sum

αisinΨn(nminus1)

eα1 otimes middot middot middot otimes eαk isin (Fn)otimesk

which has support Ψn(n minus 1) is tight Namely the elements vj =sumn

i=1 ζijei

for j isin [n] form a basis of Fn Let g isin G(tn) be the corresponding basistransformation Then we have tn =

sumnj=1 vj otimes middot middot middot otimes vj and we see that the

support supp(g middot tn) = α isin [n]k α1 = middot middot middot = αk is tight (See also [BCS97Exercise 1525]) When the characteristic of F equals n the tensor tn is also tightas we will see in Section 542

Combinatorial subrank and the CoppersmithndashWinograd method

We care about tight tensors because of a remarkable theorem for tight 3-tensors ofStrassen (Theorem 53 below) To understand the theorem we need the concept ofcombinatorial asymptotic subrank (cf [Str91 Section 5]) We say D sube I1timesmiddot middot middottimesIkis a diagonal when any two distinct α β isin D are distinct in all k coordinates Inother words for elements in D the value at one coordinate uniquely determinesthe value at the other k minus 1 coordinates Let Φ sube I1 times middot middot middot times Ik We say adiagonal D sube I1 times middot middot middot times Ik is free for Φ or simply D sube Φ is a free diagonalif D = Φ cap (D1 times middot middot middot times Dk) where Di = xi (x1 xk) isin D Define the(combinatorial) subrank Q(Φ) as the size of the largest free diagonal D sube ΦFor Φ sube I1 times middot middot middot times Ik and Ψ sube J1 times middot middot middot times Jk we naturally define the productΦtimesΨ sube (I1 times J1)times middot middot middot times (Ik times Jk) by

ΦtimesΨ = ((α1 β1) (αk βk)) α isin Φ β isin Ψ

51 Introduction 67

Define the (combinatorial) asymptotic subrank ˜Q(Φ) = limnrarrinfinQ(Φtimesn)1n Lett isin Fn1 otimes middot middot middot otimes Fnk and let Φ be the support of t in the standard basis ThenQ(Φ) le Q(t) and ˜Q(Φ) le ˜Q(t) The number Q(Φ) may be interpreted as thelargest number n such that 〈n〉 can be obtained from t using a restriction thatconsists of matrices that have at most one nonzero entry in each row and ineach column (This is called M-restriction in [Str87 Section 6] which standsfor monomial restriction) We may also interpret Φ as a k-partite hypergraphThen Q(Φ) is the size of the largest induced k-partite matching in Φ

Let Φ sube [n1]timesmiddot middot middottimes [nk] and let t isin Fn1 otimesmiddot middot middototimesFnk be any tensor with supportequal to Φ Then the (asymptotic) subranks of Φ and t are related as follows

Q(Φ) le Q(t) and ˜Q(Φ) le ˜Q(t)

Strassen proved the following theorem using the method of Coppersmith andWinograd [CW90] Recall that for Φ sube I1 times I2 times I3 we let P(Φ) be the set ofprobability distributions on Φ For P isin P(Φ) let P1 P2 P3 be the marginaldistributions of P on the 3 components of I1 times I2 times I3

Theorem 53 ([Str91 Lemma 51]) Let Φ sube I1 times I2 times I3 be tight Then

˜Q(Φ) = maxPisinP(Φ)

miniisin[3]

2H(Pi) (51)

The consequence of Theorem 53 is that the support functionals are sufficientlypowerful to compute the asymptotic subrank of tight 3-tensors

Corollary 54 ([Str91 Proposition 54]) Let t isin Fn1 otimes Fn2 otimes Fn3 be tight Then

˜Q(t) = minθisinP([3])

ζθ(t)

Moreover if Φ = supp(g middot t) is tight for some g isin G(t) then ˜Q(t) = ˜Q(Φ)

Remark 55 Strassen conjectured in [Str94 Conjecture 53] that for the familyof tight 3-tensors the support functionals give all spectral points in the asymp-totic spectrum X(tight 3-tensors) In [Str91] numerous examples are given ofsubfamilies of tight 3-tensors for which this is the case

Remark 56 Equation (51) becomes false when we let Φ sube I1 times middot middot middot times Ikwith k ge 4 and we let the right-hand side of the equation be maxPisinP(Φ) mini 2H(Pi)see [CVZ16 Example 1138]

New results in this chapter

This chapter is an investigation of tight tensors combinatorial asymptotic subrankand applications More precisely this chapter contains the following new results

68 Chapter 5 Tight tensors and combinatorial subrank cap sets

Higher-order CoppersmithndashWinograd method In Section 52 we extendTheorem 53 to obtain a lower bound for ˜Q(Φ) for tight sets Φ sube I1 times middot middot middot times Ikwith k ge 4 Our lower bound is not known to be optimal in general We computeexamples for which the lower bound is optimal

Combinatorial degeneration method In Section 53 we further extend therange of application of the CoppersmithndashWinograd method via a partial order

on supports of tensors called combinatorial degeneration We prove that if Φ Ψthen ˜Q(Φ) le ˜Q(Ψ) Suppose Ψ is not tight but Φ is tight then we may apply the(higher-order) CoppersmithndashWinograd method to obtain a lower bound on ˜Q(Φ)and thus on ˜Q(Ψ)

Cap sets In Section 54 we relate the theory of asymptotic spectra theCoppersmithndashWinograd method and the combinatorial degeneration methodto the problem of upper bounding the maximum size of cap sets in Fnp

Graph tensors Graph tensors are generalisations of the matrix multiplicationtensor 〈2 2 2〉 parametrised by graphs In Section 55 we discuss how one canapply the higher-order CoppersmithndashWinograd method to obtain upper boundson the asymptotic rank of complete graph tensors We also briefly discuss thesurgery method which gives good upper bounds on the asymptotic rank of graphtensors for sparse graphs like cycle graphs

52 Higher-order CW method

In this section we extend Theorem 53 to tight Φ sube I1 times middot middot middot times Ik with k ge 4We introduce some notation Let P(Φ) be the set of probability distributionson Φ For P isin P(Φ) let P1 Pk be the marginal distributions of P on the kcomponents of I1 times middot middot middot times Ik Let R(Φ) be the set of all subsets R sube Φ2 suchthat R 6sube (x x) x isin Φ and R sube (x y) isin Φ2 xi = yi for some i isin [k]For P isin P(Φ) and R isin R(Φ) let Q(R (P1 Pk)) be the set of probabilitydistributions Q on R whose marginal distributions on the 2k components of Rsatisfy Qi = Qk+i = Pi for i isin [k]

Let I1 Ik be finite subsets of Z The result of this section is a lower bound onthe asymptotic subrank of any Φ sube I1timesmiddot middot middottimesIk satisfying foralla isin Φ

sumki=1 ai = 0 For

R sube R(Φ) let r(R) be the rank over Q of the matrix with rows xminusy (x y) isin R

Theorem 57 Let Φ sube Zk be a finite set with foralla isin Φsumk

i=1 ai = 0 Then

log2 ˜Q(Φ) ge maxP

minRQ

H(P )minus (k minus 2)H(Q)minusH(P )

r(R)

with P isin P(Φ) R isin R(Φ) and Q isin Q(R (P1 Pk))

52 Higher-order CoppersmithndashWinograd method 69

521 Construction

We prepare for the proof of Theorem 57 by discussing some basic facts

Average-free sets

Lemma 58 Let k isin N Let M isin N We say a subset B sube ZMZ is (k minus 1)-average-free if

forallx1 xk isin B x1 + middot middot middot+ xkminus1 = (k minus 1)xk rArr x1 = middot middot middot = xk

There is a (k minus 1)-average-free set B sube ZMZ of size |B| = M1minuso(1)

Proof There is a set A sube 1 bMminus1kminus1c of size |A| = M1minuso(1) with

forallx1 xk isin A x1 + middot middot middot+ xkminus1 = (k minus 1)xk rArr x1 = middot middot middot = xk (52)

see [VC15 Lemma 10] Let B = a mod M a isin A sube ZMZ Then |B| = |A|Let x1 xk isin B with x1 + middot middot middot+ xkminus1 = (k minus 1)xk View x1 xk as elementsin 1 bMminus1

kminus1c Then x1 + middot middot middot+xkminus1 = (kminus1)xk still holds From (52) follows

x1 = middot middot middot = xk in Z and hence also in ZMZ

Linear combinations of uniform variables

Lemma 59 Let M be a prime Let u1 un be independently uniformly dis-tributed over ZMZ Let v1 vm be (ZMZ)-linear combinations of u1 unThen the vector v = (v1 vm) is uniformly distributed over the range of v in(ZMZ)m

Proof Let vi =sum

j cijuj with cij isin ZMZ Then v = Cu with u = (u1 un)and C the matrix with entries Cij = cij Let y in the image of C Then thecardinality of the preimage Cminus1(y) equals the cardinality of the kernel of CIndeed if Cx = y then Cminus1(y) = x + ker(C) Since u is uniform we concludethat v is uniform on the image of C

Free diagonals

Lemma 510 Let G be a graph with n vertices and m edges Then G has at leastnminusm connected components

Proof A graph without edges has n connected components For every edge thatwe add to the graph we lose at most one connected component

Lemma 511 Let I1 Ik be finite sets Let Ψ sube I1 times middot middot middot times Ik Let

C = a b sube Ψ a 6= bexisti isin [k] ai = bi

Then Q(Ψ) ge |Ψ| minus |C| Obviously the statement remains true if we replace C bythe larger set (a b) isin Ψ2 a 6= bexisti isin [k] ai = bi

70 Chapter 5 Tight tensors and combinatorial subrank cap sets

Proof Let G = (Ψ C) be the graph with vertex set Ψ and edge set C Let Γ sube Ψcontain exactly one vertex per connected component of G The vertices in Γ arepairwise not adjacent So Γ is a diagonal Of course Γ sube Ψcap (Γ1times middot middot middot times Γk) Leta isin Ψ cap (Γ1 times middot middot middot times Γk) Let x1 xk isin Γ with

(x1)1 = a1 (x2)2 = a2 (xk)k = ak

Then x1 xk are all adjacent to a in G ie they are all in the same connectedcomponent Then x1 = middot middot middot = xk since Γ contains precisely one vertex perconnected component So a = x1 = middot middot middot = xk So a isin Γ We conclude thatΓ supe Ψ cap (Γ1 times middot middot middot times Γk) Finally |Γ| ge |Ψ| minus |C| by Lemma 510

We now give the proof of Theorem 57 We repeat some notation from aboveLet k ge 3 Let Φ sube Zk be a finite set Let P(Φ) be the set of probabilitydistributions on Φ For P isin P(Φ) let P1 Pk be the marginal distributionsof P on the k components of Zk Let R(Φ) be the set of all subsets R sube Φ2 suchthat R 6sube (x x) x isin Φ and R sube (x y) isin Φ2 xi = yi for some i isin [k]For P isin P(Φ) and R isin R(Φ) let Q(R (P1 Pk)) be the set of probabilitydistributions Q on R whose marginal distributions on the 2k components of Rsatisfy Qi = Qk+i = Pi for i isin [k] For R sube R(Φ) let r(R) be the rank over Q ofthe matrix with rows

xminus y (x y) isin R

For any prime M let rM(R) be the rank over ZMZ of the same matrix

Theorem (Theorem 57) Let Φ sube Zk be a finite set with foralla isin Φsumk

i=1 ai = 0Then

log2 ˜Q(Φ) ge maxP

minRQ

H(P )minus (k minus 2)H(Q)minusH(P )

r(R)

with P isin P(Φ) R isin R(Φ) and Q isin Q(R (P1 Pk))

Proof Let P be a rational probability distribution on Φ ie foralla isin Φ P (a) isin Q

Choice of parameters

This proof involves a variable N that we will let go to infinity and a primenumber M that depends on N For the sake of rigor we first set the dependenceof M on N and make sure that N is large enough for M to have good properties

Let n isin N such that P is an n-type ie foralla isin Φ nP (a) isin N Let N = tn be amultiple of n Let

f(N) = log2

(2|Φ|

2

maxRisinR(Φ)

(N + |R| minus 1

|R| minus 1

))isin o(N) (53)

52 Higher-order CoppersmithndashWinograd method 71

Let

g(N) = |Φ| log2(N + 1) isin o(N)

By Lemma 420

2NH(P )minusg(N) le(N

NP

) (54)

Let

micro(N) = maxRQ

H(Q)minusH(P ) + (1 + g(N) + f(N)) 1N

r(R)(55)

with R isin R(Φ) and Q isin Q(R (P1 Pk)) Let M be a prime with

d2micro(N)Ne leM le 2d2micro(N)Ne (56)

Such a prime exists by Bertrandrsquos postulate see eg [AZ14] We can make Marbitrarily large by choosing N large enough Choose N = tn large enough suchthat

M gt k minus 1 (57)

forallR isin R(Φ) rM(R) = r(R) (58)

We will later let t and thus N go to infinity

Restrict to marginal type classes

The set ΦotimesN is a finite subset of (ZN)k Let a isin ΦotimesN Then we have thatai = ((ai)1 (ai)N) isin ZN for i isin [k] We restrict to those a for which ai is inthe type class TNPi for all i isin [k] Thus let

Ψ = ΦotimesN cap (TNP1times middot middot middot times TNPk)

We prove a lower bound on the size of Ψ Let (s1 sN ) isin TNP Then sj isin Φ forj isin [N ] and ((s1)i (sN)i) isin TNPi for i isin [k] So(

((s1)1 (sN)1) ((s1)k (sN)k))isin ΦotimesN cap (TNP1

times middot middot middot times TNPk) = Ψ

Thus |Ψ| ge |TNP | By Lemma 419 |TNP | =(NNP

) By Lemma 420

(NNP

)ge

2NH(P )minusg(N) Therefore

|Ψ| ge 2NH(P )minusg(N) (59)

72 Chapter 5 Tight tensors and combinatorial subrank cap sets

Hashing

Let u1 ukminus1 v1 vN isin ZMZ For i isin [k] let

hi ZN rarr ZMZ

x 7rarr

ui +

sumNj=1 xjvj for 1 le i le k minus 1

1kminus1

(u1 + middot middot middot+ ukminus1 minus

sumNj=1 xjvj

)for i = k

Note that kminus1 is invertible in ZMZ by (57) Let a isin Ψ Then ((a1)j (ak)j) isinΦ for j isin [N ] So

sumki=1(ai)j = 0 for every j isin [N ] Thus

ksumi=1

Nsumj=1

(ai)jvj =Nsumj=1

vj

ksumi=1

(ai)j = 0

Therefore

h1(a1) + middot middot middot+ hkminus1(akminus1) = (k minus 1)hk(ak)

Restrict to average-free set

Let B sube ZMZ be a (k minus 1)-average-free set of size

|B| geM1minusκ(M) with κ(M) isin o(1) (510)

meaning

forallx1 xk isin B x1 + middot middot middot+ xkminus1 = (k minus 1)xk rArr x1 = middot middot middot = xk (511)

(Lemma 58) Let Ψprime sube Ψ be the subset

Ψprime = a isin Ψ foralli isin [k] hi(ai) isin B

Let a isin Ψprime Then a isin Ψ so

h1(a1) + middot middot middot+ hkminus1(akminus1) = (k minus 1)hk(ak)

Since hi(ai) isin B for every i isin [k] (511) implies

h1(a1) = middot middot middot = hk(ak)

Probabilistic method

Clearly Q(ΦotimesN) ge Q(Ψ) ge Q(Ψprime) Let

C prime = (a b) isin Ψprime2 a 6= bexisti isin [k] ai = bi

52 Higher-order CoppersmithndashWinograd method 73

Let X = |Ψprime| and Y = |C prime| By Lemma 511

Q(Ψprime) ge X minus Y

Let u1 ukminus1 v1 vN be independent uniformly random variables over thefield ZMZ Then X and Y are random variables Then

Q(Ψprime) ge E[X minus Y ] = E[X]minus E[Y ]

where the expectation is over u1 ukminus1 v1 vN We will prove

E[X] = |B| |Ψ|Mminus(kminus1) (512)

E[Y ] le |B|maxRQ

2NH(Q)+f(N)Mminus(kminus1)minusr(R) (513)

with f(N) as defined in (53) and R isin R(Φ) Q isin Q(R (P1 Pk)) Beforeproving (512) and (513) we derive the final bound

Derivation of final bound

From (512) and (513) follows

E[X]minus E[Y ] ge |B| |Ψ|Mminus(kminus1) minus |B|maxRQ

2NH(Q)+f(N)Mminus(kminus1)minusr(R)

We factor out |B| |Ψ| and Mminus(kminus1)

E[X]minus E[Y ] ge |B| |Ψ|Mminus(kminus1)(

1minus 1

|Ψ|maxRQ

2NH(Q)+f(N)Mminusr(R))

From our choice of micro(N) from (55)

micro(N) = maxRQ

H(Q)minusH(P ) + (1 + g(N) + f(N)) 1N

r(R)

follows

maxRQ

2N(H(Q)minusH(P )minusr(R)micro(N))+g(N)+f(N) le 1

2 (514)

Apply |B| geM1minusκ(M) from (510) and |Ψ| ge 2NH(P )minusg(N) from (59) to get

E[X]minus E[Y ] geM1minusκ(M)2NH(P )minusg(N)Mminus(kminus1)

middot(

1minus 2minusNH(P )+g(N) maxRQ

2NH(Q)+f(N)Mminusr(R))

geMminus(kminus2+κ(M))2NH(P )minusg(N)

74 Chapter 5 Tight tensors and combinatorial subrank cap sets

middot(

1minusmaxRQ

2NH(Q)minusNH(P )+g(N)+f(N)Mminusr(R))

(Here we used (514) to see that the second factor is nonnegative) Apply theupper bound 2micro(N)N leM le 2micro(N)N+2 from (56) to get

E[X]minus E[Y ] ge (2micro(N)N+2)minus(kminus2+κ(M))2NH(P )minusg(N)

middot(

1minusmaxRQ

2NH(Q)minusNH(P )+g(N)+f(N)(2micro(N)N)minusr(R))

= 2N(H(P )minus(kminus2+κ(M))micro(N))minus2(kminus2+κ(M))minusg(N)

middot(

1minusmaxRQ

2N(H(Q)minusH(P )minusr(R)micro(N))+g(N)+f(N))

Using (514) we get

E[X]minus E[Y ] ge 2N(H(P )minus(kminus2+κ(M))micro(N))minus2(kminus2+κ(M))minusg(N)(1minus 1

2)

= 2N(H(P )minus(kminus2+κ(M))micro(N))minus2(kminus2+κ(M))minusg(N)minus1

Then

1

Nlog2 Q(ΦotimesN)

ge 1

Nlog2(E[X]minus E[Y ])

ge H(P )minus (k minus 2 + κ(M)) maxRQ

H(Q)minusH(P ) + (1 + g(N) + f(N)) 1N

r(R)

minus 2(k minus 2 + κ(M)) + g(N) + 1

N

We let t and thus N go to infinity and obtain

log2 ˜Q(Φ) ge H(P )minus (k minus 2) maxRQ

H(Q)minusH(P )

r(R)

This lower bound holds for any rational probability distribution P on Φ and bycontinuity for any real probability distribution P on Φ

It remains to prove (512) and (513) We do this in the lemmas below

Lemma 512 E[X] = |B| |Ψ|Mminus(kminus1)

Proof Let a isin Ψ Then h1(a1) + middot middot middot+ hkminus1(akminus1) = (k minus 1)hk(ak) The followingfour statements are equivalent

a isin Ψprime

foralli isin [k] hi(ai) isin B

52 Higher-order CoppersmithndashWinograd method 75

existb isin B h1(a1) = middot middot middot = hk(ak) = b

existb isin B h1(a1) = middot middot middot = hkminus1(akminus1) = b

Therefore

P[a isin Ψprime] =sumbisinB

P[h1(a1) = middot middot middot = hkminus1(akminus1) = b]

For b isin B

P[h1(a1) = middot middot middot = hkminus1(akminus1) = b] = (Mminus1)kminus1

We conclude

E[X] =sumaisinΨ

P[a isin Ψprime]

=sumaisinΨ

sumbisinB

P[h1(a1) = middot middot middot = hkminus1(akminus1) = b]

=sumaisinΨ

sumbisinB

(Mminus1)kminus1

= |Ψ| |B|Mminus(kminus1)

This proves the lemma

Lemma 513 E[Y ] le |B|maxRQ 2NH(Q)+f(N)Mminus(kminus1)minusr(R)

Proof Let

C = (a aprime) isin Ψ2 a 6= aprimeexisti isin [k] ai = aprimei

Let (a aprime) isin C The following statements are equivalent

(a aprime) isin C prime (515)

a aprime isin Ψprime (516)

foralli isin [k] hi(ai) hi(aprimei) isin B (517)

existb isin B h1(a1) = middot middot middot = hk(ak) = h1(aprime1) = middot middot middot = hk(aprimek) = b (518)

Therefore

E[Y ] =sum

(aaprime)isinC

P[(a aprime) isin C prime]

=sum

(aaprime)isinC

sumbisinB

P[h1(a1) = middot middot middot = hk(ak) = h1(aprime1) = middot middot middot = hk(aprimek) = b]

76 Chapter 5 Tight tensors and combinatorial subrank cap sets

Let (a aprime) isin C Then hi(ai) and hi(aprimei) are ZMZ-linear combinations of

u1 ukminus1 v1 vN The random variable(h1(a1) hk(ak) h1(aprime1) hk(a

primek))

is uniformly distributed over the image subspace V sube (ZMZ)2k Let b isin BThen (b b) isin V since u1 = middot middot middot = uk = b v1 vN = 0 is a valid assignmentTherefore

P[h1(a1) = middot middot middot = hk(ak) = h1(aprime1) = middot middot middot = hk(aprimek) = b] = |V |minus1

And |V | equals M to the power the rank of the matrix1 0 middot middot middot 0 1

kminus11 0 middot middot middot 0 1

kminus1

0 1 0 1kminus1

0 1 0 1kminus1

0 0 middot middot middot 1 1

kminus10 0 middot middot middot 1 1

kminus1

a1 a2 middot middot middot akminus1 minus akkminus1

aprime1 aprime2 middot middot middot aprimekminus1 minusaprimekkminus1

(519)

over ZMZ with a1 ak aprime1 a

primek thought of as column vectors in (ZMZ)N

With column operations we transform (519) into0 0 middot middot middot 0 0 1 0 middot middot middot 0 00 0 middot middot middot 0 0 0 1 0 0

0 0 middot middot middot 0 0 0 0 1 0

a1 minus aprime1 a2 minus aprime2 middot middot middot akminus1 minus aprimekminus1 ak minus aprimek aprime1 aprime2 middot middot middot aprimekminus1 0

(520)

Matrix (520) has rank equal to k minus 1 plus rM(a aprime) = rk(A(a aprime)) where

A(a aprime) =(a1 minus aprime1 a2 minus aprime2 middot middot middot ak minus aprimek

)

We obtain

E[Y ] lesum

(aaprime)isinC

sumbisinB

Mminus(kminus1+rM (aaprime))

Since the summands are independent of b we get

E[Y ] le |B|sum

(aaprime)isinC

Mminus(kminus1+rM (aaprime))

Let (a aprime) isin C Consider the rows of A(a aprime) The N rows are of theform xi minus yi with (xi yi) isin Φ2 Let s = ((x1 y1) (xN yN)) Let R =

52 Higher-order CoppersmithndashWinograd method 77

(x1 y1) (xN yN) We have rM(a aprime) = rM(R) and rM(R) = r(R) by (58)Let Q be the N -type with supp(Q) = R and s isin TNQ From a 6= aprime followsR 6sube (x x) x isin Φ From existi isin [k] ai = aprimei follows existi isin [k] R sube (x y) xi = yiFrom a aprime isin TNP1

times middot middot middot times TNPk follows Qi = Qk+i = Pi for all i isin [k] We thus have

E[Y ] le |B|sum

RisinR(Φ)

sumQisinQ(R(P1Pk))

supp(Q)=RQ is N -type

sumsisinTNQ

Mminus(kminus1+r(R))

The number of N -types Q with supp(Q) = R is at most the number of N -typeson R which is at most

(N+|R|minus1|R|minus1

)(Lemma 419) For any Q isin Q(R (P1 Pk))

|TNQ | le 2NH(Q) (Lemma 419) Therefore

E[Y ] le |B|sum

RisinR(Φ)

(N + |R| minus 1

|R| minus 1

)max

QisinQ(R(P1Pk))2NH(Q) Mminus(kminus1+r(R))

Also |R(Φ)| le 2|Φ|2

Therefore

E[Y ] le |B| 2|Φ|2 maxRisinR(Φ)

(N + |R| minus 1

|R| minus 1

)max

QisinQ(R(P1Pk))2NH(Q) Mminus(kminus1+r(R))

We conclude that

E[Y ] le |B|maxRQ

2NH(Q)+f(N)Mminus(kminus1)minusr(R)

This proves the lemma

522 Computational remarks

The following two lemmas are helpful when applying Theorem 57 We leave theproof to the reader

Lemma 514 Let P isin P(Φ) Let RRprime isin R(Φ) with R sube Rprime and r(R) = r(Rprime)Then

maxQisinQ(R(P1Pk))

H(Q)minusH(P )

r(R)le max

QisinQ(Rprime(P1Pk))

H(Q)minusH(P )

r(Rprime)

Lemma 515 Let R isin R(Φ) There is an equivalence relation Rprime isin R(Φ) withR sube Rprime and r(R) = r(Rprime)

78 Chapter 5 Tight tensors and combinatorial subrank cap sets

523 Examples type sets

We discuss some examples The first example we will use to get good upperbounds on the asymptotic rank of complete graph tensors in Section 55 We focuson one family of examples that is parametrised by partitions Let λ ` k be aninteger partition of k with d parts Let

Φλ = a isin 0 1 dminus 1 type(a) = λ

The set Φλ is tight

Theorem 516 log2 ˜Q(Φ(22)) = 1

Proof Let Φ = Φ(22) Clearly ˜Q(Φ) le 2 After relabelling foralla isin Φsumk

i=1 ai = 0We may thus apply Theorem 57 Let P be the uniform probability distributionon Φ Then H(P ) = log2 6

Let R isin R(Φ) We may assume that

R sube (1 1 0 0) (1 0 1 0) (1 0 0 1)2

cup(0 0 1 1) (0 1 0 1) (0 1 1 0)2

We may assume R is an equivalence relation (Lemma 515) Let (x y) isin RLet Rprime = R cup ((1 1 1 1) minus x (1 1 1 1) minus y) isin R(Φ) Then R sube Rprime andRprime isin R(Φ) and r(R) = r(Rprime) We may thus assume that if (x y) isin R then also((1 1 1 1)minus x (1 1 1 1)minus y) isin R (Lemma 514)

Let S = (1 1 0 0) (1 0 1 0) (1 0 0 1) By the above observation it sufficesto consider equivalence relations on S There are three types of such equivalencerelations

Type (3) all three elements of S are equivalent Then |R| = 18 and r(R) = 2Type (2 1) two elements of S are equivalent and inequivalent to the third

element (which is equivalent to itself) Then |R| = 10 and r(R) = 1Type (1 1 1)) all elements of S are inequivalent Then R sube (x x) x isin Φ

which is a contradictionFor type (3) and (2 1) the uniform probability distribution Q on R has

marginals Qi = Q4+i = Pi for i isin [4] The uniform Q is optimal Then H(Q) =log2 |R| Let R(3) and R(21) be equivalence relations of type (3) and (2 1) Then

log2 ˜Q(Φ) ge minH(P )minus 2

r(R(3))

(log2 |R(3)| minusH(P )

)

H(P )minus 2

r(R(21))

(log2 |R(21)| minusH(P )

)= min log2 6minus 2

2(log2 18minus log2 6)

log2 6minus 21(log2 10minus log2 6)

= min1 log25425 = 1

This proves the theorem

53 Combinatorial degeneration method 79

Theorem 517 log2 ˜Q(Φ(0kminus11)) = h(1k)

Proof We refer to [CVZ16]

With Srinivasan Arunachalam and Peter Vrana we have the following unpub-lished result

Theorem 518 log2 ˜Q(Φ(0k21k2)) = 1

53 Combinatorial degeneration method

In this section we extend the (higher-order) CoppersmithndashWinograd method via apreorder called combinatorial degeneration Suppose Ψ sube I1timesmiddot middot middottimes Ik is not tightbut has a tight subset Φ sube Ψ In the rest of this section we focus on obtaining alower bound on ˜Q(Ψ) via Φ This has an application in the context of tri-coloredsum-free sets (Section 542) for example

Definition 519 ([BCS97]) Let Φ sube Ψ sube I1 times middot middot middot times Ik We say that Φ is acombinatorial degeneration of Ψ and write Ψ Φ if there are maps ui Ii rarr Z(i isin [k]) such that for all α isin I1 times middot middot middot times Ik if α isin Ψ Φ then

sumki=1 ui(αi) gt 0

and if α isin Φ thensumk

i=1 ui(αi) = 0 Note that the maps ui need not be injective

Combinatorial degeneration gets its name from the following standard proposi-tion see eg [BCS97 Proposition 1530]

Proposition 520 Let t isin Fn1 otimes middot middot middot otimes Fnk Let Ψ = supp(t) Let Φ sube Ψ suchthat Ψ Φ Then t t|Φ

Proposition 520 brings us only slightly closer to our goal Namely givent isin Fn1 otimesmiddot middot middototimesFnk with Ψ = supp(t) and given Φ sube Ψ such that ΨΦ it followsdirectly from Proposition 520 that t t|Φ and thus ˜Q(t) ge ˜Q(t|Φ) This howeverdoes not give us a lower bound on the combinatorial asymptotic subrank ˜Q(Ψ)The following theorem does Our theorem extends a result in [KSS16]

Theorem 521 Let Φ sube Ψ sube I1 times middot middot middot times Ik If Ψ Φ then

˜Q(Ψ) ge ˜Q(Φ)

Lemma 522 Let Φ sube Ψ sube I1 times middot middot middot times Ik If Ψ Φ then ˜Q(Ψ) ge Q(Φ)

Proof Pick maps ui Ii rarr Z such that

ksumi=1

ui(αi) = 0 for α isin Φ

ksumi=1

ui(αi) gt 0 for α isin Ψ Φ

80 Chapter 5 Tight tensors and combinatorial subrank cap sets

Let D be a free diagonal in Φ with |D| = Q(Φ) and let

wi =sumxisinDi

ui(x)

Let n isin N and define

Wi =

(x1 xn|D|) isin Itimesn|D|i

n|D|sumj=1

ui(xj) = nwi

Then

Ψtimesn|D| cap (W1 times middot middot middot timesWk) = Φtimesn|D| cap (W1 times middot middot middot timesWk)

The inclusion supe is clear To show sube let (x1 xk) isin Ψtimesn|D| cap (W1 times middot middot middot timesWk)Write xi = (xi1 xi2 xin|D|) and consider the n|D| times k matrix of evaluations

u1(x11) u2(x21) middot middot middot uk(xk1)u1(x12) u2(x22) middot middot middot uk(xk2)

u1(x1n|D|) u2(x2n|D|) middot middot middot uk(xkn|D|)

The sum of the ith column is nwi by definition of Wi andsumk

i=1 nwi = 0 Therow sums are nonnegative by definition of the maps u1 uk We conclude thatthe row sums are zero Therefore (x1 xk) is an element of Φtimesn|D|

Since D is a free diagonal in Φ Dtimesn|D| is a free diagonal in Φtimesn|D| and alsoDtimesn|D| cap (W1times middot middot middot timesWk) is a free diagonal in Φtimesn|D| cap (W1times middot middot middot timesWk) which inturn is equal to Ψtimesn|D| cap (W1 times middot middot middot timesWk) Therefore Dtimesn|D| cap (W1 times middot middot middot timesWk) isalso a free diagonal in Ψtimesn|D| ie

Q(Ψtimesn|D|) ge |Dtimesn|D| cap (W1 times middot middot middot timesWk)|

In the set Dtimesn|D| consider the strings with uniform type ie where all |D|elements of D occur exactly n times These are clearly in W1 times middot middot middot timesWk andtheir number is

(n|D|nn

) Therefore

Q(Ψtimesn|D|) ge(

n|D|n n

)= |D|n|D|minuso(n)

which implies ˜Q(Ψ) = limnrarrinfinQ(Ψtimesn|D|)1

n|D| ge |D|

Proof of Theorem 521 We have ˜Q(Ψ) = limnrarrinfin ˜Q(Ψtimesn)1n It follows fromLemma 522 that

limnrarrinfin ˜Q(Ψtimesn)1n ge lim

nrarrinfinQ(Φtimesn)1n

The right-hand side is ˜Q(Φ)

54 Cap sets 81

54 Cap sets

A subset A sube (Z3Z)n is called a cap set if any line in A is a point a linebeing a triple of points of the form (u u + v u + 2v) Until recently it wasnot known whether the maximal size of a cap set in (Z3Z)n grows like 3nminuso(n)

or like cnminuso(n) for some c lt 3 Gijswijt and Ellenberg in [EG17] inspired bythe work of Croot Lev and Pach in [CLP17] settled this question showing thatc le 3(207+33

radic33)138 asymp 2755 Tao realised in [Tao16] that the cap set question

may naturally be phrased as the problem of computing the size of the largestmain diagonal in powers of the ldquocap set tensorrdquo

sumα eα1 otimes eα2 otimes eα3 where the

sum is over α1 α2 α3 isin F3 with α1 + α2 + α3 = 0 Here main diagonal refersto a subset A of the basis elements such that restricting the cap set tensor toAtimesAtimesA gives the tensor

sumvisinA votimes votimes v We show that the cap set tensor is in

the GL3(F3)times3 orbit of the ldquoreduced polynomial multiplication tensorrdquo which wasstudied in [Str91] and we show how recent results follow from this connectionusing Theorem 521

541 Reduced polynomial multiplication

Let tn be the tensorsum

α eα1 otimes eα2 otimes eα3 where the sum is over (α1 α2 α3) in0 1 nminus13 such that α1 +α2 = α3 We call tn the reduced polynomial multi-plication tensor since tn is essentially the structure tensor of the algebra F[x](xn)of univariate polynomials modulo the ideal generated by xn The support of tnequals

(α1 α2 α3) isin 0 nminus 13

∣∣α1 + α2 = α3

which via α3 7rarr nminus 1minus α3 we may identify with the set

Φn =

(α1 α2 α3) isin 0 nminus 13∣∣α1 + α2 + α3 = nminus 1

(521)

The support Φn is tight (cf Example 51) Strassen proves in [Str91 Theorem 67]using Corollary 54 that ˜Q(tn) = ˜Q(Φn) = z(n) where z(n) is defined as

z(n) =γn minus 1

γ minus 1γminus2(nminus1)3 (522)

with γ equal to the unique positive real solution of the equation 1γminus1minus n

γnminus1= nminus1

3

The following table contains values of z(n) for small n See also [Str91 Table 1]

82 Chapter 5 Tight tensors and combinatorial subrank cap sets

n z(n)

rounded exact

2 188988 3223 = 2h(13)

3 275510 3(207 + 33radic

33)1384 3610725 4461586 5309737 6156208 7001559 78461210 869012

In fact [Str91 Theorem 67] says that the asymptotic spectrum of tn is completelydetermined by the support functionals and that the possible values that thespectral points can take on tn form the closed interval [z(n) n] (cf Remark 221)

X(N[tn]) = ζθ|N[tn] θ isin P([3]) φ(tn) φ isin X(N[tn]) = [z(n) n]

542 Cap sets

We turn to cap sets

Definition 523 A three-term progression-free set is a set A sube (ZmZ)n satisfy-ing the following For all (x1 x2 x3) isin Atimes3 there are u v isin (ZmZ)n such that(x1 x2 x3) = (u u + v u + 2v) if and only if x1 = x2 = x3 Let r3((ZmZ)n) bethe size of the largest three-term progression-free set in (ZmZ)n and define theregularisation ˜r3(ZmZ) = limnrarrinfin r3((ZmZ)n)1n

A three-term progression-free set in (Z3Z)n is called a cap or cap set Wenext discuss an asymmetric variation on three-term progression free sets calledtri-colored sum-free sets which are potentially larger They are interesting sinceall known upper bound techniques for the size of three-term progression-free setsturn out to be upper bounds on the size of tri-colored sum-free sets

Definition 524 Let G be an abelian group Let Γ sube GtimesGtimesG For i isin [3] wedefine the marginal sets Γi = x isin G existα isin Γ αi = x We say Γ is tricoloredsum-free if the following holds The set Γ is a diagonal and for any α isin Γ1timesΓ2timesΓ3α1 + α2 + α3 = 0 if and only if α isin Γ (Recall that Γ sube I1 times I2 times I3 is a diagonalwhen any two distinct α β isin Γ are distinct in all coordinates) Let s3(G) be thesize of the largest tricolored sum-free set in GtimesGtimesG and define the regularisation

˜s3(G) = limnrarrinfin s3(Gtimesn)1n

Equivalently Γ sube GtimesGtimesG is a tricolored sum-free set if and only if Γ is afree diagonal in α isin GtimesGtimesG α1 + α2 + α3 = 0

54 Cap sets 83

If the set A sube G = (ZmZ)n is three-term progression-free then the setΓ = (a aminus2a) a isin A sube G times G times G is tri-colored sum-free Therefore wehave ˜r3(ZmZ) le ˜s3(ZmZ)

We summarise the recent history of results on cap sets For clarity we focuson m = 3 we refer the reader to the references for the general results Edel in[Ede04] proved the lower bound 221739 le ˜r3(Z3Z) In [EG17] Ellenberg andGijswijt proved the upper bound

˜r3(Z3Z) le 3(207 + 33radic

33)138 asymp 2755

Blasiak et al [BCC+17] proved that in fact

˜s3(Z3Z) le 3(207 + 33radic

33)138

This upper bound was shown to be an equality in [KSS16 Nor16 Peb16]

Theorem 525 ˜s3(Z3Z) = 3(207 + 33radic

33)138

We reprove Theorem 525 by proving that ˜s3(ZmZ) equals the asymptoticsubrank z(m) of tm discussed in Section 541 when m is a prime power Thesignificance of our proof lies in the explicit connection to the framework ofasymptotic spectra and not in the obtained value which also for prime powers mwas already computed in [BCC+17 KSS16 Nor16 Peb16]

Proof We will prove ˜s3(ZmZ) = z(m) when m is a prime power By defini-tion ˜s3(ZmZ) equals the asymptotic subrank of the set

α isin 0 mminus 13 α1 + α2 + α3 = 0 mod m

which via α3 7rarr α3 minus (mminus 1) we may identify with the set

Ψm = α isin 0 mminus 13 α1 + α2 + α3 = mminus 1 mod m

and so ˜s3(ZmZ) = ˜Q(Ψm) Let

Φm = α isin 0 mminus 13 α1 + α2 + α3 = mminus 1

We know ˜Q(Φm) = z(m) (Section 541) We will show that ˜Q(Φm) = ˜Q(Ψm)when m is a prime power This proves the theorem

We prove ˜Q(Φm) le ˜Q(Ψm) There is a combinatorial degeneration Φm ΨmIndeed let ui 0 mminus 1 rarr 0 mminus 1 be the identity map If α isin Φmthen

sum3i=1 ui(αi) = m minus 1 and if α isin Ψm Φm then

sum3i=1 ui(αi) equals m minus 1

plus a positive multiple of m This means Theorem 521 applies and we thusobtain ˜Q(Φm) le ˜Q(Ψm) This proves the claim

We show ˜Q(Ψm) le ˜Q(Φm) when m is a power of the prime p Let F = FpLet fm isin Fm otimes Fm otimes Fm have support Ψm with all nonzero coefficients equal

84 Chapter 5 Tight tensors and combinatorial subrank cap sets

to 1 Obviously ˜Q(Ψm) le ˜Q(fm) To compute ˜Q(fm) we show that there is abasis in which the support of fm equals the tight set Φm Then ˜Q(fm) = ˜Q(Φm)(Corollary 54) This implies the claim We prepare to give the basis (which isthe same basis as used in [BCC+17]) First observe that the rule x 7rarr

(xa

)gives a

well-defined map ZmZrarr ZpZ since for a isin 0 1 mminus 1 if x = y mod mthen

(xa

)=(ya

)mod p by Lucasrsquo theorem Let (ex)x be the standard basis of Fm

The elements (sum

xisinZmZ(xa

)ex)aisinZmZ form a basis of Fm since the matrix (

(xa

))ax

is upper triangular with ones on the diagonal We will now rewrite fm in the basis((sum

x

(xa

)ex)a (

sumy

(yb

)ey)b (

sumz

(zc

)ez)c) Observe that

(x

mminus1

)equals 1 if and only

if x equals mminus 1 and hence

fm =sum

xyzisinZmZx+y+z=mminus1

ex otimes ey otimes ez =sum

xyzisinZmZ

(x+ y + z

mminus 1

)ex otimes ey otimes ez

The identity(x+y+zw

)=sum(

xa

)(yb

)(zc

)with sum over a b c isin 0 1 mminus 1 such

that a+ b+ c = w is true and thussumxyzisinZmZ

(x+ y + z

mminus 1

)ex otimes ey otimes ez

=sum

xyzisinZmZ

sumabcisin01mminus1

a+b+c=mminus1

(x

a

)(y

b

)(z

c

)ex otimes ey otimes ez (523)

We may simply rewrite (523) as

sumabcisin01mminus1

a+b+c=mminus1

sumxisinZmZ

(x

a

)ex otimes

sumyisinZmZ

(y

b

)eb otimes

sumzisinZmZ

(z

c

)ez

Therefore with respect to the basis ((sum

x

(xa

)ex)a (

sumy

(yb

)ey)b (

sumz

(zc

)ez)c) the

support of fm equals the tight set Φm (And even stronger fm is isomorphic tothe tensor F[x](xm) of Section 541)

Remark 526 Why did we reprove the cap set result Theorem 525 Ourmotivation being interested in the asymptotic spectrum of tensors was to seeif the techniques in the cap set papers are stronger than the Strassen supportfunctionals ie whether they give any new spectral points Above we have seenthat the cap set result itself can be proven with the support functionals In fact weshow in Section 46 that for oblique tensors the asymptotic slice-rank which wasintroduced in [Tao16] to give a concise proof of [EG17] equals the minimum valueover the support functionals In Section 611 we show that for all complex tensorsasymptotic slice-rank equals the minimum value of the quantum functionals

55 Graph tensors 85

55 Graph tensors

In this section we briefly discuss the application that motivated us to proveTheorem 57 in [CVZ16] namely upper bounding the asymptotic rank of so-calledgraph tensors Graph tensors are defined as follows

Let G = (VE) be a graph (or hypergraph) with vertex set V and edgeset E Let n isin N Let (bi)iisin[n] be the standard basis of Fn We define the graphtensor Tn(G) as

Tn(G) =sumiisin[n]E

otimesvisinV

(otimeseisinEvisine

bie

)

seen as a |V |-tensor Given a vertex v isin V let d(v) denote the degree of v thatis d(v) equals the number of edges e isin E that contain v Then Tn(G) is naturallyinotimes

visinV Fd(v) We write T(G) for T2(G) For example for the complete graphon four vertices K4 the graph tensor is

T(K4) = T( )

= T( )

otimes T( )

otimes T( )

otimes T( )

otimes T( )

otimes T( )

=sum

iisin016(bi1otimesbi2otimesbi5)otimes (bi2otimesbi3otimesbi6)otimes (bi3otimesbi4otimesbi5)otimes (bi1otimesbi4otimesbi6)

living in (C8)otimes4 Let Kk be the complete graph on k vertices The 2times2 matrix mul-tiplication tensor 〈2 2 2〉 equals the tensor T(K3) Define the exponent ω(T(G)) =log2 ˜R(T(G)) We study the exponent per edge τ(T(G)) = ω(T(G)) |E(G)|

Our result is an upper bound on τ(T(K4)) in terms of the combinatorialasymptotic subrank ˜Q(Φ(22)) which we studied in Theorem 516

Theorem 527 For any q ge 1 τ(T(K4)) le logq

( q + 2

˜Q(Φ(22))

)

Proof We apply a generalisation of the laser method See [CVZ16]

Corollary 528 Let k ge 4 Then τ(T(Kk)) le 0772943

Proof In the bound of Theorem 527 we plug in the value ˜Q(Φ(22)) = 2 fromTheorem 516 Then we optimise over q to obtain the value 0772943 Bya ldquocovering argumentrdquo we can show that τ(T(Kk)) is non-increasing when kincreases

For k ge 4 Corollary 528 improves the upper bound τ(T(Kk)) le 0790955that can be derived from the well-known upper bound of Le Gall [LG14] on theexponent of matrix multiplication ω = ω(T(K3))

86 Chapter 5 Tight tensors and combinatorial subrank cap sets

A standard ldquoflattening argumentrdquo (ie using the gauge points from the asymp-totic spectrum) yields the lower bound τ(T(Kk)) ge 1

2k(k minus 1) if k is even and

τ(T(Kk)) ge 12(k + 1)k if k is odd As a consequence if the exponent of matrix

multiplication ω equals 2 then τ(T(K4)) = τ(T(K3)) = 23 We raise the following

question is there a k ge 5 such that τ(T(Kk)) lt23

Tensor surgery cycle graphs

For graph tensors given by sparse graphs good upper bounds on the asymptoticrank can be obtained with an entirely different method called tensor surgery whichwe introduced in [CZ18] As an illustration let me mention the results we obtainedfor cycle graphs with tensor surgery Recall ω = log2 ˜R(〈2 2 2〉) = log2 ˜R(T(Ck))Let ωk = log2 ˜R(T(Ck)) First observe that ωk = k for even k For odd k triviallyk minus 1 le ωk le k We prove the following

Theorem 529 For k ` odd ωk+`minus1 le ωk + ω`

Corollary 530 Let k ge 5 odd Then ωk le ωkminus2 + ω3 and thus ωk le kminus12ω

Corollary 531 If ω = 2 then ωk = k minus 1 for all odd k

See [CZ18] for the proofs

56 Conclusion

Tight tensors are a subfamily of the oblique tensors For tight 3-tensors theminimum over the support functionals equals the asymptotic subrank This isproven via the CoppersmithndashWinograd method The construction is in fact of avery combinatorial nature In this chapter we studied the combinatorial notion ofsubrank We proved that combinatorial subrank is monotone under combinatorialdegeneration We studied the cap set problem via the support functionals Weextended the CoppersmithndashWinograd method to higher-order tensors and appliedthis method to study graph tensors

Chapter 6

Universal points in the asymp-totic spectrum of tensors entanglementpolytopes moment polytopes

This chapter is based on joint work with Matthias Christandl and Peter Vrana [CVZ18]

61 Introduction

In Chapter 4 following Strassen we introduced the asymptotic spectrum oftensors X(T ) = X(T 6) for T the semiring of k-tensors over F for some fixedinteger k and field F with addition given by direct sum oplus multiplication givenby tensor product otimes and preorder 6 given by restriction (or degeneration) Theasymptotic spectrum characterises the asymptotic rank ˜R and the asymptoticsubrank ˜Q We have seen that the asymptotic rank plays an important role inalgebraic complexity theory the asymptotic rank of the matrix multiplicationtensor 〈2 2 2〉 =

sumijkisin[2] eij otimes ejkotimes eki isin F4otimesF4otimesF4 characterises the exponent

of the arithmetic complexity of multiplying two n times n matrices over F thatis ˜R(〈2 2 2〉) = 2ω We have also seen in Chapter 5 how one may use theasymptotic subrank to upper bound the size of combinatorial objects like forexample cap sets in Fn3

New results in this chapter

So far the only elements we have seen in X(T ) (ie universal spectral pointscf Section 213) are the gauge points (Section 43) Besides that we have seenin Section 44 that the Strassen support functionals ζθ are in X(oblique) Inthis chapter we introduce for the first time an explicit infinite family of universalspectral points (over the complex numbers) the quantum functionals Our newinsight is to use the moment polytope Given a tensor t isin Cn1 otimes Cn2 otimes Cn3 themoment polytope P(t) is a convex polytope that carries representation-theoretic

87

88 Chapter 6 Universal points in the asymptotic spectrum of tensors

information about t The quantum functionals are defined as maximisations overmoment polytopes

Let me immediately put a disclaimer The quantum functionals do not give anew lower bound on the asymptotic rank of matrix multiplication 〈2 2 2〉 namelythe quantum functionals give the same lower bound as the gauge points Alsothe quantum functionals being defined for tensors over complex numbers only wedo not expect to get new upper bounds on the size of combinatorial objects thatare ldquolike cap setsrdquo

So what have we gained Arguably we have found the ldquorightrdquo viewpoint onhow to construct universal spectral points for tensors (In fact after writing ourpaper [CVZ18] we realised that Strassen had begun a study of moment polytopesin the appendix of the German survey [Str05] Strassen did not construct newuniversal spectral points however not in that publication at least) If there aremore universal spectral points then our viewpoint may lead the way to findingthem Moreover whereas no efficient algorithm is known for evaluating the supportfunctionals the moment polytope viewpoint may open the way to having efficientalgorithms for evaluating the quantum functionals

In Sections 62ndash67 we work towards the construction of the quantum functionalsand we give a proof that they are universal spectral points In Sections 68ndash610 wecompare the quantum functionals and the support functionals and in Section 611we relate asymptotic slice rank to the quantum functionals

In this chapter we will focus on 3-tensors but the theory naturally generalisesto k-tensors

62 SchurndashWeyl duality

For background on representation theory we refer to [Kra84] [Ful97] and [GW09]Let Sn be the symmetric group on n symbols Let Sn act on the tensor

space (Cd)otimesn by permuting the tensor legs

π middot v1 otimes middot middot middot otimes vn = vπminus1(1) otimes middot middot middot otimes vπminus1(n) π isin Sn

Let GLd be the general linear group of Cd Let GLd act on (Cd)otimesn via the diagonalembedding GLd rarr GLtimesnd g 7rarr (g g)

g middot v1 otimes middot middot middot otimes vn = (gv1)otimes middot middot middot otimes (gvn) g isin GLd

The actions of Sn and GLd commute so we have a well-defined action of the productgroup Sn timesGLd on (Cd)otimesn SchurndashWeyl duality describes the decomposition ofthe space (Cd)otimesn into a direct sum of irreducible Sn timesGLd representations Thisdecomposition is

(Cd)otimesn sim=oplusλ`dn

[λ]otimes Sλ(Cd) (61)

62 SchurndashWeyl duality 89

with [λ] an irreducible Sn representation of type λ and Sλ(Cd) an irreducibleGLd-representation of type λ when `(λ) le d and 0 when `(λ) gt d We use thenotation λ `d n for the partitions of n with at most d parts Let

Pλ (Cd)otimesn rarr (Cd)otimesn

be the equivariant projector onto the isotypical component of type λ ie onto thesubspace of (Cd)otimesn isomorphic to [λ]otimes Sλ(Cd) The projector Pλ is given by theaction of the group algebra element

Pλ =(dim[λ]

n

)2 sumTisinTab(λ)

cT isin C[Sn]

where Tab(λ) is the set of Young tableaux of shape λ filled with [n] and with cTthe Young symmetrizer

cT =sum

σisinC(T )

sgn(σ)σsum

πisinR(T )

π

where C(T ) R(T ) sube Sn are the subgroups of permutations inside columns andpermutations inside rows respectively The element Pλ is a minimal centralidempotent in C[Sn] and

sumλ`n Pλ = e

Back to the decomposition of (Cd)otimesn We need a handle on the size of thecomponents in the direct sum decomposition (61) For our application it is goodto think of d as a constant and n as a large number The number of summands inthe direct sum decomposition (61) is upper bounded by a polynomial in n

|λ `d n| le (n+ 1)d

ie there are only few summands compared to the total dimension dn There arethe following well-known bounds on the dimensions of the irreducible representa-tions [λ] and Sλ(Cd) that make up the summands

nprodd`=1(λ` + dminus `)

le dim[λ] le nprodd`=1 λ`

(62)

dimSλ(Cd) le (n+ 1)d(dminus1)2 (63)

Let p isin Rn be a probability vector iesumn

i=1 pi = 1 and pi ge 0 for i isin [n]Let H(p) be the Shannon entropy of the probability vector p

H(p) =nsumi=1

pi log2

1

pi

For α isin [0 1] let h(α) = H((α 1 minus α)) be the binary entropy For a partitionλ = (λ1 λ`) ` n let λ = λn = (λ1n λ`n) be the probability vectorobtained by normalising λ

90 Chapter 6 Universal points in the asymptotic spectrum of tensors

Let λ ` n For N isin N let Nλ = (Nλ1 Nλ2 Nλ`) be the stretchedpartition We see that asymptotically in the stretching factor N the dimensionof [Nλ] behaves like a multinomial coefficient and

2NnH(λ)minuso(N) le dim[Nλ] le 2NnH(λ) (64)

63 Kronecker and LittlewoodndashRichardson coef-

ficients gλmicroν cλmicroν

Let micro ν ` n Let Sn rarr SntimesSn π 7rarr (π π) be the diagonal embedding Considerthe decomposition of the tensor product [micro] otimes [ν] restricted along the diagonalembedding

[micro]otimes [ν] darrSntimesSnSnsim=otimesλ`n

HomSn([λ] [micro]otimes [ν])otimes [λ]

Define the Kronecker coefficient

gλmicroν = dim HomSn([λ] [micro]otimes [ν])

ie gλmicroν is the multiplicity of [λ] in [micro]otimes [ν]

Let λ `a+b Let GLa timesGLb rarr GLa+b (AB) 7rarr AoplusB be the block-diagonalembedding Consider the decomposition of the representation Sλ(Ca+b) restrictedalong the block-diagonal embedding

Sλ(Ca+b) darrGLa+b

GLatimesGLbsim=oplusmicro`aν`b

Hλmicroν otimes Smicro(Ca)otimes Sν(Cb)

with

Hλmicroν = HomGLatimesGLb(Smicro(Ca)otimes Sν(Cb)Sλ(Ca+b))

Define the LittlewoodndashRichardson coefficient cλmicroν = dimHλmicroν

For partitions λ λprime ` define λ + λprime elementwise The Kronecker and theLittlewoodndashRichardson coefficients have the following semigroup property (seeeg [CHM07])

Lemma 61 Let λ micro ν α β γ ` be partitions

(i) If gλmicroν gt 0 and gαβγ gt 0 then gλ+α micro+β ν+γ gt 0

(ii) If cλmicroν gt 0 and cαβγ gt 0 then cλ+αmicro+β ν+γ gt 0

64 Entropy inequalities 91

64 Entropy inequalities

The semigroup properties imply the following lemma Of this lemma the firststatement can be found in a paper by Christandl and Mitchison [CM06] while wedo not know of any source that explicitly states the second statement For theconvenience of the reader we give the proofs of both statements

Lemma 62 Let λ micro ν ` be partitions

(i) If gλmicroν gt 0 then H(λ) le H(micro) +H(ν)

(ii) If cλmicroν gt 0 then H(λ) le |micro||micro|+|ν|H(micro) + |ν|

|micro|+|ν|H(ν) + h( |micro||micro|+|ν|

)

Proof (i) Let gλmicroν gt 0 Suppose λ micro ν ` n Let N isin N Then Lemma 61implies gNλNmicroNν gt 0 This means HomSnN ([Nλ] [Nmicro]otimes [Nν]) 6= 0 which impliesdim[Nλ] le dim[Nmicro] dim[Nν] From (64) we have the dimension bounds

2NnH(λ)minuso(N) le dim[Nλ]

dim[Nmicro] le 2NnH(micro)

dim[Nν] le 2NnH(ν)

Thus NnH(λ) minus o(N) le NnH(micro) + NnH(ν) Divide by Nn and let N go toinfinity to get H(λ) le H(micro) +H(ν)

(ii) We restrict the decomposition

(Ca+b)otimesn sim=oplusλ`a+bn

[λ]otimes Sλ(Ca+b)

along the block-diagonal embedding to get

(Ca+b)otimesn darrGLa+b

GLatimesGLbsim=otimesλ`a+bn

[λ]otimes Sλ(Ca+b) darrGLa+b

GLatimesGLb

sim=oplusλ`a+bn

[λ]otimesoplusmicro`aν`b

Ccλmicroν otimes Smicro(Ca)otimes Sν(Cb)

sim=oplusmicro`aν`b

(oplusλ`a+bn

[λ]otimes Ccλmicroν)otimes Smicro(Ca)otimes Sν(Cb)

On the other hand

(Ca+b)otimesn darr sim= (Ca oplus Cb)otimesn darrsim= (Ca)otimesn oplus ((Ca)otimesnminus1 otimes Cb)oplus middot middot middot oplus (Cb)otimesn darr

sim=noplusk=0

C(nk) otimesoplusmicro`ak

([micro]otimes Smicro(Ca))otimesoplus

ν`bnminusk

([ν]otimes Sν(Cb))

92 Chapter 6 Universal points in the asymptotic spectrum of tensors

sim=noplusk=0

oplusmicro`akν`bnminusk

(C(nk) otimes [micro]otimes [ν]

)otimes Smicro(Ca)otimes Sν(Cb)

Suppose cλmicroν gt 0 Comparing the above expressions gives the inequality dim[λ] le(n|micro|

)dim[micro] dim[ν] By the semigroup property Lemma 61 we have cNλNmicroNν gt 0

for all N isin N Thus dim[Nλ] le(NnN |micro|

)dim[Nmicro] dim[Nν] for all N isin N Then

from (64) follows

2NnH(λ)minuso(N) le 2Nnh(|micro|n

)2N |micro|H(micro)2N |ν|H(ν)

We conclude H(λ) le h( |micro|n

) + |micro|nH(micro) + |ν|

nH(ν)

Let x = (x(1) x(2) x(3)) be a triple of probability vectors x(i) isin Rni Let θ isin Θbe a weighting Let Hθ(x) be the θ-weighted average of the Shannon entropies ofthe probability vectors x(1) x(2) and x(3)

Hθ(x) = θ(1)H(x(1)) + θ(2)H(x(2)) + θ(3)H(x(3))

(Note that this notation is slightly different from the notation used in Chapter 4)We will use the notation λ `3 n to say that λ is a triple of partitions of n ie λequals (λ(1) λ(2) λ(3)) where each λ(i) is a partition of n We write λ for the

normalised triple (λ(1) λ(2) λ(3))

Lemma 63 Let λ micro ν `3 be three triples of partitions

(i) If gλ(i)micro(i)ν(i) gt 0 for all i then 2Hθ(λ) le 2Hθ(micro)2Hθ(ν)

(ii) If micro `3 m ν `3 nminusm and cλ(i)

micro(i)ν(i) gt 0 for all i then 2Hθ(λ) le 2Hθ(micro) + 2Hθ(ν)

Proof (i) Suppose gλ(i)micro(i)ν(i) gt 0 for all i Then H(λ(i)) le H(micro(i)) +H(ν(i)) for

all i by Lemma 62 Thussum

i θ(i)H(λ(i)) lesum

i θ(i)H(micro(i))+sum

i θ(i)H(ν(i)) Then

Hθ(λ) le Hθ(micro) +Hθ(ν) We conclude 2Hθ(λ) le 2Hθ(micro)2Hθ(ν)

(ii) Suppose cλ(i)

micro(i)ν(i) gt 0 for all i Then H(λ(i)) le mnH(micro(i))+ nminusm

nH(ν(i))+h(m

n)

by Lemma 62 We take the θ-weighted average to get Hθ(λ) le mnHθ(micro) +

nminusmmHθ(ν) + h(m

n) We conclude 2Hθ(λ) le 2Hθ(micro) + 2Hθ(ν) by Lemma 49(iv)

65 Hilbert spaces and density operators

Endow the vector space Cn with a hermitian inner product (one may take thestandard hermitian inner product 〈u v〉 =

sumni=1 uivi for u v isin Cn where middot denotes

taking the complex conjugate) so that it is a Hilbert space

66 Moment polytopes P(t) 93

Let (V1 〈middot middot〉) and (V2 〈middot middot〉) be Hilbert spaces On V1 oplus V2 we define the innerproduct by 〈u1 oplus u2 v1 oplus v2〉 = 〈u1 v1〉+ 〈u2 v3〉 On V1 otimes V2 we define the innerproduct by 〈u1 otimes u2 v1 otimes v2〉 = 〈u1 v1〉〈u2 v2〉 and extending linearly

Let V be a Hilbert space A positive semidefinite hermitian operator ρ V rarr Vwith trace one is called a density operator The sequence of eigenvalues of a densityoperator ρ is a probability vector Let spec(ρ) = (p1 pn) be the sequence ofeigenvalues of ρ ordered non-increasingly p1 ge middot middot middot ge pn

Let V1 and V2 be Hilbert spaces Given a density operator ρ on V1 otimes V2the reduced density operator ρ1 = tr2 ρ is uniquely defined by the property thattr(ρ1X1) = tr(ρ(X1otimes IdV2)) for all operators X1 on V1 The operator ρ1 is again adensity operator The operation tr2 is called the partial trace over V2 Explicitly ρ1

is given by 〈ei ρ1(ej)〉 =sum

`〈ei otimes f` ρ(ej otimes f`)〉 where the ei are some basis of V1

and the fi are some basis of V2 (the statement is independent of basis choice)Let Vi be a Hilbert space and consider the tensor product V1otimesV2otimesV3 Associate

with t isin V1 otimes V2 otimes V3 the dual element tlowast = 〈t middot〉 isin (V1 otimes V2 otimes V3)lowast Then

ρt = ttlowast〈t t〉 = t〈t middot〉〈t t〉

is a density operator on V1 otimes V2 otimes V3 Viewing ρt as a density operator on theregrouped space V1otimes (V2otimes V3) we may take the partial trace of ρt over V2otimes V3 asdescribed above We denote the resulting density operator by ρt1 = tr23 ρ

t Wesimilarly define ρt2 = tr13 ρ

t and ρt3 = tr12 ρt

66 Moment polytopes P(t)

We give a brief introduction to moment polytopes We refer to [Nes84 Bri87Fra02 Wal14] for more information We begin with the general setting and thenspecialise to orbit closures in tensor spaces

661 General setting

Let G be a connected reductive algebraic group (We refer to Kraft [Kra84] andHumphreys [Hum75] for an introduction to algebraic groups) Fix a maximal torusT sube G and a Borel subgroup T sube B sube G We have the character group X(T ) theWeyl group W the root system Φ sube X(T ) and the system of positive roots Φ+ sube ΦFor λ micro isin X(T ) we set λ 4 micro if micro minus λ is a sum of positive roots Let V bea rational G-representation The restriction of the action of G to T gives adecomposition

V =oplus

λisinX(T )

Vλ Vλ = v isin V forallt isin T t middot v = λ(t)v

This decomposition is called the weight decomposition of V The λ isin X(T )with Vλ 6= 0 are called the weights of V with respect to T The Vλ are the

94 Chapter 6 Universal points in the asymptotic spectrum of tensors

weight spaces of V For v isin V let vλ be the component of v in Vλ Letsupp(v) = λ vλ 6= 0

Let E be the real vector space E = X(T ) otimes R The Weyl group W actson X(T ) and thus on E We enlarge 4 to a partial order on E as follows Forx y isin E let x 4 y if y minus x is a nonnegative linear combination of positive rootsLet D sube E be the positive Weyl chamber For every x isin E the orbit W middot xintersects the positive Weyl chamber D in exactly one point which we denote bydom(x)

Let V be a finite-dimensional rational G-module Let χ isin X(T ) cap D bea dominant character We denote the χ-isotypical component of V with V(χ)Let Z sube V be a Zariski closed set We denote the coordinate ring of Z with C[Z]We denote the degree d part of C[Z] with C[Z]d If Z is G-stable then C[Z]d is aG-module

Definition 64 Let V be a rational G-module and Z sube V a nontrivial irreducibleclosed G-stable cone The moment polytope of Z denoted by

P(Z)

is defined as the Euclidean closure in E of the set

R(Z) = χd (C[Z]d)(χlowast) 6= 0

of normalised characters χd for which the χlowast-isotypical component (C[Z]d)(χlowast) isnot zero

Theorem 65 (MumfordndashNess [Nes84] Brion [Bri87] Franz [Fra02]) The momentpolytope is indeed a convex polytope and it is equal to the image of the so-calledmoment map intersected with the positive Weyl chamber

P(Z) = micro(Z 0) capD

Let Z = G middot v be the orbit closure (in the Zariski topology) of a vector v isin V 0and suppose G middot v is a cone

Lemma 66 (See eg [Str05]) Suppose G middot v is a cone Then

R(G middot v) = χd (C[G middot v]d)(χlowast) 6= 0= χd (lin(G middot votimesd))(χ) 6= 0

662 Tensor spaces

We specialise to 3-tensors Let V = V1 otimes V2 otimes V3 with Vi = Cni Let

G = GLn1 timesGLn2 timesGLn3

67 Quantum functionals F θ(t) 95

T = T1 times T2 times T3

with Ti the diagonal matrices in GLni The weight decomposition of V is thedecomposition with respect to the standard basis elements ex1 otimes ex2 otimes ex3 wherex isin [n1]times [n2]times [n3] The support supp(v) is the support of v with respect to thestandard basis

In the current setting there is a beautiful rephrasing of Theorem 65 in termsof ordered spectra of reduced density matrices Recall from Section 65 that forv isin V 0 we have a density matrix ρv and reduced density matrices ρvi of whichwe may take the non-increasingly ordered spectra spec(ρvi )

Theorem 67 (WalterndashDoranndashGrossndashChristandl [WDGC13]) Let Z sube V be anontrivial irreducible closed G-stable cone Then

P(Z) = (spec ρz1 spec ρz2 spec ρz3) z isin Z 0

Let v isin V 0 We consider the moment polytope of the orbit closure Z = G middot vIn this setting Lemma 66 specialises to the following

Lemma 68 (See eg [Str05])

R(G middot v) = χd (C[G middot v]d)(χlowast) 6= 0= χd (lin(G middot votimesd))(χ) 6= 0= χd Pχv

otimesd 6= 0

where Pχ = Pχ(1) otimes Pχ(2) otimes Pχ(3) with Pχ(i) V otimesdi rarr V otimesdi the projector onto the

isotypical component of type χ(i) discussed in Section 62

On the other hand Theorem 67 immediately gives a description of the momentpolytope P(G middot v) in terms of ordered spectra of reduced density matrices

Theorem 69 Let v isin V 0 Then

P(G middot v) = (spec ρu1 spec ρu2 spec ρu3) u isin G middot v 0

Summarising we have two descriptions of the moment polytope a represen-tation-theoretic or invariant-theoretic description (Lemma 68) and a quantummarginal spectra description (Theorem 69) These two descriptions are the keyto proving the properties of the quantum functionals that we need

67 Quantum functionals F θ(t)

We will now define the quantum functionals and prove that they are universalspectral points

96 Chapter 6 Universal points in the asymptotic spectrum of tensors

Let p = (p1 pn) isin Rn be a probability vector iesumn

i=1 pi = 1 andpi ge 0 for all i isin [n] Recall that H(p) denotes the Shannon entropy of theprobability vector p H(p) =

sumni=1 pi log2 1pi Let x = (x(1) x(2) x(3)) be a

triple of probability vectors x(i) isin Rni Let θ isin Θ be a weighting Recallthat Hθ(x) denotes the θ-weighted average of the Shannon entropies of the threeprobability vectors x(1) x(2) x(3)

Hθ(x) = θ(1)H(x(1)) + θ(2)H(x(2)) + θ(3)H(x(3))

Let V = Cn1 otimes Cn2 otimes Cn3 Let G = GLn1 timesGLn2 timesGLn3 Let v isin V 0 Weuse the notation P(v) = P(G middot v) for the moment polytope of the orbit closureof v

Definition 610 For θ isin Θ and v isin V 0 let

F θ(v) = max2Hθ(x) x isin P(v)

Let F θ(0) = 0 We call the functions F θ the quantum functionals The namequantum functional comes from the fact that the moment polytope P(t) consistsof triples of quantum marginal entropies

Theorem 611 Let T be the semiring of 3-tensors over C Let 6 be the restrictionpreorder For θ isin Θ

F θ isin X(T 6)

In other words F θ is a semiring homomorphism T rarr Rge0 which is monotoneunder degeneration 6 In fact F θ is monotone under degeneration

Remark 612 The results in this chapter generalise to k-tensors over C In ourpaper [CVZ18] we discuss this general situation in detail and make a distinctionbetween upper quantum functionals and lower quantum functionals

Let p isin Rn and q isin Rm be probability vectors The tensor product potimesq isin Rnm

defined by

potimes q = (piqj i isin [n] j isin [m])

is a probability vector The direct sum poplus q isin Rn+m defined by

poplus q = (p1 pn q1 qm)

is a probability vectorLet x = (x(1) x(2) x(3)) and y = (y(1) y(2) y(3)) be triples of probability vectors

We define the tensor product xotimes y elementwise

xotimes y = (x(1) otimes y(1) x(2) otimes y(2) x(3) otimes y(3))

67 Quantum functionals F θ(t) 97

We define the direct sum xoplus y elementwise

xoplus y = (x(1) oplus y(1) x(2) oplus y(2) x(3) oplus y(3))

For x otimes y and x oplus y to be in the moment polytope we will need to reorder thecomponents non-increasingly For a triple of probability vectors x = (x(1) x(2) x(3))let

dom(x)

be the triple of probability vectors obtained from x be reordering the compo-nents x(i) such that they become non-increasing Let dom(S) = dom(x) x isin S

For v isin Cn1otimesCn2otimesCn3 we will use the notation G(v) = GLn1timesGLn2timesGLn3

to denote the group that naturally corresponds to the space that v lives in Wewill use the notation P(v) = P(G(v) middot v) for the moment polytope of the orbitclosure of v

Theorem 613 Let s isin Cn1 otimes Cn2 otimes Cn3 and t isin Cm1 otimes Cm2 otimes Cm3

(i) dom(P(s)otimesP(t)

)sube P(sotimes t)

(ii) forallα isin [0 1] dom(αP(s)oplus (1minus α) P(t)

)sube P(soplus t)

(iii) If s t isin Cn1 otimes Cn2 otimes Cn3 0 and s isin G(t) middot t then P(s) sube P(t)

(iv) P(soplus 0) = P(s)oplus 0

(v) P(〈1〉) = ((1) (1) (1)) with 〈1〉 = e1 otimes e1 otimes e1 isin C1 otimes C1 otimes C1

Proof To prove statements (i) and (ii) let x isin P(s) and y isin P(t) Then thereare elements a isin G(s) middot s and b isin G(t) middot t with ordered marginal spectra x and y

x = (spec ρa1 spec ρa2 spec ρa3)

y = (spec ρb1 spec ρb2 spec ρb3)

We prove statement (i) We have aotimes b isin G(sotimes t) middot sotimes t Thus

dom(xotimes y) = (spec ρaotimesb1 spec ρaotimesb2 spec ρaotimesb3 ) isin P(sotimes t)

We conclude dom(P(s)otimesP(t)) sube P(sotimes t) We prove statement (ii) Let α isin [0 1]Define the tensor u(α) isin Cn1+m1 otimes Cn2+m2 otimes Cn3+m3 by

u(α) =

radicαradic〈s s〉

aoplusradic

1minus αradic〈t t〉

b

Then u(α) isin G(soplus t) middot soplus t We have ρu(α)i = αρai oplus (1 minus α)ρbi From the

observation

spec(αρai oplus (1minus α)ρbi) = dom(αxoplus (1minus α)y)

98 Chapter 6 Universal points in the asymptotic spectrum of tensors

follows dom(αxoplus (1minus α)y) isin P(G(soplus t) middot soplus t) We conclude

dom(αP(s)oplus (1minus α)P(t)) sube P(soplus t)

We have thus proven statement (i) and (ii)We prove statement (iii) Let G = G(t) = G(s) Let s isin G middot t Then

G middot s sube G middot t so we have a G-equivariant restriction map C[G middot s] C[G middot t] onthe coordinate rings Let χd isin R(G middot s) with (C[G middot s]d)(χlowast) 6= 0 Then also(C[G middot t]d)(χlowast) 6= 0 by Schurrsquos lemma Thus χd isin R(G middot t) sube P(G middot t) Weconclude P(s) sube P(t)

We prove statement (iv) Let χd isin R(G(soplus 0) middot (soplus 0)) with Pχ(soplus0)otimesd 6= 0Recall from Section 62 that Pχ is given by the action of an element in the groupalgebra C[Sd] which we also denoted by Pχ From this viewpoint we see that also

Pχsotimesd 6= 0 So χd isin R(G(s) middot s)Statement (v) is a direct observation

Corollary 614

(i) F θ(s)F θ(t) le F θ(sotimes t)

(ii) F θ(s) + F θ(t) le F θ(soplus t)

(iii) If s t then F θ(s) le F θ(t)

(iv) F θ(〈1〉) = 1

Proof (i) Let x isin P(s) and y isin P(t) Then xotimesy isin P(sotimest) by Theorem 613 It isa basic fact that Hθ(x)+Hθ(y) = Hθ(xotimesy) (Lemma 49) so 2Hθ(x)2Hθ(y) = 2Hθ(xotimesy)We conclude F θ(s)F θ(t) le F θ(sotimes t)

(ii) Let x isin P(s) and y isin P(t) Then by Theorem 613 for all α isin [0 1]

dom(αxoplus (1minus α)y) isin P(soplus t)

It is a basic fact that αHθ(x) + (1 minus α)Hθ(y) + h(α) = Hθ(αx oplus (1 minus α)y)(Lemma 49) Thus for any α isin [0 1] we have 2αHθ(x)+(1minusα)Hθ(y)+h(α) le F θ(soplus t)Using Lemma 49(iv) we conclude F θ(s) + F θ(t) le F θ(soplus t)

(iii) This follows from statement (iii) and (iv) of Theorem 613 since bydefinition degeneration s t means soplus 0 isin G(toplus 0) middot (toplus 0)

(iv) This follows from statement (v) of Theorem 613

67 Quantum functionals F θ(t) 99

Theorem 615

(i) R(sotimes t) sube λN existmicroN isin R(s) νN isin R(t) gλ(i)micro(i)ν(i) gt 0 for all i

(ii) R(soplus t) sube λN existmicrom isin R(s) ν(N minusm) isin R(t) cλ(i)

micro(i)ν(i) gt 0 for all i

Proof (i) Let s isin V1 otimes V2 otimes V3 and let t isin W1 otimesW2 otimesW3 Let λN isin R(sotimes t)with Pλ(sotimes t)otimesN 6= 0 Let π be the natural reordering map

π ((V1 otimesW1)otimes (V2 otimesW2)otimes (V3 otimesW3))otimesN

rarr (V1 otimes V2 otimes V3)otimesN otimes (W1 otimesW2 otimesW3)otimesN

Then

(sotimes t)otimesN =summicroν

πminus1(Pmicro otimes Pν)π(sotimes t)otimesN

Let micro ν `3 N with Pλπminus1(Pmicro otimes Pν)π(s otimes t)otimesN 6= 0 Then Pmicros

otimesN 6= 0 andPνt

otimesN 6= 0 ie microN isin R(s) and νN isin R(t) Moreover Pλπminus1(Pmicro otimes Pν)π 6= 0

which means the Kronecker coefficients gλ(i)micro(i)ν(i) are nonzero(ii) Let λN isin R(soplus t) with Pλ(soplus t)otimesN 6= 0 Let us expand (soplus t)otimesN as

(soplus t)otimesN = sotimesN oplus (sotimesNminus1 otimes t)oplus middot middot middot oplus totimesN

Then Pλ does not vanish on some summand which we may assume to be of theform sotimesm otimes totimesNminusm Let π be the natural projection

π ((V1 oplusW1)otimes (V2 oplusW2)otimes (V3 oplusW3))otimesN

rarr (V1 otimes V2 otimes V3)otimesm otimes (W1 otimesW2 otimesW3)otimesNminusm

Let micro ν with Pλπminus1(Pmicro otimes Pν)π(soplus t)otimesN 6= 0 Then Pmicros

otimesm 6= 0 and PνtotimesNminusm 6= 0

Moreover Pλπminus1(Pmicro otimes Pν)π 6= 0 Therefore the LittlewoodndashRichardson coeffi-

cients cλ(i)

micro(i)ν(i) are nonzero

Corollary 616

(i) F θ(sotimes t) le F θ(s)F θ(t)

(ii) F θ(soplus t) le F θ(s) + F θ(t)

Proof (i) Let λN isin R(s otimes t) By Theorem 615 there is a microN isin R(s) and aνN isin R(t) such that the Kronecker coefficient gλ(i)micro(i)ν(i) is nonzero for every i

Then 2Hθ(micro) le F θ(s) and 2Hθ(ν) le F θ(t) by definition of F θ The Kroneckercoefficients being nonzero implies

2Hθ(λ) le 2Hθ(micro)2Hθ(ν)

100 Chapter 6 Universal points in the asymptotic spectrum of tensors

by Lemma 63 We conclude F θ(sotimes t) le F θ(s)F θ(t)

(ii) Let λN isin R(s oplus t) Then by Theorem 615 there are microm isin R(s) and

ν(N minus m) isin R(t) such that the LittlewoodndashRichardson coefficient cλ(i)

micro(i)ν(i) isnonzero for every i This means

2Hθ(λ) le 2Hθ(micro) + 2Hθ(ν)

by Lemma 63 We conclude F θ(soplus t) le F θ(s) + F θ(t)

Proof of Theorem 611 Corollary 614 and Corollary 616 together prove The-orem 611

68 Outer approximation

In this section we discuss an outer approximation of P(t) We will use this outerapproximation to show that the quantum functionals are at most the supportfunctionals

Let 4 be the dominance order ie majorization order on triples of probabilityvectors For any set S sube Rn1 times Rn2 times Rn3 of triples of probability vectors let S4

denote the upward closure with respect to 4

S4 = y isin Rn1 times Rn2 times Rn3 existx isin S x 4 y

Let conv(S) denote the convex hull of S in Rn1 times Rn2 times Rn3 Recall that for x isinS we defined dom(x) as the triple of probability vectors obtained from x =(x(1) x(2) x(3)) by reordering the components x(i) such that they become non-increasing and dom(S) = dom(x) x isin S

Theorem 617 (Strassen [Str05]) Let v isin V 0 Then

P(v) sube (dom conv supp v)4 (65)

Proof We give the proof for the convenience of the reader Let χd isin R(G middot v)Then (lin(G middot votimesd))(χ) 6= 0 Let Mχ sube lin(G middot votimesd) be a simple G-submodule withhighest weight χ Let N sube V otimesd be the G-module complement N oplusMχ = V otimesdThen votimesd is not in N Let v =

oplusγisinsupp v vγ be the weight decomposition Then votimesd

is a sum of tensor products of the vγ At least one summand is not in N say ofweight η =

sumγ dγγ with

sumγ dγ = d The projection V otimesd rarrMχ along N maps this

summand onto a nonzero weight vector of weight η So η is a weight of Mχ Thenalso dom(η) is a weight of Mχ Since χ is the highest weight of Mχ dom(η) 4 χ

Then dom(ηd) 4 χd We have ηd =sum

γdγdγ isin conv supp v We conclude

R(G middot v) sube (dom conv supp v)4 and thus P(G middot v) sube (dom conv supp v)4

69 Inner approximation for free tensors 101

69 Inner approximation for free tensors

In this section we discuss an inner approximation for the moment polytope of afree tensor We will use this inner approximation in the next section to prove thatthe quantum functionals coincide with the support functionals when restricted tofree tensors We will prove that not all tensors are free

We say a set Φ sube [n1] times [n2] times [n3] is free if every two different elementsof Φ differ in at least two coordinates in other words if the elements of Φ haveHamming distance at least two We say v isin V = Cn1 otimes Cn2 otimes Cn3 is free if forsome g isin G(v) = GLn1 timesGLn2 timesGLn3 the support supp(g middot v) sube [n1]times [n2]times [n3]is free (Free is called schlicht in [Str05])

Theorem 618 (Strassen [Str05]) Let v isin V 0 with supp(v) free Then

dom conv supp v sube P(v)

Proof We refer to [Str05]

Corollary 619 Let v isin V 0 with supp(v) free Then

P(v)4 =(dom conv supp v

)4

Proof By Theorem 618 dom conv supp v sube P(v) We take the upward closureon both sides to get (dom conv supp v)4 sube P(v)4 On the other hand fromTheorem 617 follows P(v)4 sube (dom conv supp v)4

Remark 620 Recall that v isin V is oblique if the support supp(g middot v) is anantichain for some g isin G(v) (Section 44) Such antichains are free so obliquetensors are free Thus tight sube oblique sube free Like the tight tensors andoblique tensors free tensors from a semigroup under otimes and oplus

Proposition 621 For n ge 5 there exists a tensor that is not free in CnotimesCnotimesCn

Proof We upper bound the maximal size of a free support Let Φ sube [n]times [n]times [n]be free Any two distinct elements in Φ are still distinct if we forget the thirdcoefficient of each Therefore |Φ| = |(α1 α2) α isin Φ| le n2 (This is a specialcase of the Singleton bound [Sin64] from coding theory This upper bound is tightsince Φ = (a b c) a b c isin [n] c = a+ b mod n is free and has size n2) Secondwe apply the following observation of Burgisser [Bur90 page 3] Let

Zn = t isin Cn otimes Cn otimes Cn existg isin G(t) |supp(g middot t)| lt n3 minus 3n2

Let Yn = Cn otimes Cn otimes Cn Zn Then the set Yn is Zariski open and nonemptyNow let n ge 5 and let t isin Yn Then forallg isin G(t) |supp(g middot t)| ge n3 minus 3n2 gt n2 Weconclude t is not free

102 Chapter 6 Universal points in the asymptotic spectrum of tensors

610 Quantum functionals versus support func-

tionals

We discussed the support functionals ζθ isin X(oblique 3-tensors over F) in Chap-ter 4 We recall its definition over C Let V = Cn1otimesCn2otimesCn3 For θ isin Θ = P([3])and t isin V 0 with supp(t) oblique

ζθ(t) = max2Hθ(P ) P isin P(supp(t))

We also discussed an extension of ζθ to all 3-tensors over C the upper supportfunctional

ζθ(t) = mingisinG(t)

max2Hθ(P ) P isin P(supp(g middot t))

We know ζθ(s otimes t) le ζθ(s)ζθ(t) ζθ(s oplus t) = ζθ(s) + ζθ(t) ζθ(〈1〉) = 1 ands 6 trArr ζθ(s) le ζθ(t) for any s t isin V

The set conv supp(g middot t) is the set of marginals of probability distributions onsupp(g middot t) Thus dom conv supp(g middot t) is the set of ordered marginals of probabilitydistributions on supp(g middot t) Therefore

ζθ(t) = mingisinG(t)

maxxisinS(gmiddott)

2Hθ(x)

with S(w) = dom conv suppw Let X sube Rn1 times Rn2 times Rn3 be a set of triples ofprobability vectors From Schur-convexity of the Shannon entropy function followsmaxxisinX 2Hθ(x) = maxxisinX4 2Hθ(x) Also Hθ(x) = Hθ(domx)

Theorem 622 ζθ(t) ge F θ(t)

Proof Let g isin G(t) such that

maxxisinS

2Hθ(x) = ζθ(t)

with S = dom conv supp(g middot t) We have

maxxisinS

2Hθ(x) = maxxisinS4

2Hθ(x)

By Theorem 617 P(t) sube S4 We conclude F θ(t) le ζθ(t)

Theorem 623 Let t isin V be free Then ζθ(t) = F θ(t)

Proof We know from Theorem 622 that ζθ(t) ge F θ(t) We prove ζθ(t) le F θ(t)Let g isin G(t) such that supp(g middot t) is free Let S = dom conv supp(g middot t) Thenζθ(t) le maxxisinS 2Hθ(x) = maxxisinS4 2Hθ(x) By Theorem 618 we have S4 = P(t)4We conclude ζθ(t) le F θ(t)

611 Asymptotic slice rank 103

We can show that the regularised upper support functional equals the quantumsupport functional As a consequence the quantum functional is at least the lowersupport functional which was discussed in Chapter 4

Theorem 624 limnrarrinfin1nζθ(totimesn)1n = F θ(t)

Proof We refer the reader to [CVZ18]

Corollary 625 F θ(v) ge ζθ(v)

Proof By Theorem 624 F θ(v) = limnrarrinfin ζθ(votimesn)1n We know ζθ(v) ge ζθ(v)

by Theorem 415 and thus limnrarrinfin ζθ(votimesn)1n ge limnrarrinfin ζθ(v

otimesn)1n The lowersupport functional ζθ is supermultiplicative under otimes (Theorem 414) so

limnrarrinfin

ζθ(votimesn)1n ge ζθ(v)

Combining these three inequalities proves the theorem

611 Asymptotic slice rank

We proved in Section 46 that for oblique t isin Fn1 otimes Fn2 otimes Fn3 the asymptotic slicerank limnrarrinfin SR(totimesn)1n exists and equals minθisinΘ ζ

θ(t) with Θ = P([3]) In thissection we prove the analogous statement for the quantum functionals

Theorem 626 Let t isin Cn1 otimes Cn2 otimes Cn3 Then

limnrarrinfin

SR(totimesn)1n = minθisinΘ

F θ(t)

We work towards the proof of Theorem 626 Let t isin Cn1 otimesCn2 otimesCn3 0 LetEθ(t) = log2 F

θ(t)

Lemma 627 For any ε gt 0 there is an n0 isin N such that for all n ge n0 there isa λn isin R(t) with miniisin[3]H(λ(i)) ge minθisinΘE

θ(t)minus ε

Proof By definition

minθisinΘ

Eθ(t) = minθisinΘ

maxxisinP(t)

sumjisin[3]

θ(j)H(x(j))

By Von Neumannrsquos minimax theorem the right-hand side equals

maxxisinP(t)

minθisinΘ

sumjisin[3]

θ(j)H(x(j))

which equals

maxxisinP(t)

minjisin[3]

H(x(j))

104 Chapter 6 Universal points in the asymptotic spectrum of tensors

Let ε gt 0 Let microm isin R(t) with minjisin[3] H(micro(j)) ge minθisinΘ Eθ(t)minus ε2 We will

use two facts We have (P(1) otimes P(1) otimes P(1))t = t 6= 0 The triples of partitions λwith Pλt

otimesn 6= 0 for some n form a semigroup Let n isin N We can write n = qm+rwith q r isin N 0 le r lt m Let λ(j) = qmicro(j) + (r) Then by the semigroup property

Pλtotimesn 6= 0 ie λn isin R(t) We have 1

n(qmicro(j) + (r)) = qm

nmicro(j) + r

n(r) By concavity

of Shannon entropy

H( 1n(qmicro(j) + (r))) = H( qm

nmicro(j) + r

n(r))

ge qmnH(micro(j))

ge (1minus mn

)H(micro(j))

When n is large enough (1minus mn

)H(micro(j)) is at least H(micro(j))minus ε2 Let n0 isin N suchthat this is the case for all j isin [3]

Lemma 628 Let λn isin R(t) Then SR(totimesn) ge miniisin[3] dim[λ(i)]

Proof We have the restriction totimesn ge Pλtotimesn 6= 0 Choose rank-one projections Aj

in the vector spaces Sλ(j)(Cnj) with

s = (id[λ(1)]otimesA1)otimes (id[λ(2)]otimesA2)otimes (id[λ(3)]otimesA3)Pλtotimesn 6= 0

The tensor s is invariant under Sn acting diagonally on (Cn1)otimesnotimes(Cn2)otimesnotimes(Cn3)otimesnThus the marginal spectra spec ρsi are uniform This implies s is semistableFrom [BCC+17 Theorem 46] follows that SR(s) equals miniisin[3] dim[λ(i)]

Lemma 629 lim infnrarrinfin SR(totimesn)1n ge minθisinΘ Fθ(t)

Proof Let ε gt 0 For n large enough choose λn isin R(t) as in Lemma 627 ByLemma 628 SR(totimesn) ge miniisin[3] dim[λ(i)] The right-hand side we lower bound by

miniisin[3]

dim[λ(i)] ge miniisin[3]

2nH(λ(i))2minuso(n) ge 2n(minθisinΘ Eθ(t)minusε)2minuso(n)

Then lim infnrarrinfin SR(totimesn)1n ge 2minθisinΘ Eθ(t)minusε

Lemma 630 lim supnrarrinfin SR(totimesn)1n le F θ(t)

Proof Let n isin N Define s1 s2 s3 isin Cn1 otimes Cn2 otimes Cn3 by

s1 =(sumλ(1)`n

H(λ(1))leEθ(t)

Pλ(1) otimes Idotimes Id)totimesn

s2 =(sumλ(2)`n

H(λ(2))leEθ(t)

Idotimes Pλ(2) otimes Id)

(totimesn minus s1)

612 Conclusion 105

s3 =(sumλ(3)`n

H(λ(3))leEθ(t)

Idotimes Idotimes Pλ(3)

)(totimesn minus s1 minus s2)

Then totimesn = s1 +s2 +s3 The slice rank of an element in the image of Pλ(1)otimes Idotimes Id

is at most dim[λ(1)] otimes Sλ(1)(Cn1) which is at most 2nH(λ(1))+o(n) (Section 62)Similarly for Id otimes Pλ(2) otimes Id and Id otimes Id otimes Pλ(3) The tensor s1 is in the imageof the sum

sumλ(1) Pλ(1) otimes Id otimes Id over λ(1) ` n with at most n1 parts There are

at most (n+ 1)n1 such partitions Thus SR(s1) le (n+ 1)n12nEθ(t)+o(n) Similarly

for s2 and s3 Therefore

lim supnrarrinfin

SR(totimesn)1n le lim supnrarrinfin

(3(n+ 1)maxiisin[3] ni 2nE

θ(t)+o(n))1n

(66)

The right-hand side of (66) equals F θ(t)

Proof of Theorem 626 Lemma 629 and Lemma 630 together prove Theo-rem 626

612 Conclusion

In this chapter we constructed the first infinite family of spectral points for 3-tensors over C the quantum functionals For 30 years the only explicit spectralpoints known were the gauge points The constructions in this chapter naturallygeneralise to higher-order tensors for which we refer to our paper [CVZ18] Wedo not know whether the quantum functionals are all spectral points for 3-tensorsover C Finally we showed that for complex tensors the asymptotic slice rankexists and equals the minimum value over the quantum functionals

Chapter 7

Algebraic branching programsapproximation and nondeterminism

This chapter is based on joint work with Karl Bringmann and Christian

Ikenmeyer [BIZ17]

71 Introduction

The study of asymptotic tensor rank in previous chapters was originally motivatedby the study of the complexity of matrix multiplication in the algebraic circuitmodel an algebraic model of computation In this chapter we will study severalother algebraic models of computation and algebraic complexity classes

Formulas the class VPe and the determinant

An (arithmetic) formula is a rooted binary tree whose leaves are each labeledwith a variable or a field constant and whose root and intermediate vertices arelabeled with either + (addition) or times (multiplication) In the natural way viarecursion over the tree structure a formula computes a multivariate polynomial f The formula size of a multivariate polynomial f is the smallest number of verticesrequired for any formula to compute f Here is an example of a formula of size 7computing the polynomial (3 + x)(3 + y)

3 x 3 y

+ +

times

A sequence of multivariate polynomials (fn)nisinN is called a family Valiant inhis seminal paper [Val79] introduced the complexity class VPe that is defined as

107

108 Chapter 7 Algebraic branching programs

the set of all families whose formula size is polynomially bounded (We say asequence (an)n isin NN of natural numbers is polynomially bounded if there exists aunivariate polynomial q such that an le q(n) for all n) For example the family((x1)

n + (x2)n + middot middot middot+ (xn)n)n is in VPe because the formula size of this family

grows quadratically

The smallest known formulas for the determinant family detn have size nO(logn)This follows from Berkowitzrsquo algorithm [Ber84] which gives an algebraic cir-cuit of depth O(log2 n) and thus by expanding we get an algebraic formula ofdepth O(log2 n) whose size is then trivially bounded by 2O(log2 n) = nO(logn) Itis a major open question in algebraic complexity theory whether formulas ofpolynomially bounded size exist for detn This question can be phrased in termsof complexity classes as asking whether or not the inclusion VPe sube VPs is strict(We will define VPs shortly)

Motivated by this question we study the closure class VPe of families ofpolynomials that can be approximated arbitrarily closely by families in VPe

(see Section 724 for the formal definition) Over the field R or C one can thinkof VPe as the set of families whose border formula size is polynomially boundedThe border formula size of a polynomial f is the smallest number c such that thereexists a sequence gi of polynomials with formula size at most c and limirarrinfin gi = f

Continuous lower bounds

In algebraic complexity theory problem instances correspond to vectors v isin FnA complexity lower bound often takes the form of a function f Fn rarr F that is zeroon the vectors of ldquolow complexityrdquo and nonzero on v We refer to Grochow [Gro13]for a discussion of settings where complexity lower bounds are obtained in thisway (eg [NW97 Raz09 LO15 GKKS13 LMR13 BI13]) Over the complexnumbers we can in fact assume that these functions f are continuous [Gro13](and even so-called highest-weight vector polynomials) If C and D are algebraiccomplexity classes with C sube D (for example C = VPe and D = VPs) thena proof of separation D 6sube C in this continuous manner implies the strongerseparation D 6sube C In our case it is thus natural to aim for the separation VPs 6subeVPe instead of the slightly weaker VPs 6sube VPe which provides further motivationfor studying VPe This is exactly analogous to the geometric complexity theoryapproach of Mulmuley and Sohoni (see eg [MS01 MS08] and the exposition[BLMW11 Sec 9]) which aims to prove the separation VNP 6sube VPs to attackValiantrsquos famous conjecture VPs 6= VNP [Val79] (Here VNP is the class ofp-definable families see Section 724)

New results in this chapter

We prove two new results in this chapter

71 Introduction 109

Algebraic branching programs of width 2 An algebraic branching pro-gram (abp) is a directed acyclic graph with a source vertex s and a sink vertex tthat has affine linear forms over the base field F as edge labels Moreover werequire that each vertex is labeled with an integer (its layer) and that edges in theabp only point from vertices in layer i to vertices in layer i+ 1 The width of anabp is the cardinality of its largest layer The size of an abp is the number of itsvertices The value of an abp is the sum of the values of all sndasht-paths where thevalue of an sndasht-path is the product of its edge labels We say that an abp computesits value The class VPs coincides with the class of families of polynomials thatcan be computed by abps of polynomially bounded size see eg [Sap16]

For k isin N we introduce the class VPk as the class of families of polyno-mials computable by width-k abps of polynomially bounded size It is well-known (see Lemma 72) that VPk sube VPe for all k ge 1 In 1992 Ben-Or andCleve [BOC92] showed that VPk = VPe for all k ge 3 In 2011 Allender andWang [AW16] showed that width-2 abps cannot compute every polynomial so inparticular we have a strict inclusion VP2 ( VP3

We prove that the closure of VP2 and the closure of VPe are equal

VP2 = VPe (71)

when char(F) 6= 2 From (71) and the result of Allender and Wang follows directlythat the inclusion VP2 ( VP2 is strict We have thus separated a complexityclass from its approximation closure

VNP via affine linear forms Every algebraic complexity class has a nondeter-ministic closure (see Section 725 for the definition) The nondeterministic closureof VP is called VNP and the nondeterministic closure of VPe is called VNPeIn 1980 Valiant [Val80] proved VNPe = VNP The nondeterministic closureof VP1 and VP2 we call VNP1 and VNP2 Using interpolation techniques wecan deduce VNP2 = VNP from (71) provided the field is infinite Using moresophisticated techniques we prove

VNP1 = VNP (72)

From (72) easily follows VP1 ( VNP1 Also from [AW16] we get VP2 ( VNP2We have thus separated complexity classes from their nondeterministic closures

Further related work

An excellent exposition on the history of small-width computation can be foundin [AW16] along with an explicit polynomial that cannot be computed by width-2abps namely x1x2 + x3x4 + middot middot middot+ x15x16 Saha Saptharishi and Saxena in [SSS09Cor 14] showed that x1x2 + x3x4 + x5x6 cannot be computed by width-2 abpsthat correspond to the iterated matrix multiplication of upper triangular matrices

110 Chapter 7 Algebraic branching programs

Burgisser in [Bur04] studied approximations in the model of general algebraiccircuits finding general upper bounds on the error degree For most algebraiccomplexity classes C the relation between C and C has not been an activeobject of study As pointed out recently by Forbes [For16] Nisanrsquos result [Nis91]implies that C = C for C being the class of size-k algebraic branching programson noncommuting variables A structured study of VP and VPs was startedin [GMQ16] Much work in lower bounds for algebraic approximation algorithmshas been done in the area of bilinear complexity dating back to [BCRL79 Str83Lic84] and more recently eg [Lan06 LO15 HIL13 Zui17 LM16a]

This chapter is organised as follows In Section 72 we discuss definitions andbasic results In Section 73 we prove that the approximation closure of VP2

equals the approximation closure of VPe ie VP2 = VPe In Section 74 we provethat the nondeterminism closure of VP1 equals VNP

72 Definitions and basic results

We briefly recall the definition of circuits formulas and branching programs andwe recall the definition of the corresponding complexity classes Then we discusssome straightforward relationships among these classes and review the proof of atheorem by Ben-Or and Cleve which inspired our work Finally we discuss theapproximation closure and the nondeterminism closure for algebraic complexityclasses

721 Computational models

Let x1 x2 be formal variables By F[x] we mean the ring of polynomials over Fwith variables x1 x2 xk with k large enough

A circuit is a directed acyclic graph G with one or more source vertices andone sink vertex Each source vertex is labelled by a variable xi or a constant c isin FThe other vertices are labelled by either + or times and have in-degree 2 (that isfan-in 2) Each vertex computes an element in F[x] by recursion over the graphThe element computed by the sink is the element computed by the circuit Thesize of a circuit is the number of vertices

A formula is a circuit whose graph is a treeAn algebraic branching program (abp) is a directed acyclic graph with a source

vertex s and a sink vertex t that has affine linear forms αxi + β α β isin F asedge labels Moreover we require that each vertex is labeled with an integer (itslayer) and that edges in the abp only point from vertices in layer i to vertices inlayer i+ 1 The width of an abp is the cardinality of its largest layer The size ofan abp is the number of its vertices The value of an abp is the sum of the valuesof all sndasht-paths where the value of an sndasht-path is the product of its edge labels

72 Definitions and basic results 111

We say that an abp computes its value

For example the following abp has depth 5 width 3 and computes thepolynomial x1x2 + x2 + 2x1 minus 1

x1 2

x1x2minus1

An abp G corresponds naturally to an iterated product of matrices for any twoconsecutive layers Li Li+1 in G let Mi be the matrix (evw)visinLiwisinLi+1

with evwthe label of the edge from v to w (or 0 if there is no edge from v to w) Then thevalue of G equals the product Mk middot middot middotM2M1

For example the above abp corresponds to the following iterated matrixproduct

(1 1 1

)minus1 0 00 x2 00 0 x1

1 0 0x1 1 00 0 2

1

11

722 Complexity classes VP VPe VPk

The circuit size of a polynomial f is the size of the smallest circuit computing f The formula size of a polynomial f is the size of the smallest formula computing f

A family is a sequence (fn)nisinN of multivariate polynomials over F A class is aset of families The class VP consists of all families (fn) with circuit size degreeand number of variables in poly(n) The class VPe consists of all families (fn)with formula size in poly(n) (The origin of the subscript e in VPe is the termldquoarithmetic expressionrdquo) Clearly VPe sube VP

We introduce classes defined by abps Let k ge 1 The class VPk consists of allfamilies computed by polynomial-size width-k abps with edges labelled by affinelinear forms

sumi αixi + β with coefficients αi β isin F

We note that the above classes depend on the choice of the ground field F

In our paper [BIZ17] we make a distinction between three different types ofedge labels for abps The class VPk in this chapter corresponds to the class VPg

k

in [BIZ17]

112 Chapter 7 Algebraic branching programs

723 The theorem of Ben-Or and Cleve

This subsection is about the relations among VPk and VPe

Lemma 71 VPk sube VP` when k le `

Proof This is clearly true

Lemma 72 VPk sube VPe for any k

Proof For the simple proof we refer to [BIZ17]

Ben-Or and Cleve [BOC92] showed that for k ge 3 the classes VPk and VPe

are in fact equal

Theorem 73 (Ben-Or and Cleve [BOC92]) For k ge 3 VPk = VPe

We will review the construction of Ben-Or and Cleve here because we will useit to prove Theorem 78 and Theorem 715 The following depth-reduction lemmafor formulas by Brent is a crucial ingredient

Lemma 74 (Brent [Bre74]) Let f be an n-variate degree-d polynomial computedby a formula of size s Then f can also be computed by a formula of size poly(s n d)and depth O(log s)

Proof See the survey of Saptharishi [Sap16 Lemma 55] for a modern proof

Proof of Theorem 73 Lemma 72 says VPk sube VPe We will prove theinlusion VPe sube VP3 from which follows VPe sube VPk by Lemma 71 andthus VPk = VPe For a polynomial h define the matrix

M(h) =

1 0 0h 1 00 0 1

which as part of an abp looks like

h

We call the following matrices primitive

bull M(h) with h any variable or any constant in F

bull the 3times 3 permutation matrices denoted by Mπ with π isin S3

bull the diagonal matrices Mabc = diag(a b c) with a b c isin F

72 Definitions and basic results 113

The entries of the primitives are variables or constants in F making them suitableto use in the construction of a width-3 abp

Let (fn) isin VPe Then fn can be computed by a formula of size s(n) isin poly(n)By Brentrsquos depth-reduction theorem for formulas (Lemma 74) fn can be computedby a formula of size poly(n) and depth d(n) isin O(log s(n))

We will construct a sequence of primitives A1 Am(n) such that

A1 middot middot middotAm(n) =

1 0 0fn 1 00 0 1

with m(n) isin O(4d(n)) = poly(n) Then

fn(x) = ( 1 1 1 )Mminus110A1 middot middot middotAm(

111

)

so fn(x) can be computed by a width-3 abp of length poly(n) proving the theoremTo explain the construction let h be a polynomial and consider a formula

computing h of depth d The goal is to construct (recursively on the formulastructure) primitives A1 Am such that

A1 middot middot middotAm =

1 0 0h 1 00 0 1

with m isin O(4d)

Suppose h is a variable or a constant Then M(h) is itself a primitive matrixSuppose h = f + g is a sum of two polynomials f g and suppose M(f) and

M(g) can be written as a product of primitives Then M(f + g) equals a productof primitives because M(f + g) = M(f)M(g) This can easily be verified directlyor by noting that in the corresponding partial abps the top-bottom paths (ui-vjpaths) have the same value

u1 u2 u3

v1 v2 v3

f

gsim

u1 u2 u3

v1 v2 v3

f+g

Suppose h = fg is a product of two polynomials f g and suppose M(f) andM(g) can be written as a product of primitives Then M(fg) equals a product ofprimitives because

M(f middot g) = M(23)

(M1minus11M(123)M(g)M(132)M(f)

)2M(23)

114 Chapter 7 Algebraic branching programs

(here (23) isin S3 denotes the transposition 1 7rarr 1 2 7rarr 3 3 7rarr 2 and (123) isin S3

denotes the cyclic shift 1 7rarr 2 2 7rarr 3 3 7rarr 1) as can be verified either directly orby checking that in the corresponding partial abps the top-bottom paths (ui-vjpaths) have the same value

u1 u2 u3

v1 v2 v3

f

minus1

g

f

g

minus1

sim

u1 u2 u3

v1 v2 v3

f middotg

This completes the construction

The length m of the construction is m(h) = 1 for h a variable or constant andrecursively m(f + g) = m(f) +m(g) m(f middot g) = 2(m(f) +m(g)) so m isin O(4d)where d is the formula size of h

The above result of Ben-Or and Cleve (Theorem 73) raises the intriguingquestion whether the inclusion VP2 sube VPe is strict Allender and Wang [AW16]show that the inclusion is indeed strict in fact they show that some polynomialscannot be computed by any width-2 abp

Theorem 75 (Allender and Wang [AW16]) The polynomial

x1x2 + x3x4 + middot middot middot+ x15x16

cannot be computed by any width-2 abp Therefore we have the separation ofclasses VP2 ( VP3 = VPe

72 Definitions and basic results 115

724 Approximation closure C

We define the norm of a complex multivariate polynomial as the sum of theabsolute values of its coefficients This defines a topology on the polynomial ringC[x1 xm] Given a complexity measure L say abp size or formula size thereis a natural notion of approximate complexity that is called border complexityNamely a polynomial f isin C[x] has border complexity Ltop at most c if there isa sequence of polynomials g1 g2 in C[x] converging to f such that each gisatisfies L(gi) le c It turns out that for reasonable classes over the field of complexnumbers C this topological notion of approximation is equivalent to what we callalgebraic approximation (see eg [Bur04]) Namely a polynomial f isin C[x] satisfiesL(f)alg le c iff there are polynomials f1 fe isin C[x] such that the polynomial

h = f + εf1 + ε2f2 + middot middot middot+ εefe isin C[εx]

has complexity LC(ε)(h) le c where ε is a formal variable and LC(ε)(h) denotesthe complexity of h over the field extension C(ε) This algebraic notion ofapproximation makes sense over any base field and we will use it in the statementsand proofs of this chapter

Definition 76 Let C(F) be a class over the field F We define the approximationclosure C(F) as follows a family (fn) over F is in C(F) if there are polynomialsfni(x) isin F[x] and a function e Nrarr N such that the family (gn) defined by

gn(x) = fn(x) + εfn1(x) + ε2fn2(x) + middot middot middot+ εe(n)fne(n)(x)

is in C(F(ε)) We define the poly-approximation closure Cpoly

(F) similarly butwith the additional requirement that e(n) isin poly(n) We call e(n) the errordegree

725 Nondeterminism closure N(C)

We introduce the nondeterminism closure for algebraic complexity classes

Definition 77 Let C be a class The class N(C) consists of families (fn) withthe following property there is a family (gn) isin C and p(n) q(n) isin poly(n) suchthat

fn(x) =sum

bisin01p(n)

gq(n)(bx)

where x and b denote sequences of variables x1 x2 and b1 b2 bp(n) We saythat f(x) is a hypercube sum over g and that b1 b2 bp(n) are the hypercubevariables For any subscript x we will use the notation VNPx to denote N(VPx)We remark that the map C 7rarr N(C) trivially satisfies all properties of being aKuratowski closure operator ie N(empty) = empty C sube N(C) N(CcupD) = N(C)cupN(D)and N(N(C)) = N(C)

116 Chapter 7 Algebraic branching programs

73 Approximation closure of VP2

We show that every polynomial can be approximated by a width-2 abp Evenbetter we show that every polynomial can be approximated by a width-2 abpof size polynomial in the formula size and with error degree polynomial in theformula size This is the main result of the current chapter

Theorem 78 VPe sube VP2poly

when char(F) 6= 2

Proof For a polynomial h define the matrix M(h) =(

1 0h 1

) We call the following

matrices primitives

bull M(h) with h any variable or constant in F

bull(

12ε

00 1

)

(minus2ε 0

0 1

)

(minus1 ε0 1

)

(minus1 minusε0 1

)

(minus1 00 1

)The entries of the primitives are variables or constants in the base field F(ε)making them suitable to use in a width-2 abp over the base field F(ε)

Let (fn) isin VPe so fn(x) can be computed by a formula of size s(n) isin poly(n)By Brentrsquos depth reduction theorem for formulas (Lemma 74) fn can be computedby a formula of size poly(n) and depth d(n) isin O(log s(n))

We will construct a sequence of primitives A1 Am(n) such that

A1 middot middot middotAm(n) =

(1 0fn 1

)+ ε

(fn111 fn112

fn121 fn122

)+ ε2

(fn211 fn212

fn221 fn222

)+ middot middot middot + εe

(fne11 fne12

fne21 fne22

)for some fnijk isin F[x] with m(n) e(n) isin O(8d(n)) = poly(n) Then

( 1 1 )(minus1 00 1

)A1 middot middot middotAm(n)( 1

1 ) = fn(x) +O(ε)

so fn(x) can be approximated by a width-2 abp of length poly(n) and with errordegree poly(n) proving the theorem

We begin with the construction Let h be a polynomial and consider a formulacomputing h of depth d The goal is to construct recursively on the tree structureof the formula a sequence of primitives A1 Am such that for some hijk isin F[x]

A1 middot middot middotAm =

(1 0h 1

)+ ε

(0 0h121 0

)+ ε2

(h211 h212

h221 h222

)+

middot middot middot + εe(he11 he12

he21 he22

)(73)

with m e isin O(8d) Notice the particular first-degree error pattern in (73) whichour recursion will rely on

73 Approximation closure of VP2 117

Suppose h is a variable or a constant Then M(h) is itself a primitive satisfy-ing (73)

Suppose h = f + g is a sum of two polynomials f g and suppose that

F =

(1 0f 1

)+ ε

(0 0f prime 0

)+O(ε2) (74)

G =

(1 0g 1

)+ ε

(0 0gprime 0

)+O(ε2) (75)

are products of primitives for some f prime gprime isin F[x] Then

G middot F =

(1 0

f + g 1

)+ ε

(0 0

f prime + gprime 0

)+O(ε2)

is a product of primitives satisfying (73)Suppose h = fg is a product of two polynomials and suppose that F and G

are of the form (74) and (75) and are products of primitives We will constructM((f + g)2) M(minusf 2) M(minusg2) approximately in such a way that when we usethe identity (f + g)2 minus f 2 minus g2 = 2fg the error terms cancel properly Define theexpressions sq+(A) and sqminus(A) by

sqplusmn(A) =

(minusε 00 1

)middot A middot

(minus1 plusmnε0 1

)middot A middot

(1ε

00 1

)

Then

sqplusmn(F ) =

(1∓ εf 0

plusmnf 2 +O(ε) 1plusmn εf

)+O(ε2)

We have

sqminus(F ) middot sqminus(G) middot sq+(G middot F )

=

(1 + εg 0

minusg2 +O(ε) 1minus εg

)middot(

1 + εf 0minusf 2 +O(ε) 1minus εf

)middot(

1minus ε(f + g) 0(f + g)2 +O(ε) 1 + ε(f + g)

)+O(ε2)

which simplifies to

sqminus(F ) middot sqminus(G) middot sq+(G middot F ) =

(1 0

2fg +O(ε) 1

)+O(ε2)

118 Chapter 7 Algebraic branching programs

We conclude(2 00 1

)middot sqminus(G) middot sqminus(F ) middot sq+(G middot F ) middot

(12

00 1

)=

(minus2ε 0

0 1

)middotG middot

(minus1 minusε0 1

)middotG middot

(minus1 00 1

)middot F middot

(minus1 minusε0 1

)F

middot(minus1 00 1

)middotG middot F middot

(minus1 ε0 1

)middotG middot F middot

(12ε

00 1

)=

(1 0

fg +O(ε) 1

)+O(ε2)

This completes the constructionThe length m of the construction is m(h) = 1 for h a variable or constant

and recursively m(f + g) = m(f) + m(g) m(f middot g) = 4(m(f) + m(g)) + 7 Weconclude m isin O(8d) The error degree e of the construction satisfies the samerecursion so e isin O(8d)

Remark 79 The construction in the above proof of Theorem 78 is differentfrom the construction in our paper [BIZ17] The recursion in the above proof issimpler while the construction in [BIZ17] has a better error degree and has aspecial form which relates it to a family of polynomials called continuants

Corollary 710 VP2 = VPe and VP2poly

= VPepoly

when char(F) 6= 2

Proof We have VP2 sube VPe by Lemma 72 Taking closures on both sides weobtain VP2 sube VPe and VP2

polysube VPepoly

When char(F) 6= 2 VPe sube VP2

poly(Theorem 78) By taking closures follows

VPe sube VP2 and VPepolysube VP2

poly

Corollary 711 VP2poly

= VPe when char(F) 6= 2 and F is infinite

Proof By Corollary 710 VP2poly

= VPepoly

We prove VPepoly

= VPe inLemma 712 below

Lemma 712 VPepoly

= VPe when char(F) 6= 2 and F is infinite

Proof The inclusion VPe sube VPepoly

is trivially true We prove the other directionLet (fn) isin VPe

poly Then there are polynomials fni(x) isin F[x] and e(n) isin poly(n)

such that

fn(x) + εfn1(x) + ε2fn2(x) + middot middot middot+ εe(n)fne(n)(x)

is computed by a poly-size formula Γ over F(ε) Let α0 α1 αe(n) be distinctelements in F such that replacing ε by αj in Γ is a valid substitution ie not

74 Nondeterminism closure of VP1 119

causing division by zero These αj exist since our field is infinite by assumptionView

gn(ε) = fn(x) + εfn1(x) + ε2fn2(x) + middot middot middot+ εe(n)fne(n)(x)

as a polynomial in ε The polynomial gn(ε) has degree at most e(n) so we canwrite gn(ε) as follows (Lagrange interpolation on e(n) + 1 points)

gn(ε) =

e(n)sumj=0

gn(αj)prod

0lemlee(n)m6=j

εminus αmαj minus αm

(76)

Clearly fn(x) = gn(0) However replacing ε by 0 in Γ is not a valid substitutionin general From (76) we see directly how to write gn(0) as a linear combinationof the values gn(αj) namely

gn(0) =

e(n)sumj=0

gn(αj)prod

0lemlee(n)m6=j

minusαmαj minus αm

that is

gn(0) =e(n)sumj=0

βj gn(αj) with βj =prod

0lemlee(n)m6=j

αmαm minus αj

The value gn(αj) is computed by the formula Γ with ε replaced by αj which we

denote by Γ|ε=αj Thus fn(x) is computed by the poly-size formulasume(n)

j=0 βj Γ|ε=αj We conclude (fn) isin VPe

Remark 713 The statement of Lemma 712 also holds with VPe replaced withVPs or with VP by a similar proof

74 Nondeterminism closure of VP1

Recall the definition of VNPx = N(VPx) from Definition 77 Valiant proved thefollowing characterisation of VNP in his seminal work [Val80] See also [BCS97Thm 2126] [Bur00 Thm 213] and [MP08 Thm 2]

Theorem 714 (Valiant [Val80]) VNPe = VNP

We strengthen Valiantrsquos characterisation of VNP from VNPe to VNP1

Theorem 715 VNP1 = VNP when char(F) 6= 2

120 Chapter 7 Algebraic branching programs

The idea of the proof is ldquoto simulate in VNP1rdquo the primitives that we used inthe proof of VPe sube VP3 (Theorem 73)

Proof of Theorem 715 Clearly VNP1 sube VNP by Lemma 72 and takingthe nondeterminism closure N We will prove that VNP sube VNP1 Recall thatin the proof of VPe sube VP3 (Theorem 73) we defined for any polynomial h thematrix

M(h) =

1 0 0h 1 00 0 1

and we called the following matrices primitives

bull M(h) with h any variable or any constant in F

bull the 3times 3 permutation matrices denoted by Mπ for π isin S3

bull the diagonal matrices Mabc = diag(a b c) with a b c isin F

In the proof of VPe sube VP3 we constructed for any family (fn) isin VPe a sequenceof primitive matrices An1 Ant(n) with t(n) isin poly(n) such that

fn(x) = ( 1 1 1 )Mminus110A1 middot middot middotAm(

111

) (77)

We will show VPe sube VNP1 by constructing a hypercube sum over a width-1abp that evaluates the right-hand side of (77) This implies VNPe sube VNP1 bytaking the N-closure Then by Valiantrsquos Theorem 714 VNP sube VNP1

Let f(x) be a polynomial and let A1 Ak be primitive matrices suchthat f(x) is computed as

f(x) = ( 1 1 1 )Ak middot middot middotA1

(111

)

View this expression as a width-3 abp G with vertex layers labeled as shown inthe left-hand diagram in Fig 71 Assume for simplicity that all edges betweenlayers are present possibly with label 0 The sum of the values of every sndasht pathin G equals f(x)

f(x) =sumjisin[3]k

Ak[jk jkminus1] middot middot middotA1[j2 j1] (78)

We introduce some hypercube variables To every vertex of G except s and twe associate a bit the bits in the ith layer we call b1[i] b2[i] b3[i] To an sndashtpath in G we associate an assignment of the bj[i] by setting the bits of verticesvisited by the path to 1 and the others to 0 For example in the right-hand

74 Nondeterminism closure of VP1 121

s

0

1

2

kminus1

k

t

A1

A2

Ak

s

1 0 0

0 1 0

0 1 0

0 0 1

0 1 0

t

Figure 71 Illustration of the layer labelling and the path labelling used in theproof of Theorem 715

diagram in Fig 71 we show an sndasht path with the corresponding assignment of thebits bj[i] The assignments of the bj[i] corresponding to sndasht paths are preciselythe assignments such that for every i isin [k] exactly one of b1[i] b2[i] b3[i] equals 1Let

V (b1 b2 b3) =prodiisin[k]

(b1[i] + b2[i] + b3[i]

)prodstisin[3]s 6=t

(1minus bs[i]bt[i]

) (79)

Then the assignments of the bj[i] corresponding to sndasht paths are precisely theassignments such that V (b1 b2 b3) = 1 Otherwise V (b1 b2 b3) = 0

We will write f(x) as a hypercube sum by replacing each Ai[ji jiminus1] in (78)by a product of affine linear forms Si(Ai) with variables b and xsum

b

V (b1 b2 b3)Sk(Ak) middot middot middotS1(A1)

Define the expression Eq(α β) = (1minus α minus β)(1minus α minus β) for α β isin 0 1 Theexpression Eq(α β) evaluates to 1 if α equals β and evaluates to 0 otherwise

bull For any variable or constant x define

Si(M(x)) =(1 + (xminus 1)(b1[i]minus b1[iminus1])

)middot(1minus (1minus b2[i])b2[iminus1]

)middot Eq

(b3[iminus1] b3[i]

)

122 Chapter 7 Algebraic branching programs

bull For any permutation π isin S3 define

Si(Mπ) = Eq(b1[iminus1] bπ(1)[i]

)middot Eq

(b2[iminus1] bπ(2)[i]

)middot Eq

(b3[iminus1] bπ(3)[i]

)

bull For any constants a b c isin F define

Si(Mabc) =(a middot b1[iminus1] + b middot b2[iminus1] + c middot b3[iminus 1]

)middot Eq

(b1[iminus1] b1[i]

)middot Eq

(b2[iminus1] b2[i]

)middot Eq

(b3[iminus1] b3[i]

)

One verifies that

f(x) =sumb

V (b1 b2 b3)Sk(Ak) middot middot middotS1(A1)

Some of the factors in the expressions for the Si(Ai) are not affine linear As afinal step we apply the equality 1 + xy = 1

2

sumcisin01(x + 1 minus 2c)(y + 1 minus 2c) to

write these factors as products of affine linear forms introducing new hypercubevariables

75 Conclusion

We finish with an overview of inclusions equalities and separations among theclasses VPk VPe VP and their approximation and nondeterminism closures(when char(F) 6= 2) see Fig 72 The figure relies on the following two simplelemmas of which proofs can be found in our paper [BIZ17]

Lemma 716 ([BIZ17 Prop 510]) VP1 = VP1

Lemma 717 ([BIZ17 Prop 511]) VP1 ( VNP1 when char(F) 6= 2

75 Conclusion 123

VP1 VP2 VPe VP

VP1 VP2 VPe VP

VNP1 VNP2 VNPe VNP=

(

(

= =

(

= sube

( sube[AW16]

717

=716 sube sube(

subesube(710

715 [Val80]

[Val79]

Figure 72 Overview of relations among the algebraic complexity classes VPkVPe VP and their approximation and nondeterminism closures (when char(F) isnot 2) The relations without reference are either by definition or follow logicallyfrom the other relations

Bibliography

[AJRS13] Elizabeth S Allman Peter D Jarvis John A Rhodes andJeremy G Sumner Tensor rank invariants inequalities andapplications SIAM J Matrix Anal Appl 34(3)1014ndash1045 2013doi101137120899066 p 14

[Alo98] Noga Alon The Shannon capacity of a union Combinatorica18(3)301ndash310 1998 doi101007PL00009824 p 37

[ASU13] Noga Alon Amir Shpilka and Christopher Umans On sunflowersand matrix multiplication Comput Complexity 22(2)219ndash243Jun 2013 doi101007s00037-013-0060-1 p 48

[AW16] Eric Allender and Fengming Wang On the power of algebraicbranching programs of width two Comput Complexity25(1)217ndash253 2016 doi101007s00037-015-0114-7 p 17109 114 123

[AZ14] Martin Aigner and Gunter M Ziegler Proofs from The BookSpringer-Verlag Berlin fifth edition 2014doi101007978-3-662-44205-0 p 71

[BC18] Boris Bukh and Christopher Cox On a fractional version ofHaemersrsquo bound arXiv 2018 arXiv180200476 p 41 42

[BCC+17] Jonah Blasiak Thomas Church Henry Cohn Joshua A GrochowEric Naslund William F Sawin and Chris Umans On cap setsand the group-theoretic approach to matrix multiplication DiscreteAnal 2017 arXiv160506702 doi1019086da1245 p 4883 84 104

125

126 Bibliography

[BCPZ16] Harry Buhrman Matthias Christandl Christopher Perry andJeroen Zuiddam Clean quantum and classical communicationprotocols Phys Rev Lett 117230503 Dec 2016doi101103PhysRevLett117230503 p 1

[BCRL79] Dario Bini Milvio Capovani Francesco Romani and Grazia LottiO(n27799) complexity for ntimes n approximate matrix multiplicationInf Process Lett 8(5)234ndash235 1979doi1010160020-0190(79)90113-3 p 3 110

[BCS97] Peter Burgisser Michael Clausen and M Amin ShokrollahiAlgebraic complexity theory volume 315 of Grundlehren MathWiss Springer-Verlag Berlin 1997doi101007978-3-662-03338-8 p 4 6 48 50 66 79 119

[BCSX10] Arnab Bhattacharyya Victor Chen Madhu Sudan and Ning XieTesting Linear-Invariant Non-linear Properties A Short Reportpages 260ndash268 Springer Berlin Heidelberg Berlin Heidelberg2010 doi101007978-3-642-16367-8_18 p 48

[BCZ17a] Markus Blaser Matthias Christandl and Jeroen Zuiddam Theborder support rank of two-by-two matrix multiplication is sevenarXiv 2017 arXiv170509652 p 1 15

[BCZ17b] Harry Buhrman Matthias Christandl and Jeroen ZuiddamNondeterministic Quantum Communication Complexity the CyclicEquality Game and Iterated Matrix Multiplication In Christos HPapadimitriou editor 8th Innovations in Theoretical ComputerScience Conference (ITCS 2017) pages 241ndash2418 2017arXiv160303757 doi104230LIPIcsITCS201724 p 115

[Ber84] Stuart J Berkowitz On computing the determinant in smallparallel time using a small number of processors Inform ProcessLett 18(3)147ndash150 1984 doi1010160020-0190(84)90018-8p 108

[BI13] Peter Burgisser and Christian Ikenmeyer Explicit lower bounds viageometric complexity theory Proceedings 45th Annual ACMSymposium on Theory of Computing 2013 pages 141ndash150 2013doi10114524886082488627 p 108

[Bin80] Dario Bini Relations between exact and approximate bilinearalgorithms Applications Calcolo 17(1)87ndash97 1980doi101007BF02575865 p 3

Bibliography 127

[BIZ17] Karl Bringmann Christian Ikenmeyer and Jeroen Zuiddam OnAlgebraic Branching Programs of Small Width In Ryan OrsquoDonnelleditor 32nd Computational Complexity Conference (CCC 2017)pages 201ndash2031 2017 doi104230LIPIcsCCC201720 p 1107 111 112 118 122

[Bla13] Anna Blasiak A graph-theoretic approach to network coding PhDthesis Cornell University 2013 URL httpsecommonscornelledubitstreamhandle181334147ab675pdf p 42

[BLMW11] Peter Burgisser Joseph M Landsberg Laurent Manivel and JerzyWeyman An overview of mathematical issues arising in thegeometric complexity theory approach to VP 6= VNP SIAM JComput 40(4)1179ndash1209 2011 doi101137090765328 p 108

[BOC92] Michael Ben-Or and Richard Cleve Computing algebraic formulasusing a constant number of registers SIAM J Comput21(1)54ndash58 1992 doi1011370221006 p 17 109 112

[BPR+00] Charles H Bennett Sandu Popescu Daniel Rohrlich John ASmolin and Ashish V Thapliyal Exact and asymptotic measuresof multipartite pure-state entanglement Phys Rev A63(1)012307 2000 doi101103PhysRevA63012307 p 48

[Bre74] Richard P Brent The parallel evaluation of general arithmeticexpressions J ACM 21(2)201ndash206 April 1974doi101145321812321815 p 112

[Bri87] Michel Brion Sur lrsquoimage de lrsquoapplication moment In Seminairedrsquoalgebre Paul Dubreil et Marie-Paule Malliavin (Paris 1986)volume 1296 of Lecture Notes in Math pages 177ndash192 SpringerBerlin 1987 doi101007BFb0078526 p 9 93 94

[BS83] Eberhard Becker and Niels and Schwartz Zum Darstellungssatzvon Kadison-Dubois Arch Math (Basel) 40(5)421ndash428 1983doi101007BF01192806 p 7 12 33

[Bur90] Peter Burgisser Degenerationsordnung und Tragerfunktionalbilinearer Abbildungen PhD thesis Universitat Konstanz 1990httpnbn-resolvingdeurnnbndebsz352-opus-20311p 57 101

[Bur00] Peter Burgisser Completeness and reduction in algebraiccomplexity theory volume 7 of Algorithms and Computation inMathematics Springer-Verlag Berlin 2000doi101007978-3-662-04179-6 p 119

128 Bibliography

[Bur04] Peter Burgisser The complexity of factors of multivariatepolynomials Found Comput Math 4(4)369ndash396 2004doi101007s10208-002-0059-5 p 110 115

[BX15] Arnab Bhattacharyya and Ning Xie Lower bounds for testingtriangle-freeness in boolean functions Comput Complexity24(1)65ndash101 2015 doi101007s00037-014-0092-1 p 48

[BZ17] Jop Briet and Jeroen Zuiddam On the orthogonal rank of Cayleygraphs and impossibility of quantum round elimination QuantumInf Comput 17(1amp2) 2017 URL httpwwwrintonpresscomxxqic17qic-17-120106-0116pdfarXiv160806113 p 2

[CHM07] Matthias Christandl Aram W Harrow and Graeme MitchisonNonzero Kronecker coefficients and what they tell us about spectraComm Math Phys 270(3)575ndash585 2007doi101007s00220-006-0157-3 p 90

[CJZ18] Matthias Christandl Asger Kjaeligrulff Jensen and Jeroen ZuiddamTensor rank is not multiplicative under the tensor product LinearAlgebra Appl 543125ndash139 2018doi101016jlaa201712020 p 2 15

[CKSV16] Suryajith Chillara Mrinal Kumar Ramprasad Saptharishi andV Vinay The chasm at depth four and tensor rank Old resultsnew insights arXiv 2016 arXiv160604200 p 15

[CLP17] Ernie Croot Vsevolod F Lev and Peter Pal Pach Progression-freesets in Zn

4 are exponentially small Ann of Math (2)185(1)331ndash337 2017 doi104007annals201718517 p 4881

[CM06] Matthias Christandl and Graeme Mitchison The spectra ofquantum states and the Kronecker coefficients of the symmetricgroup Comm Math Phys 261(3)789ndash797 2006doi101007s00220-005-1435-1 p 91

[CMR+14] Toby Cubitt Laura Mancinska David E Roberson SimoneSeverini Dan Stahlke and Andreas Winter Bounds onentanglement-assisted source-channel coding via the Lovasz thetanumber and its variants IEEE Trans Inform Theory60(11)7330ndash7344 2014 arXiv13107120doi101109TIT20142349502 p 42

Bibliography 129

[CT12] Thomas M Cover and Joy A Thomas Elements of informationtheory John Wiley amp Sons 2012 p 60

[CU13] Henry Cohn and Christopher Umans Fast matrix multiplicationusing coherent configurations In Proceedings of the Twenty-FourthAnnual ACM-SIAM Symposium on Discrete Algorithms pages1074ndash1086 SIAM 2013 p 15

[CVZ16] Matthias Christandl Peter Vrana and Jeroen ZuiddamAsymptotic tensor rank of graph tensors beyond matrixmultiplication arXiv 2016 arXiv160907476 p 2 65 67 7985

[CVZ18] Matthias Christandl Peter Vrana and Jeroen Zuiddam Universalpoints in the asymptotic spectrum of tensors In Proceedings of 50thAnnual ACM SIGACT Symposium on the Theory of Computing(STOCrsquo18) ACM New York 2018 arXiv170907851doi10114531887453188766 p 2 47 65 87 88 96 103 105

[CW82] Don Coppersmith and Shmuel Winograd On the asymptoticcomplexity of matrix multiplication SIAM J Comput11(3)472ndash492 1982 doi1011370211038 p 3

[CW87] Don Coppersmith and Shmuel Winograd Matrix multiplication viaarithmetic progressions In Proceedings of the nineteenth annualACM symposium on Theory of computing pages 1ndash6 ACM 1987p 3

[CW90] Don Coppersmith and Shmuel Winograd Matrix multiplication viaarithmetic progressions J Symbolic Comput 9(3)251ndash280 1990doi101016S0747-7171(08)80013-2 p 4 6 8 10 48 67

[CZ18] Matthias Christandl and Jeroen Zuiddam Tensor surgery andtensor rank Comput Complexity Mar 2018doi101007s00037-018-0164-8 p 2 86

[Dra15] Jan Draisma Multilinear Algebra and Applications (lecture notes)2015 URL httpsmathsitesunibechjdraismapublicationsmlapplpdfp 15

[DVC00] Wolfgang Dur Guivre Vidal and Juan Ignacio Cirac Three qubitscan be entangled in two inequivalent ways Phys Rev A (3)62(6)062314 12 2000 doi101103PhysRevA62062314 p 48

130 Bibliography

[Ede04] Yves Edel Extensions of generalized product caps Des CodesCryptogr 31(1)5ndash14 2004 doi101023A1027365901231p 48 83

[EG17] Jordan S Ellenberg and Dion Gijswijt On large subsets of Fnq with

no three-term arithmetic progression Ann of Math (2)185(1)339ndash343 2017 doi104007annals201718518 p 1048 81 83 84

[FK14] Hu Fu and Robert Kleinberg Improved lower bounds for testingtriangle-freeness in boolean functions via fast matrix multiplicationIn Approximation Randomization and CombinatorialOptimization Algorithms and Techniques (APPROXRANDOM2014) pages 669ndash676 2014doi104230LIPIcsAPPROX-RANDOM2014669 p 48

[For16] Michael Forbes Some concrete questions on the border complexityof polynomials Presentation given at the Workshop on AlgebraicComplexity Theory WACT 2016 in Tel Avivhttpswwwyoutubecomwatchv=1HMogQIHT6Q 2016 p 110

[Fra02] Matthias Franz Moment polytopes of projective G-varieties andtensor products of symmetric group representations J Lie Theory12(2)539ndash549 2002 URLhttpemisamsorgjournalsJLTvol12_no216htmlp 93 94

[Fri17] Tobias Fritz Resource convertibility and ordered commutativemonoids Math Structures Comput Sci 27(6)850ndash938 2017doi101017S0960129515000444 p 37

[Ful97] William Fulton Young tableaux volume 35 of LondonMathematical Society Student Texts Cambridge University PressCambridge 1997 With applications to representation theory andgeometry p 88

[GKKS13] Ankit Gupta Pritish Kamath Neeraj Kayal and RamprasadSaptharishi Approaching the chasm at depth four In 2013 IEEEConference on Computational ComplexitymdashCCC 2013 pages 65ndash73IEEE Computer Soc Los Alamitos CA 2013doi101109CCC201316 p 108

[GMQ16] Joshua A Grochow Ketan D Mulmuley and Youming QiaoBoundaries of VP and VNP In Ioannis Chatzigiannakis MichaelMitzenmacher Yuval Rabani and Davide Sangiorgi editors 43rd

Bibliography 131

International Colloquium on Automata Languages andProgramming (ICALP 2016) volume 55 pages 341ndash3414 2016arXiv160502815 doi104230LIPIcsICALP201634 p 110

[Gro13] Joshua A Grochow Unifying and generalizing known lower boundsvia geometric complexity theory arXiv 2013 arXiv13046333p 108

[GW09] Roe Goodman and Nolan R Wallach Symmetry representationsand invariants volume 255 of Graduate Texts in MathematicsSpringer Dordrecht 2009 doi101007978-0-387-79852-3p 88

[Hae79] Willem Haemers On some problems of Lovasz concerning theShannon capacity of a graph IEEE Trans Inform Theory25(2)231ndash232 1979 doi101109TIT19791056027 p 37 4042

[Has90] Johan Hastad Tensor rank is NP-complete J Algorithms11(4)644ndash654 1990 doi1010160196-6774(90)90014-6 p 47

[HHHH09] Ryszard Horodecki Pawe l Horodecki Micha l Horodecki and KarolHorodecki Quantum entanglement Rev Modern Phys81(2)865ndash942 2009 doi101103RevModPhys81865 p 48

[HIL13] Jonathan D Hauenstein Christian Ikenmeyer and Joseph MLandsberg Equations for lower bounds on border rank ExpMath 22(4)372ndash383 2013 doi101080105864582013825892p 15 110

[Hum75] James E Humphreys Linear algebraic groups Springer-VerlagNew York-Heidelberg 1975 Graduate Texts in Mathematics No21 p 93

[HX17] Ishay Haviv and Ning Xie Sunflowers and testing triangle-freenessof functions Comput Complexity 26(2)497ndash530 Jun 2017doi101007s00037-016-0138-7 p 48

[Ike13] Christian Ikenmeyer Geometric complexity theory tensor rankand LittlewoodndashRichardson coefficients PhD thesis UniversitatPaderborn 2013 p 14

[Kar72] Richard M Karp Reducibility among combinatorial problems InComplexity of computer computations (Proc Sympos IBM ThomasJ Watson Res Center Yorktown Heights NY 1972) pages85ndash103 Plenum New York 1972 p 36

132 Bibliography

[Knu94] Donald E Knuth The sandwich theorem Electron J Combin 11994 URL httpwwwcombinatoricsorgVolume_1Abstractsv1i1a1htmlp 41

[Kra84] Hanspeter Kraft Geometrische Methoden in der InvariantentheorieSpringer 1984 doi101007978-3-663-10143-7 p 50 88 93

[KS08] Tali Kaufman and Madhu Sudan Algebraic property testing Therole of invariance In Proceedings of the Fortieth Annual ACMSymposium on Theory of Computing STOC rsquo08 pages 403ndash412New York NY USA 2008 ACMdoi10114513743761374434 p 48

[KSS16] Robert Kleinberg William F Sawin and David E Speyer Thegrowth rate of tri-colored sum-free sets arXiv 2016arXiv160700047 p 48 79 83

[Lan06] Joseph M Landsberg The border rank of the multiplication of2times 2 matrices is seven J Amer Math Soc 19(2)447ndash459 2006doi101090S0894-0347-05-00506-0 p 110

[LG14] Francois Le Gall Powers of tensors and fast matrix multiplicationIn ISSAC 2014mdashProceedings of the 39th International Symposiumon Symbolic and Algebraic Computation pages 296ndash303 ACM NewYork 2014 doi10114526086282608664 p 4 6 8 48 85

[Lic84] Thomas Lickteig A note on border rank Inf Process Lett18(3)173ndash178 1984 doi1010160020-0190(84)90023-1p 110

[LM16a] Joseph M Landsberg and Mateusz Micha lek A 2n2 minus log(n)minus 1lower bound for the border rank of matrix multiplication arXiv2016 arXiv160807486 p 110

[LM16b] Joseph M Landsberg and Mateusz Micha lek Abelian tensorsJ Math Pures Appl 2016 doi101016jmatpur201611004p 14

[LMR13] Joseph M Landsberg Laurent Manivel and Nicolas RessayreHypersurfaces with degenerate duals and the geometric complexitytheory program Comment Math Helv 88(2)469ndash484 2013doi104171CMH292 p 108

[LO15] Joseph M Landsberg and Giorgio Ottaviani New lower bounds forthe border rank of matrix multiplication Theory Comput

Bibliography 133

11285ndash298 2015 arXiv11126007doi104086toc2015v011a011 p 108 110

[Lov79] Laszlo Lovasz On the Shannon capacity of a graph IEEE TransInform Theory 25(1)1ndash7 1979 doi101109TIT19791055985p 13 35 41

[Mar08] Murray Marshall Positive polynomials and sums of squaresvolume 146 of Mathematical Surveys and Monographs AmericanMathematical Society Providence RI 2008doi101090surv146 p 34

[MP71] Robert J McEliece and Edward C Posner Hide and seek datastorage and entropy The Annals of Mathematical Statistics42(5)1706ndash1716 1971 doi101214aoms1177693169 p 41

[MP08] Guillaume Malod and Natacha Portier Characterizing Valiantrsquosalgebraic complexity classes J Complexity 24(1)16ndash38 2008doi101016jjco200609006 p 119

[MS01] Ketan D Mulmuley and Milind Sohoni Geometric complexitytheory I An approach to the P vs NP and related problemsSIAM J Comput 31(2)496ndash526 2001doi101137S009753970038715X p 14 108

[MS08] Ketan D Mulmuley and Milind Sohoni Geometric complexitytheory II Towards explicit obstructions for embeddings amongclass varieties SIAM J Comput 38(3)1175ndash1206 2008doi101137080718115 p 108

[Nes84] Linda Ness A stratification of the null cone via the moment mapAmer J Math 106(6)1281ndash1329 1984 With an appendix byDavid Mumford doi1023072374395 p 9 93 94

[Nis91] Noam Nisan Lower bounds for non-commutative computation InProceedings of the twenty-third annual ACM symposium on Theoryof computing pages 410ndash418 ACM 1991doi101145103418103462 p 110

[Nor16] Sergey Norin A distribution on triples with maximum entropymarginal arXiv 2016 arXiv160800243 p 83

[NW97] Noam Nisan and Avi Wigderson Lower bounds on arithmeticcircuits via partial derivatives Comput Complexity 6(3)217ndash234199697 doi101007BF01294256 p 108

134 Bibliography

[Pan78] Victor Ya Pan Strassenrsquos algorithm is not optimal Trilineartechnique of aggregating uniting and canceling for constructingfast algorithms for matrix operations In 19th Annual Symposiumon Foundations of Computer Science (Ann Arbor Mich 1978)pages 166ndash176 IEEE Long Beach Calif 1978 p 3

[Pan80] Victor Ya Pan New fast algorithms for matrix operations SIAMJ Comput 9(2)321ndash342 1980 doi1011370209027 p 3

[Pan81] Victor Ya Pan New combinations of methods for the accelerationof matrix multiplication Comput Math Appl 7(1)73ndash125 1981doi1010160898-1221(81)90009-2 p 3

[Pan84] Victor Ya Pan How to multiply matrices faster volume 179 ofLecture Notes in Computer Science Springer-Verlag Berlin 1984doi1010073-540-13866-8 p 3

[Pan18] Victor Ya Pan Fast feasible and unfeasible matrix multiplicationarXiv 2018 arXiv180404102 p 6

[PD01] Alexander Prestel and Charles N Delzell Positive polynomialsSpringer Monographs in Mathematics Springer-Verlag Berlin2001 From Hilbertrsquos 17th problem to real algebradoi101007978-3-662-04648-7 p 34

[Peb16] Luke Pebody Proof of a conjecture of Kleinberg-Sawin-SpeyerarXiv 2016 arXiv160805740 p 83

[PS98] George Polya and Gabor Szego Problems and theorems inanalysis I Classics in Mathematics Springer-Verlag Berlin 1998Series integral calculus theory of functions Translated from theGerman by Dorothee Aeppli Reprint of the 1978 Englishtranslation doi101007978-3-642-61905-2 p 21

[Raz09] Ran Raz Multi-linear formulas for permanent and determinant areof super-polynomial size J ACM 56(2)Art 8 17 2009doi10114515027931502797 p 108

[Raz13] Ran Raz Tensor-rank and lower bounds for arithmetic formulasJ ACM 60(6)Art 40 15 2013 doi1011452535928 p 14

[Rom82] Francesco Romani Some properties of disjoint sums of tensorsrelated to matrix multiplication SIAM J Comput 11(2)263ndash2671982 doi1011370211020 p 3

Bibliography 135

[Sap16] Ramprasad Saptharishi A survey of lower bounds in arithmeticcircuit complexity 302 2016 Online survey URLhttpsgithubcomdasarpmarlowerbounds-survey p 6 17109 112

[Sch81] Arnold Schonhage Partial and total matrix multiplication SIAMJ Comput 10(3)434ndash455 1981 p 3

[Sch03] Alexander Schrijver Combinatorial optimization polyhedra andefficiency volume 24 Springer Science amp Business Media 2003p 37 41

[Sha56] Claude E Shannon The zero error capacity of a noisy channelInstitute of Radio Engineers Transactions on Information TheoryIT-2(September)8ndash19 1956 doi101109TIT19561056798p 13 35

[Sha09] Asaf Shapira Greenrsquos conjecture and testing linear-invariantproperties In Proceedings of the Forty-first Annual ACMSymposium on Theory of Computing STOC rsquo09 pages 159ndash166New York NY USA 2009 ACMdoi10114515364141536438 p 48

[Shi16] Yaroslav Shitov How hard is the tensor rank arXiv 2016arXiv161101559 p 47

[Sin64] Richard C Singleton Maximum distance q-nary codes IEEETrans Information Theory IT-10116ndash118 1964doi101109TIT19641053661 p 101

[SOK14] Adam Sawicki Micha l Oszmaniec and Marek Kus Convexity ofmomentum map Morse index and quantum entanglement RevMath Phys 26(3)1450004 39 2014doi101142S0129055X14500044 p 9

[SSS09] Chandan Saha Ramprasad Saptharishi and Nitin Saxena Thepower of depth 2 circuits over algebras In IARCS AnnualConference on Foundations of Software Technology and TheoreticalComputer Science volume 4 pages 371ndash382 2009arXiv09042058 doi104230LIPIcsFSTTCS20092333p 109

[Sto10] Andrew James Stothers On the complexity of matrix multiplicationPhD thesis University of Edinburgh 2010httphdlhandlenet18424734 p 4 6 8 48

136 Bibliography

[Str69] Volker Strassen Gaussian elimination is not optimal NumerMath 13(4)354ndash356 1969 doi101007BF02165411 p 3 5

[Str83] Volker Strassen Rank and optimal computation of generic tensorsLinear Algebra Appl 5253645ndash685 1983doi1010160024-3795(83)80041-X p 110

[Str86] Volker Strassen The asymptotic spectrum of tensors and theexponent of matrix multiplication In Proceedings of the 27thAnnual Symposium on Foundations of Computer Science SFCS rsquo86pages 49ndash54 Washington DC USA 1986 IEEE Computer Societydoi101109SFCS198652 p 4 7

[Str87] Volker Strassen Relative bilinear complexity and matrixmultiplication J Reine Angew Math 375376406ndash443 1987doi101515crll1987375-376406 p 3 4 49 67

[Str88] Volker Strassen The asymptotic spectrum of tensors J ReineAngew Math 384102ndash152 1988doi101515crll1988384102 p 4 7 12 19 26 27 28 2930 32 33 49 50 51

[Str91] Volker Strassen Degeneration and complexity of bilinear mapssome asymptotic spectra J Reine Angew Math 413127ndash1801991 doi101515crll1991413127 p 3 4 10 48 49 5255 56 57 66 67 81 82

[Str94] Volker Strassen Algebra and complexity In First EuropeanCongress of Mathematics Vol II (Paris 1992) volume 120 ofProgr Math pages 429ndash446 Birkhauser Basel 1994doi101007s10107-008-0221-1 p 67

[Str05] Volker Strassen Komplexitat und Geometrie bilinearerAbbildungen Jahresber Deutsch Math-Verein 107(1)3ndash31 2005p 4 88 94 95 100 101

[Tao08] Terence Tao Structure and randomness pages from year one of amathematical blog American Mathematical Soc 2008 p 48

[Tao16] Terence Tao A symmetric formulation of theCrootndashLevndashPachndashEllenbergndashGijswijt capset boundhttpsterrytaowordpresscom 2016 p 48 58 81 84

[Tob91] Verena Tobler Spezialisierung und Degeneration von TensorenPhD thesis Universitat Konstanz 1991httpnbn-resolvingdeurnnbndebsz352-opus-20324p 57

Bibliography 137

[TS16] Terence Tao and Will Sawin Notes on the ldquoslice rankrdquo of tensorshttpsterrytaowordpresscom 2016 p 48 58

[Val79] Leslie G Valiant Completeness classes in algebra In ConferenceRecord of the Eleventh Annual ACM Symposium on Theory ofComputing (Atlanta Ga 1979) pages 249ndash261 ACM New York1979 doi101145800135804419 p 107 108 123

[Val80] Leslie G Valiant Reducibility by algebraic projections Universityof Edinburgh Department of Computer Science 1980 InternalReport p 109 119 123

[VC15] Peter Vrana and Matthias Christandl Asymptotic entanglementtransformation between W and GHZ states J Math Phys56(2)022204 12 2015 arXiv13103244doi10106314908106 p 69

[VDDMV02] F Verstraete J Dehaene B De Moor and H Verschelde Fourqubits can be entangled in nine different ways Phys Rev A (3)65(5 part A)052112 5 2002 doi101103PhysRevA65052112p 48

[Wal14] Michael Walter Multipartite quantum states and their marginalsPhD thesis ETH Zurich 2014 arXiv14106820 p 93

[WDGC13] Michael Walter Brent Doran David Gross and MatthiasChristandl Entanglement polytopes multiparticle entanglementfrom single-particle information Science 340(6137)1205ndash12082013 arXiv12080365 doi101126science1232957 p 8 995

[Wil12] Virginia Vassilevska Williams Multiplying matrices faster thanCoppersmith-Winograd Extended abstract InSTOCrsquo12mdashProceedings of the 2012 ACM Symposium on Theory ofComputing pages 887ndash898 ACM New York 2012doi10114522139772214056 p 4 6 8 48

[Zui17] Jeroen Zuiddam A note on the gap between rank and border rankLinear Algebra Appl 52533ndash44 2017doi101016jlaa201703015 p 2 14 110

[Zui18] Jeroen Zuiddam The asymptotic spectrum of graphs and theShannon capacity arXiv 2018 arXiv180700169 p 35

Glossary

〈n〉 ntimes middot middot middot times n diagonal tensor 47

〈a b c〉 matrix multiplication tensor 48

G lowastH or-product 42

GH strong graph product and-product 35

α(G) stability number 35

χ(G) clique cover number 40

Kk complete graph on k vertices 36

F θ(t) quantum functional 96

G(t) GLn1 times middot middot middot timesGLnk for t isin Fn1 otimes middot middot middot otimes Fnk 52

H(P ) Shannon entropy of probability distribution P 52

h(p) binary entropy of probability p isin [0 1] 53

τ(Φ) hitting set number 59

˜τ(Φ) asymptotic hitting set number 60

ω matrix multiplication exponent 47

P moment polytope 94

139

140 Glossary

P(X) the set of probability distributions on X 52

R rank 27

˜R asymptotic rank 27

R(t) border rank 50

R(G) rank of a graph clique cover number 40

R(t) tensor rank 47

SR(t) slice rank 58

Q subrank 27

˜Q asymptotic subrank 27

Q(t) border subrank 50

Q(Φ) combinatorial subrank 10

Q(G) subrank of a graph stability number 40

supp(t) support 52

Θ(G) Shannon capacity 35

ϑ(G) Lovasz theta number 41

G tH disjoint union 36

W (t) Sn1 times middot middot middot times Snk for t isin Fn1 otimes middot middot middot otimes Fnk 53

X(S6) asymptotic spectrum of semiring S with Strassen preorder 6 25

ζ(S)(t) gauge point 51

ζθ(t) support functional 52

Samenvatting

Algebraısche complexiteit asymptotische spectra enverstrengelingspolytopen

Het is welbekend dat de rang van een matrix multiplicatief is onder het Krone-ckerproduct additief onder de directe som genormaliseerd op identiteitsmatricesen niet-stijgend onder vermenigvuldiging van links en van rechts met matricesMatrixrang is zelfs de enige reele parameter met deze vier eigenschappen In 1986initieerde Strassen de studie van de uitbreiding naar tensoren vind alle afbeel-dingen van k-tensoren naar de reele getallen die multiplicatief zijn onder hettensor Kroneckerproduct additief onder de directe som genormaliseerd op ldquoiden-titeitstensorenrdquo en niet-stijgend onder het toepassen van lineaire afbeeldingen opde k tensorfactoren Strassen noemde de verzameling van deze afbeeldingen hetldquoasymptotische spectrum van k-tensorenrdquo Hij bewees als we het asymptotischespectrum begrijpen dan begrijpen we de asymptotische relaties tussen tensorswaaronder de asymptotische subrang en de asymptotische rang In het bijzonderals we het asymptotische spectrum kennen dan kennen we de aritmetische com-plexiteit van matrixvermenigvuldiging een centraal probleem in de algebraıschecomplexiteitstheorie

Een van de hoofdresultaten in dit proefschrift is de eerste expliciete construc-tie van een oneindige familie van elementen in het asymptotische spectrum vancomplexe k-tensoren genaamd de quantumfunctionalen Onze constructie is geba-seerd op informatietheorie en momentpolytopen ook wel verstrengelingspolytopengenoemd Daarnaast bestuderen we onder andere de relatie tussen de recentgeıntroduceerde slice rang en de quantumfunctionalen en we bewijzen dat deldquoasymptotischerdquo slice rang gelijk is aan het minimum over de quantumfunctionalenNaast het bestuderen van de bovengenoemde tensorparameters geven we eenuitbreiding van de CoppersmithndashWinograd-methode (voor het verkrijgen vanondergrenzen op de asymptotische combinatorische subrang) naar hogere-orde

141

142 Samenvatting

tensoren dwz tensoren van orde minstens 4 We passen deze uitbreiding toeom nieuwe bovengrenzen te krijgen op de asymptotische tensorrang van complete-graaftensoren via de lasermethode (Gezamenlijk werk met Christandl en VranaQIP 2018 STOC 2018)

Als een nieuwe toepassing van de abstracte theorie van asymptotische spectraintroduceren we het asymptotische spectrum van grafen in de grafentheorie Ana-loog aan de situatie voor tensoren geldt als we het asymptotisch spectrum vangrafen begrijpen dan begrijpen we de Shannoncapaciteit een graafparameter diede zero-error-communicatiecomplexiteit van communicatiekanalen karakteriseertMet andere woorden we bewijzen een nieuwe dualiteitsstelling voor de Shannon-capaciteit Voorbeelden van elementen in het asymptotische spectrum van grafenzijn het thetagetal van Lovasz en de fractionele Haemersgrenzen

Tot slot bestuderen we een algebraısch model van berekening genaamd algebraicbranching programs Een algebraic branching program (abp) is het spoor vaneen product van matrices met polynomen van graad hoogstens 1 als elementenDe maximale grootte van de matrices heet de breedte van de abp In 1992bewezen Ben-Or en Cleve dat elk polynoom berekend kan worden door eenbreedte-3 abp met een aantal matrices dat polynomiaal is in de formula size vanhet polynoom Daarentegen bewezen Allender en Wang in 2011 dat sommigepolynomen niet berekend kunnen worden door breedte-2 abps Wij bewijzen dat elkpolynoom benaderd kan worden door een breedte-2 abp met een aantal matricesdat polynomiaal is in de formula size van het polynoom waarbij benaderingwordt bedoeld in de zin van degeneration (Gezamenlijk werk met Ikenmeyer enBringmann CCC 2017 JACM 2018)

Summary

Algebraic complexity asymptotic spectra andentanglement polytopes

Matrix rank is well-known to be multiplicative under the Kronecker productadditive under the direct sum normalised on identity matrices and non-increasingunder multiplying from the left and from the right by any matrices In fact matrixrank is the only real matrix parameter with these four properties In 1986 Strassenproposed to study the extension to tensors find all maps from k-tensors to thereals that are multiplicative under the tensor Kronecker product additive underthe direct sum normalised on ldquoidentity tensorsrdquo and non-increasing under actingwith linear maps on the k tensor factors Strassen called the collection of thesemaps the ldquoasymptotic spectrum of k-tensorsrdquo He proved that understandingthe asymptotic spectrum implies understanding the asymptotic relations amongtensors including the asymptotic subrank and the asymptotic rank In particularknowing the asymptotic spectrum means knowing the arithmetic complexity ofmatrix multiplication a central problem in algebraic complexity theory

One of the main results in this dissertation is the first explicit construction ofan infinite family of elements in the asymptotic spectrum of complex k-tensorscalled the quantum functionals Our construction is based on information theoryand moment polytopes ie entanglement polytopes Moreover among otherthings we study the relation of the recently introduced slice rank to the quantumfunctionals and find that ldquoasymptoticrdquo slice rank equals the minimum over thequantum functionals Besides studying the above tensor parameters we extendthe CoppersmithndashWinograd method (for obtaining asymptotic combinatorialsubrank lower bounds) to higher-order tensors ie order at least 4 We applythis generalisation to obtain new upper bounds on the asymptotic tensor rankof complete graph tensors via the laser method (Joint work with Christandland Vrana QIP 2018 STOC 2018)

143

144 Summary

In graph theory as a new instantiation of the abstract theory of asymptoticspectra we introduce the asymptotic spectrum of graphs Analogous to thesituation for tensors understanding the asymptotic spectrum of graphs meansunderstanding the Shannon capacity a graph parameter capturing the zero-errorcommunication complexity of communication channels In different words weprove a new duality theorem for Shannon capacity Some known elements in theasymptotic spectrum of graphs are the Lovasz theta number and the fractionalHaemers bounds

Finally we study an algebraic model of computation called algebraic branchingprograms An algebraic branching program (abp) is the trace of a product ofmatrices with affine linear forms as matrix entries The maximum size of thematrices is called the width of the abp In 1992 Ben-Or and Cleve provedthat width-3 abps can compute any polynomial efficiently in the formula sizeOn the other hand in 2011 Allender and Wang proved that some polynomialscannot be computed by any width-2 abp We prove that any polynomial can beefficiently approximated by a width-2 abp where approximation is defined in thesense of degeneration (Joint work with Ikenmeyer and Bringmann CCC 2017JACM 2018)

Titles in the ILLC Dissertation Series

ILLC DS-2009-01 Jakub SzymanikQuantifiers in TIME and SPACE Computational Complexity of GeneralizedQuantifiers in Natural Language

ILLC DS-2009-02 Hartmut FitzNeural Syntax

ILLC DS-2009-03 Brian Thomas SemmesA Game for the Borel Functions

ILLC DS-2009-04 Sara L UckelmanModalities in Medieval Logic

ILLC DS-2009-05 Andreas WitzelKnowledge and Games Theory and Implementation

ILLC DS-2009-06 Chantal BaxSubjectivity after Wittgenstein Wittgensteinrsquos embodied and embedded subjectand the debate about the death of man

ILLC DS-2009-07 Kata BaloghTheme with Variations A Context-based Analysis of Focus

ILLC DS-2009-08 Tomohiro HoshiEpistemic Dynamics and Protocol Information

ILLC DS-2009-09 Olivia LadinigTemporal expectations and their violations

ILLC DS-2009-10 Tikitu de JagerrdquoNow that you mention it I wonderrdquo Awareness Attention Assumption

ILLC DS-2009-11 Michael FrankeSignal to Act Game Theory in Pragmatics

ILLC DS-2009-12 Joel UckelmanMore Than the Sum of Its Parts Compact Preference Representation OverCombinatorial Domains

ILLC DS-2009-13 Stefan BoldCardinals as Ultrapowers A Canonical Measure Analysis under the Axiom ofDeterminacy

ILLC DS-2010-01 Reut TsarfatyRelational-Realizational Parsing

ILLC DS-2010-02 Jonathan ZvesperPlaying with Information

ILLC DS-2010-03 Cedric DegremontThe Temporal Mind Observations on the logic of belief change in interactivesystems

ILLC DS-2010-04 Daisuke IkegamiGames in Set Theory and Logic

ILLC DS-2010-05 Jarmo KontinenCoherence and Complexity in Fragments of Dependence Logic

ILLC DS-2010-06 Yanjing WangEpistemic Modelling and Protocol Dynamics

ILLC DS-2010-07 Marc StaudacherUse theories of meaning between conventions and social norms

ILLC DS-2010-08 Amelie GheerbrantFixed-Point Logics on Trees

ILLC DS-2010-09 Gaelle FontaineModal Fixpoint Logic Some Model Theoretic Questions

ILLC DS-2010-10 Jacob VosmaerLogic Algebra and Topology Investigations into canonical extensions dualitytheory and point-free topology

ILLC DS-2010-11 Nina GierasimczukKnowing Onersquos Limits Logical Analysis of Inductive Inference

ILLC DS-2010-12 Martin Mose BentzenStit Iit and Deontic Logic for Action Types

ILLC DS-2011-01 Wouter M KoolenCombining Strategies Efficiently High-Quality Decisions from ConflictingAdvice

ILLC DS-2011-02 Fernando Raymundo Velazquez-QuesadaSmall steps in dynamics of information

ILLC DS-2011-03 Marijn KoolenThe Meaning of Structure the Value of Link Evidence for Information Retrieval

ILLC DS-2011-04 Junte ZhangSystem Evaluation of Archival Description and Access

ILLC DS-2011-05 Lauri KeskinenCharacterizing All Models in Infinite Cardinalities

ILLC DS-2011-06 Rianne KapteinEffective Focused Retrieval by Exploiting Query Context and Document Struc-ture

ILLC DS-2011-07 Jop BrietGrothendieck Inequalities Nonlocal Games and Optimization

ILLC DS-2011-08 Stefan MinicaDynamic Logic of Questions

ILLC DS-2011-09 Raul Andres LealModalities Through the Looking Glass A study on coalgebraic modal logic andtheir applications

ILLC DS-2011-10 Lena KurzenComplexity in Interaction

ILLC DS-2011-11 Gideon BorensztajnThe neural basis of structure in language

ILLC DS-2012-01 Federico SangatiDecomposing and Regenerating Syntactic Trees

ILLC DS-2012-02 Markos MylonakisLearning the Latent Structure of Translation

ILLC DS-2012-03 Edgar Jose Andrade LoteroModels of Language Towards a practice-based account of information innatural language

ILLC DS-2012-04 Yurii KhomskiiRegularity Properties and Definability in the Real Number Continuum idealizedforcing polarized partitions Hausdorff gaps and mad families in the projectivehierarchy

ILLC DS-2012-05 David Garcıa SorianoQuery-Efficient Computation in Property Testing and Learning Theory

ILLC DS-2012-06 Dimitris GakisContextual Metaphilosophy - The Case of Wittgenstein

ILLC DS-2012-07 Pietro GallianiThe Dynamics of Imperfect Information

ILLC DS-2012-08 Umberto GrandiBinary Aggregation with Integrity Constraints

ILLC DS-2012-09 Wesley Halcrow HollidayKnowing What Follows Epistemic Closure and Epistemic Logic

ILLC DS-2012-10 Jeremy MeyersLocations Bodies and Sets A model theoretic investigation into nominalisticmereologies

ILLC DS-2012-11 Floor SietsmaLogics of Communication and Knowledge

ILLC DS-2012-12 Joris DormansEngineering emergence applied theory for game design

ILLC DS-2013-01 Simon PauwSize Matters Grounding Quantifiers in Spatial Perception

ILLC DS-2013-02 Virginie FiutekPlaying with Knowledge and Belief

ILLC DS-2013-03 Giannicola ScarpaQuantum entanglement in non-local games graph parameters and zero-errorinformation theory

ILLC DS-2014-01 Machiel KeestraSculpting the Space of Actions Explaining Human Action by IntegratingIntentions and Mechanisms

ILLC DS-2014-02 Thomas IcardThe Algorithmic Mind A Study of Inference in Action

ILLC DS-2014-03 Harald A BastiaanseVery Many Small Penguins

ILLC DS-2014-04 Ben RodenhauserA Matter of Trust Dynamic Attitudes in Epistemic Logic

ILLC DS-2015-01 Marıa Ines CrespoAffecting Meaning Subjectivity and evaluativity in gradable adjectives

ILLC DS-2015-02 Mathias Winther MadsenThe Kid the Clerk and the Gambler - Critical Studies in Statistics andCognitive Science

ILLC DS-2015-03 Shengyang ZhongOrthogonality and Quantum Geometry Towards a Relational Reconstructionof Quantum Theory

ILLC DS-2015-04 Sumit SourabhCorrespondence and Canonicity in Non-Classical Logic

ILLC DS-2015-05 Facundo CarreiroFragments of Fixpoint Logics Automata and Expressiveness

ILLC DS-2016-01 Ivano A CiardelliQuestions in Logic

ILLC DS-2016-02 Zoe ChristoffDynamic Logics of Networks Information Flow and the Spread of Opinion

ILLC DS-2016-03 Fleur Leonie BouwerWhat do we need to hear a beat The influence of attention musical abilitiesand accents on the perception of metrical rhythm

ILLC DS-2016-04 Johannes MartiInterpreting Linguistic Behavior with Possible World Models

ILLC DS-2016-05 Phong LeLearning Vector Representations for Sentences - The Recursive Deep LearningApproach

ILLC DS-2016-06 Gideon Maillette de Buy WennigerAligning the Foundations of Hierarchical Statistical Machine Translation

ILLC DS-2016-07 Andreas van CranenburghRich Statistical Parsing and Literary Language

ILLC DS-2016-08 Florian SpeelmanPosition-based Quantum Cryptography and Catalytic Computation

ILLC DS-2016-09 Teresa PiovesanQuantum entanglement insights via graph parameters and conic optimization

ILLC DS-2016-10 Paula HenkNonstandard Provability for Peano Arithmetic A Modal Perspective

ILLC DS-2017-01 Paolo GaleazziPlay Without Regret

ILLC DS-2017-02 Riccardo PinosioThe Logic of Kantrsquos Temporal Continuum

ILLC DS-2017-03 Matthijs WesteraExhaustivity and intonation a unified theory

ILLC DS-2017-04 Giovanni CinaCategories for the working modal logician

ILLC DS-2017-05 Shane Noah Steinert-ThrelkeldCommunication and Computation New Questions About Compositionality

ILLC DS-2017-06 Peter HawkeThe Problem of Epistemic Relevance

ILLC DS-2017-07 Aybuke OzgunEvidence in Epistemic Logic A Topological Perspective

ILLC DS-2017-08 Raquel Garrido AlhamaComputational Modelling of Artificial Language Learning Retention Recogni-tion amp Recurrence

ILLC DS-2017-09 Milos StanojevicPermutation Forests for Modeling Word Order in Machine Translation

ILLC DS-2018-01 Berit JanssenRetained or Lost in Transmission Analyzing and Predicting Stability in DutchFolk Songs

ILLC DS-2018-02 Hugo HuurdemanSupporting the Complex Dynamics of the Information Seeking Process

ILLC DS-2018-03 Corina KoolenReading beyond the female The relationship between perception of authorgender and literary quality

ILLC DS-2018-04 Jelle BruinebergAnticipating Affordances Intentionality in self-organizing brain-body-environmentsystems

ILLC DS-2018-05 Joachim DaiberTypologically Robust Statistical Machine Translation Understanding and Ex-ploiting Differences and Similarities Between Languages in Machine Transla-tion

ILLC DS-2018-06 Thomas BrochhagenSignaling under Uncertainty

ILLC DS-2018-07 Julian SchloderAssertion and Rejection

ILLC DS-2018-08 Srinivasan ArunachalamQuantum Algorithms and Learning Theory

ILLC DS-2018-09 Hugo de Holanda Cunha NobregaGames for functions Baire classes Weihrauch degrees transfinite computa-tions and ranks

ILLC DS-2018-10 Chenwei ShiReason to Believe

ILLC DS-2018-11 Malvin GattingerNew Directions in Model Checking Dynamic Epistemic Logic

ILLC DS-2018-12 Julia IlinFiltration Revisited Lattices of Stable Non-Classical Logics

  • Acknowledgements
  • Introduction
    • Matrix multiplication
    • The asymptotic spectrum of tensors
    • Higher-order CW method
    • Abstract asymptotic spectra
    • The asymptotic spectrum of graphs
    • Tensor degeneration
    • Combinatorial degeneration
    • Algebraic branching program degeneration
    • Organisation
      • The theory of asymptotic spectra
        • Introduction
        • Semirings and preorders
        • Strassen preorders
        • Asymptotic preorders
        • Maximal Strassen preorders
        • The asymptotic spectrum
        • The representation theorem
        • Abstract rank and subrank
        • Topological aspects
        • Uniqueness
        • Subsemirings
        • Subsemirings generated by one element
        • Universal spectral points
        • Conclusion
          • The asymptotic spectrum of graphs Shannon capacity
            • Introduction
            • The asymptotic spectrum of graphs
              • The semiring of graph isomorphism classes
              • Strassen preorder via graph homomorphisms
              • The asymptotic spectrum of graphs
              • Shannon capacity
                • Universal spectral points
                  • Lovaacutesz theta number
                  • Fractional graph parameters
                    • Conclusion
                      • The asymptotic spectrum of tensors matrix multiplication
                        • Introduction
                        • The asymptotic spectrum of tensors
                          • The semiring of tensor equivalence classes
                          • Strassen preorder via restriction
                          • The asymptotic spectrum of tensors
                          • Asymptotic rank and asymptotic subrank
                            • Gauge points
                            • Support functionals
                            • Upper and lower support functionals
                            • Asymptotic slice rank
                            • Conclusion
                              • Tight tensors and combinatorial subrank cap sets
                                • Introduction
                                • Higher-order CoppersmithndashWinograd method
                                  • Construction
                                  • Computational remarks
                                  • Examples type sets
                                    • Combinatorial degeneration method
                                    • Cap sets
                                      • Reduced polynomial multiplication
                                      • Cap sets
                                        • Graph tensors
                                        • Conclusion
                                          • Universal points in the asymptotic spectrum of tensors entanglement polytopes moment polytopes
                                            • Introduction
                                            • SchurndashWeyl duality
                                            • Kronecker and LittlewoodndashRichardson coefficients
                                            • Entropy inequalities
                                            • Hilbert spaces and density operators
                                            • Moment polytopes
                                              • General setting
                                              • Tensor spaces
                                                • Quantum functionals
                                                • Outer approximation
                                                • Inner approximation for free tensors
                                                • Quantum functionals versus support functionals
                                                • Asymptotic slice rank
                                                • Conclusion
                                                  • Algebraic branching programs approximation and nondeterminism
                                                    • Introduction
                                                    • Definitions and basic results
                                                      • Computational models
                                                      • Complexity classes
                                                      • The theorem of Ben-Or and Cleve
                                                      • Approximation closure
                                                      • Nondeterminism closure
                                                        • Approximation closure of VP2
                                                        • Nondeterminism closure of VP1
                                                        • Conclusion
                                                          • Bibliography
                                                          • Glossary
                                                          • Samenvatting
                                                          • Summary
Page 5: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch

Algebraic complexity

asymptotic spectra and

entanglement polytopes

Academisch Proefschrift

ter verkrijging van de graad van doctoraan de Universiteit van Amsterdamop gezag van de Rector Magnificus

prof dr ir KIJ Maexten overstaan van een door het College voor Promoties ingestelde

commissie in het openbaar te verdedigen in de Agnietenkapelop dinsdag 23 oktober 2018 te 1200 uur

door

Jeroen Zuiddam

geboren te Leiderdorp

Promotiecommisie

Promotores prof dr HM Buhrman Universiteit van Amsterdamprof dr M Christandl Koslashbenhavns Universitet

Overige leden prof dr M Laurent Tilburg Universityprof dr EM Opdam Universiteit van Amsterdamprof dr RM de Wolf Universiteit van Amsterdamdr J Briet CWI Amsterdamdr M Walter Universiteit van Amsterdam

Faculteit der Natuurwetenschappen Wiskunde en Informatica

Contents

Acknowledgements ix

1 Introduction 311 Matrix multiplication 512 The asymptotic spectrum of tensors 613 Higher-order CW method 1014 Abstract asymptotic spectra 1115 The asymptotic spectrum of graphs 1216 Tensor degeneration 1417 Combinatorial degeneration 1518 Algebraic branching program degeneration 1519 Organisation 17

2 The theory of asymptotic spectra 1921 Introduction 1922 Semirings and preorders 1923 Strassen preorders 2024 Asymptotic preorders 4sim 2125 Maximal Strassen preorders 2326 The asymptotic spectrum X(S6) 2527 The representation theorem 2628 Abstract rank and subrank RQ 2729 Topological aspects 29210 Uniqueness 30211 Subsemirings 31212 Subsemirings generated by one element 32213 Universal spectral points 33

v

214 Conclusion 33

3 The asymptotic spectrum of graphs Shannon capacity 3531 Introduction 3532 The asymptotic spectrum of graphs 37

321 The semiring of graph isomorphism classes G 37322 Strassen preorder via graph homomorphisms 38323 The asymptotic spectrum of graphs X(G) 39324 Shannon capacity Θ 39

33 Universal spectral points 41331 Lovasz theta number ϑ 41332 Fractional graph parameters 41

34 Conclusion 46

4 The asymptotic spectrum of tensors matrix multiplication 4741 Introduction 4742 The asymptotic spectrum of tensors 49

421 The semiring of tensor equivalence classes T 49422 Strassen preorder via restriction 49423 The asymptotic spectrum of tensors X(T ) 49424 Asymptotic rank and asymptotic subrank 50

43 Gauge points ζ(i) 5144 Support functionals ζθ 5245 Upper and lower support functionals ζθ ζθ 5646 Asymptotic slice rank 5847 Conclusion 63

5 Tight tensors and combinatorial subrank cap sets 6551 Introduction 6552 Higher-order CoppersmithndashWinograd method 68

521 Construction 69522 Computational remarks 77523 Examples type sets 78

53 Combinatorial degeneration method 7954 Cap sets 81

541 Reduced polynomial multiplication 81542 Cap sets 82

55 Graph tensors 8556 Conclusion 86

6 Universal points in the asymptotic spectrum of tensors entan-glement polytopes moment polytopes 8761 Introduction 87

vi

62 SchurndashWeyl duality 8863 Kronecker and LittlewoodndashRichardson coefficients gλmicroν c

λmicroν 90

64 Entropy inequalities 9165 Hilbert spaces and density operators 9266 Moment polytopes P(t) 93

661 General setting 93662 Tensor spaces 94

67 Quantum functionals F θ(t) 9568 Outer approximation 10069 Inner approximation for free tensors 101610 Quantum functionals versus support functionals 102611 Asymptotic slice rank 103612 Conclusion 105

7 Algebraic branching programs approximation and nondetermi-nism 10771 Introduction 10772 Definitions and basic results 110

721 Computational models 110722 Complexity classes VP VPe VPk 111723 The theorem of Ben-Or and Cleve 112724 Approximation closure C 115725 Nondeterminism closure N(C) 115

73 Approximation closure of VP2 11674 Nondeterminism closure of VP1 11975 Conclusion 122

Bibliography 125

Glossary 139

Samenvatting 141

Summary 143

vii

Acknowledgements

First of all I thank all my coauthors for very fruitful collaboration Harry BuhrmanMatthias Christandl Peter Vrana Jop Briet Chris Perry Asger Jensen MarkusBlaser Christian Ikenmeyer and Karl Bringmann

Chris Zaal Leen Torenvliet and Robert Belleman I thank for all their effortsto set up for me the ldquodouble bachelor programmerdquo in Mathematics and Computerscience at the University of Amsterdam (UvA) in 2009 This programme as wellas the ldquowebklasrdquo on the Riemann hypothesis organised by Jan van de Craats andRoland van der Veen and the close vicinity of the UvA to the Dutch nationalresearch institute for mathematics and computer science (CWI) made me decideto come to Amsterdam My enjoyable master thesis project in mathematics withEric Opdam made me follow the academic path for which I thank Eric

Of course most importantly I thank my PhD supervisor Harry Buhrmanfor introducing me to research as a bachelor student for absorbing me into theAlgorithms and Complexity group at CWI for having enough faith in me to hireme as his PhD student in 2014 and for his general guidance throughout I feelvery lucky for the opportunities and scientific freedom that this has brought me

Matthias Christandl has been my closest collaborator and mentor since wemet in Berkeley in 2014 In practice this meant countless nights of fun Skypesessions between Amsterdam and Copenhagen countless enjoyable visits to theUniversity of Copenhagen and countless kitchen table sessions at the HallinsgadeThanks Matthias for the energy inspiration and optimism And thanks Matthiasand Henriette for the hospitality

Jop Briet I thank for his general guidance and for lots of inspiration Thepolynomial method reading group which he mainly organised inspired partof my paper with Matthias Christandl and Peter Vrana on universal points inthe asymptotic spectrum of tensors (This reading group also resulted in DionGijswijtrsquos paper on cap sets) My paper with Jop on round elimination laterinspired me to write the paper on the asymptotic spectrum of graphs

ix

Christian Ikenmeyer I thank for numerous inspiring discussions on algebraiccomplexity theory and tensors which greatly influenced my papers on tensor rankand our joint paper with Karl Bringmann on algebraic branching programs

Peter Vrana I thank for our many enjoyable research collaborations the resultsof which form a central part of this dissertation for his clever insights and forfinding several mathematical mistakes while reading the draft of this dissertation

Ronald de Wolf I thank for his general advice throughout my PhD and formany suggestions regarding the current version of this dissertation which will beincorporated in the next version (but not in the printed version because of theregulations of the University of Amsterdam)

Jop Briet Monique Laurent Lex Schrijver Peter Vrana Matthias ChristandlMaris Ozols Michael Walter and Bart Sevenster I thank for helpful discussionsregarding the results in Chapter 2 and Chapter 3 of this dissertation

Srinivasan Arunachalam I thank for sharing the ups and downs during ourfour years as PhD students at CWI Florian Speelman Farrokh Labib SvenPolak Bart Litjens and Bart Sevenster I thank for numerous valuable researchdiscussions

Bikkie Aldeias and Rob van Rooijen I thank for their excellent library servicesMartijn Zuiddam and Maris Ozols I thank for proofreading the draft of this

dissertationFinally I thank my parents and my brothers and my friends for their support

Amsterdam Jeroen ZuiddamAugust 31 2018

x

Publications

[BCPZ16] Harry Buhrman Matthias Christandl Christopher Perry and JeroenZuiddam Clean quantum and classical communication protocolsPhysical Review Letters 117230503 2016httpslinkapsorgdoi101103PhysRevLett117230503

httparxivorgabs160507948

[BCZ17a] Markus Blaser Matthias Christandl and Jeroen Zuiddam The bordersupport rank of two-by-two matrix multiplication is sevenManuscript 2017httpsarxivorgabs170509652

[BCZ17b] Harry Buhrman Matthias Christandl and Jeroen ZuiddamNondeterministic Quantum Communication Complexity the CyclicEquality Game and Iterated Matrix MultiplicationIn Christos H Papadimitriou editor 8th Innovations in TheoreticalComputer Science Conference (ITCS) 2017httpdropsdagstuhldeopusvolltexte20178181

[BIZ17] Karl Bringmann Christian Ikenmeyer and Jeroen Zuiddam On algebraicbranching programs of small widthIn Ryan OrsquoDonnell editor 32nd Computational ComplexityConference (CCC) 2017httpsdoiorg104230LIPIcsCCC201720

httpsarxivorgabs170205328

Journal of the ACM Vol 65 No 5 Article 32 2018httpsdoiorg1011453209663

1

2 Acknowledgements

[BZ17] Jop Briet and Jeroen Zuiddam On the orthogonal rank of Cayley graphsand impossibility of quantum round eliminationQuantum Information and Computation 2017httpwwwrintonpresscomxxqic17qic-17-120106-0116pdf

httpsarxivorgabs160806113

[CJZ18] Matthias Christandl Asger Kjaeligrulff Jensen and Jeroen ZuiddamTensor rank is not multiplicative under the tensor productLinear Algebra and its Applications 543125ndash139 2018httpsdoiorg101016jlaa201712020

httpsarxivorgabs170509379

[CVZ16] Matthias Christandl Peter Vrana and Jeroen Zuiddam Asymptotic tensorrank of graph tensors beyond matrix multiplicationManuscript 2016httpsarxivorgabs160907476

[CVZ18] Matthias Christandl Peter Vrana and Jeroen Zuiddam Universal Pointsin the Asymptotic Spectrum of Tensors Extended AbstractIn Proceedings of 50th Annual ACM SIGACT Symposium on the Theory ofComputing (STOC) 2018httpsdoiorg10114531887453188766

httpsarxivorgabs170907851

[CZ18] Matthias Christandl and Jeroen Zuiddam Tensor surgery and tensor rank(Journal of) computational complexity 2018httpsdoiorg101007s00037-018-0164-8

httpsarxivorgabs160604085

[Zui17] Jeroen Zuiddam A note on the gap between rank and border rankLinear Algebra and its Applications 52533ndash44 2017httpsdoiorg101016jlaa201703015

httparxivorgabs150405597

[Zui18] Jeroen Zuiddam The asymptotic spectrum of graphs and the ShannoncapacityManuscript 2018httparxivorgabs180700169

This dissertation is based on the above papers with primary focus on the fourhighlighted papers

Toelichting op het relatieve belang van de co-auteurs voor elk artikel is hetbelang van de co-auteurs ongeveer gelijk verdeeld

Chapter 1

Introduction

Volker Strassen published in 1969 his famous algorithm for multiplying anytwo ntimes n matrices using only O(n281) rather than O(n3) arithmetical opera-tions [Str69] His discovery marked the beginning of a still ongoing line of researchin the field of algebraic complexity theory a line of research that by now touchesseveral fields of mathematics including algebraic geometry representation theory(quantum) information theory and combinatorics This dissertation is inspired byand contributes to this line of research

No further progress followed for almost 10 years after Strassenrsquos discoverydespite the fact that ldquomany scientists understood that discovery as a signal toattack the problem and to push the exponent further downrdquo [Pan84] Then in 1978Pan improved the exponent from 281 to 279 [Pan78 Pan80] One year later BiniCapovani Lotti and Romani improved the exponent to 278 by constructing fastldquoapproximativerdquo algorithms for matrix multiplication and making these algorithmsexact via the method of interpolation [BCRL79 Bin80] Cast in the languageof tensors the result of Bini et al corresponds to what we now call a ldquoborderrankrdquo upper bound The idea of studying approximative complexity or bordercomplexity for algebraic problems has nowadays become an important theme inalgebraic complexity theory

Schonhage then obtained the exponent 255 by constructing a fast algorithm forcomputing many ldquodisjointrdquo small matrix multiplications and transforming this intoan algorithm for one large matrix multiplication [Sch81] The upper bound was im-proved shortly after by works of Pan [Pan81] Romani [Rom82] and Coppersmithand Winograd [CW82] resulting in the exponent 250 Then in 1987 Strassenpublished the laser method with which he obtained the exponent 248 [Str87] Thelaser method was used in the same year by Coppersmith and Winograd to obtainthe exponent 238 [CW87] To do this they invented a method for constructingcertain large combinatorial structures This method or actually the extendedversion that Strassen published in [Str91] we now call the CoppersmithndashWinogradmethod All further improvements on upper bounding the exponent essentially

3

4 Chapter 1 Introduction

follow the framework of Coppersmith and Winograd and the improvements donot affect the first two digits after the comma [CW90 Sto10 Wil12 LG14]

Define ω to be the optimal exponent in the complexity of matrix multiplicationWe call ω the exponent of matrix multiplication To summarise the above historicalaccount on upper bounds ω lt 238 On the other hand the only lower bound wecurrently have is the trivial lower bound 2 le ω

The history of upper bounds on the matrix multiplication exponent ω whichbegan with Strassenrsquos algorithm and ended with the Strassen laser methodand CoppersmithndashWinograd method is well-known and well-documented seeeg [BCS97 Section 1513] However there is remarkable work of Strassen ona theory of lower bounds for ω and similar types of exponents and this workhas received almost no attention This theory of lower bounds is the theory ofasymptotic spectra of tensors and is the topic of a series of papers by Strassen[Str86 Str87 Str88 Str91 Str05]

In the foregoing the word tensor has popped up twicemdashnamely when wementioned border rank and just now when we mentioned asymptotic spectraof tensorsmdashbut we have not discussed at all why tensors should be relevant forunderstanding the complexity of matrix multiplication First we give a mini courseon tensors A k-tensor t = (ti1ik)i1ik is a k-dimensional array of numbers fromsome field say the complex numbers C Thus a 2-tensor is simply a matrix Ak-tensor is called simple if there exist k vectors v1 vk such that the entries of tare given by the products ti1ik = (v1)i1 middot middot middot (vk)ik for all indices ij The tensorrank of t is the smallest number n such that t can be written as a sum of n simpletensors Thus the tensor rank of a 2-tensor is simply its matrix rank Returning tothe problem of finding the complexity of matrix multiplication there is a special3-tensor called the matrix multiplication tensor that encodes the computationalproblem of multiplying two 2times 2 matrices This 3-tensor is commonly denotedby 〈2 2 2〉 It turns out that the matrix multiplication exponent ω is exactly theasymptotic rate of growth of the tensor rank of the ldquoKronecker powersrdquo of thetensor 〈2 2 2〉 This important observation follows from the fundamental fact thatthe computational problem of multiplying matrices is ldquoself-reduciblerdquo Namely wecan multiply two matrices by viewing them as block matrices and then performmatrix multiplication at the level of the blocks

We wrap up this introductory story To understand the computational com-plexity of matrix multiplication one should understand the asymptotic rate ofgrowth of the tensor rank of a certain family of tensors a family that is obtainedby taking powers of a fixed tensor The theory of asymptotic spectra is the theoryof bounds on such asymptotic parameters of tensors

The main story line of this dissertation concerns the theory of asymptoticspectra In Section 11 of this introduction we discuss in more detail the computa-tional problem of multiplying matrices In Section 12 we discuss the asymptoticspectrum of tensors and discuss a new result an explicit description of infinitely

11 Matrix multiplication 5

many elements in the asymptotic spectrum of tensors In Section 13 we considera new higher-order CoppersmithndashWinograd method

The theory of asymptotic spectra of tensors is a special case of an abstracttheory of asymptotic spectra of preordered semirings which we discuss in Sec-tion 14 In Section 15 we apply this abstract theory to a new setting namelyto graphs By doing this we obtain a new dual characterisation of the Shannoncapacity of graphs

The second story line of this dissertation is about degeneration an algebraickind of approximation related to the concept of border rank of Bini et al We discussdegeneration in the context of tensors in Section 16 There is a combinatorialversion of tensor degeneration which we call combinatorial degeneration Wediscuss a new result regarding combinatorial degeneration in Section 17 FinallySection 18 is about a new result concerning degeneration for algebraic branchingprograms an algebraic model of computation

We finish in Section 19 with a discussion of the organisation of this dissertationinto chapters

11 Matrix multiplication

In this section we discuss in more detail the computational problem of multiplyingtwo matrices

Algebraic complexity theory studies algebraic algorithms for algebraic problemsRoughly speaking algebraic algorithms are algorithms that use only the basicarithmetical operations + and times over some field say R or C A fundamentalexample of an algebraic problem is matrix multiplication

If we multiply two ntimesn matrices by computing the inner products between anyrow of the first matrix and any column of the second matrix one by one we needroughly 2 middot n3 arithmetical operations (+ and times) For example we can multiplytwo 2times2 matrices with 12 arithmetical operations namely 8 multiplications and 4additions(

a11 a12

a21 a22

)(b11 b12

b21 b22

)=

(a11b11 + a12b21 a11b12 + a12b22

a21b11 + a22b21 a21b12 + a22b22

)

Since matrix multiplication is a basic operation in linear algebra it is worthwhileto see if we can do better than 2 middot n3 In 1969 Strassen [Str69] published a betteralgorithm The base routine of Strassenrsquos algorithm is an algorithm for multiplyingtwo 2times 2 matrices with 7 multiplications 18 additions and certain sign changes(

a11 a12

a21 a22

)(b11 b12

b21 b22

)=

(x1 + x4 minus x5 + x7 x3 + x5

x2 + x4 x1 + x3 minus x2 + x6

)with

x1 = (a11 + a22)(b11 + b22)

6 Chapter 1 Introduction

x2 = (a21 + a22)b11

x3 = a11(b12 minus b22)

x4 = a22(minusb11 + b21)

x5 = (a11 + a12)b22

x6 = (minusa11 + a21)(b11 + b12)

x7 = (a12 minus a22)(b21 + b22)

The general routine of Strassenrsquos algorithm multiplies two n times n matrices byrecursively dividing the matrices into four blocks and applying the base routineto multiply the blocks (this is the self-reducibility of matrix multiplication thatwe mentioned earlier) The base routine does not assume commutativity of thevariables for correctness so indeed we can take the variables to be matrices Afterexpanding the recurrence we see that Strassenrsquos algorithm uses 47middotnlog2 7 asymp 47middotn281

arithmetical operations Over the years Strassenrsquos algorithm was improved bymany researchers The best algorithm known today uses C middot n238 arithmeticaloperations where C is some constant [CW90 Sto10 Wil12 LG14] The exponentof matrix multiplication ω is the infimum over all real numbers β such that forsome constant Cβ we can multiply for any n isin N any two ntimes n matrices with atmost Cβ middot nβ arithmetical operations From the above it follows that ω le 238From a simple flattening argument it follows that 2 le ω We are left with thefollowing well-known open problem what is the value of the matrix multiplicationexponent ω

The constant C for the currently best algorithm is impractically large (fora discussion of this issue see eg [Pan18]) For a practical fast algorithm oneshould either improve C or find a balance between C and the exponent of nWe will ignore the size of C in this dissertation and focus on the exponent ωFor an overview of the field of algebraic complexity theory the reader shouldconsult [BCS97] and [Sap16]

12 The asymptotic spectrum of tensors

We now discuss the theory of asymptotic spectra for tensors

Let s and t be k-tensors over a field F s isin Fn1otimesmiddot middot middototimesFnk t isin Fm1otimesmiddot middot middototimesFmk We say s restricts to t and write s gt t if there are linear maps Ai Fni rarr Fmisuch that (A1 otimes middot middot middot otimes Ak)(s) = t Let [n] = 1 n for n isin N We definethe product s otimes t isin Fn1m1 otimes middot middot middot otimes Fnkmk by (s otimes t)(i1j1)(ikjk) = si1iktj1jkfor i isin [n1] times middot middot middot [nk] and j isin [m1] times middot middot middot times [mk] This product generalizes thewell-known Kronecker product of matrices We refer to this product as the tensor(Kronecker) product We define the direct sum s oplus t isin Fn1+m1 otimes middot middot middot otimes Fnk+mk

by (s oplus t)`1`k = s`1`k if ` isin [n1] times middot middot middot times [nk] (s oplus t)n1+`1nk+`k = t`1`k if` isin [m1]times middot middot middot times [mk] and (soplus t)`1`k = 0 for the remaining indices

12 Asymptotic spectra of tensors 7

The asymptotic restriction problem asks to compute the infimum of all realnumbers β ge 0 such that for all n isin N

sotimesβn+o(n) gt totimesn

We may think of the asymptotic restriction problem as having two directionsnamely to find

1 obstructions ldquocertificatesrdquo that prohibit sotimesβn+o(n) gt totimesn or

2 constructions linear maps that carry out sotimesβn+o(n) gt totimesn

Ideally we would like to find matching obstructions and constructions so that weindeed learn the value of β

What do obstructions look like We set β equal to one it turns out that itis sufficient to understand this case We say s restricts asymptotically to t andwrite s gtsim t if

sotimesn+o(n) gt totimesn

What do obstructions look like for asymptotic restriction gtsim More precisely whatdo obstructions look like for gtsim restricted to a subset S sube k-tensors over FLet us assume S is closed under direct sum and tensor product and containsthe diagonal tensors 〈n〉 =

sumni=1 ei otimes middot middot middot otimes ei for n isin N where e1 en is the

standard basis of Fn Let X(S) be the set of all maps φ S rarr Rge0 that are

(a) monotone under restriction gt

(b) multiplicative under the tensor Kronecker product otimes

(c) additive under the direct sum oplus

(d) normalised to φ(〈n〉) = n at the diagonal tensor 〈n〉

The elements φ isin X(S) are called spectral points of S The set X(S) is called theasymptotic spectrum of S

Spectral points φ isin X(S) are obstructions Let s t isin S If s gtsim t thenby definition we have a restriction sotimesn+o(n) gt totimesn Then (a) and (b) imply theinequality φ(s)n+o(n) = φ(sotimesn+o(n)) gt φ(totimesn) = φ(t)n This implies φ(s) gt φ(t)We negate that statement if φ(s) lt φ(t) then not s gtsim t In that case φ is anobstruction to s gtsim t

The remarkable fact is that X(S) is a complete set of obstructions for gtsimNamely for s t isin S the asymptotic restriction s gtsim t holds if and only if we haveφ(s) gt φ(t) for all spectral points φ isin X(S) This was proven by Volker Strassenin [Str86 Str88] His proof uses a theorem of Becker and Schwarz [BS83] which iscommonly referred to as the KadisonndashDubois theorem (for historical reasons) or

8 Chapter 1 Introduction

the real representation theorem (We will say more about this completeness resultin Section 14)

Let us introduce tensor rank and subrank and their asymptotic versionsThe tensor rank of t is the size of the smallest diagonal tensor that restrictsto t R(t) = minr isin N t 6 〈r〉 and the subrank of t is the size of thelargest diagonal tensor to which t restricts Q(t) = maxr isin N 〈r〉 6 tAsymptotic rank is defined as ˜R(t) = limnrarrinfinR(totimesn)1n and asymptotic subrankis defined as ˜Q(t) = limnrarrinfinQ(totimesn)1n From Feketersquos lemma it follows that

˜Q(t) = supn Q(totimesn)1n and ˜R(t) = infn R(totimesn)1n One easily verifies that everyspectral point φ isin X(S) is an upper bound on asymptotic subrank and a lowerbound on asymptotic rank for any tensor t isin S

˜Q(t) le φ(t) le ˜R(t)

Strassen used the completeness of X(S) for 6sim to prove ˜Q(t) = minφisinX(S) φ(t) and

˜R(t) = maxφisinX(S) φ(t) One should think of these expressions as being dual to thedefining expressions for ˜Q and ˜R

We mentioned that Strassen was motivated to study the asymptotic spectrumof tensors by the study of the complexity of matrix multiplication The preciseconnection with matrix multiplication is as follows The matrix multiplicationexponent ω is characterised by the asymptotic rank ˜R(〈2 2 2〉) of the matrixmultiplication tensor

〈2 2 2〉 =sum

ijkisin[2]

eij otimes ejk otimes eki isin F4 otimes F4 otimes F4

via ˜R(〈2 2 2〉) = 2ω We know the trivial lower bound 2 le ω see Section 43 Weknow the (nontrivial) upper bound ω le 23728639 which is by Coppersmith andWinograd [CW90] and improvements by Stothers [Sto10] Williams [Wil12] andLe Gall [LG14] It may seem that for the study of matrix multiplication only theasymptotic rank ˜R is of interest and that the asymptotic subrank ˜Q is just a toyparameter Asymptotic subrank however plays an important role in the currentlybest matrix multiplication algorithms We will discuss this idea in the context ofthe asymptotic subrank of so-called complete graph tensors in Section 55

The important message is understanding the asymptotic spectrum of ten-sors X(S) means understanding asymptotic restriction 6sim the asymptotic sub-rank ˜Q and the asymptotic rank ˜R of tensors Of course we should now find anexplicit description of X(S)

Our main result regarding the asymptotic spectrum of tensors is the explicitdescription of an infinite family of elements in the asymptotic spectrum of allcomplex tensors X(complex k-tensors) which we call the quantum function-als (Chapter 6) Finding such an infinite family has been an open problemsince the work of Strassen Moment polytopes (studied under the name en-tanglement polytopes in quantum information theory [WDGC13]) play a key

12 Asymptotic spectra of tensors 9

role here To each tensor t is associated a convex polytope P(t) collectingrepresentation-theoretic information about t called the moment polytope of t(See eg [Nes84 Bri87 WDGC13 SOK14]) The moment polytope has twoimportant equivalent descriptions

Quantum marginal spectra description We begin with the descriptionof P(t) in terms of quantum marginal spectra

Let V be a (finite-dimensional) Hilbert space In quantum information theorya positive semidefinite hermitian operator ρ V rarr V with trace one is calleda density operator The sequence of eigenvalues of a density operator ρ is aprobability vector We let spec(ρ) = (p1 pn) be the sequence of eigenvalues of ρordered non-increasingly p1 ge middot middot middot ge pn Let V1 and V2 be Hilbert spaces Given adensity operator ρ on V1 otimes V2 the reduced density operator ρ1 = tr2 ρ is uniquelydefined by the property that tr(ρ1X1) = tr(ρ(X1otimesIdV2)) for all operators X1 on V1The operator ρ1 is again a density operator The operation tr2 is called the partialtrace over V2 In an explicit form ρ1 is given by 〈ei ρ1(ej)〉 =

sum`〈eiotimesf` ρ(ejotimesf`)〉

where the ei form a basis of V1 and the fi form an orthonormal basis of V2 (thestatement is independent of basis choice)

Let Vi be a Hilbert space and consider the tensor product V1 otimes V2 otimes V3Associate with t isin V1 otimes V2 otimes V3 the dual element tlowast = 〈t middot〉 isin (V1 otimes V2 otimes V3)

lowastThen ρt = ttlowast〈t t〉 = t〈t middot〉〈t t〉 is a density operator on V1otimesV2otimesV3 Viewing ρt

as a density operator on the regrouped space V1 otimes (V2 otimes V3) we may take thepartial trace of ρt over V2otimesV3 as described above We denote the resulting densityoperator by ρt1 = tr23 ρ

t We similarly define ρt2 and ρt3

Let V = V1otimesV2otimesV3 Let G = GL(V1)timesGL(V2)timesGL(V3) act naturally on V Let t isin V 0 The moment polytope of t is

P(t) = P(G middot t) = (spec(ρu1) spec(ρu2) spec(ρu3)) u isin G middot t 0

Here G middot t denotes the Zariski closure or equivalently the Euclidean closure in Vof the orbit G middot t = g middot t g isin G

Representation-theoretic description On the other hand there is a de-scription of P(t) in terms of non-vanishing of representation-theoretic multiplicitiesWe do not state this description here but stress that it is crucial for our proofs

Quantum functionals For any probability vector θ isin Rk (iesumk

i=1 θ(i) = 1and θ(i) ge 0 for all i isin [k]) we define the quantum functional F θ as an optimisationover the moment polytope

F θ(t) = max

2sumki=1 θ(i)H(x(i)) (x(1) x(k)) isin P(t)

Here H(y) denotes Shannon entropy of the probability vector y We prove that F θ

satisfies properties (a) (b) (c) and (d) for all complex k-tensors

Theorem (Theorem 611) F θ isin X(complex k-tensors)

10 Chapter 1 Introduction

To put our result into context Strassen in [Str91] constructed elements in theasymptotic spectrum of S = oblique k-tensors over F with the preorder 6|SThe set S is a strict and non-generic subset of all k-tensors over F These elementswe call the (Strassen) support functionals On oblique tensors over C the quantumfunctionals and the support functionals coincide An advantage of the supportfunctionals over the quantum functionals is that they are defined over any fieldIn fact the support functionals are ldquopowerful enoughrdquo to reprove the result ofEllenbergndashGijswijt on cap sets [EG17] We discuss the support functionals inSection 44

13 Higher-order CW method

Recall that in the asymptotic restriction problem we have an obstruction directionand a construction direction The quantum functionals and the support functionalsprovide obstructions Now we look at the construction direction Constructionsare asymptotic transformations sotimesβn+o(n) gt totimesn We restrict attention to the casethat t is a diagonal tensor 〈r〉 Constructions in this case essentially correspondto lower bounds on the asymptotic subrank ˜Q(s) The goal is now to constructgood lower bounds on ˜Q(s)

Strassen solved the problem of computing the asymptotic subrank for so-calledtight 3-tensors with the CoppersmithndashWinograd (CW) method and the supportfunctionals [CW90 Str91] The CW method is combinatorial Let us introduce thecombinatorial viewpoint Let I1 Ik be finite sets We call a set D sube I1timesmiddot middot middottimesIka diagonal if any two distinct elements a b isin D differ in all k coordinates LetΦ sube I1timesmiddot middot middottimes Ik We call a diagonal D sube Φ free if D = Φcap (D1timesmiddot middot middottimesDk) HereDi = ai a isin D is the projection ofD onto the ith coordinate The subrank Q(Φ)of Φ is the size of the largest free diagonal D sube Φ For two sets Φ sube I1 times middot middot middot times Ikand Ψ sube J1 times middot middot middot times Jk we define the product ΦtimesΨ sube (I1 times J1)times middot middot middot times (Ik times Jk)by Φ times Ψ = ((a1 b1) (ak bk)) a isin Φ b isin Ψ The asymptotic subrankis defined as ˜Q(Φ) = limnrarrinfinQ(Φtimesn)1n One may think of Φ as a k-partitehypergraph and of a free diagonal in Φ as an induced k-partite matching

How does this combinatorial version of subrank relate to the tensor version ofsubrank that we defined earlier Let t isin Fn1otimesmiddot middot middototimesFnk Expand t in the standardbasis t =

sumiisin[n1]timesmiddotmiddotmiddottimes[nk] ti ei1 otimes middot middot middot otimes eik Let supp(t) be the support of t in the

standard basis supp(t) = i isin [n1]timesmiddot middot middottimes [nk] ti 6= 0 Then Q(supp(t)) le Q(t)We want to construct large free diagonals Let Φ sube I1timesmiddot middot middottimesIk We call Φ tight

if there are injective maps αi Ii rarr Z such that if a isin Φ thensumk

i=1 αi(ai) = 0For a set X let P(X) be the set of probability distributions on X For θ isin P([k]) letHθ(Φ) = maxPisinP(Φ)

sumki=1 θ(i)H(Pi) where H(Pi) denotes the Shannon entropy

of the ith marginal distribution of P In [Str91] Strassen used the CW methodand the support functionals to characterise the asymptotic subrank ˜Q(Φ) fortight Φ sube I1 times I2 times I3 He proved the following Let Φ sube I1 times I2 times I3 be tight

14 Abstract asymptotic spectra 11

Then

˜Q(Φ) = minθisinP([3])

2Hθ(Φ) = maxPisinP(Φ)

miniisin[3]

2H(Pi) (11)

We study the higher-order regime Φ sube I1 times middot middot middot times Ik k ge 4

Theorem (Theorem 57) Let Φ sube I1 times middot middot middot times Ik be tight Then ˜Q(Φ) is lowerbounded by an expression that generalizes the right-hand side of (11)

Stating the lower bound requires a few definitions so we do not state it hereIt is not known whether our new lower bound matches the upper bound given byquantum or support functionals

Using Theorem 57 we managed to exactly determine the asymptotic subranksof several new examples These results in turn we used to obtain upper boundson the asymptotic rank of so-called complete graph tensors via a higher-orderStrassen laser method

14 Abstract asymptotic spectra

Strassen mainly studied tensors but he developed an abstract theory of asymptoticspectra in a general setting In the next section we apply this abstract theory tographs We now introduce the abstract theory One has a semiring S (think of asemiring as a ring without additive inverses) that contains N and a preorder 6on S that (1) behaves well with respect to the semiring operations (2) inducesthe natural order on N and (3) for any a b isin S b 6= 0 there is an r isin N sube Swith a 6 r middot b We call such a preorder a Strassen preorder The main theoremis that the asymptotic version 6sim of the Strassen preorder is characterised by themonotone semiring homomorphisms S rarr Rge0 For a b isin S let a 6sim b if there is a

sequence xn isin NN with x1nn rarr 1 when nrarrinfin and an 6 bnxn for all n isin N Let

X = X(S6) = φ isin Hom(SRge0) foralla b isin S a 6 brArr φ(a) le φ(b)

The set X is called the asymptotic spectrum of (S6)

Theorem (Strassen) a 6sim b iff forallφ isin X φ(a) le φ(b)

Strassen applies this theorem to study rank and subrank of tensors Wedefine an abstract notion of rank R(a) = minn isin N a 6 n and an abstractnotion of subrank Q(a) = maxm isin N m 6 a We then naturally have anasymptotic rank ˜R(a) = limnrarrinfinR(an)1n and (under certain mild conditions) anasymptotic subrank ˜Q(a) = limnrarrinfinQ(an)1n In fact ˜R(a) = infn R(an)1n and

˜Q(a) = supn Q(an)1n by Feketersquos lemma The theorem implies the following dualcharacterisations

12 Chapter 1 Introduction

Corollary (Section 28) If a isin S with ak gt 2 for some k isin N then

˜Q(a) = minφisinX

φ(a)

If a isin S with φ(a) ge 1 for some φ isin X then

˜R(a) = maxφisinX

φ(a)

In Chapter 2 we will discuss the abstract theory of asymptotic spectra Wewill discuss a proof of the above theorem that is obtained by integrating the proofsof Strassen in [Str88] and the proof of the KadisonndashDubois theorem of Beckerand Schwarz in [BS83] We will also discuss some basic properties of generalasymptotic spectra

15 The asymptotic spectrum of graphs

In the previous section we have seen the abstract theory of asymptotic spectraWe now discuss a problem in graph theory where we can apply this abstracttheory Consider a communication channel with input alphabet a b c d e andoutput alphabet 1 2 3 4 5 When the sender gives an input to the channel thereceiver gets an output according to the following diagram where an outgoingarrow is picked randomly (say uniformly randomly)

a 1

b 2

c 3

d 4

e 5

Output 2 has an incoming arrow from a and an incoming arrow from b Wesay a and b are confusable because the receiver cannot know whether a or bwas given as an input to the channel In this channel the pairs of inputsa b b c c d d e e a are confusable If we restrict the input set toa subset of pairwise non-confusable letters say a c then we can use the channelto communicate two messages with zero error It is clear that for this channel anynon-confusable set of inputs has size at most two Can we make better use of thechannel if we use the channel twice Yes now the input set is the set of two letterwords aa ab ac ad ae ba bb and we have a set of pairwise non-confusablewords aa bc ce db ed which has size 5 Thus ldquoper channel userdquo we can send atleast

radic5 letters What happens if we use the channel n times

15 The asymptotic spectrum of graphs 13

The situation is concisely described by drawing the confusability graph of thechannel which has the input letters as vertices and the confusable pairs of inputletters as edges For the above channel the confusability graph is the 5-cycle C5

a

b

cd

e

A subset of inputs that are pairwise non-confusable corresponds to a subset ofthe vertices in the confusability graph that contains no edges an independent setThe independence number of any graph G is the size of the largest independentset in G and is denoted by α(G) If G is the confusability graph of some channelthen the confusability graph for using the channel n times is denoted by Gn (thegraph product is called the strong graph product) The question of how manyletters we can send asymptotically translates to computing the limit

Θ(G) = limnrarrinfin

α(Gn)1n

which exists because α is supermultiplicative under The parameter Θ(G) wasintroduced by Shannon [Sha56] and is called the Shannon capacity of the graph GComputing the Shannon capacity is a nontrivial problem already for small graphsLovasz in 1979 [Lov79] computed the value Θ(C5) =

radic5 by introducing and

evaluating a new graph parameter ϑ which is now known as the Lovasz thetanumber Already for the 7-cycle C7 the Shannon capacity is not known

Duality theorem We propose a new application of the abstract theory ofasymptotic spectra to graph theory The main theorem that results from this is adual characterisation of the Shannon capacity of graphs For graphs G and H wesay G 6 H if there is a graph homomorphism Grarr H ie from the complementof G to the complement of H We show graphs are a semiring under the stronggraph product and the disjoint union t and 6 is a Strassen preorder onthis semiring The rank in this setting is the clique cover number χ(middot) = χ( middot )ie the chromatic number of the complement The subrank in this setting is theindependence number α(middot) Let X(G) be the set of semiring homomorphismsfrom graphs to Rge0 that are monotone under 6 From the abstract theory ofasymptotic spectra we derive the following duality theorem

Theorem (Theorem 31) Θ(G) = minφisinX(G) φ(G)

In Chapter 3 we will prove Theorem 31 and we will discuss the known elementsin X(G) which are the Lovasz theta number and a family of parameters obtainedby ldquofractionalisingrdquo

14 Chapter 1 Introduction

16 Tensor degeneration

We move to the second story line that we mentioned earlier degeneration Degen-eration is a prominent theme in algebraic complexity theory Roughly speakingdegeneration is an algebraic notion of approximation defined via orbit closures

For tensors for example degeneration is defined as follows Let V1 V2 V3

be finite-dimensional complex vector spaces and let V = V1 otimes V2 otimes V3 be thetensor product space Let G = GL(V1) times GL(V2) times GL(V3) act naturally on V Let s t isin V Let G middot t = g middot t g isin G be the orbit of t under G We say tdegenerates to s and write t s if s is an element in the orbit closure G middot t Herethe closure is taken with respect to the Zariski topology or equivalently withrespect to the Euclidean topology One should think of this degeneration asa topologically closed version of the restriction preorder le for tensors that wedefined earlier Degeneration is a ldquolargerrdquo preorder than restriction in the sensethat s t implies s le t

In several algebraic models of computation approximative computations cor-respond to certain degenerations In some models such an approximative com-putation can be turned into an exact computation at a small cost for exampleusing the method of interpolation The currently fastest matrix multiplicationalgorithms are constructed in this way for example

On the other hand it turns out that if a lower bound technique for analgebraic measure of complexity is ldquocontinuousrdquo then the lower bounds obtainedwith this technique are already lower bounds on the approximative version ofthe complexity measure This observation turns approximative complexity anddegeneration into an interesting topic itself A research program in this directionis the geometric complexity theory program of Mulmuley and Sohoni towardsseparating the algebraic complexity class VP (and related classes) from VNP[MS01] (see also [Ike13])

In this section we briefly discuss three results related to degeneration of tensorsthat are not discussed further in this dissertation Then we will discuss resultson combinatorial degeneration in Section 17 and algebraic branching programdegeneration in Section 18

Ratio of tensor rank and border rank The approximative or degenera-tion version of tensor rank is called border rank and is denoted by R It has beenknown since the work of Bini and Strassen that tensor rank R and border rank Rare different How much can they be different In [Zui17] we showed the followinglower bound Let k ge 3 There is a sequence of k-tensors tn in (C2n)otimesk such thatR(tn)R(tn) ge k minus o(1) when n rarr infin This answers a question of Landsbergand Micha lek [LM16b] and disproves a conjecture of Rhodes [AJRS13] Furtherprogress will most likely require the construction of explicit tensors with hightensor rank which has implications in formula complexity [Raz13]

Border support rank Support rank is a variation on tensor rank whichhas its own approximative version called border support rank A border support

17 Combinatorial degeneration 15

rank upper bound for the matrix multiplication tensor yields an upper bound onthe asymptotic complexity This was shown by Cohn and Umans in the contextof the group theoretic approach towards fast matrix multiplication [CU13] Theyasked what is the border support rank of the smallest matrix multiplicationtensor 〈2 2 2〉 In [BCZ17a] we showed that it equals seven Our proof usesthe highest-weight vector technique (see also [HIL13]) Our original motivationto study support rank is a connection that we found between support rank andnondeterministic multiparty quantum communication complexity [BCZ17b]

Tensor rank under outer tensor product We applied degeneration asa tool to study an outer tensor product otimes on tensors For s isin Cn1 otimes middot middot middot otimes Cnk

and t isin Cm1 otimes middot middot middot otimes Cm` let s otimes t be the natural (k + `)-tensor in Cn1 otimes middot middot middot otimesCnk otimes Cm1 otimes middot middot middot otimes Cm` The products otimes and otimes differ by a regrouping of thetensor indices It is well known that tensor rank is not multiplicative under otimesIn [CJZ18] we showed that tensor rank is already not multiplicative under otimes astronger result Nonmultiplicativity occurs when taking a power of a tensor whoseborder rank is strictly smaller than its tensor rank This answers a question ofDraisma [Dra15] and Saptharishi et al [CKSV16]

17 Combinatorial degeneration

In the previous section we introduced the general idea of degeneration and discusseddegeneration of tensors Combinatorial degeneration is the combinatorial analogueof tensor degeneration Consider sets Φ sube Ψ sube I1 times middot middot middot times Ik of k-tuples Wesay Φ is a combinatorial degeneration of Ψ and write Ψ Φ if there are mapsui Ii rarr Z such that for all α isin I1 times middot middot middot times Ik if α isin Ψ Φ then

sumki=1 ui(αi) gt 0

and if α isin Φ thensumk

i=1 ui(αi) = 0 We prove that combinatorial asymptoticsubrank is nonincreasing under combinatorial degeneration

Theorem (Theorem 521) If Ψ Φ then ˜Q(Ψ) ge ˜Q(Φ)

The analogous statement for subrank of tensors is trivially true The crucialpoint is that Theorem 521 is about combinatorial subrank As an example Theo-rem 521 combined with the CW method yields an elegant optimal constructionof tri-colored sum-free sets which are combinatorial objects related to cap sets

18 Algebraic branching program degeneration

We now consider degeneration in the context of algebraic branching programs Acentral theme in algebraic complexity theory is the study of the power of differentalgebraic models of computation and the study of the corresponding complexityclasses We have already (implicitly) used an algebraic model of computationwhen we discussed matrix multiplication circuits

16 Chapter 1 Introduction

bull A circuit is a directed acyclic graph G with one or more source verticesand one sink vertex Each source vertex is labelled by a variable xi ora constant α isin F The other vertices are labelled by either + or times andhave in-degree 2 (that is fan-in 2) Each vertex of G naturally computesa polynomial The value of G is the element computed at the sink vertexThe size of G is the number of vertices (One may also allow multiple sinkvertices in order to compute multiple polynomials eg to compute matrixmultiplication) Here is an example of a circuit computing xy + 2x+ y minus 1

minus1 2 x y source vertices

times times

+ +

+ sink vertex

Consider the following two models

bull A formula is a circuit whose graph is a tree

bull An algebraic branching program (abp) is a directed acyclic graph G withone source vertex s one sink vertex t and affine linear forms over the basefield F as edge labels Moreover each vertex is labeled with an integer (itslayer) and the arrows in the abp point from vertices in layer i to vertices inlayer i+ 1 The cardinality of the largest layer we call the width of the abpThe number of vertices we call the size of the abp The value of an abp isthe sum of the values of all sndasht-paths where the value of an sndasht-path is theproduct of its edge labels We say that an abp computes its value Here isan example of a width-3 abp computing xy + 2x+ y minus 1

s

t

x2

xyminus1

19 Organisation 17

The above models of computation give rise to complexity classes A complexityclass consists of families of multivariate polynomials (fn)n = (f(x1 xqn)n)nisinNover some fixed field F We say a family of polynomials (fn)n is a p-family if thedegree of fn and the number of variables of fn grow polynomially in n Let VPbe the class of p-families with polynomially bounded circuit size Let VPe be theclass of p-families with polynomially bounded formula size For k isin N let VPk bethe class of families of polynomials computable by width-k abps of polynomiallybounded size Let VPs be the class of p-families computable by skew circuitsof polynomial size Skew circuits are a type of circuits between formulas andgeneral circuits The class VPs coincides with the class of families of polynomialscomputable by abps of polynomially bounded size (see eg [Sap16]) Ben-Orand Cleve proved that VP3 = VP4 = middot middot middot = VPe [BOC92] Allender and Wangproved VP2 ( VP3 [AW16] Thus VP2 ( VP3 = VP4 = middot middot middot = VPe sube VPsThe following separation problem is one of the many open problems regardingalgebraic complexity classes Is the inclusion VPe sube VPs strict Motivated by thisseparation problem we study the approximation closure of VPe We mentionedthat Ben-Or and Cleve proved that formula size is polynomially equivalent towidth-3 abp size [BOC92] Regarding width-2 there are explicit polynomials thatcannot be computed by any width-2 abp of any size [AW16] The abp model hasa natural notion of approximation When we allow approximation in our abpsthe situation changes completely

Theorem (Theorem 78) Any polynomial can be approximated by a width-2 abpof size polynomial in the formula size

In terms of complexity classes this means VP2 = VPe where middot denotes theldquoapproximation closurerdquo of the complexity class The theorem suggests an ap-proach regarding the separation of VPe and VPs Namely superpolynomial lowerbounds on formula size may be obtained from superpolynomial lower bounds onapproximate width-2 abp size We moreover study the nondeterminism closure ofcomplexity classes and prove a new characterisation of the complexity class VNP

19 Organisation

This dissertation is divided into chapters as follows We will begin with the abstracttheory of asymptotic spectra in Chapter 2 Then we introduce the asymptoticspectra of graphs and a new characterisation of the Shannon capacity in Chapter 3In Chapter 4 we introduce the asymptotic spectrum of tensors discuss the supportfunctionals of Strassen for oblique tensors and a characterisation of asymptoticslice rank of oblique tensors as the minimum over the support functionals InChapter 5 we discuss tight tensors the higher-order CoppersmithndashWinogradmethod the combinatorial degeneration method and applications to the cap setproblem type sets and graph tensors In Chapter 6 we introduce an infinite family

18 Chapter 1 Introduction

of elements in the asymptotic spectrum of complex k-tensors and characterise theasymptotic slice rank as the minimum over the quantum functionals Finally inChapter 7 we study algebraic branching programs and approximation closure andnondeterminism closure of algebraic complexity classes

Chapter 2

The theory of asymptotic spectra

21 Introduction

This is an expository chapter about the abstract theory of asymptotic spectra ofVolker Strassen [Str88] The theory studies semirings S that are endowed with apreorder 6 The main result Theorem 212 is that under certain conditions theasymptotic version 6sim of this preorder is characterised by the semiring homomor-phisms S rarr Rge0 that are monotone under 6 These monotone homomorphismsmake up the ldquoasymptotic spectrumrdquo of (S6) For the elements of S we havenatural notions of rank and subrank generalising rank and subrank of tensorsThe asymptotic spectrum gives a dual characterisation of the asymptotic versionsof rank and subrank This dual description may be thought of as a ldquolower boundrdquomethod in the sense of computational complexity theory In Chapter 3 andChapter 4 we will study two specific pairs (S6)

22 Semirings and preorders

A (commutative) semiring is a set S with a binary addition operation + a binarymultiplication operation middot and elements 0 1 isin S such that for all a b c isin S

(1) + is associative (a+ b) + c = a+ (b+ c)

(2) + is commutative a+ b = b+ a

(3) 0 + a = a

(4) middot is associative (a middot b) middot c = a middot (b middot c)

(5) middot is commutative a middot b = b middot a

(6) 1 middot a = a

19

20 Chapter 2 The theory of asymptotic spectra

(7) middot distributes over + a middot (b+ c) = (a middot b) + (a middot c)

(8) 0 middot a = 0

As usual we abbreviate a middot b as ab A preorder is a relation 4 on a set X such thatfor all a b c isin X

(1) 4 is reflexive a 4 a

(2) 4 is transitive a 4 b and b 4 c implies a 4 c

As usual a 4 b is the same as b lt a Let N = 0 1 2 be the set of naturalnumbers and let Ngt0 = 1 2 be the set of strictly-positive natural numbersWe write le for the natural order 0 le 1 le 2 le 3 le middot middot middot on N

23 Strassen preorders

Let S be a semiring with N sube S A preorder 4 on S is a Strassen preorder if

(1) forallnm isin N n le m iff n 4 m

(2) foralla b c d isin S if a 4 b and c 4 d then a+ c 4 b+ d and ac 4 bd

(3) foralla b isin S b 6= 0 existr isin N a 4 rb

Note that condition (2) is equivalent to the condition foralla b s isin S if a 4 b thena+ s 4 b+ s and as 4 bs

Let 4 be a Strassen preorder on S Then 0 4 1 by condition (1) For a isin Swe have a 4 a by reflexivity and thus 0 4 a by condition (2)

Examples

We give two examples of a semiring with a Strassen preorder Proofs and formaldefinitions are given later

Graphs Let S be the set of all (isomorphism classes of) finite simple graphsLet GH isin S Let G t H be the disjoint union of G and H Let G H bethe strong graph product of G and H (see Chapter 3) With addition t andmultiplication the set S becomes a semiring The 0 in S is the graph with novertices and the 1 in S is the graph with a single vertex Let G be the complementof G Define a preorder 6 on S by G 6 H if there is a graph homomorphismGrarr H Then 6 is a Strassen preorder We will investigate this semiring furtherin Chapter 3

24 Asymptotic preorders 4sim 21

Tensors Let F be a field Let k isin N Let S be the set of all k-tensors over Fwith arbitrary format that is S = cupFn1 otimes middot middot middot otimes Fnk n1 nk isin N Fors isin Fn1 otimes middot middot middot otimes Fnk and t isin Fm1 otimes middot middot middot otimes Fmk let s 6 t if there are linear mapsAi Fmi rarr Fni with (A1otimesmiddot middot middototimesAk)t = s We identify any s t isin S for which s 6 tand t 6 s Let oplus be the direct sum of k-tensors and let otimes be the tensor productof k-tensors (see Chapter 4) With addition oplus and multiplication otimes the set Sbecomes a semiring The 0 in S is the zero tensor and the 1 in S is the standardbasis element e1otimes middot middot middot otimes e1 isin F1otimes middot middot middot otimes F1 The preorder 6 is a Strassen preorderWe will investigate this semiring further in Chapter 4 Chapter 5 and Chapter 6

24 Asymptotic preorders 4simDefinition 21 Let 4 be a relation on S Define the relation 4sim on S by

a2 4sim a1 if exist(xN) isin NN infNx

1NN = 1 forallN isin N aN2 4 aN1 xN (21)

If 4 is a Strassen preorder then we may in (21) replace the infimum infN x1NN

by the limit limNrarrinfin x1NN since we may assume xN+M le xNxM (if aN2 4 aN1 xN

and aM2 4 aM1 xM then aN+M2 4 aN+M

1 xNxM) and then apply Feketersquos lemma(Lemma 22)

Lemma 22 (Feketersquos lemma see [PS98 No 98]) Let x1 x2 x3 isin Rge0 satisfyxn+m le xn + xm Then limnrarrinfin xnn = infn xnn

Proof Let y = infn xnn Let ε gt 0 Let m isin Ngt0 with xmm lt y + ε Anyn isin N can be written in the form n = qm+ r where r is an integer 0 le r le mminus 1Set x0 = 0 Then xn = xqm+r le xm + xm + middot middot middot+ xm + xr = qxm + xr Therefore

xnn

=xqm+r

qm+ rle qxm + xr

qm+ r=xmm

qm

qm+ r+xrn

Thus

y le xnnlt (y + ε)

qm

n+xrn

The claim follows because xrnrarr 0 and qmnrarr 1 when nrarrinfin

For a1 a2 isin S if a1 4 a2 then clearly a1 4sim a2

Lemma 23 Let 4 be a Strassen preorder on S Then 4sim is a Strassen preorderon S the ldquoasymptotic preorderrdquo corresponding to 4

Proof Let a b c d isin S We verify that 4sim is a preorderFirst reflexivity We have a 4 a so aN 4 aN middot 1 so a 4sim a

22 Chapter 2 The theory of asymptotic spectra

Second transitivity Let a 4sim b and b 4sim c This means aN 4 bNxN andbN 4 cNyN with x

1NN rarr 1 and y

1NN rarr 1 Then aN 4 bNxN 4 cNxNyN Since

(xNyN)1N rarr 1 we conclude a 4sim cWe verify condition (1) Let nm isin N If n le m then n 4 m so n 4sim m If

n 4sim m then nN 4 nMxN so nN le mNxN which implies n le mWe verify condition (2) Let a 4sim b and c 4sim d This means aN 4 bNxN and

cN 4 dNyN Thus aNcN 4 bNdNxNyN and so ac 4sim bd Assume xN and yN arenondecreasing (otherwise set xN = maxnleN xn) Then

(a+ c)N =Nsumm=0

(N

m

)amcNminusm 4

Nsumm=0

(N

m

)bmdNminusmxmyNminusm

4Nsumm=0

(N

m

)bmdNminusmxNyN = (b+ d)NxNyN

Thus a+ c 4sim b+ dWe verify (3) Let a b isin S b 6= 0 Then there is an r isin N with a 4 rb and

thus a 4sim rb

Lemma 24 Let 4 be a Strassen preorder on S Let a1 a2 b isin S

(i) If a2 + b 4 a1 + b then a2 4sim a1

(ii) If a2b 4 a1b with b 6= 0 then a2 4sim a1

(iii) If a24simsim a1 then a2 4sim a1

(iv) If exists isin S foralln isin N na2 4 na1 + s then a2 4sim a1

Proof (ii) Let a2b 4 a1b By an inductive argument similar to the argument weused to prove (24)

forallN isin N aN2 b 4 aN1 b (22)

Let m r isin N with 1 4 mb 4 r (We use b 6= 0) From (22) follows

forallN isin N aN2 4 aN2 mb 4 aN1 mb 4 aN1 r

Thus we conclude a2 4sim a1(iii) Let a2

4simsim a1 This means aN2 4sim aN1 xN with x1NN rarr 1 This in turn means

that (aN2 )M 4 (aN1 xN)MyNM with forallN isin N y1MNM rarr 1 that is

aNM2 4 aNM1 xMN yNM

Choose a sequence N 7rarrMN such that (yNMN)1MN le 2 eg given N let MN be

the smallest M for which (yNM)1M le 2 Then aNMN2 4 aNMN

1 xMNN yNMN

and

(xMNN yNMN

)1(NMN ) = x1NN (yNMN

)1(NMN ) le x1NN 21N rarr 1

25 Maximal Strassen preorders 23

We conclude a2 4sim a1(iv) Let s isin S with foralln isin N na2 4 na1 + s We may assume a1 6= 0 Let k isin N

with s 4 ka1 Then

foralln isin N kna2 4 kna1 + ka1 = ka1(n+ 1) (23)

Apply (ii) to (23) to get

foralln isin N a2n 4sim a1(n+ 1)

By an inductive argument

forallN isin N aN2 4sim aNminus12 a12 4sim aNminus2

2 a213 4sim middot middot middot 4sim aN1 (N + 1)

Since (N + 1)1N rarr 1 a24simsim a1 From (iii) follows a2 4sim a1

(i) Let a2 + b 4 a1 + b We first prove

forallq isin N qa2 + b 4 qa1 + b (24)

By assumption the statement is true for q = 1 suppose the statement is truefor q minus 1 then

qa2 + b = (q minus 1)a2 + (a2 + b) 4 (q minus 1)a2 + (a1 + b)

= ((q minus 1)a2 + b) + a1 4 ((q minus 1)a1 + b) + a1 = qa1 + b

which proves the statement by induction Then foralln isin N na2 4 na1 + b From (iv)follows a2 4sim a1

25 Maximal Strassen preorders

Let P be the set of Strassen preorders on S For 4142 isin P we write 42 sube 41

if for all a b isin S a 42 b implies a 41 b (The notation 42 sube 41 is natural if wethink of the relations 4i as sets of pairs (a b) with a 4i b)

Lemma 25 Let 4 isin P with 4 = 4sim and a2 64 a1 Then there is an element4a1a2 isin P with 4 sube 4a1a2 and a1 4a1a2 a2

Proof For x1 x2 isin S let

x1 4a1a2 x2 if exists isin S x1 + sa2 4 x2 + sa1

The relation 4a1a2 is reflexive since x + 0 middot a2 4 x + 0 middot a1 The relation 4a1a2

is transitive if x1 4a1a2 x2 and x2 4a1a2 x3 then x1 + sa2 4 x2 + sa1 andx2 + ta2 4 x3 + ta1 for some s t isin S and so x1 + (t + s)a2 4 x2 + ta2 + sa1 4x3 + ta1 + sa1 = x3 + (t + s)a1 Thus x1 4a1a2 x3 We conclude that 4a1a2 is apreorder on S

24 Chapter 2 The theory of asymptotic spectra

We prove that 4a1a2 is a Strassen preorder If x1 4a1a2 x2 and y1 4a1a2 y2then clearly x1 + y1 4a1a2 x2 + y2 If x1 4a1a2 x2 and y isin S then x1y 4a1a2 x2yFrom this follows if x1 4a1a2 x2 and y1 4a1a2 y2 then x1y2 4a1a2 x2y2

Let nm isin N If n le m then n 4 m so n 4a1a2 m If n 6le m then n ge m+ 1Suppose n 4a1a2 m Let s isin S with n+ sa2 4 m+ sa1 Adding m+ 1 4 n gives

m+ 1 + n+ sa2 4 n+m+ sa1

Since 4 = 4sim we may apply Lemma 24 (i) to obtain

1 + sa2 4 sa1 (25)

From (25) follows s 6= 0 From (25) also follows

sa2 4 sa1 (26)

Since 4 = 4sim we may apply Lemma 24 (ii) to (26) to obtain the contradiction

a2 4 a1

Therefore n 64a1a2 m We conclude that 4a1a2 is a Strassen preorder thatis 4 isin P

Finally we have a1 4a1a2 a2 since a1 + 1 middot a2 4 a2 + 1 middot a1 Also if x1 4 x2then x1 + 0 middot a2 4 x2 + 0 middot a1 that is 4 sube 4a1a2

Let 4 be a Strassen preorder Let P4 be the set of Strassen preorderscontaining 4 ordered by inclusion sube Let C sube P4 be any chain Then theunion of all preorders in C is an element of P4 and contains all elements of CTherefore by Zornrsquos lemma P4 contains a maximal element (maximal withrespect to inclusion sube)

Lemma 26 Let 4 be maximal in P Then 4 = 4sim

Proof Trivially 4 sube 4sim From Lemma 23 we know 4sim isin P From maximalityof 4 follows 4 = 4sim

A relation 4 on S is total if for all a b isin S a 4 b or b 4 a

Lemma 27 Let 4 be maximal in P Then 4 is total

Proof Suppose 4 is not total say a1 64 a2 and a2 64 a1 By Lemma 25 there is anelement 4a1a2isin P with 4 sube 4a1a2 and a1 4a1a2 a2 Then 4 is strictly containedin 4a1a2 which contradicts the maximality of 4 We conclude 4 is total

26 The asymptotic spectrum X(S6) 25

26 The asymptotic spectrum X(S6)

Definition 28 Let S be a semiring with N sube S and let 6 be a Strassen preorderon S Let

X(S6) = φ isin Hom(SRge0) a 6 brArr φ(a) le φ(b)

We call X(S6) the asymptotic spectrum of (S6) We call the elements ofX(S6) spectral points

Lemma 29 Let 4 isin P be total There is exactly one semiring homomorphismφ S rarr Rge0 with

a 4 brArr φ(a) le φ(b)

Moreover if 4 is maximal in P then

a 4 bhArr φ(a) le φ(b)

Proof Let 4 isin P be total For a isin S define

φ(a) = inf rs

r s isin N sa 4 rψ(a) = supu

v u v isin N u 4 va

We prove ψ(a) le φ(a) Let r s u v isin N Suppose u 4 va and sa 4 r Thenfollows su 4 vsa 4 vr Thus uv le rs We prove ψ(a) ge φ(a) Supposeψ(a) lt φ(a) Let r s isin N with ψ(a) lt rs lt φ(a) Then sa 64 r Fromtotality follows sa lt r Thus ψ(a) ge rs which is a contradiction We concludeψ(a) = φ(a)

Let a b isin S We prove φ(a+ b) le φ(a) + φ(b) Let sa sb ra rb isin N Supposesaa 4 ra and sbb 4 rb Then sasba 4 sbra and sasbb 4 sarb By additionsasb(a+b) 4 sbra+sarb Thus φ(a+b) le ra

sa+ rb

sb We prove ψ(a+b) ge ψ(a)+ψ(b)

Suppose ua 4 vaa and ub 4 vbb Then vbua 4 vavba and vaub 4 vavbb By additionvbua + vaub 4 vavb(a+ b) Thus ψ(a+ b) ge ua

va+ ub

vb We thus have additivity

We prove φ(ab) le φ(a)φ(b) Suppose saa 4 ra and sbb 4 rb Then sasbab 4rarb Thus φ(ab) le ra

sa

rbsb

We prove ψ(ab) ge ψ(a)ψ(b) Suppose ua 4 vaa and

ub 4 vbb Then uaub 4 vavbab Thus uava

ubvble ψ(ab) We thus have multiplicativity

We prove monotonicity a 4 brArr φ(a) le φ(b) Suppose sbb 4 rb From a 4 bfollows sba 4 sbb 4 rb Thus φ(a) le rb

sb

We prove φ(1) = 1 Trivially 1 4 1 Therefore φ(1) le 11

= 1 and ψ(1) ge 11

= 1We prove φ(0) = 0 Trivially sa0 4 0 so φ(0) le 0

sa= 0 Trivially 0 4 va0 so

φ(0) ge 0va

= 0We prove the uniqueness of φ Let φ1 φ2 be semiring homomorphisms S rarr Rge0

with a 4 b rArr φi(a) le φi(b) Suppose φ1(a) lt φ2(a) Let u v isin N with

26 Chapter 2 The theory of asymptotic spectra

φ1(a) lt uvlt φ2(a) Then va 64 u so by totality va lt u Thus φ1(a) ge u

v which

is a contradiction This proves uniquenessFinally suppose 4 is maximal in P Lemma 26 gives 4 = 4sim Let a 64 b

From Lemma 24 (iv) follows existn na 64 nb+ 1 By totality na lt nb+ 1 Apply φto get φ(a) ge φ(b) + 1

n In particular φ(a) gt φ(b)

Lemma 210 The map

X(S6)rarr maximal elements in P6 φ 7rarr 4φwith a 4φ b iff φ(a) le φ(b) is a bijection

Proof Let φ isin X(S6) One verifies that 4φ is a Strassen preorder and 6 sube 6sim sube4φ Let 4 be maximal in P4φ Lemma 27 says that 4 is total By Lemma 29there is a ψ isin X(S6) with 4 sube 4ψ Clearly 4φ sube 4ψ The uniqueness statementof Lemma 29 implies φ = ψ This means 4φ = 4 that is 4φ is maximal Weconclude that the map is well defined

Let 4 maximal in P6 Then 4 is total By Lemma 29 there is a φ isin X(S6)with 4 sube 4φ We conclude the map is surjective

Let φ ψ isin X(S6) with 4φ = 4ψ From Lemma 29 follows φ = ψ Weconclude the map is injective

Lemma 211 Let a b isin S Then a 6sim b iff a 4 b for all maximal 4 isin P6

Proof Let 4 isin P6 be maximal Then 6sim sube 4sim = 4 by Lemma 26 so a 6sim bimplies a 4 b

Suppose a 66sim b Let n isin Nge1 with na 66sim nb+1 (Lemma 24 (iv)) By Lemma 25there is an element 4nb+1na isin P with 6sim sube 4nb+1na and we may assume 4nb+1na

is maximal Then nb+ 1 4nb+1na na and so a 64nb+1na b

27 The representation theorem

The following theorem is the main theorem

Theorem 212 ([Str88 Th 24]) Let S be a commutative semiring with N sube Sand let 6 be a Strassen preorder on S Let X = X(S6) be the set of 6-monotonesemiring homomorphisms from S to Rge0

X = X(S6) = φ isin Hom(SRge0) foralla b isin S a 6 brArr φ(a) le φ(b)

For a b isin S let a 6sim b if there is a sequence (xN) isin NN with x1NN rarr 1 when

N rarrinfin such that forallN isin N aN 6 bNxN Then

foralla b isin S a 6sim b iff forallφ isin X φ(a) le φ(b)

Proof Let a b isin S Suppose a 6sim b Then clearly for all φ isin X we haveφ(a) le φ(b) Suppose a 66sim b By Lemma 211 there is a maximal 4isin P6 witha 64 b By Lemma 210 there is a φ isin X with φ(a) gt φ(b)

28 Abstract rank and subrank RQ 27

28 Abstract rank and subrank RQ

We generalise the notions of rank and subrank for tensors to arbitrary semiringswith a Strassen preorder Let a isin S Define the rank

R(a) = minr isin N a 6 r

and the subrank

Q(a) = maxr isin N r 6 a

Then Q(a) le R(a) Define the asymptotic rank

˜R(a) = limNrarrinfin

R(aN)1N

Define the asymptotic subrank

˜Q(a) = limNrarrinfin

Q(aN)1N

By Feketersquos lemma (Lemma 22) asymptotic rank is an infimum and asymptoticsubrank is a supremum as follows

˜R(a) = infN

R(aN)1N

˜Q(a) = supN

Q(aN)1N when a = 0 or a ge 1

Theorem 212 implies that the asymptotic rank and asymptotic subrank have thefollowing dual characterisation in terms of the asymptotic spectrum (This is astraightforward generalisation of [Str88 Th 38])

Corollary 213 (cf [Str88 Th 38]) For a isin S with existφ isin Xφ(a) ge 1

˜R(a) = maxφisinX

φ(a)

Proof Let φ isin X For N isin N R(aN) ge φ(a)N Therefore ˜R(a) ge φ(a)and so ˜R(a) ge maxφisinX φ(a) It remains to prove ˜R(a) le maxφisinX φ(a) Welet x = maxφisinX φ(a) By assumption x ge 1 By definition of x we have

forallφ isin X φ(a) le x

Take the mth power on both sides

forallφ isin Xm isin N φ(am) le xm

Take the ceiling on the right-hand side

forallφ isin Xm isin N φ(am) le dxme

28 Chapter 2 The theory of asymptotic spectra

Apply Theorem 212 to get asymptotic preorders

forallm isin N am 6sim dxme

Then by definition of asymptotic preorder

forallmN isin N amN 6 dxmeN2εmN for some εmN isin o(N)

Then

forallmN isin N R(amN)1mN le dxme1m2εmNmN

From x ge 1 follows dxme1m rarr x when m rarr infin Choose m = m(N) withm(N)rarrinfin as N rarrinfin and εm(N)N isin o(N) to get ˜R(a) = infN R(aN )1N le x

Corollary 214 (cf [Str88 Th 38]) For a isin S with existk isin N ak gt 2

˜Q(a) = minφisinX

φ(a)

Proof Let φ isin X For N isin N Q(aN ) le φ(a)N Therefore ˜Q(a) le φ(a) so ˜Q(a) leminφisinX φ(a) It remains to prove ˜Q(a) ge minφisinX φ(a) Let y = minφisinX φ(a)

From the assumption ak gt 2 follows y gt 1 By definition of y we have

forallφ isin X φ(a) ge y

Take the mth power on both sides

forallφ isin Xm isin N φ(am) ge ym

Take the floor on the right-hand side

forallφ isin Xm isin N φ(am) ge bymc

Apply Theorem 212 to get asymptotic preorders

forallm isin N am gtsim bymc

Then by definition of asymptotic preorder

forallmN isin N amN2εmN gt bymcN for some εmN isin o(N)

Now we use ak gt 2 to get

forallmN isin N amN+kεmN gt bymcN

Then

forallmN isin N Q(amN+kεmN )1

mN+kεmN ge bymcN

mN+kεmN

Choose m = m(N) with m(N) rarr infin as N rarr infin and εm(N)N isin o(N) to obtain

˜Q(a) = supN Q(aN)1N ge y

29 Topological aspects 29

29 Topological aspects

Theorem 212 does not tell the full story Namely there is also a topologicalcomponent which we will now discuss Let S be a semiring with N sube S Let 6 bea Strassen preorder on S Let X = X(S6) be the asymptotic spectrum of (S6)For a isin S let

a Xrarr Rge0 φ 7rarr φ(a) (27)

The map a simply evaluates a given homomorphism φ at a One may think of aas the collection (φ(a))φisinX of all evaluations of the elements of X at a Let Rge0

have the Euclidean topology Endow X with the weak topology with respect tothe family of functions a a isin S That is endow X with the coarsest topologysuch that each a becomes continuous

Let C(XRge0) be the semiring of continuous functions Xrarr Rge0 with additionand multiplication defined pointwise on X that is (f + g)(x) = f(x) + g(x)and (f middot g)(x) = f(x)g(x) for f g isin C(XRge0) and x isin X Define the semiringhomomorphism

Φ S rarr C(XRge0) a 7rarr a

which maps a to the evaluator a defined in (27)

Theorem 215 ([Str88 Th 24])

(i) X is a nonempty compact Hausdorff space

(ii) foralla b isin S a 6sim b iff Φ(a) le Φ(b) pointwise on X

(iii) Φ(S) separates the points of X

Proof Statement (ii) follows from Theorem 212Statement (iii) is clearWe prove statement (i) We have 2 66sim 1 so from Theorem 212 follows that X

cannot be emptyFor a isin S let na isin N with a le na Then for φ isin X φ(a) le na and so

φ(a) isin [0 na] Embed X subeprod

aisinS[0 na] as a set via φ 7rarr (φ(a))aisinS The setprodaisinS[0 na] with the product topology is compact by the theorem of TychonoffTo see that X is closed in

prodaisinS[0 na] we write X as an intersection of sets

X =φ isin

prodaisinS

[0 na] φ(0) = 0capφ isin

prodaisinS

[0 na] φ(1) = 1

cap⋂bcisinS

φ isin

prodaisinS

[0 na] φ(b+ c)minus φ(b)minus φ(c) = 0

cap⋂bcisinS

φ isin

prodaisinS

[0 na] φ(bc)minus φ(b)φ(c) = 0

30 Chapter 2 The theory of asymptotic spectra

cap⋂bcisinSblec

φ isin

prodaisinS

[0 na] φ(b) le φ(c)

and we observe that the intersected sets are closed

X = 0minus1(0) cap 1minus1(1)

cap⋂bcisinS

((b+ c)minus bminus c

)minus1(0)

cap⋂bcisinS

((bc)minus bc

)minus1(0)

cap⋂bcisinSblec

(cminus b

)minus1([0infin))

This implies X is also compactLet φ ψ isin X be distinct Let a isin S with φ(a) 6= ψ(a) Then a(φ) 6= a(ψ)

Let U 3 a(φ) V 3 a(ψ) be open and disjoint subsets of Rge0 Then aminus1(U) andaminus1(V ) are open and disjoint subsets of X We conclude that X is Hausdorff

210 Uniqueness

Let S be a semiring with N sube S Let 6 be a Strassen preorder on S LetX = X(S6) be the asymptotic spectrum of (S6) The object X is unique inthe following sense

Theorem 216 ([Str88 Cor 27]) Let Y be a compact Hausdorff space LetΨ S rarr C(YRge0) be a homomorphism of semirings such that

Ψ(S) separates the points of Y (28)

and

foralla b isin S a 6sim bhArr Ψ(a) le Ψ(b) pointwise on Y (29)

Then there is a unique homeomorphism (continuous bijection with continuousinverse) h Y rarr X such that the diagram

S

C(XRge0) C(YRge0)

ΨΦ

hlowast

(210)

commutes where hlowast φ 7rarr φ h Namely let h y 7rarr(a 7rarr Ψ(a)(y)

)

211 Subsemirings 31

Proof We prove uniqueness Suppose there are two such homeomorphisms

h1 h2 Y rarr X

Suppose x 6= h2(hminus11 (x)) for some x isin X Since Φ(S) separates the points of X

there is an a isin S with Φ(a)(x) 6= Φ(a)(h2(hminus11 (x))) Let y = hminus1

1 (x) isin Y ThenΦ(a)(h1(y)) 6= Φ(a)(h2(y)) Since (210) commutes Φ(a)(h1(y)) = Ψ(a)(y) andΦ(a)(h2(y)) = Ψ(a)(y) a contradiction

We prove existence Let h Y rarr X y 7rarr (a 7rarr Ψ(a)(y)) One verifies that his well-defined continuous injective and that the diagram in (210) commutes Itremains to show that h is surjective We know that Q middot Φ(S) is a Q-subalgebraof C(XR) which separates points and which contains the nonzero constantfunction Φ(1) so by the StonendashWeierstrass theorem Q middot Φ(S) is dense in C(XR)under the sup-norm Suppose h is not surjective Then h(Y) ( X is a properclosed subset Let x0 isin X h(Y) be in the complement Since X is a compactHausdorff space there is a continuous function f Xrarr [minus1 1] with

f(h(Y)) = 1

f(x0) = minus1

We know that f can be approximated by elements from Q middot Φ(S) ie let ε gt 0then there are a1 a2 isin S N isin N such that

1N

(Φ(a1)(x)minus Φ(a2)(x)

)gt 1minus ε for all x isin h(Y)

1N

(Φ(a1)(x0)minus Φ(a2)(x0)

)lt minus1 + ε

This means Ψ(a1) ge Ψ(a2) pointwise on Y so a1 gtsim a2 but also Φ(a1) 6ge Φ(a2)pointwise on X so a1 6gtsim a2 This is a contradiction

211 Subsemirings

Let S be a subsemiring of a semiring T and let 6 be a Strassen preorder on T Then the restriction 6|S is a Strassen preorder on S How are the asymptoticspectra X(S6|S) and X(T6) related Obviously for φ isin X(T6) we haveφ|S isin X(S6|S) In fact the uniqueness theorem of Section 210 implies that allelements of X(S6|S) are restrictions of elements of X(T6)

Corollary 217 Let S be a subsemiring of a semiring T Let 6 be a Strassenpreorder on T Then

X(S6|S) = X(T6)|S

Proof Let

X = X(S6|S)

32 Chapter 2 The theory of asymptotic spectra

Φ S rarr C(XRge0) a 7rarr a

and let

Y = X(T6)|S = φ|S φ isin X(T6)Ψ S rarr C(YRge0) a 7rarr

(φ|S 7rarr φ|S(a)

)

Then Y is a compact Hausdorff space Let φ|S ψ|S isin Y be distinct Then there isan a isin S with φ|S(a) 6= ψ|S(a) so (28) holds For a b isin S a 6sim b iff Φ(a) le Φ(b)iff Ψ(a) le Ψ(b) so (29) holds Therefore

h X(T6)|S rarr X(S6|S) φ|S 7rarr(a 7rarr Ψ(a)(φ|S)

)= φ|S

is a homeomorphism

212 Subsemirings generated by one element

Let S be a semiring and let 6 be a Strassen preorder on S We specialise to thesimplest type of subsemiring of S Namely let a isin S and let

N[a] = ksumi=0

ni ai k isin N ni isin N

sube S

be the subsemiring of S generated by a We call X(N[a]) = X(N[a]6|N[a]) theasymptotic spectrum of a

Corollary 218 (cf [Str88]) If ak gt 2 for some k isin N then

˜Q isin X(N[a])

If φ(a) ge 1 for some φ isin X then

˜R isin X(N[a])

Proof Let X = X(N[a]) Let n1 nq By Corollary 214

˜Q(an1 + middot middot middot+ anq) = minφisinX

φ(an1 + middot middot middot+ anq)

Since φ is a homomorphism φ(an1 + middot middot middot+ anq) = φ(a)n1 + middot middot middot+ φ(a)nq Now weobserve that xn1 + middot middot middot+ xnq is minimised by taking x minimal in the domain Weconclude

˜Q(an1 + middot middot middot+ anq) =

qsumi=1

(minφisinX

φ(a))ni = ˜Q(a)n1 + middot middot middot+ ˜Q(a)nq

The claim for asymptotic rank ˜R similarly follows from Corollary 213

213 Universal spectral points 33

Remark 219 In general asymptotic subrank ˜Q and asymptotic rank ˜R are notelements of the asymptotic spectrum We will see an example in Chapter 4 relatedto the matrix multiplication tensor

Remark 220 Corollary 218 is closely related to Schonhagersquos τ -theorem fortensors also called Schonhagersquos asymptotic sum inequality The τ -theorem featuresin every recent fast matrix multiplication algorithm (ie every algorithm based onthe laser method)

Remark 221 An element φ isin X(N[a]) is uniquely determined by the valueof φ(a) isin Rge0 We may thus identify the asymptotic spectrum X(N[a]) with acompact (ie closed and bounded) subset of the positive reals Rge0 via φ 7rarr φ(a)

213 Universal spectral points

Having discussed the simplest type of subsemiring in the previous section letus discuss the most difficult type of supersemiring When applying the theoryof asymptotic spectra to some setting there is a natural largest semiring S inwhich the objects of study live For example we may study the semiring S of all(equivalence classes of) 3-tensors of arbitrary format over F Or we may studythe semiring S of all (isomorphism classes of) finite simple graphs We refer tothe elements of the asymptotic spectrum X(S) of the ldquoambientrdquo semiring S bythe term universal spectral points (cf [Str88 page 119]) The universal spectralpoints are the most useful monotone homomorphisms

214 Conclusion

To a semiring S with a Strassen preorder 6 we associated an asymptotic pre-order 6sim We proved that this asymptotic preorder is characterised by the6-monotone semiring homomorphisms S rarr Rge0 which make up the asymp-totic spectrum X(S6) of (S6) For (S6) we naturally have a rank functionR S rarr N and a subrank function Q S rarr N Their asymptotic versions

˜R(a) = infn R(an)1n and ˜Q(a) = supn Q(an)1n coincide with maxφisinX(S6) φ(a)

and minφisinX(S6) φ(a) respectively assuming existφ isin Xφ(a) ge 1 and existk isin N ak gt 2respectively Unfortunately we have proved the existence of the asymptotic spec-trum by nonconstructive means Explicitly constructing spectral points for a givenpair (S6) will be a challenging task

Some remarks about our proof in this chapter The proof in [Str88] uses theKadisonndashDubois theorem from the paper of Becker and Schwartz [BS83] as ablack-box Our presentation basically integrates the proof of Strassen with theproof of Becker and Schwartz The notions of rank and subrank were in [Str88] onlydiscussed for tensors We considered the straightforward generalisation to arbitrary

34 Chapter 2 The theory of asymptotic spectra

semirings with a Strassen preorder An evident feature of our presentation is thatwe do not pass from the semiring to its Grothendieck ring but instead stay inthe semiring In this way we stay close to the ldquoreal worldrdquo objects I thank JopBriet and Lex Schrijver for this idea There is a large body of literature on theKadisonndashDubois theorem for which we refer to the modern books by Prestel andDelzell [PD01 Theorem 526] and Marshall [Mar08 Theorem 544]

Chapter 3

The asymptotic spectrum of graphsShannon capacity

This chapter is based on the manuscript [Zui18]

31 Introduction

This chapter is about the Shannon capacity of graphs which was introduced byClaude Shannon in the context of coding theory [Sha56] More precisely we willapply the theory of asymptotic spectra of Chapter 2 to gain a better understandingof Shannon capacity (and other asymptotic properties of graphs)

We first recall the definition of the Shannon capacity of a graph Let G be a(finite simple) graph with vertex set V (G) and edge set E(G) An independent setor stable set in G is a subset of V (G) that contains no edges The independencenumber or stability number α(G) is the cardinality of the largest independentset in G For graphs G and H the and-product GH also called strong graphproduct is defined by

V (GH) = V (G)times V (H)

E(GH) =(g h) (gprime hprime)

(g gprime isin E(G) or g = gprime

)and

(h hprime isin E(H) or h = hprime

)and (g h) 6= (gprime hprime)

The Shannon capacity Θ(G) is defined as the limit

Θ(G) = limNrarrinfin

α(GN)1N (31)

This limit exists and equals the supremum supN α(GN)1N by Feketersquos lemma(Lemma 22)

Computing the Shannon capacity is nontrivial already for small graphs Lovaszin [Lov79] computed the value Θ(C5) =

radic5 where Ck denotes the k-cycle graph

by introducing and evaluating a new graph parameter ϑ which is now known as

35

36 Chapter 3 The asymptotic spectrum of graphs Shannon capacity

the Lovasz theta number For example the value of Θ(C7) is currently not knownThe Shannon capacity Θ is not known to be hard to compute in the sense ofcomputational complexity On the other hand deciding whether α(G) le k givena graph G and k isin N is NP-complete [Kar72]

New result dual description of Shannon capacity

The new result of this chapter is a dual characterisation of the Shannon capacityof graphs This characterisation is obtained by applying Strassenrsquos theory ofasymptotic spectra of Chapter 2 Thus this chapter also serves as an illustrationof the theory of asymptotic spectra

To state the theorem we need the standard notions graph homomorphismgraph complement and graph disjoint union Let G and H be graphs A graphhomomorphism f G rarr H is a map f V (G) rarr V (H) such that for allu v isin V (G) if u v isin E(G) then f(u) f(v) isin E(H) In other words a graphhomomorphism maps edges to edges The complement G of G is defined by

V (G) = V (G)

E(G) =u v u v 6isin E(G) u 6= v

We define a relation 6 on graphs let G 6 H if there is a graph homomor-phism Grarr H from the complement of G to the complement of H The disjointunion G tH is defined by

V (G tH) = V (G) t V (H)

E(G tH) = E(G) t E(H)

For n isin N the complete graph Kn is the graph with V (Kn) = [n] = 1 2 nand E(Kn) = i j i j isin [n] i 6= j Thus K0 = K0 is the empty graphand K1 = K1 is the graph consisting of a single vertex and no edges

Theorem 31 Let S sube graphs be a collection of graphs which is closed underthe disjoint union t and the strong graph product and which contains the graphwith a single vertex K1 Define the asymptotic spectrum X(S) as the set of allmaps φ S rarr Rge0 such that for all GH isin S

(1) if G 6 H then φ(G) le φ(H)

(2) φ(G tH) = φ(G) + φ(H)

(3) φ(GH) = φ(G)φ(H)

(4) φ(K1) = 1

32 The asymptotic spectrum of graphs 37

Let G 6sim H if there is a sequence (xN) isin NN with x1NN rarr 1 when N rarr infin such

that for every N isin N

GN 6 (HN)txN = HN t middot middot middot tHN︸ ︷︷ ︸xN

Then

(i) G 6sim H iff forallφ isin X(S) φ(G) le φ(H)

(ii) Θ(G) = minφisinX(S) φ(G)

Statement (ii) of Theorem 31 is nontrivial in the sense that Θ is not anelement of X(graphs) Namely Θ is not additive under t by a result ofAlon [Alo98] and Θ is not multiplicative under by a result of Haemers [Hae79]It turns out that the graph parameter G 7rarr maxφisinX(graphs) φ(G) is itself anelement of X(graphs) and is equal to the fractional clique cover number χf (seeSection 332 and eg [Sch03 Eq (67112)]) Fritz in [Fri17] proves (independentlyof Strassenrsquos line of work) a statement that is weaker than Theorem 31 Namelyhe proves the statement of Theorem 31 without the additivity condition (2)

In Section 32 we will prove Theorem 31 by applying the theory of asymptoticspectra of Chapter 2 to the appropriate semiring and preorder In Section 33 wewill discuss the elements in the asymptotic spectrum of graphs X(graphs) thatare currently known to me the Lovasz theta number the fractional clique covernumber the fractional orthogonal rank of the complement and the fractionalHaemers bounds We moreover prove a sufficient condition for the ldquofractionalisa-tionrdquo of a graph parameter to be in the asymptotic spectrum of graphs

32 The asymptotic spectrum of graphs

In this section we prove Theorem 31 by applying the theory of asymptotic spectrato the appropriate semiring

321 The semiring of graph isomorphism classes GA graph homomorphism f Grarr H is a graph isomorphism if f is bijective asa map V (G)rarr V (H) and bijective as a map E(G)rarr E(H) We write G sim= Hif there is a graph isomorphism f G rarr H The relation sim= is an equivalencerelation on graphs which we call isomorphism For example the graphs Gand H given by

V (G) = a b c d E(G) = a b b c c d a dV (H) = 1 2 3 4 E(H) = 1 3 2 3 2 4 1 4

38 Chapter 3 The asymptotic spectrum of graphs Shannon capacity

are isomorphic Let G = graphssim= be the set of equivalence classes in graphsunder sim= ie the isomorphism classes The relation 6 is a preorder on G Recallthat Kn is the complete graph on n vertices and thus Kn is the graph with nvertices and no edges

Lemma 32 Let ABC isin graphs

(i) t and are commutative and associative operations on G

(ii) distributes over t on G ie A (B t C) = (AB) t (A C)

(iii) K1 A = A

(iv) K0 A = K0

(v) K0 t A = A

(vi) Kn tKm = Kn+m

Proof We leave the proof to the reader

In other words Lemma 32 says that (Gt K0 K1) is a (commutative)semiring in which the elements K0 K1 K2 behave like the natural numbers NWe will denote this semiring simply by G

322 Strassen preorder via graph homomorphisms

Let G be the semiring of graphs Recall that G 6 H if there is a graph homomor-phism f Grarr H

Lemma 33 The preorder 6 is a Strassen preorder on G That is for graphsABCD isin G we have the following

(i) For nm isin N Kn 6 Km iff n le m

(ii) If A 6 B and C 6 D then A t C 6 B tD and A C 6 B D

(iii) For AB isin G if B 6= K0 then there is an r isin N with A 6 Kr B

Proof Statement (i) is easy to verify We prove (ii) Let f Ararr B and g C rarr Dbe graph homomorphisms Let the map f t g V (A) t V (C)rarr V (B) t V (D) bedefined by

(f + g)(a) = f(a) for a isin V (A)

(f + g)(c) = g(c) for c isin V (C)

32 The asymptotic spectrum of graphs 39

One verifies directly that f t g is a graph homomorphism A t C rarr B tD Letthe map f g V (A)times V (C)rarr V (B)times V (D) be defined by

(f g)(a c) = (f(a) g(c))

One verifies directly that f g is a graph homomorphism A C rarr B D Thisproves (ii) We prove (iii) Let r = |V (A)| Then A 6 Kr By assumptionB 6= K0 so K1 6 B Therefore A 6 Kr

sim= Kr1 6 KrB This proves (iii)

323 The asymptotic spectrum of graphs X(G)

We thus have a semiring G with a Strassen preorder 6 We are therefore in theposition to apply the theory of asymptotic spectra (Chapter 2) Let us translatethe abstract terminology to this setting

Let G 6sim H if there is a sequence (xN) isin NN with (xN)1N rarr 1 such that forevery N isin N we have GN 6 HN KxN ie GN 6 (HN)txN

Let S sube G be a subsemiring For example one may take S = G or one maychoose any set X sube G and let S = N[X] be the subsemiring of G generated by Xunder t and

The asymptotic spectrum of S is the set X(S) of 6-monotone semiring homo-morphisms S rarr Rge0 ie all maps φ S rarr Rge0 such that for all GH isin S

(1) if G 6 H then φ(G) le φ(H)

(2) φ(G tH) = φ(G) + φ(H)

(3) φ(GH) = φ(G)φ(H)

(4) φ(K1) = 1

We call X(G) the asymptotic spectrum of graphs

Theorem 34 Let GH isin S Then G 6sim H iff forallφ isin X(S) φ(G) le φ(H)

Proof By Lemma 32 we have a semigroup S and by Lemma 33 we have aStrassen preorder 6 so we may apply Theorem 212

We refer to Chapter 2 for a discussion of the topological properties of X(S)

324 Shannon capacity Θ

Let us discuss the (asymptotic) rank and (asymptotic) subrank for (G6) Recallthat an independent set in G is a subset of V (G) that contains no edges andthe independence number α(G) is the cardinality of the largest independent setin G A colouring of G is an assignment of colours to the elements of V (G) suchthat connected vertices get distinct colours The chromatic number χ(G) is the

40 Chapter 3 The asymptotic spectrum of graphs Shannon capacity

smallest number of colours in any colouring of G The clique cover number χ(G)is defined as the chromatic number of the complement χ(G) = χ(G)

For the semiring G with preorder 6 the abstract definition of subrank ofSection 28 becomes Q(G) = maxm isin N Km 6 G and the abstract definitionof rank becomes R(G) = minn isin N G 6 Kn

Lemma 35

(i) α(G) = Q(G)

(ii) χ(G) = R(G)

Proof We leave the proof to the reader

We see directly that the asymptotic rank is the Shannon capacity

˜R(G) = limNrarrinfin

R(GN)1N = limNrarrinfin

α(GN)1N = Θ(G)

and that the asymptotic subrank is the asymptotic clique cover number

˜Q(G) = limNrarrinfin

Q(GN)1N = limNrarrinfin

χ(GN)1N = ˜χ(G)

Let S sube G be a subsemiring Let G isin S

Corollary 36 Θ(G) = minφisinX(S) φ(G)

Proof Let G be a graph Either G = K0 or K1 6 G 6 K1 or G contains at leastone edge In the first two cases the claim is clearly true In the third case G gt K2

and we may thus apply Corollary 213

Corollary 37 ˜χ(G) = maxφisinX(S) φ(G)

Proof This is Corollary 214

Remark 38 As mentioned earlier it turns out that ˜χ is in fact itself an elementof X(G) See Section 332 (This is a striking difference with the situation fortensors which we will discuss in Chapter 4 there both asymptotic rank andasymptotic subrank are not in the asymptotic spectrum see Remark 44)

Shannon capacity is not in the asymptotic spectrum

Lemma 39 GG ge K|V (G)|

Proof Let D = (u u) u isin V (G) Let (u u) (v v) isin D Then eitheru v isin E(G) or u v isin E(G) (exclusive or) and so (u u) (v v) 6isin E(GG)Therefore the subgraph in GG induced by D is isomorphic to K|V (G)|

Example 310 Let G be the Schlafli graph This is a graph with 27 verticesThus Θ(GG) ge |V (G)| = 27 On the other hand Haemers in [Hae79] showedthat Θ(G)Θ(G) le 21 This implies the map Θ is not in X(G) since it is notmultiplicative under

33 Universal spectral points 41

33 Universal spectral points

The abstract theory of asymptotic spectra of Chapter 2 does not explicitly describethe elements of X(G) ie the universal spectral points (cf Section 213) Howeverseveral graph parameters from the literature can be shown to be universal spectralpoints In fact recently in [BC18] the first infinite family of universal spectralpoints was found the fractional Haemers bounds We give a brief (and probablyincomplete) overview of currently known elements in X(G)

331 Lovasz theta number ϑ

For any real symmetric matrix A let Λ(A) be the largest eigenvalue The Lovasztheta number ϑ(G) is defined as

ϑ(G) = minΛ(A) A isin RV (G)timesV (G) symmetric u v 6isin E(G)rArr Auv = 1

The parameter ϑ(G) was introduced by Lovasz in [Lov79] We refer to [Knu94]and [Sch03] for a survey It follows from well-known properties that ϑ isin X(G)

332 Fractional graph parameters

Besides the Lovasz theta number there are several elements in X(G) that arenaturally obtained as fractional versions of -submultiplicative t-subadditive6-monotone maps G rarr Rge0 For any map φ G rarr Rge0 we define a fractionalversion φf by

φf (G) = infd

φ(GKd

)d

We will discuss several fractional parameters from the literature and prove ageneral theorem about fractional parameters

Fractional clique cover number

We consider the fractional version of the clique cover number χ(G) = χ(G) It iswell-known that χf isin X(G) see eg [Sch03] The fractional clique cover number χfin fact equals the asymptotic clique cover number ˜χ(G) = limNrarrinfin χ(GN)1N

which we introduced in the previous section see [MP71] and also [Sch03 Th 6717]

Fractional Haemers bound

Let rank(A) denote the matrix rank of any matrix A For any set C of matricesdefine rank(C) = minrank(A) A isin C For a field F and a graph G define theset of matrices

MF(G) = A isin FV (G)timesV (G) foralluv Avv 6= 0 u v 6isin E(G)rArr Auv = 0

42 Chapter 3 The asymptotic spectrum of graphs Shannon capacity

Let RF(G) = rank(MF(G)) The parameter RF(G) was introduced by Haemersin [Hae79] and is known as the Haemers bound The fractional Haemers bound RF

f

was studied by Anna Blasiak in [Bla13] and was recently shown to be -multiplica-tive by Bukh and Cox in [BC18] From this it is not hard to prove that RF

f isin X(G)Bukh and Cox in [BC18] furthermore prove a separation result for any field F ofnonzero characteristic and any ε gt 0 there is a graph G such that for any field Fprimewith char(F) 6= char(Fprime) the inequality RF

f(G) lt εRFprimef (G) holds This separation

result implies that there are infinitely many elements in X(G)

Fractional orthogonal rank

In [CMR+14] the orthogonal rank ξ(G) and its fractional version the projectiverank ξf (G) are studied It easily follows from results in [CMR+14] that G 7rarr ξf (G)is in X(G)

General fractional parameters

We will prove something general about fractional parameters Define the lexico-graphic product GnH by

V (GnH) = V (G)times V (H)

E(GnH) =(g h) (gprime hprime) g gprime isin E(G)

or (g = gprime and h hprime isin E(H))

The lexicographic product satisfies GnH = GnH Also define the or-productG lowastH by

V (G lowastH) = V (G)times V (H)

E(G lowastH) =(g h) (gprime hprime) g gprime isin E(G) or h hprime isin E(H)

The or-product and the strong graph product are related by G lowastH = GH Thestrong graph product gives a subgraph of the lexicographic product which gives asubgraph of the or-product

GH sube GnH sube G lowastH

Therefore G lowastH 6 GnH 6 GH Finally GnKd = G lowastKd and of courseGKd = Gtd

We will prove if φ G rarr Rge0 is -submultiplicative t-subadditive and 6-monotone then φf is again -submultiplicative t-subadditive and 6-monotoneMoreover if φ G rarr N is 6-monotone and satisfies

forallGH isin G φ(GnH) ge φ(GnKφ(H))

then φf is n-supermultiplicative and more importantly φf is -supermultiplica-tive

33 Universal spectral points 43

Lemma 311

(i) If φ is t-superadditive then φf is t-superadditive

(ii) If φ is 6-monotone then φf is 6-monotone

(iii) If φ is t-subadditive and 6-monotone then φf is t-subadditive

(iv) If foralln isin N φ(Kn) = n then foralln isin N φf (Kn) = n

(v) If φ is -submultiplicative and 6-monotone then φf is -submultiplicative

Proof Let GH isin G Let d isin N(i) The lexicographic product distributes over the disjoint union

(G tH) nKd = (GnKd) t (H nKd)

By superadditivity

φ((GnKd) t (H nKd)) ge φ(GnKd) + φ(H nKd)

Therefore

φf (G tH) = infd

φ((G tH) nKd)

d

= infd

φ((GnKd) t (H nKd))

d

ge infd

φ(GnKd)

d+φ(H nKd)

d

ge infd1

φ(GnKd1)

d1

+ infd2

φ(H nKd2)

d2

= φf (G) + φf (H)

(ii) Let G 6 H Then G n Kd 6 H n Kd Thus φ(G n Kd) le φ(H n Kd)Therefore φf (G) le φf (H)

(iii) We have GnKd 6 GKd = Gtd Thus by monotonicity and subadditivity

φ(GnKd) le dφ(G)

and for d e isin N

φ(GnKde) = φ((GnKd) nKe) le eφ(GnKd)

We use this inequality to get for d1 d2 isin N

φ(GnKd1)

d1

+φ(H nKd2)

d2

ge φ(GnKd1d2) + φ(H nKd1d2)

d1d2

44 Chapter 3 The asymptotic spectrum of graphs Shannon capacity

From subadditivity follows

φ(GnKd1d2) + φ(H nKd1d2)

d1d2

ge φ((GnKd1d2) t (H nKd1d2))

d1d2

=φ((G tH) nKd1d2)

d1d2

ge φf (G tH)

We conclude φf (G) + φf (H) ge φf (G tH)(iv) Let n isin N Then φf (Kn) = infd φ(Kn nKd)d = infd φ(Knd)d = n(v) Let d1 d2 isin N We claim

(GH) nKd1d2 le (GnKd1) (H nKd2)

This is the same as saying there is a graph homomorphism

(GH) nKd1d2 rarr (GnKd1) (H nKd2)

which is the same as saying there is a graph homomorphism

(G lowastH) nKd1d2 rarr (GnKd1) lowast (H nKd2)

where lowast denotes the or-product of graphs One verifies that (g h (i j)) 7rarr((g i) (h j)) is such a graph homomorphism proving the claim The claimtogether with monotonicity and submultiplicativity gives

φ((GH)nKd1d2) le φ((GnKd1) (H nKd2)) le φ(GnKd1)φ(H nKd2)

Therefore

φf (GH) = infd

φ((GH) nKd)

d

= infd1d2

φ((GH) nKd1d2)

d1d2

le infd1d2

φ(GnKd1)

d1

φ(H nKd2)

d2

= φf (G)φf (H)

This concludes the proof of the lemma

Lemma 312 Let φ G rarr N satisfy

forallGH isin G φ(GnH) ge φ(GnKφ(H)) (32)

Then

infH

φ(GnH)

φ(H)= inf

d

φ(GnKd)

d

33 Universal spectral points 45

Proof From (32) follows

φ(GnH)

φ(H)geφ(GnKφ(H))

φ(H)

and so

φ(GnH)

φ(H)ge inf

d

φ(GnKd)

d

We take the infimum over H to get

infH

φ(GnH)

φ(H)ge inf

d

φ(GnKd)

d

The inequality in the other direction

infH

φ(GnH)

φ(H)le inf

d

φ(GnKd)

d

is trivially true

Lemma 313 Let φ G rarr N be 6-monotone and satisfy

forallGH isin G φ(GnH) ge φ(GnKφ(H))

Then φf is n- and -supermultiplicative

Proof Let AB isin G We have AB gt AnB so

φf (AB) ge φf (AnB)

It remains to show φf (AnB) ge φf (A)φf (B) We have

φ(AnB nH)

φ(H)=φ(An (B nH))

φ(B nH)

φ(B nH)

φ(H)

which implies

φ(AnB nH)

φ(H)ge inf

Hprime

φ(AnH prime)

φ(H prime)infHprimeprime

φ(B nH primeprime)

φ(H primeprime)= φf (A)φf (B)

Take the infimum over H to obtain φf (AnB) ge φf (A)φf (B)

Theorem 314 Let φ G rarr N be t-additive -submultiplicative 6-monotoneand Kn-normalised and satisfy

forallGH isin G φ(GnH) ge φ(GnKφ(H))

Then φf is in X(G)

Proof This follows from Lemma 311 Lemma 312 and Lemma 313

46 Chapter 3 The asymptotic spectrum of graphs Shannon capacity

34 Conclusion

In this chapter we introduced a new connection between Strassenrsquos theory ofasymptotic spectra and the Shannon capacity of graphs In particular we charac-terised the Shannon capacity (which is defined as a supremum) as a minimisationover elements in the asymptotic spectrum of graphs Known elements in theasymptotic spectrum of graphs include the fractional clique cover number theLovasz theta number the projective rank and the fractional Haemers bound Weare left with a clear goal for future work find all elements in the asymptoticspectrum of graphs

Chapter 4

The asymptotic spectrum of tensorsexponent of matrix multiplication

This chapter is based on joint work with Matthias Christandl and Peter Vrana [CVZ18]

41 Introduction

This chapter is about tensors t isin Fn1 otimes middot middot middot otimes Fnk and their asymptotic propertiesThe theory of asymptotic spectra of Chapter 2 was developed by Strassen exactlyfor the purpose of understanding the asymptotic properties of tensors Thischapter is expository and provides the necessary background for understandingChapter 5 and Chapter 6

Let us first define the asymptotic properties of interest and discuss some oftheir applications We need the concepts restriction tensor product and diagonaltensor Let s isin Fn1 otimes middot middot middot otimes Fnk and t isin Fm1 otimes middot middot middot otimes Fmk be tensors We say srestricts to t and write s gt t if there are linear maps Ai Fni rarr Fmi suchthat t = (A1 otimes middot middot middot otimes Ak) middot s The tensor product of s and t is the elements otimes t isin Fn1m1 otimes middot middot middot otimes Fnkmk with coordinates (s otimes t)ij = sitj We naturallydefine the direct sum s oplus t isin Fn1+m1 otimes middot middot middot otimes Fnk+mk We define the diagonaltensors 〈n〉 =

sumni=1 ei otimes middot middot middot otimes ei for n isin N where e1 en is the standard basis

of Fn The tensor rank R(t) is the smallest number n isin N such that t can bewritten as a sum of simple tensors a simple tensor being a tensor of the formv1 otimes middot middot middot otimes vk Equivalently R(t) = minn isin N t 6 〈n〉 The asymptotic rankis the regularisation ˜R(t) = limnrarrinfinR(totimesn)1n While tensor rank is known to behard to compute [Has90 Shi16] we do not know whether asymptotic rank is hardto compute

The exponent of matrix multiplication

The motivating example for studying asymptotic rank is the problem of findingthe exponent of matrix multiplication ω Recall from the introduction that ω

47

48 Chapter 4 The asymptotic spectrum of tensors matrix multiplication

is the infimum over a isin R such that two n times n matrices can be multipliedusing O(na) arithmetic operations (in the algebraic circuit model) It turns out(see [BCS97]) that ω is characterised by the asymptotic rank ˜R(〈2 2 2〉) of thematrix multiplication tensor

〈2 2 2〉 =sum

ijkisin[2]

eij otimes ejk otimes eki isin F4 otimes F4 otimes F4

Namely ˜R(〈2 2 2〉) = 2ω We know the trivial lower bound 2 le ω see Section 43We know the (non-trivial) upper bound ω le 23728639 which is by Coppersmithand Winograd [CW90] and improvements by Stothers Williams and Le Gall[Sto10 Wil12 LG14]

Asymptotic subrank and asymptotic restriction

Besides (asymptotic) rank we naturally define subrank Q(t) = maxm isin N 〈m〉 6 t and the asymptotic subrank ˜Q(t) = limnrarrinfinQ(totimesn)1n Moreover wesay s restricts asymptotically to t written s gtsim t if there is a sequence of naturalnumbers a(n) isin o(n) such that for all n isin N

sotimesn otimes 〈2〉otimesa(n) gt totimesn

One can prove (see [Str91]) that

sotimesn otimes 〈2〉otimeso(n) gt totimesn iff sotimesn+o(n) gt totimesn

Our goal is to understand asymptotic restriction asymptotic rank and asymptoticsubrank

More connections quantum information combinatorics algebraic prop-erty testing

Besides matrix multiplication other applications of asymptotic restriction oftensors asymptotic rank of tensors and asymptotic subrank of tensors includedeciding the feasibility of an asymptotic transformation between pure quantumstates via stochastic local operations and classical communication (slocc) inquantum information theory [BPR+00 DVC00 VDDMV02 HHHH09] boundingthe size of combinatorial structures like cap sets and tri-colored sum-free sets inadditive combinatorics [Ede04 Tao08 ASU13 CLP17 EG17 Tao16 BCC+17KSS16 TS16] see Chapter 5 and bounding the query complexity of certainproperties in algebraic property testing [KS08 BCSX10 Sha09 BX15 HX17FK14]

This chapter is organised as follows In Section 42 we briefly discuss thesemiring of tensors the asymptotic spectrum of tensors and asymptotic rank and

42 The asymptotic spectrum of tensors 49

subrank In Section 43 we discuss the gauge points a simple construction of finitelymany elements in the asymptotic spectrum of tensors In Section 44 we discussthe Strassen support functionals a family of elements in the asymptotic spectrumof ldquoobliquerdquo tensors This family is parametrised by probability distributionson [k] In Section 45 we discuss an extension of the support functionals calledthe Strassen upper support functionals which have the potential to be universalFinally in Section 46 we prove a new result we show how asymptotic slice rankis related to the support functionals

42 The asymptotic spectrum of tensors

Let us properly set up the semiring of tensors and the asymptotic spectrum Forthe proofs we refer to [Str87 Str88 Str91]

421 The semiring of tensor equivalence classes TWe begin by putting an equivalence relation on tensors For example we want toidentify isomorphic tensors and also for any tensor t isin Fn1 otimes middot middot middot otimes Fnk we wantto identify t with toplus 0 where 0 isin Fm1 otimes middot middot middot otimes Fmk is a zero tensor of any format

We say s is isomorphic to t and write s sim= t if there are bijective linear mapsAi Fmi rarr Fni such that t = (A1 Ak) middot s

We say s and t are equivalent and write s sim t if there are zero tensorss0 = 0 isin Fa1 times middot middot middot times Fak and t0 = 0 isin Fb1 times middot middot middot times Fbk such that s oplus s0

sim= t oplus t0The equivalence relation sim is in fact the equivalence relation generated by therestriction preorder 6

Let T be the set of sim-equivalence classes of k-tensors over F for some fixed kand field F The direct sum and the tensor product naturally carry over to T and T becomes a semiring with additive unit 〈0〉 and multiplicative unit 〈1〉(more precisely the equivalence classes of those tensors but we will not make thisdistinction)

422 Strassen preorder via restriction

Restriction 6 induces a partial order on T which behaves well with respect tothe semiring operations and naturally n le m if and only if 〈n〉 6 〈m〉 Thereforerestriction 6 is a Strassen preorder on T

423 The asymptotic spectrum of tensors X(T )

Let S sube T be a subsemiring Let

X(S) = X(S6) = φ isin Hom(SRge0) foralla b isin S a 6 brArr φ(a) le φ(b)

50 Chapter 4 The asymptotic spectrum of tensors matrix multiplication

We call X(S) the asymptotic spectrum of S and we call X(T ) the asymptoticspectrum of k-tensors over F

Theorem 41 ([Str88]) Let s t isin S Then s t iff forallφ isin X(S) φ(s) le φ(t)

Proof This follows from Theorem 212

We refer to Chapter 2 for a discussion of the topological properties of X(S)

Remark 42 We mention that X(S) may equivalently be defined with degenera-tion instead of restriction ge Over C we say f degenerates to g written f gif f sim= f prime and g sim= gprime and gprime is in the Euclidean closure (or equivalently Zariskiclosure) of the orbit GLn1 times middot middot middot timesGLnk middot f prime It is a nontrivial fact from algebraicgeometry (see [Kra84 Lemma III231] or [BCS97]) that there is a degenerationf g if and only if there are matrices Ai with entries polynomial in ε such that(A1 Ak) middot f = εdg + εd+1g1 + middot middot middot + εd+ege for some elements g1 ge Thelatter definition of degeneration is valid when C is replaced by an arbitrary fieldF and that is how degeneration is defined for an arbitrary field Degenerationis weaker than restriction f ge g implies f g Asymptotically however thenotions coincide f amp g if and only if fotimesn otimes 〈2〉otimeso(n) gotimesn We mention thatanalogous to restriction degeneration gives rise to border rank and border subrankR(f) = minr isin N f 〈r〉 and Q(f) = maxs isin N 〈s〉 f respectively

424 Asymptotic rank and asymptotic subrank

The abstract theory of asymptotic spectra characterises asymptotic subrank andasymptotic rank as follows

Corollary 43 Let S sube T be a subsemiring Let a isin S Then

˜Q(a) = minφisinX(S)

φ(a) (41)

˜R(a) = maxφisinX(S)

φ(a) (42)

Proof Statement (42) follows from Corollary 213 since either a = 0 or a gt 1For statement (41) if totimesk gt 2 for some k isin N then we apply Corollary 214Otherwise one can show that ˜Q(t) equals 0 or 1 using the gauge points of thenext section (see [Str88 Lemma 37])

Remark 44 One verifies that ˜R and ˜Q are 6-monotones and have value non 〈n〉 They are not universal spectral points however Namely the asymptoticrank of each of the three tensors

〈2 1 1〉 = e1 otimes e1 otimes 1 + e2 otimes e2 otimes 1 isin F2 otimes F2 otimes F1

〈1 1 2〉 = e1 otimes 1otimes e1 + e2 otimes 1otimes e2 isin F2 otimes F1 otimes F2

43 Gauge points ζ(i) 51

〈1 2 2〉 = 1otimes e1 otimes e1 + 1otimes e2 otimes e2 isin F1 otimes F2 otimes F2

equals 2 whereas their tensor product equals the matrix multiplication ten-sor 〈2 2 2〉 whose tensor rank equals 7 and whose asymptotic rank is thus atmost 7 ie strictly smaller than 23 Therefore asymptotic rank is not multiplica-tive On the other hand the asymptotic subrank of each of the above three tensorsequals 1 whereas the asymptotic subrank of 〈2 2 2〉 equals 4 see Chapter 5Therefore asymptotic subrank is not multiplicative

Goal 45 Our goal is now to explicitly describe elements in X(T ) universalspectral points or more modestly to describe elements in X(S) for interestingsubsemirings S sube T

Strassen constructed a finite family of elements in X(T ) the gauge points andan infinite family of elements in X(oblique tensors) the support functionalsThe support functionals are powerful enough to determine the asymptotic subrankof any ldquotight tensorrdquo Tight tensors are discussed in Chapter 5 In Chapter 6 weconstruct an infinite family in X(k-tensors over C) the quantum functionalsIn the rest of this chapter we discuss the gauge points and the support functionalsWe will focus on the case k = 3 for clarity of exposition

43 Gauge points ζ(i)

Strassen in [Str88] introduced a finite family of elements in X(T ) called the gaugepoints We focus on 3-tensors but the construction generalises immediately tok-tensors Let Vi = Fni Let t isin V1 otimes V2 otimes V3 Let i isin [3] Let flatteni(t) bethe image of t under the grouping V1 otimes V2 otimes V3 rarr Vi otimes (

otimesj 6=i Vj) We think

of flatteni(t) as a matrix Let ζ(i) T rarr N t 7rarr rank(flatteni(t)) with rankdenoting matrix rank We call ζ(1) ζ(2) ζ(3) the gauge points From the propertiesof matrix rank follows directly that ζ(i) is multiplicative under otimes additive under oplusmonotone under restriction 6 (and under degeneration ) and normalised to 1on 〈1〉 = e1 otimes e1 otimes e1

Theorem 46 ζ(1) ζ(2) ζ(3) isin X(T )

Recall ˜Q(t) le φ(t) le ˜R(t) for φ isin X(T ) In particular maxi ζ(i)(t) le ˜R(t)

We do not know whether maxiisin[3] ζ(i) equals ˜R To be precise we do not know any t

for which maxi ζ(i)(t) lt ˜R(t) and we do not know a proof that maxi ζ

(i)(t) = ˜R(t)for all t There are various families of tensors t for which maxi ζ

(i)(t) = ˜R(t) isproven We will see such a family in Section 542 For the matrix multiplicationtensor 〈2 2 2〉 we have 4 = maxi ζ

(i)(〈2 2 2〉) le 2ω so maxi ζ(i)(t) = ˜R(t) would

imply that the matrix multiplication exponent ω equals 2On the other hand ˜Q(t) le mini ζ

(i)(t) There exist t for which ˜Q(t) is

strictly smaller than miniisin[3] ζ(i)(t) To show this strict inequality we need another

52 Chapter 4 The asymptotic spectrum of tensors matrix multiplication

technique of Strassen the support functionals The support functionals are thetopic of the next section

44 Support functionals ζθ

Strassen in [Str91] constructed an infinite family of elements in the asymptoticspectrum of oblique k-tensors called the support functionals In this section we ex-plain the construction of the support functionals The support functionals providethe benchmark for our new quantum functionals (Chapter 6) and are relevant inthe context of combinatorial problems like the cap set problem (Section 542) Forclarity of exposition we focus on 3-tensors The ideas extend directly to k-tensors

Oblique tensors are tensors for which in some basis the support has thefollowing special structure Let t isin Fn1 otimes Fn2 otimes Fn3 Let e1 eni be thestandard basis of Fni Write t =

sumijk tijk ei otimes ej otimes ek Let [ni] = 1 2 ni

Let supp(t) = (i j k) tijk 6= 0 sube [n1] times [n2] times [n3] be the support of t withrespect to the standard basis Let [ni] have the natural ordering 1 lt 2 lt middot middot middot lt [ni]and let [n1]times [n2]times [n3] have the product order denoted by le That is x le yif for all i isin [3] holds xi le yi We call supp(t) oblique if supp(t) is an antichainwith respect to le ie if any two elements in supp(t) are incomparable withrespect to le We call a tensor t oblique if supp(g middot t) is oblique for some groupelement g isin G(t) = GLn1 times GLn2 times GLn3 The family of oblique tensors is asemiring under oplus and otimes

Not all tensors are oblique Obliqueness is not a generic property (see Propo-sition 621) However many tensors that are of interest in algebraic complexitytheory are oblique notably the matrix multiplication tensors

〈a b c〉 =sumiisin[a]

sumjisin[b]

sumkisin[c]

eij otimes ejk otimes eki isin Fab otimes Fbc otimes Fca

For any finite set X let P(X) be the set of all probability distributions on XFor any probability distribution P isin P(X) the Shannon entropy of P is definedas H(P ) = minus

sumxisinX P (x) log2 P (x) with 0 log2 0 understood as 0 Given finite

sets X1 Xk and a probability distribution P isin P(X1 times middot middot middot times Xk) on theproduct set X1 times middot middot middot timesXk we denote the marginal distribution of P on Xi by Pithat is Pi(a) =

sumxxi=a

P (x) for any a isin Xi

Definition 47 Let θ isin Θ = P([3]) For t isin Fn1 otimes Fn2 otimes Fn3 0 with supp(t)oblique define

ζθ(t) = max2sum3i=1 θ(i)H(Pi) P isin P(supp(t))

We call the ζθ for θ isin Θ the support functionals

Theorem 48 ζθ isin X(oblique) for θ isin Θ

44 Support functionals ζθ 53

We work towards the proof of Theorem 48 For p isin [0 1] let h(p) be thebinary entropy function h(p) = minusp log2 p minus (1 minus p) log2(1 minus p) ie h(p) is theShannon entropy of the probability vector (p 1minus p) The following properties ofthe Shannon entropy are well-known

Lemma 49

(i) H(P otimesQ) = H(P ) +H(Q) for P isin P(X1) Q isin P(X2)

(ii) H(P ) le H(P1) +H(P2) for P isin P(X1 timesX2)

(iii) H(pPoplus(1minusp)Q) = pH(P )+(1minusp)H(Q)+h(p) for PQ isin P(X) p isin [0 1]

(iv) 2a + 2b = max0leple1 2pa+(1minusp)b+h(p) for a b isin R

For X sube [n1]times [n2]times [n3] let Xle = y isin [n1]times [n2]times [n3] existx isin X y le x bethe downward closure of X Let max(X) = y isin X forallx isin X y le x rArr y = xbe the maximal points of X with respect to le Let Sn be the symmetric groupof permutations of [n] Then the product group Sn1 times Sn2 times Sn3 acts naturallyon [n1]times [n2]times [n3]

Lemma 410 Let t isin Fn1 otimes Fn2 otimes Fn3 For every g isin G(t) there is a triple ofpermutations w isin W (t) = Sn1 times Sn2 times Sn3 with w middotmax(supp(g middot t)) sube supp(t)le

Proof We prepare for the construction of w Let n isin N Let e1 en bethe standard basis of Fn Let g isin GLn Let f1 fn with fj = g middot ej be thetransformed basis of Fn Let (Ei)iisin[n] and (Fj)jisin[n] be the complete flags of Fnwith

Ei = Spanei ei+1 enFj = Spanfj fj+1 fn

Define the map

π [n]rarr [n] j 7rarr maxi isin [n] Ei cap (fj + Fj+1) 6= empty

(43)

We prove π is injective Let j k isin [n] with j le k and suppose i = π(j) = π(k)Let Ftimes = F 0 From (43) follows

(Ftimesei + Ei+1) cap (fj + Fj+1) 6= empty (44)

Ei+1 cap (fj + Fj+1) = empty (45)

(Ftimesei + Ei+1) cap (fk + Fk+1) 6= empty (46)

Suppose j lt k Then from (44) and (46) we obtain a contradiction to (45) Weconclude that j = k Thus π is injective

54 Chapter 4 The asymptotic spectrum of tensors matrix multiplication

For each Fni define as above the standard complete flag (Eij)jisin[ni] of Fni the

complete flag (F ij )jisin[ni] corresponding to the basis given by gi and the permuta-

tion πi [ni]rarr [ni] Let w = (π1 π2 π3) isin W (t)We will prove w middot max(supp(g middot t)) sube supp(t)le Let y isin max(supp(g middot t))

Let x = w middot y By construction of πi the intersection Eixicap (f iyi + F i

yi+1) is notempty Choose

f iyi isin Eixicap (f iyi + F i

yi+1)

Let tlowast be the multilinear map Fn1 times Fn2 times Fn3 rarr F with tlowast(ei ej ek) = tijk for alli isin [n1] j isin [n2] k isin [n3] Then

tlowast(f 1y1 f 2y2 f 3y3

) = tlowast(f 1y1 f 2y2 f 3y3

) +sum

zisin[n1]times[n2]times[n3]zgty

cz tlowast(f 1

z1 f 2z2 f 3z3

) (47)

for some cz isin F Since y is maximal in supp(gmiddott) the sum over z gt y in (47) equalszero We conclude tlowast(f 1

y1 f 2y2 f 3y3

) = tlowast(f 1y1 f 2y2 f 3y3

) 6= 0 Thus tlowast(E1x1timesE2

x2timesE3

x3)

is not zero and thus x isin supp(t)le

Proof of Theorem 48 We prove ζθ on oblique tensors is otimes-multiplicative oplus-additive 6-monotone and normalised to 1 on 〈1〉 = e1otimese1otimese1 The normalisationζθ(〈1〉) = 1 is clear

We prove ζθ is otimes-supermultiplicative Let s isin Fn1 otimes Fn2 otimes Fn3 and lett isin Fm1otimesFm2otimesFm3 Let P isin P(supp(t)) and Q isin P(supp(s)) Then the productP otimesQ isin P(supp(sotimes t)) has marginals PiotimesQi Since H(PiotimesQi) = H(Pi)+H(Qi)(Lemma 49(i)) we conclude ζθ(s)ζθ(t) le ζθ(sotimes t)

We prove ζθ is otimes-submultiplicative For P isin P(supp(t)) and θ isin Θ we use thenotation Hθ(P ) =

sum3i=1 θ(i)H(Pi) We naturally identify supp(t) with a subset

of [n1] times [n2] times [n3] times [m1] times [m2] times [m3] Let P isin P(supp(t)) Let P[3] be themarginal distribution of P on [n1] times [n2] times [n3] and let P3+[3] be the marginaldistribution of P on [m1]times [m2]times [m3] Then Hθ(P ) le Hθ(P[3]) +Hθ(P3+[3]) byLemma 49(ii) We conclude ζθ(sotimes t) le ζθ(s)ζθ(t)

We prove ζθ is oplus-additive By definition

ζθ(soplus t) = max2Hθ(P ) P isin P(supp(soplus t))= max

max0leple1

2Hθ(pPoplus(1minusp)Q) P isin P(supp(s)) Q isin P(supp(t))

From Lemma 49(iii) and (iv) follows

max

max0leple1

2Hθ(pPoplus(1minusp)Q) P isin P(supp(s)) Q isin P(supp(t))

= max

max0leple1

2pHθ(P )+(1minusp)Hθ(Q)+h(p) P isin P(supp(s)) Q isin P(supp(t))

= max

2Hθ(P ) + 2Hθ(Q) P isin P(supp(s)) Q isin P(supp(t))

44 Support functionals ζθ 55

= ζθ(s) + ζθ(t)

We conclude ζθ(soplus t) = ζθ(s) + ζθ(t)We prove ζθ is 6-monotone Let s 6 t with supp(s) and supp(t) oblique Then

there are linear maps Ai with s = (A1 otimes A2 otimes A3) middot t If A1 A2 A3 are of theform diag(1 1 0 0) then ζθ(s) le ζθ(t) Suppose g = (A1 A2 A3) isin G(t)Let P isin P(supp(t)) maximise Hθ on P(supp(t)) Let σ isin W such that σ middot Phas non-increasing marginals Then Hθ(σ middot P ) = Hθ(P ) and σ middot P maximises Hθ

on P(supp(σ middot t)) Then σ middot P maximises Hθ on P(supp(σ middot t)le) by Lemma 412below Let Q isin P(supp(g middot t)) maximise Hθ on P(supp(g middot t)) By Lemma 410there is a w isin W with w middot supp(g middot t) sube supp(σ middot t)le Then Hθ(w middotQ) = Hθ(Q) leHθ(σ middot P ) = Hθ(P ) Thus maxPisinP(supp(gmiddott)) Hθ(P ) le maxPisinsupp(t) Hθ(P ) Weconclude ζθ(g middot t) le ζθ(t)

The following two lemmas finish the above proof of Theorem 48 Recall thatin the proof we defined Hθ(P ) =

sum3i=1 θ(i)H(Pi) for θ isin Θ

Lemma 411 ([Str91 Prop 21]) Let Φ sube [n1] times [n2] times [n3] Let P isin P(Φ)Let supp(P ) be the support x isin Φ P (x) 6= 0 For x isin Φ define hP (x) =minussum3

i=1 θ(i) log2 Pi(xi) Then P maximises Hθ on P(Φ) if and only if

forallx isin supp(P ) hP (x) = maxyisinΦ

hP (y) (48)

Proof We write Hθ(P ) in terms of hP

Hθ(P ) =3sumi=1

θ(i)H(Pi) =sum

xisinsupp(P )

P (x)hP (x) (49)

For Q isin P(Φ)

limεrarr0+

d

dεHθ

((1minus ε)P + εQ

)= lim

εrarr0+

d

sumx

((1minus ε)P (x) + εQ(x)

)h(1minusε)P+εQ(x)

=sumx

P (x)

( 3sumi=1

θ(i)Pi(xi)minusQi(xi)

Pi(xi) ln(2)

)+sumx

(minusP (x) +Q(x)

)hP (x)

=sumx

Q(x)hP (x)minussumx

P (x)hP (x)

Therefore since Hθ is continuous and concave P maximises Hθ if and only if

forallQ isin P(Φ)sumx

Q(x)hP (x)minussumx

P (x)hP (x) le 0 (410)

56 Chapter 4 The asymptotic spectrum of tensors matrix multiplication

We will prove (410) is equivalent to (48) Supposesum

xQ(x)hP (x) lesum

x P (x)hP (x)for every Q isin P(Φ) In particular hP (y) le

sumx P (x)hP (x) for every y isin Φ so

maxyisinΦ hP (y) lesum

x P (x)hP (x) Then maxyisinΦ hP (y) =sum

x P (x)hP (x) We con-clude maxyisinΦ hP (y) = hP (x) for every x isin supp(P )

Suppose maxyisinΦ hP (y) = hP (x) for every x isin supp(P ) Then hP (y) le hP (x)for every Q isin P(Φ) y isin supp(Q) x isin supp(P ) We conclude

sumxQ(x)hP (x) lesum

x P (x)hP (x)

Lemma 412 ([Str91 Cor 22]) Let Φ sube [n1]times [n2]times [n3] Let P maximise Hθ

on P(Φ) Suppose Pi is nonincreasing on [ni] for each i isin [3] Then P max-imises Hθ on P(Φle) where Φle is the downward closure of Φ with respect to le

Proof We know P satisfies (48) We will prove P satisfies (48) with Φ replacedby Φle Then we are done by Lemma 411 Let x isin Φle Then x le y forsome y isin Φ Then (P1(x1) P2(x2) P3(x3)) ge (P1(y1) P2(y2) P3(y3)) since each Piis nonincreasing Then hP (x) le hP (y) We conclude maxΦle hP le maxΦ hP Onthe other hand Φ sube Φle Therefore maxΦ hP le maxΦle hP

Using the support functionals Strassen managed to fully compute the asymp-totic spectrum of several semirings generated by oblique tensors We will see anexample in Section 542

45 Upper and lower support functionals ζθ ζθ

In Section 44 we defined the support functionals ζθ oblique rarr Rge0 andproved that ζθ isin X(oblique) From the general theory of asymptotic spectra(Chapter 2) we know ζθ is the restriction of some map φ tensors rarr Rge0

in X(T ) However the proof of that fact was non-constructive In other wordswe know that ζθ can be extended to an element of X(T ) In this short sectionwe discuss a candidate extension proposed by Strassen called the upper supportfunctional We also discuss a companion called the lower support functional

For arbitrary t isin Fn1 otimes Fn2 otimes Fn3 the upper support functional and the lowersupport functional are defined as

ζθ(t) = mingisinG(t)

max2Hθ(P ) P isin P(supp(g middot t))

ζθ(t) = maxgisinG(t)

max2Hθ(P ) P isin P(max(supp(g middot t)))

with G(t) = GLn1 timesGLn2 timesGLn3 and Hθ(P ) =sum3

i=1 θ(i)H(Pi) We summarisethe known properties of the upper and lower support functional

Theorem 413 ([Str91]) Let s isin Fn1 otimes Fn2 otimes Fn3 and t isin Fm1 otimes Fm2 otimes Fm3Let θ isin Θ = P([3])

45 Upper and lower support functionals ζθ ζθ 57

(i) ζθ(〈n〉) = n for n isin N

(ii) ζθ(soplus t) = ζθ(s) + ζθ(t)

(iii) ζθ(sotimes t) le ζθ(s)ζθ(t)

(iv) If s gt t then ζθ(s) ge ζθ(t)

Theorem 414 ([Str91]) Let s isin Fn1 otimes Fn2 otimes Fn3 and t isin Fm1 otimes Fm2 otimes Fm3Let θ isin Θ

(i) ζθ(〈n〉) = n for n isin N

(ii) ζθ(soplus t) ge ζθ(s) + ζθ(t)

(iii) ζθ(sotimes t) ge ζθ(s)ζθ(t)

(iv) If s gt t then ζθ(s) ge ζθ(t)

Theorem 415 ([Str91]) ζθ(sotimes t) ge ζθ(s)ζθ(t) and ζθ(t) ge ζθ(t) for θ isin Θ

Regarding statement (ii) in Theorem 414 Burgisser [Bur90] shows that thelower support functional ζθ is not in general additive under the direct sumwhen θi gt 0 for all i See also [Str91 Comment (iii)] In particular this impliesthat the upper support functional ζθ(t) and the lower support functional ζθ(t)are not equal in general the upper support functional being additive In factto show that the lower support functional is not additive Burgisser first showsthat when F is algebraically closed the generic value of ζθ on Fn otimes Fn otimes Fnequals (1minusmini θi) log2 n+ o(n) On the other hand Tobler [Tob91] shows thatthe generic value of ζθ on FnotimesFnotimesFn equals log2 n So even generically ζθ and ζθare different on Fn otimes Fn otimes Fn

For θ isin Θ we say f is θ-robust if ζθ(t) = ζθ(t) We say t is robust if t is θ-robustfor all θ isin Θ Let us try to understand what robust tensors look like A tensor tis θ-robust if and only if

ζθ(t) le ζθ(t) (411)

The set of θ-robust tensors is closed under oplus and otimes since

ζθ(soplus t) = ζθ(s) + ζθ(t) = ζθ(s) + ζθ(t) le ζθ(soplus t)

and

ζθ(sotimes t) le ζθ(s)ζθ(t) = ζθ(s)ζθ(t) le ζθ(sotimes t)

For X sube [n1] times [n2] times [n3] we use the notation Hθ(X) = maxPisinP(X) Hθ(P )Let t isin Fn1 otimes Fn2 otimes Fn3 0 Equation (411) means that there are g h isin G(t)and P isin P(max supp(h middot t)) such that Hθ(supp(g middot t)) le Hθ(P ) In this case we

58 Chapter 4 The asymptotic spectrum of tensors matrix multiplication

have ζθ(t) = ζθ(t) = 2Hθ(P ) In particular t is θ-robust if there is a g isin G(t) suchthat the maximisation Hθ(supp(g middot t) is attained by a P isin P(max(supp(g middot t)))This criterion is automatically satisfied for all θ when supp(g middot t) = max(supp(g middot t))for some g isin G(t) Suppose t is oblique Then supp(g middot t) is an antichainfor some g isin G(t) and thus supp(g middot t) = max supp(g middot t) Then t is robustand ζθ(t) = ζθ(t) = 2Hθ(supp(gmiddott))

46 Asymptotic slice rank

Slice rank is a variation on tensor rank that was introduced by Terence Taoin [Tao16] to study cap sets We will look at cap sets in Section 54 Here westudy the relationship between asymptotic slice rank and the support functionals

Consider the following characterisation of tensor rank Let a simple tensor beany tensor of the form v1 otimes v2 otimes v3 isin V1 otimes V2 otimes V3 with vi isin Vi for i isin [k] Thenthe rank R(t) of t isin V1 otimes V2 otimes V3 is the smallest number r such that t can bewritten as a sum of r simple tensors

Slice rank is defined similarly but with simple tensors replaced by slicesFor S sube [k] let VS =

otimesiisinS Vi For j isin [k] let j = j A tensor in V1 otimes V2 otimes V3

is called a slice if it is of the form v otimes w with v isin Vj and w isin Vj for some j isin [k](under the natural reordering of the tensor legs) Let t isin V1 otimes V2 otimes V3 The slicerank of t denoted by SR(t) is the smallest number r such that t can be writtenas a sum of r slices For example the tensor

W = e1 otimes e1 otimes e2 + e1 otimes e2 otimes e1 + e2 otimes e1 otimes e1 isin F2 otimes F2 otimes F2 (412)

has slice rank 2 since we can write W = e1 otimes (e1 otimes e2 + e2 otimes e1) + e2 otimes e1 otimes e1In fact the slice rank of any element in V1 otimes V2 otimes V3 is at most mini dimVi Thetensor rank of W on the other hand is known to be 3

Slice rank is clearly monotone under restriction The slice rank of the diagonaltensor 〈r〉 equals r [Tao16] It follows that subrank is at most slice rank

Q(t) le SR(t)

The motivation for the introduction of slice rank in [Tao16] was finding upperbounds on subrank Q(t) and asymptotic subrank ˜Q(t)

The main result of this section is the following theorem Recall that a tensor tis oblique if the support supp(g middot t) is an antichain for some g isin G(t)

Theorem 416 Let t be oblique Then

limnrarrinfin

SR(totimesn)1n = minθisinP([3])

ζθ(t)

Our proof of Theorem 416 is based on a proof of Tao and Sawin in [TS16]and discussions of the author with Dion Gijswijt The explicit connection betweenasymptotic slice rank and the support functionals is new

46 Asymptotic slice rank 59

We use Theorem 416 before giving its proof to see that SR is not submulti-plicative and not supermultiplicative under the tensor product otimes In particular wecannot use Feketersquos lemma Lemma 22 to prove that the limit limnrarrinfin SR(totimesn)1n

exists Thus the existence of the limit is a non-trivial consequence of Theorem 416Let W as in (412) Then SR(W ) = 2 We have ζ(131313)(W ) = 2h(13) lt 2

From Theorem 416 follows SR(Wotimesn) le 2nh(13)+o(1) We conclude SR(Wotimesn) lt 2n

for n large enough We conclude SR is not supermultiplicative Now it is alsoclear that slice rank is not the same as (border) subrank since (border) subrankis supermultiplicative

Next the tensorssumn

i=1 eiotimeseiotimes1sumn

i=1 eiotimes1otimeseisumn

i=1 1otimeseiotimesei have slice rankone while their tensor product equals the matrix multiplication tensor 〈n n n〉which has slice rank n2 by Theorem 416 and Theorem 53 in the next chapterapplied to the tight tensor 〈n n n〉 We conclude SR is not submultiplicative

Slice rank and hitting set number

We study the hitting set number of the support of a tensor Let Φ sube [n1]times[n2]times[n3]A hitting set for Φ is a 3-tuple of sets A1 sube [n1] A3 sube [n2] A3 sube [n3] such that forevery a isin Φ there is an i isin [3] with ai isin Ai We may think of Φ as a 3-partite3-uniform hypergraph Then the definition of hitting set says every edge a isin Φ ishit by an element of some Ai A hitting set is also called a vertex cover everyedge being covered by some vertex or a transversal The size of the hittingset (A1 A2 A3) is |A1|+ |A2|+ |A3| The hitting set number τ(Φ) is the size ofthe smallest hitting set for Φ Let t isin Fn1 otimes Fn2 otimes Fn3

Lemma 417 Let g isin G(t) = GLn1timesGLn2timesGLn3 Then SR(t) le τ(supp(g middot t))

Proof This is clear

Lemma 418 Let g isin G(t) Then SR(t) ge τ(max(supp(g middot t)))

Proof It is sufficient to consider g = e Let

t =

r1sumi=1

v1i otimes u1

i +

r2sumi=1

v2i otimes u2

i +

r3sumi=1

v3i otimes u3

i

be a slice decomposition We may assume vj1 vjrj

are linearly independent

Let Vj = Spanvj1 vjrj sube Fnj Let Wj sube (Fnj)lowast be the elements in thedual space that vanish on Vj Let Bj sube Wj be a basis with the followingproperty with respect to the standard basis the matrix with the elementsof Bj as columns is in reduced row echelon form ie each column is of theform (lowast middot middot middot lowast 1 0 middot middot middot 0)T and the pivot elements (the 1rsquos) are all in different rowsLet Sj sube [nj] be the indices of the pivot element Let Sj = [nj] Sj be thecomplement Then |Sj| = rj We claim (S1 S2 S3) is a hitting set for max(supp(t))

60 Chapter 4 The asymptotic spectrum of tensors matrix multiplication

Then r1 + r2 + r3 = |S1|+ |S2|+ |S3| ge τ(max(supp(t))) Let x isin max(supp(t))Suppose x isin S1 times S2 times S3 For every j isin [3] let φj isin Bj have its pivot element atindex xj Let φ = φ1 otimes φ2 otimes φ3 Then φ isin W1 otimesW2 otimesW3 so φ(t) = 0 Since x ismaximal and each Bj is in reduced row echelon form

φ(t) =sumylex

ty φ(ey1 otimes ey2 otimes ey3)

=sumyltx

ty φ(ey1 otimes ey2 otimes ey3) + tx ex1 otimes ex2 otimes ex3

=sumyltx

sy ey1 otimes ey2 otimes ey3 + tx ex1 otimes ex2 otimes ex3

for some sy isin F From φ(t) = 0 follows tx = 0 This contradicts x isin supp(t) sox 6isin S1 times S2 times S3 ie there is a j isin [3] with xj isin Sj

Asymptotic hitting set number

We now study the asymptotic hitting set number ˜τ(Φ) = limnrarrinfin τ(Φtimesn)1nWe will use some basic facts of types and type classes Let X be a finite

set Let N isin N An N-type on X is a probability distribution P on X withN middot P (x) isin N for all x isin X Let P be an N -type on X The type class TNP sube XN

is the set of sequences s = (s1 sN) with x occuring N middot P (x) times in s forevery x isin X ie |i isin [N ] si = x| = N middot P (x)

Lemma 419 The number of N-types on X equals(N+|X|minus1|X|minus1

) Let P be an

N-type The size of the type class TNP equals the multinomial coefficient(NNP

)

Proof We leave the proof to the reader

Lemma 420 Let P be an N-type on X Then

1

(N + 1)|X|2NH(P ) le

(N

NP

)le 2NH(P )

Proof See eg [CT12 Theorem 1113]

Lemma 421 log2˜τ(Φ) le maxPisinP(Φ) miniisin[3]H(Pi)

Proof Let P maximise maxPisinP(Φ) miniH(Pi) Let n isin N We construct a hittingset (A1 A2 A3) for Φn as follows Let x isin Φn Viewing x as an n-tuple of elementsin Φ let Q isin Pn(Φ) be the type of x (ie the empirical distribution) Let j isin [3]with H(Qj) = miniisin[3]H(Qi) By our choice of P we have

H(Qj) = miniisin[3]

H(Qi) le miniisin[3]

H(Pi)

46 Asymptotic slice rank 61

Viewing x as a 3-tuple (x1 x2 x3) add xj to Aj We repeat this for all x isin ΦnThe final (A1 A2 A3) is a hitting set for Φn by construction For each j isin [3]

|Aj| lesumQj

|T nQj | lesumQj

2nH(Qj)

where the sum is over Qj isin Pn(Φj) with H(Qj) le miniisin[3]H(Pi) Then

|Aj| le |Pn(Φj)| 2nminiH(Pi) = poly(n)2nminiH(Pi)

We conclude |A1|+ |A2|+ |A3| le poly(n)2nminiH(Pi)

Lemma 422 log2˜τ(Φ) ge maxPisinP(Φ) miniisin[3]H(Pi)

Proof Let P maximise maxPisinP(Φ) miniH(Pi) Let n isin N Let (A1 A2 A3) be ahitting set for Φn Let Q isin Pn(Φ) be an n-type with miniH(Qi) = miniH(Pi)minuso(n) Let Ψ = T nQ sube Φn be the set of strings with type Q Then (A1 A2 A3) is ahitting set for Ψ Let πi Ψrarr Φn

i (x1 x2 x3) 7rarr xi Then

Ψ = πminus11 (A1) cup πminus1

2 (A2) cup πminus13 (A3)

Let j isin [3] with |πminus1j (Aj)| ge 1

3|Ψ| The fiber πminus1

j (a) has constant size over a isin Ψj

Let cj = |πminus1j (a)| be this size Then

|Ψ| =sumaisinΨj

|πminus1j (a)| =

sumaisinΨj

cj = |Ψj| cj

And

|πminus1j (Aj)| =

sumaisinAjcapΨj

|πminus1j (a)| = |Aj capΨj| cj le |Aj| cj

Therefore

|Aj| ge|πminus1j (Aj)|cj

ge13|Ψ|cj

= 13|Ψj|

We have |Ψj| ge 2nH(Qj)minuso(n) ge 2nminiH(Qi)minuso(n) ge 2nminiH(Pi)minuso(n) We conclude|A1|+ |A2|+ |A3| ge |Aj| ge 1

3|Ψj| ge 1

32nminiH(Pi)minuso(n)

Lemma 423 log2˜τ(Φ) = maxPisinP(Φ) miniisin[3] H(Pi)

Proof This follows directly from the above lemmas

62 Chapter 4 The asymptotic spectrum of tensors matrix multiplication

Asymptotic slice rank

We now combine the above lemmas about slice rank and the asymptotic hittingset number to prove Theorem 416 First we have the following basic lemma

Lemma 424 minθisinΘ maxPisinP(Φ)Hθ(P ) = maxPisinP(Φ) miniisin[3]H(Pi)

Proof Since Hθ(P ) is convex in θ and concave in P von Neumannrsquos minimaxtheorem gives minθ maxPisinP(Φ)Hθ(P ) = maxPisinP(Φ) minθHθ(P ) Finally we usethat minθHθ(P ) = miniH(Pi)

Define fsim(t) = lim supnrarrinfin f(totimesn)1n and fsim(t) = lim infnrarrinfin f(totimesn)1n

Lemma 425 Let t isin Fn1 otimes Fn2 otimes Fn3 Then

maxgisinG(t)

maxPisinP(max supp(gmiddott))

miniH(Pi) le SRsim(t) le SRsim(t) le min

θζθ(t)

Proof By definition SRsim(t) le SRsim(t) From Lemma 417 follows

SRsim(t) le˜τ(supp(g middot t))

for any g isin G(t) Lemma 423 gives ˜τ(supp(g middot t)) = maxPisinP(supp(gmiddott)) mini 2H(Pi)

Thus with the help of Lemma 424

SRsim(t) le mingisinG(t)

maxPisinP(supp(gmiddott))

mini

2H(Pi) = minθζθ(t)

From Lemma 418 follows

˜τ(max(supp(g middot t))) le SRsim(t)

for any g isin G(t) Lemma 423 gives

maxgisinG(t)

maxPisinP(max(supp(gmiddott)))

mini

2H(Pi) le SRsim(t)

This proves the lemma

Proof of Theorem 416 We may assume Φ = supp(t) is oblique Then withthe help of Lemma 424 and Lemma 425

minθisinΘ

ζθ(t) = minθisinΘ

ζθ(t)

= minθisinΘ

maxPisinmax(Φ)

2Hθ(P )

= maxPisinmax(Φ)

miniisin[3]

2H(Pi)

le maxgisinG(t)

maxPisinP(max(supp(gmiddott)))

miniisin[3]

2H(Pi)

le SRsim(t)

le SRsim(t)

le minθisinΘ

ζθ(t)

This proves the claim

47 Conclusion 63

47 Conclusion

The study of asymptotic rank of tensors is motivated by the open problem of findingthe exponent of matrix multiplication Asymptotic subrank has applicationsin for example combinatorics and algebraic property testing Via the theoryof asymptotic spectra Strassen characterised asymptotic rank and asymptoticsubrank in terms of the asymptotic spectrum of tensors Strassen introduced thegauge points in X(T ) and the support functionals in X(oblique) More preciselythere are the lower support functionals and the upper support functionals Thelower support functionals are not additive and can thus not be universal spectralpoints The upper support functionals may be universal spectral points but thiscan however not be shown with the help of the lower support functionals Finallywe showed that for oblique tensors the asymptotic slice rank exists and equals theminimum value over the support functionals In the next chapter we will see asubfamily of the oblique 3-tensors for which the support functionals are powerfulenough to compute the asymptotic subrank

Chapter 5

Tight tensors and combinatorialsubrank cap sets

This chapter is based on joint work with Matthias Christandl and Peter Vrana [CVZ16

CVZ18]

51 Introduction

In the previous chapter we discussed the gauge points and the support function-als ζθ The gauge points are in the asymptotic spectrum of all tensors while thesupport functionals are in the asymptotic spectrum of oblique tensors

How ldquopowerfulrdquo are the support functionals We know ˜Q(t) le ζθ(t) le ˜R(t) for

oblique t Thus maxθ ζθ(t) le ˜R(t) In fact maxθ ζ

θ(t) is at most the maximumover the gauge points maxS ζ(S) and in turn maxS ζ(S) is at most ˜R(t) Asremarked earlier it is not known whether maxS ζ(S) equals ˜R(t) in general

On the other hand we have ˜Q(t) le minθ ζθ(t) Do we attain equality here

in general ˜Q(t) = minθ ζθ(t) The answer is ldquoyesrdquo for the subsemiring of tight

3-tensors In this chapter we study tight k-tensors

Tight tensors

Let I1 Ik be finite sets Let Φ sube I1 times middot middot middot times Ik We say Φ is tight if there areinjective maps ui Ii rarr Z for i isin [k] such that

forallα isin Φ u1(α1) + middot middot middot+ uk(αk) = 0

We say t isin Fn1 otimes middot middot middot otimes Fnk is tight if there is a g isin G(t) = GLn1 times middot middot middot times GLnksuch that the support supp(g middot t) is tight

Recall that a tensor is oblique if the support is an antichain in some basisClearly tight tensors are oblique To summarise the families of tensors that we

65

66 Chapter 5 Tight tensors and combinatorial subrank cap sets

have defined up to now we have

tight sube oblique sube robust sube θ-robust

Recall that the families of oblique robust and θ-robust tensors each form asemiring under otimes and oplus Tight tensors have the same property [Str91 Section 5]Another property is that any subset of a tight set is tight

Example 51 Let k ge 3 be fixed For any integer n ge 1 and c isin [n] the set

Φn(c) = α isin 0 nminus 1k α1 + middot middot middot+ αk = c

is tight For any integer n ge 2 and any c isin [n] the set

Ψn(c) = α isin 0 nminus 1k α1 + middot middot middot+ αk = c mod n

is not tight (cf Exercise 1520 in [BCS97])

Example 52 When F contains a primitive nth root of unity ζ the tensor

tn =sum

αisinΨn(nminus1)

eα1 otimes middot middot middot otimes eαk isin (Fn)otimesk

which has support Ψn(n minus 1) is tight Namely the elements vj =sumn

i=1 ζijei

for j isin [n] form a basis of Fn Let g isin G(tn) be the corresponding basistransformation Then we have tn =

sumnj=1 vj otimes middot middot middot otimes vj and we see that the

support supp(g middot tn) = α isin [n]k α1 = middot middot middot = αk is tight (See also [BCS97Exercise 1525]) When the characteristic of F equals n the tensor tn is also tightas we will see in Section 542

Combinatorial subrank and the CoppersmithndashWinograd method

We care about tight tensors because of a remarkable theorem for tight 3-tensors ofStrassen (Theorem 53 below) To understand the theorem we need the concept ofcombinatorial asymptotic subrank (cf [Str91 Section 5]) We say D sube I1timesmiddot middot middottimesIkis a diagonal when any two distinct α β isin D are distinct in all k coordinates Inother words for elements in D the value at one coordinate uniquely determinesthe value at the other k minus 1 coordinates Let Φ sube I1 times middot middot middot times Ik We say adiagonal D sube I1 times middot middot middot times Ik is free for Φ or simply D sube Φ is a free diagonalif D = Φ cap (D1 times middot middot middot times Dk) where Di = xi (x1 xk) isin D Define the(combinatorial) subrank Q(Φ) as the size of the largest free diagonal D sube ΦFor Φ sube I1 times middot middot middot times Ik and Ψ sube J1 times middot middot middot times Jk we naturally define the productΦtimesΨ sube (I1 times J1)times middot middot middot times (Ik times Jk) by

ΦtimesΨ = ((α1 β1) (αk βk)) α isin Φ β isin Ψ

51 Introduction 67

Define the (combinatorial) asymptotic subrank ˜Q(Φ) = limnrarrinfinQ(Φtimesn)1n Lett isin Fn1 otimes middot middot middot otimes Fnk and let Φ be the support of t in the standard basis ThenQ(Φ) le Q(t) and ˜Q(Φ) le ˜Q(t) The number Q(Φ) may be interpreted as thelargest number n such that 〈n〉 can be obtained from t using a restriction thatconsists of matrices that have at most one nonzero entry in each row and ineach column (This is called M-restriction in [Str87 Section 6] which standsfor monomial restriction) We may also interpret Φ as a k-partite hypergraphThen Q(Φ) is the size of the largest induced k-partite matching in Φ

Let Φ sube [n1]timesmiddot middot middottimes [nk] and let t isin Fn1 otimesmiddot middot middototimesFnk be any tensor with supportequal to Φ Then the (asymptotic) subranks of Φ and t are related as follows

Q(Φ) le Q(t) and ˜Q(Φ) le ˜Q(t)

Strassen proved the following theorem using the method of Coppersmith andWinograd [CW90] Recall that for Φ sube I1 times I2 times I3 we let P(Φ) be the set ofprobability distributions on Φ For P isin P(Φ) let P1 P2 P3 be the marginaldistributions of P on the 3 components of I1 times I2 times I3

Theorem 53 ([Str91 Lemma 51]) Let Φ sube I1 times I2 times I3 be tight Then

˜Q(Φ) = maxPisinP(Φ)

miniisin[3]

2H(Pi) (51)

The consequence of Theorem 53 is that the support functionals are sufficientlypowerful to compute the asymptotic subrank of tight 3-tensors

Corollary 54 ([Str91 Proposition 54]) Let t isin Fn1 otimes Fn2 otimes Fn3 be tight Then

˜Q(t) = minθisinP([3])

ζθ(t)

Moreover if Φ = supp(g middot t) is tight for some g isin G(t) then ˜Q(t) = ˜Q(Φ)

Remark 55 Strassen conjectured in [Str94 Conjecture 53] that for the familyof tight 3-tensors the support functionals give all spectral points in the asymp-totic spectrum X(tight 3-tensors) In [Str91] numerous examples are given ofsubfamilies of tight 3-tensors for which this is the case

Remark 56 Equation (51) becomes false when we let Φ sube I1 times middot middot middot times Ikwith k ge 4 and we let the right-hand side of the equation be maxPisinP(Φ) mini 2H(Pi)see [CVZ16 Example 1138]

New results in this chapter

This chapter is an investigation of tight tensors combinatorial asymptotic subrankand applications More precisely this chapter contains the following new results

68 Chapter 5 Tight tensors and combinatorial subrank cap sets

Higher-order CoppersmithndashWinograd method In Section 52 we extendTheorem 53 to obtain a lower bound for ˜Q(Φ) for tight sets Φ sube I1 times middot middot middot times Ikwith k ge 4 Our lower bound is not known to be optimal in general We computeexamples for which the lower bound is optimal

Combinatorial degeneration method In Section 53 we further extend therange of application of the CoppersmithndashWinograd method via a partial order

on supports of tensors called combinatorial degeneration We prove that if Φ Ψthen ˜Q(Φ) le ˜Q(Ψ) Suppose Ψ is not tight but Φ is tight then we may apply the(higher-order) CoppersmithndashWinograd method to obtain a lower bound on ˜Q(Φ)and thus on ˜Q(Ψ)

Cap sets In Section 54 we relate the theory of asymptotic spectra theCoppersmithndashWinograd method and the combinatorial degeneration methodto the problem of upper bounding the maximum size of cap sets in Fnp

Graph tensors Graph tensors are generalisations of the matrix multiplicationtensor 〈2 2 2〉 parametrised by graphs In Section 55 we discuss how one canapply the higher-order CoppersmithndashWinograd method to obtain upper boundson the asymptotic rank of complete graph tensors We also briefly discuss thesurgery method which gives good upper bounds on the asymptotic rank of graphtensors for sparse graphs like cycle graphs

52 Higher-order CW method

In this section we extend Theorem 53 to tight Φ sube I1 times middot middot middot times Ik with k ge 4We introduce some notation Let P(Φ) be the set of probability distributionson Φ For P isin P(Φ) let P1 Pk be the marginal distributions of P on the kcomponents of I1 times middot middot middot times Ik Let R(Φ) be the set of all subsets R sube Φ2 suchthat R 6sube (x x) x isin Φ and R sube (x y) isin Φ2 xi = yi for some i isin [k]For P isin P(Φ) and R isin R(Φ) let Q(R (P1 Pk)) be the set of probabilitydistributions Q on R whose marginal distributions on the 2k components of Rsatisfy Qi = Qk+i = Pi for i isin [k]

Let I1 Ik be finite subsets of Z The result of this section is a lower bound onthe asymptotic subrank of any Φ sube I1timesmiddot middot middottimesIk satisfying foralla isin Φ

sumki=1 ai = 0 For

R sube R(Φ) let r(R) be the rank over Q of the matrix with rows xminusy (x y) isin R

Theorem 57 Let Φ sube Zk be a finite set with foralla isin Φsumk

i=1 ai = 0 Then

log2 ˜Q(Φ) ge maxP

minRQ

H(P )minus (k minus 2)H(Q)minusH(P )

r(R)

with P isin P(Φ) R isin R(Φ) and Q isin Q(R (P1 Pk))

52 Higher-order CoppersmithndashWinograd method 69

521 Construction

We prepare for the proof of Theorem 57 by discussing some basic facts

Average-free sets

Lemma 58 Let k isin N Let M isin N We say a subset B sube ZMZ is (k minus 1)-average-free if

forallx1 xk isin B x1 + middot middot middot+ xkminus1 = (k minus 1)xk rArr x1 = middot middot middot = xk

There is a (k minus 1)-average-free set B sube ZMZ of size |B| = M1minuso(1)

Proof There is a set A sube 1 bMminus1kminus1c of size |A| = M1minuso(1) with

forallx1 xk isin A x1 + middot middot middot+ xkminus1 = (k minus 1)xk rArr x1 = middot middot middot = xk (52)

see [VC15 Lemma 10] Let B = a mod M a isin A sube ZMZ Then |B| = |A|Let x1 xk isin B with x1 + middot middot middot+ xkminus1 = (k minus 1)xk View x1 xk as elementsin 1 bMminus1

kminus1c Then x1 + middot middot middot+xkminus1 = (kminus1)xk still holds From (52) follows

x1 = middot middot middot = xk in Z and hence also in ZMZ

Linear combinations of uniform variables

Lemma 59 Let M be a prime Let u1 un be independently uniformly dis-tributed over ZMZ Let v1 vm be (ZMZ)-linear combinations of u1 unThen the vector v = (v1 vm) is uniformly distributed over the range of v in(ZMZ)m

Proof Let vi =sum

j cijuj with cij isin ZMZ Then v = Cu with u = (u1 un)and C the matrix with entries Cij = cij Let y in the image of C Then thecardinality of the preimage Cminus1(y) equals the cardinality of the kernel of CIndeed if Cx = y then Cminus1(y) = x + ker(C) Since u is uniform we concludethat v is uniform on the image of C

Free diagonals

Lemma 510 Let G be a graph with n vertices and m edges Then G has at leastnminusm connected components

Proof A graph without edges has n connected components For every edge thatwe add to the graph we lose at most one connected component

Lemma 511 Let I1 Ik be finite sets Let Ψ sube I1 times middot middot middot times Ik Let

C = a b sube Ψ a 6= bexisti isin [k] ai = bi

Then Q(Ψ) ge |Ψ| minus |C| Obviously the statement remains true if we replace C bythe larger set (a b) isin Ψ2 a 6= bexisti isin [k] ai = bi

70 Chapter 5 Tight tensors and combinatorial subrank cap sets

Proof Let G = (Ψ C) be the graph with vertex set Ψ and edge set C Let Γ sube Ψcontain exactly one vertex per connected component of G The vertices in Γ arepairwise not adjacent So Γ is a diagonal Of course Γ sube Ψcap (Γ1times middot middot middot times Γk) Leta isin Ψ cap (Γ1 times middot middot middot times Γk) Let x1 xk isin Γ with

(x1)1 = a1 (x2)2 = a2 (xk)k = ak

Then x1 xk are all adjacent to a in G ie they are all in the same connectedcomponent Then x1 = middot middot middot = xk since Γ contains precisely one vertex perconnected component So a = x1 = middot middot middot = xk So a isin Γ We conclude thatΓ supe Ψ cap (Γ1 times middot middot middot times Γk) Finally |Γ| ge |Ψ| minus |C| by Lemma 510

We now give the proof of Theorem 57 We repeat some notation from aboveLet k ge 3 Let Φ sube Zk be a finite set Let P(Φ) be the set of probabilitydistributions on Φ For P isin P(Φ) let P1 Pk be the marginal distributionsof P on the k components of Zk Let R(Φ) be the set of all subsets R sube Φ2 suchthat R 6sube (x x) x isin Φ and R sube (x y) isin Φ2 xi = yi for some i isin [k]For P isin P(Φ) and R isin R(Φ) let Q(R (P1 Pk)) be the set of probabilitydistributions Q on R whose marginal distributions on the 2k components of Rsatisfy Qi = Qk+i = Pi for i isin [k] For R sube R(Φ) let r(R) be the rank over Q ofthe matrix with rows

xminus y (x y) isin R

For any prime M let rM(R) be the rank over ZMZ of the same matrix

Theorem (Theorem 57) Let Φ sube Zk be a finite set with foralla isin Φsumk

i=1 ai = 0Then

log2 ˜Q(Φ) ge maxP

minRQ

H(P )minus (k minus 2)H(Q)minusH(P )

r(R)

with P isin P(Φ) R isin R(Φ) and Q isin Q(R (P1 Pk))

Proof Let P be a rational probability distribution on Φ ie foralla isin Φ P (a) isin Q

Choice of parameters

This proof involves a variable N that we will let go to infinity and a primenumber M that depends on N For the sake of rigor we first set the dependenceof M on N and make sure that N is large enough for M to have good properties

Let n isin N such that P is an n-type ie foralla isin Φ nP (a) isin N Let N = tn be amultiple of n Let

f(N) = log2

(2|Φ|

2

maxRisinR(Φ)

(N + |R| minus 1

|R| minus 1

))isin o(N) (53)

52 Higher-order CoppersmithndashWinograd method 71

Let

g(N) = |Φ| log2(N + 1) isin o(N)

By Lemma 420

2NH(P )minusg(N) le(N

NP

) (54)

Let

micro(N) = maxRQ

H(Q)minusH(P ) + (1 + g(N) + f(N)) 1N

r(R)(55)

with R isin R(Φ) and Q isin Q(R (P1 Pk)) Let M be a prime with

d2micro(N)Ne leM le 2d2micro(N)Ne (56)

Such a prime exists by Bertrandrsquos postulate see eg [AZ14] We can make Marbitrarily large by choosing N large enough Choose N = tn large enough suchthat

M gt k minus 1 (57)

forallR isin R(Φ) rM(R) = r(R) (58)

We will later let t and thus N go to infinity

Restrict to marginal type classes

The set ΦotimesN is a finite subset of (ZN)k Let a isin ΦotimesN Then we have thatai = ((ai)1 (ai)N) isin ZN for i isin [k] We restrict to those a for which ai is inthe type class TNPi for all i isin [k] Thus let

Ψ = ΦotimesN cap (TNP1times middot middot middot times TNPk)

We prove a lower bound on the size of Ψ Let (s1 sN ) isin TNP Then sj isin Φ forj isin [N ] and ((s1)i (sN)i) isin TNPi for i isin [k] So(

((s1)1 (sN)1) ((s1)k (sN)k))isin ΦotimesN cap (TNP1

times middot middot middot times TNPk) = Ψ

Thus |Ψ| ge |TNP | By Lemma 419 |TNP | =(NNP

) By Lemma 420

(NNP

)ge

2NH(P )minusg(N) Therefore

|Ψ| ge 2NH(P )minusg(N) (59)

72 Chapter 5 Tight tensors and combinatorial subrank cap sets

Hashing

Let u1 ukminus1 v1 vN isin ZMZ For i isin [k] let

hi ZN rarr ZMZ

x 7rarr

ui +

sumNj=1 xjvj for 1 le i le k minus 1

1kminus1

(u1 + middot middot middot+ ukminus1 minus

sumNj=1 xjvj

)for i = k

Note that kminus1 is invertible in ZMZ by (57) Let a isin Ψ Then ((a1)j (ak)j) isinΦ for j isin [N ] So

sumki=1(ai)j = 0 for every j isin [N ] Thus

ksumi=1

Nsumj=1

(ai)jvj =Nsumj=1

vj

ksumi=1

(ai)j = 0

Therefore

h1(a1) + middot middot middot+ hkminus1(akminus1) = (k minus 1)hk(ak)

Restrict to average-free set

Let B sube ZMZ be a (k minus 1)-average-free set of size

|B| geM1minusκ(M) with κ(M) isin o(1) (510)

meaning

forallx1 xk isin B x1 + middot middot middot+ xkminus1 = (k minus 1)xk rArr x1 = middot middot middot = xk (511)

(Lemma 58) Let Ψprime sube Ψ be the subset

Ψprime = a isin Ψ foralli isin [k] hi(ai) isin B

Let a isin Ψprime Then a isin Ψ so

h1(a1) + middot middot middot+ hkminus1(akminus1) = (k minus 1)hk(ak)

Since hi(ai) isin B for every i isin [k] (511) implies

h1(a1) = middot middot middot = hk(ak)

Probabilistic method

Clearly Q(ΦotimesN) ge Q(Ψ) ge Q(Ψprime) Let

C prime = (a b) isin Ψprime2 a 6= bexisti isin [k] ai = bi

52 Higher-order CoppersmithndashWinograd method 73

Let X = |Ψprime| and Y = |C prime| By Lemma 511

Q(Ψprime) ge X minus Y

Let u1 ukminus1 v1 vN be independent uniformly random variables over thefield ZMZ Then X and Y are random variables Then

Q(Ψprime) ge E[X minus Y ] = E[X]minus E[Y ]

where the expectation is over u1 ukminus1 v1 vN We will prove

E[X] = |B| |Ψ|Mminus(kminus1) (512)

E[Y ] le |B|maxRQ

2NH(Q)+f(N)Mminus(kminus1)minusr(R) (513)

with f(N) as defined in (53) and R isin R(Φ) Q isin Q(R (P1 Pk)) Beforeproving (512) and (513) we derive the final bound

Derivation of final bound

From (512) and (513) follows

E[X]minus E[Y ] ge |B| |Ψ|Mminus(kminus1) minus |B|maxRQ

2NH(Q)+f(N)Mminus(kminus1)minusr(R)

We factor out |B| |Ψ| and Mminus(kminus1)

E[X]minus E[Y ] ge |B| |Ψ|Mminus(kminus1)(

1minus 1

|Ψ|maxRQ

2NH(Q)+f(N)Mminusr(R))

From our choice of micro(N) from (55)

micro(N) = maxRQ

H(Q)minusH(P ) + (1 + g(N) + f(N)) 1N

r(R)

follows

maxRQ

2N(H(Q)minusH(P )minusr(R)micro(N))+g(N)+f(N) le 1

2 (514)

Apply |B| geM1minusκ(M) from (510) and |Ψ| ge 2NH(P )minusg(N) from (59) to get

E[X]minus E[Y ] geM1minusκ(M)2NH(P )minusg(N)Mminus(kminus1)

middot(

1minus 2minusNH(P )+g(N) maxRQ

2NH(Q)+f(N)Mminusr(R))

geMminus(kminus2+κ(M))2NH(P )minusg(N)

74 Chapter 5 Tight tensors and combinatorial subrank cap sets

middot(

1minusmaxRQ

2NH(Q)minusNH(P )+g(N)+f(N)Mminusr(R))

(Here we used (514) to see that the second factor is nonnegative) Apply theupper bound 2micro(N)N leM le 2micro(N)N+2 from (56) to get

E[X]minus E[Y ] ge (2micro(N)N+2)minus(kminus2+κ(M))2NH(P )minusg(N)

middot(

1minusmaxRQ

2NH(Q)minusNH(P )+g(N)+f(N)(2micro(N)N)minusr(R))

= 2N(H(P )minus(kminus2+κ(M))micro(N))minus2(kminus2+κ(M))minusg(N)

middot(

1minusmaxRQ

2N(H(Q)minusH(P )minusr(R)micro(N))+g(N)+f(N))

Using (514) we get

E[X]minus E[Y ] ge 2N(H(P )minus(kminus2+κ(M))micro(N))minus2(kminus2+κ(M))minusg(N)(1minus 1

2)

= 2N(H(P )minus(kminus2+κ(M))micro(N))minus2(kminus2+κ(M))minusg(N)minus1

Then

1

Nlog2 Q(ΦotimesN)

ge 1

Nlog2(E[X]minus E[Y ])

ge H(P )minus (k minus 2 + κ(M)) maxRQ

H(Q)minusH(P ) + (1 + g(N) + f(N)) 1N

r(R)

minus 2(k minus 2 + κ(M)) + g(N) + 1

N

We let t and thus N go to infinity and obtain

log2 ˜Q(Φ) ge H(P )minus (k minus 2) maxRQ

H(Q)minusH(P )

r(R)

This lower bound holds for any rational probability distribution P on Φ and bycontinuity for any real probability distribution P on Φ

It remains to prove (512) and (513) We do this in the lemmas below

Lemma 512 E[X] = |B| |Ψ|Mminus(kminus1)

Proof Let a isin Ψ Then h1(a1) + middot middot middot+ hkminus1(akminus1) = (k minus 1)hk(ak) The followingfour statements are equivalent

a isin Ψprime

foralli isin [k] hi(ai) isin B

52 Higher-order CoppersmithndashWinograd method 75

existb isin B h1(a1) = middot middot middot = hk(ak) = b

existb isin B h1(a1) = middot middot middot = hkminus1(akminus1) = b

Therefore

P[a isin Ψprime] =sumbisinB

P[h1(a1) = middot middot middot = hkminus1(akminus1) = b]

For b isin B

P[h1(a1) = middot middot middot = hkminus1(akminus1) = b] = (Mminus1)kminus1

We conclude

E[X] =sumaisinΨ

P[a isin Ψprime]

=sumaisinΨ

sumbisinB

P[h1(a1) = middot middot middot = hkminus1(akminus1) = b]

=sumaisinΨ

sumbisinB

(Mminus1)kminus1

= |Ψ| |B|Mminus(kminus1)

This proves the lemma

Lemma 513 E[Y ] le |B|maxRQ 2NH(Q)+f(N)Mminus(kminus1)minusr(R)

Proof Let

C = (a aprime) isin Ψ2 a 6= aprimeexisti isin [k] ai = aprimei

Let (a aprime) isin C The following statements are equivalent

(a aprime) isin C prime (515)

a aprime isin Ψprime (516)

foralli isin [k] hi(ai) hi(aprimei) isin B (517)

existb isin B h1(a1) = middot middot middot = hk(ak) = h1(aprime1) = middot middot middot = hk(aprimek) = b (518)

Therefore

E[Y ] =sum

(aaprime)isinC

P[(a aprime) isin C prime]

=sum

(aaprime)isinC

sumbisinB

P[h1(a1) = middot middot middot = hk(ak) = h1(aprime1) = middot middot middot = hk(aprimek) = b]

76 Chapter 5 Tight tensors and combinatorial subrank cap sets

Let (a aprime) isin C Then hi(ai) and hi(aprimei) are ZMZ-linear combinations of

u1 ukminus1 v1 vN The random variable(h1(a1) hk(ak) h1(aprime1) hk(a

primek))

is uniformly distributed over the image subspace V sube (ZMZ)2k Let b isin BThen (b b) isin V since u1 = middot middot middot = uk = b v1 vN = 0 is a valid assignmentTherefore

P[h1(a1) = middot middot middot = hk(ak) = h1(aprime1) = middot middot middot = hk(aprimek) = b] = |V |minus1

And |V | equals M to the power the rank of the matrix1 0 middot middot middot 0 1

kminus11 0 middot middot middot 0 1

kminus1

0 1 0 1kminus1

0 1 0 1kminus1

0 0 middot middot middot 1 1

kminus10 0 middot middot middot 1 1

kminus1

a1 a2 middot middot middot akminus1 minus akkminus1

aprime1 aprime2 middot middot middot aprimekminus1 minusaprimekkminus1

(519)

over ZMZ with a1 ak aprime1 a

primek thought of as column vectors in (ZMZ)N

With column operations we transform (519) into0 0 middot middot middot 0 0 1 0 middot middot middot 0 00 0 middot middot middot 0 0 0 1 0 0

0 0 middot middot middot 0 0 0 0 1 0

a1 minus aprime1 a2 minus aprime2 middot middot middot akminus1 minus aprimekminus1 ak minus aprimek aprime1 aprime2 middot middot middot aprimekminus1 0

(520)

Matrix (520) has rank equal to k minus 1 plus rM(a aprime) = rk(A(a aprime)) where

A(a aprime) =(a1 minus aprime1 a2 minus aprime2 middot middot middot ak minus aprimek

)

We obtain

E[Y ] lesum

(aaprime)isinC

sumbisinB

Mminus(kminus1+rM (aaprime))

Since the summands are independent of b we get

E[Y ] le |B|sum

(aaprime)isinC

Mminus(kminus1+rM (aaprime))

Let (a aprime) isin C Consider the rows of A(a aprime) The N rows are of theform xi minus yi with (xi yi) isin Φ2 Let s = ((x1 y1) (xN yN)) Let R =

52 Higher-order CoppersmithndashWinograd method 77

(x1 y1) (xN yN) We have rM(a aprime) = rM(R) and rM(R) = r(R) by (58)Let Q be the N -type with supp(Q) = R and s isin TNQ From a 6= aprime followsR 6sube (x x) x isin Φ From existi isin [k] ai = aprimei follows existi isin [k] R sube (x y) xi = yiFrom a aprime isin TNP1

times middot middot middot times TNPk follows Qi = Qk+i = Pi for all i isin [k] We thus have

E[Y ] le |B|sum

RisinR(Φ)

sumQisinQ(R(P1Pk))

supp(Q)=RQ is N -type

sumsisinTNQ

Mminus(kminus1+r(R))

The number of N -types Q with supp(Q) = R is at most the number of N -typeson R which is at most

(N+|R|minus1|R|minus1

)(Lemma 419) For any Q isin Q(R (P1 Pk))

|TNQ | le 2NH(Q) (Lemma 419) Therefore

E[Y ] le |B|sum

RisinR(Φ)

(N + |R| minus 1

|R| minus 1

)max

QisinQ(R(P1Pk))2NH(Q) Mminus(kminus1+r(R))

Also |R(Φ)| le 2|Φ|2

Therefore

E[Y ] le |B| 2|Φ|2 maxRisinR(Φ)

(N + |R| minus 1

|R| minus 1

)max

QisinQ(R(P1Pk))2NH(Q) Mminus(kminus1+r(R))

We conclude that

E[Y ] le |B|maxRQ

2NH(Q)+f(N)Mminus(kminus1)minusr(R)

This proves the lemma

522 Computational remarks

The following two lemmas are helpful when applying Theorem 57 We leave theproof to the reader

Lemma 514 Let P isin P(Φ) Let RRprime isin R(Φ) with R sube Rprime and r(R) = r(Rprime)Then

maxQisinQ(R(P1Pk))

H(Q)minusH(P )

r(R)le max

QisinQ(Rprime(P1Pk))

H(Q)minusH(P )

r(Rprime)

Lemma 515 Let R isin R(Φ) There is an equivalence relation Rprime isin R(Φ) withR sube Rprime and r(R) = r(Rprime)

78 Chapter 5 Tight tensors and combinatorial subrank cap sets

523 Examples type sets

We discuss some examples The first example we will use to get good upperbounds on the asymptotic rank of complete graph tensors in Section 55 We focuson one family of examples that is parametrised by partitions Let λ ` k be aninteger partition of k with d parts Let

Φλ = a isin 0 1 dminus 1 type(a) = λ

The set Φλ is tight

Theorem 516 log2 ˜Q(Φ(22)) = 1

Proof Let Φ = Φ(22) Clearly ˜Q(Φ) le 2 After relabelling foralla isin Φsumk

i=1 ai = 0We may thus apply Theorem 57 Let P be the uniform probability distributionon Φ Then H(P ) = log2 6

Let R isin R(Φ) We may assume that

R sube (1 1 0 0) (1 0 1 0) (1 0 0 1)2

cup(0 0 1 1) (0 1 0 1) (0 1 1 0)2

We may assume R is an equivalence relation (Lemma 515) Let (x y) isin RLet Rprime = R cup ((1 1 1 1) minus x (1 1 1 1) minus y) isin R(Φ) Then R sube Rprime andRprime isin R(Φ) and r(R) = r(Rprime) We may thus assume that if (x y) isin R then also((1 1 1 1)minus x (1 1 1 1)minus y) isin R (Lemma 514)

Let S = (1 1 0 0) (1 0 1 0) (1 0 0 1) By the above observation it sufficesto consider equivalence relations on S There are three types of such equivalencerelations

Type (3) all three elements of S are equivalent Then |R| = 18 and r(R) = 2Type (2 1) two elements of S are equivalent and inequivalent to the third

element (which is equivalent to itself) Then |R| = 10 and r(R) = 1Type (1 1 1)) all elements of S are inequivalent Then R sube (x x) x isin Φ

which is a contradictionFor type (3) and (2 1) the uniform probability distribution Q on R has

marginals Qi = Q4+i = Pi for i isin [4] The uniform Q is optimal Then H(Q) =log2 |R| Let R(3) and R(21) be equivalence relations of type (3) and (2 1) Then

log2 ˜Q(Φ) ge minH(P )minus 2

r(R(3))

(log2 |R(3)| minusH(P )

)

H(P )minus 2

r(R(21))

(log2 |R(21)| minusH(P )

)= min log2 6minus 2

2(log2 18minus log2 6)

log2 6minus 21(log2 10minus log2 6)

= min1 log25425 = 1

This proves the theorem

53 Combinatorial degeneration method 79

Theorem 517 log2 ˜Q(Φ(0kminus11)) = h(1k)

Proof We refer to [CVZ16]

With Srinivasan Arunachalam and Peter Vrana we have the following unpub-lished result

Theorem 518 log2 ˜Q(Φ(0k21k2)) = 1

53 Combinatorial degeneration method

In this section we extend the (higher-order) CoppersmithndashWinograd method via apreorder called combinatorial degeneration Suppose Ψ sube I1timesmiddot middot middottimes Ik is not tightbut has a tight subset Φ sube Ψ In the rest of this section we focus on obtaining alower bound on ˜Q(Ψ) via Φ This has an application in the context of tri-coloredsum-free sets (Section 542) for example

Definition 519 ([BCS97]) Let Φ sube Ψ sube I1 times middot middot middot times Ik We say that Φ is acombinatorial degeneration of Ψ and write Ψ Φ if there are maps ui Ii rarr Z(i isin [k]) such that for all α isin I1 times middot middot middot times Ik if α isin Ψ Φ then

sumki=1 ui(αi) gt 0

and if α isin Φ thensumk

i=1 ui(αi) = 0 Note that the maps ui need not be injective

Combinatorial degeneration gets its name from the following standard proposi-tion see eg [BCS97 Proposition 1530]

Proposition 520 Let t isin Fn1 otimes middot middot middot otimes Fnk Let Ψ = supp(t) Let Φ sube Ψ suchthat Ψ Φ Then t t|Φ

Proposition 520 brings us only slightly closer to our goal Namely givent isin Fn1 otimesmiddot middot middototimesFnk with Ψ = supp(t) and given Φ sube Ψ such that ΨΦ it followsdirectly from Proposition 520 that t t|Φ and thus ˜Q(t) ge ˜Q(t|Φ) This howeverdoes not give us a lower bound on the combinatorial asymptotic subrank ˜Q(Ψ)The following theorem does Our theorem extends a result in [KSS16]

Theorem 521 Let Φ sube Ψ sube I1 times middot middot middot times Ik If Ψ Φ then

˜Q(Ψ) ge ˜Q(Φ)

Lemma 522 Let Φ sube Ψ sube I1 times middot middot middot times Ik If Ψ Φ then ˜Q(Ψ) ge Q(Φ)

Proof Pick maps ui Ii rarr Z such that

ksumi=1

ui(αi) = 0 for α isin Φ

ksumi=1

ui(αi) gt 0 for α isin Ψ Φ

80 Chapter 5 Tight tensors and combinatorial subrank cap sets

Let D be a free diagonal in Φ with |D| = Q(Φ) and let

wi =sumxisinDi

ui(x)

Let n isin N and define

Wi =

(x1 xn|D|) isin Itimesn|D|i

n|D|sumj=1

ui(xj) = nwi

Then

Ψtimesn|D| cap (W1 times middot middot middot timesWk) = Φtimesn|D| cap (W1 times middot middot middot timesWk)

The inclusion supe is clear To show sube let (x1 xk) isin Ψtimesn|D| cap (W1 times middot middot middot timesWk)Write xi = (xi1 xi2 xin|D|) and consider the n|D| times k matrix of evaluations

u1(x11) u2(x21) middot middot middot uk(xk1)u1(x12) u2(x22) middot middot middot uk(xk2)

u1(x1n|D|) u2(x2n|D|) middot middot middot uk(xkn|D|)

The sum of the ith column is nwi by definition of Wi andsumk

i=1 nwi = 0 Therow sums are nonnegative by definition of the maps u1 uk We conclude thatthe row sums are zero Therefore (x1 xk) is an element of Φtimesn|D|

Since D is a free diagonal in Φ Dtimesn|D| is a free diagonal in Φtimesn|D| and alsoDtimesn|D| cap (W1times middot middot middot timesWk) is a free diagonal in Φtimesn|D| cap (W1times middot middot middot timesWk) which inturn is equal to Ψtimesn|D| cap (W1 times middot middot middot timesWk) Therefore Dtimesn|D| cap (W1 times middot middot middot timesWk) isalso a free diagonal in Ψtimesn|D| ie

Q(Ψtimesn|D|) ge |Dtimesn|D| cap (W1 times middot middot middot timesWk)|

In the set Dtimesn|D| consider the strings with uniform type ie where all |D|elements of D occur exactly n times These are clearly in W1 times middot middot middot timesWk andtheir number is

(n|D|nn

) Therefore

Q(Ψtimesn|D|) ge(

n|D|n n

)= |D|n|D|minuso(n)

which implies ˜Q(Ψ) = limnrarrinfinQ(Ψtimesn|D|)1

n|D| ge |D|

Proof of Theorem 521 We have ˜Q(Ψ) = limnrarrinfin ˜Q(Ψtimesn)1n It follows fromLemma 522 that

limnrarrinfin ˜Q(Ψtimesn)1n ge lim

nrarrinfinQ(Φtimesn)1n

The right-hand side is ˜Q(Φ)

54 Cap sets 81

54 Cap sets

A subset A sube (Z3Z)n is called a cap set if any line in A is a point a linebeing a triple of points of the form (u u + v u + 2v) Until recently it wasnot known whether the maximal size of a cap set in (Z3Z)n grows like 3nminuso(n)

or like cnminuso(n) for some c lt 3 Gijswijt and Ellenberg in [EG17] inspired bythe work of Croot Lev and Pach in [CLP17] settled this question showing thatc le 3(207+33

radic33)138 asymp 2755 Tao realised in [Tao16] that the cap set question

may naturally be phrased as the problem of computing the size of the largestmain diagonal in powers of the ldquocap set tensorrdquo

sumα eα1 otimes eα2 otimes eα3 where the

sum is over α1 α2 α3 isin F3 with α1 + α2 + α3 = 0 Here main diagonal refersto a subset A of the basis elements such that restricting the cap set tensor toAtimesAtimesA gives the tensor

sumvisinA votimes votimes v We show that the cap set tensor is in

the GL3(F3)times3 orbit of the ldquoreduced polynomial multiplication tensorrdquo which wasstudied in [Str91] and we show how recent results follow from this connectionusing Theorem 521

541 Reduced polynomial multiplication

Let tn be the tensorsum

α eα1 otimes eα2 otimes eα3 where the sum is over (α1 α2 α3) in0 1 nminus13 such that α1 +α2 = α3 We call tn the reduced polynomial multi-plication tensor since tn is essentially the structure tensor of the algebra F[x](xn)of univariate polynomials modulo the ideal generated by xn The support of tnequals

(α1 α2 α3) isin 0 nminus 13

∣∣α1 + α2 = α3

which via α3 7rarr nminus 1minus α3 we may identify with the set

Φn =

(α1 α2 α3) isin 0 nminus 13∣∣α1 + α2 + α3 = nminus 1

(521)

The support Φn is tight (cf Example 51) Strassen proves in [Str91 Theorem 67]using Corollary 54 that ˜Q(tn) = ˜Q(Φn) = z(n) where z(n) is defined as

z(n) =γn minus 1

γ minus 1γminus2(nminus1)3 (522)

with γ equal to the unique positive real solution of the equation 1γminus1minus n

γnminus1= nminus1

3

The following table contains values of z(n) for small n See also [Str91 Table 1]

82 Chapter 5 Tight tensors and combinatorial subrank cap sets

n z(n)

rounded exact

2 188988 3223 = 2h(13)

3 275510 3(207 + 33radic

33)1384 3610725 4461586 5309737 6156208 7001559 78461210 869012

In fact [Str91 Theorem 67] says that the asymptotic spectrum of tn is completelydetermined by the support functionals and that the possible values that thespectral points can take on tn form the closed interval [z(n) n] (cf Remark 221)

X(N[tn]) = ζθ|N[tn] θ isin P([3]) φ(tn) φ isin X(N[tn]) = [z(n) n]

542 Cap sets

We turn to cap sets

Definition 523 A three-term progression-free set is a set A sube (ZmZ)n satisfy-ing the following For all (x1 x2 x3) isin Atimes3 there are u v isin (ZmZ)n such that(x1 x2 x3) = (u u + v u + 2v) if and only if x1 = x2 = x3 Let r3((ZmZ)n) bethe size of the largest three-term progression-free set in (ZmZ)n and define theregularisation ˜r3(ZmZ) = limnrarrinfin r3((ZmZ)n)1n

A three-term progression-free set in (Z3Z)n is called a cap or cap set Wenext discuss an asymmetric variation on three-term progression free sets calledtri-colored sum-free sets which are potentially larger They are interesting sinceall known upper bound techniques for the size of three-term progression-free setsturn out to be upper bounds on the size of tri-colored sum-free sets

Definition 524 Let G be an abelian group Let Γ sube GtimesGtimesG For i isin [3] wedefine the marginal sets Γi = x isin G existα isin Γ αi = x We say Γ is tricoloredsum-free if the following holds The set Γ is a diagonal and for any α isin Γ1timesΓ2timesΓ3α1 + α2 + α3 = 0 if and only if α isin Γ (Recall that Γ sube I1 times I2 times I3 is a diagonalwhen any two distinct α β isin Γ are distinct in all coordinates) Let s3(G) be thesize of the largest tricolored sum-free set in GtimesGtimesG and define the regularisation

˜s3(G) = limnrarrinfin s3(Gtimesn)1n

Equivalently Γ sube GtimesGtimesG is a tricolored sum-free set if and only if Γ is afree diagonal in α isin GtimesGtimesG α1 + α2 + α3 = 0

54 Cap sets 83

If the set A sube G = (ZmZ)n is three-term progression-free then the setΓ = (a aminus2a) a isin A sube G times G times G is tri-colored sum-free Therefore wehave ˜r3(ZmZ) le ˜s3(ZmZ)

We summarise the recent history of results on cap sets For clarity we focuson m = 3 we refer the reader to the references for the general results Edel in[Ede04] proved the lower bound 221739 le ˜r3(Z3Z) In [EG17] Ellenberg andGijswijt proved the upper bound

˜r3(Z3Z) le 3(207 + 33radic

33)138 asymp 2755

Blasiak et al [BCC+17] proved that in fact

˜s3(Z3Z) le 3(207 + 33radic

33)138

This upper bound was shown to be an equality in [KSS16 Nor16 Peb16]

Theorem 525 ˜s3(Z3Z) = 3(207 + 33radic

33)138

We reprove Theorem 525 by proving that ˜s3(ZmZ) equals the asymptoticsubrank z(m) of tm discussed in Section 541 when m is a prime power Thesignificance of our proof lies in the explicit connection to the framework ofasymptotic spectra and not in the obtained value which also for prime powers mwas already computed in [BCC+17 KSS16 Nor16 Peb16]

Proof We will prove ˜s3(ZmZ) = z(m) when m is a prime power By defini-tion ˜s3(ZmZ) equals the asymptotic subrank of the set

α isin 0 mminus 13 α1 + α2 + α3 = 0 mod m

which via α3 7rarr α3 minus (mminus 1) we may identify with the set

Ψm = α isin 0 mminus 13 α1 + α2 + α3 = mminus 1 mod m

and so ˜s3(ZmZ) = ˜Q(Ψm) Let

Φm = α isin 0 mminus 13 α1 + α2 + α3 = mminus 1

We know ˜Q(Φm) = z(m) (Section 541) We will show that ˜Q(Φm) = ˜Q(Ψm)when m is a prime power This proves the theorem

We prove ˜Q(Φm) le ˜Q(Ψm) There is a combinatorial degeneration Φm ΨmIndeed let ui 0 mminus 1 rarr 0 mminus 1 be the identity map If α isin Φmthen

sum3i=1 ui(αi) = m minus 1 and if α isin Ψm Φm then

sum3i=1 ui(αi) equals m minus 1

plus a positive multiple of m This means Theorem 521 applies and we thusobtain ˜Q(Φm) le ˜Q(Ψm) This proves the claim

We show ˜Q(Ψm) le ˜Q(Φm) when m is a power of the prime p Let F = FpLet fm isin Fm otimes Fm otimes Fm have support Ψm with all nonzero coefficients equal

84 Chapter 5 Tight tensors and combinatorial subrank cap sets

to 1 Obviously ˜Q(Ψm) le ˜Q(fm) To compute ˜Q(fm) we show that there is abasis in which the support of fm equals the tight set Φm Then ˜Q(fm) = ˜Q(Φm)(Corollary 54) This implies the claim We prepare to give the basis (which isthe same basis as used in [BCC+17]) First observe that the rule x 7rarr

(xa

)gives a

well-defined map ZmZrarr ZpZ since for a isin 0 1 mminus 1 if x = y mod mthen

(xa

)=(ya

)mod p by Lucasrsquo theorem Let (ex)x be the standard basis of Fm

The elements (sum

xisinZmZ(xa

)ex)aisinZmZ form a basis of Fm since the matrix (

(xa

))ax

is upper triangular with ones on the diagonal We will now rewrite fm in the basis((sum

x

(xa

)ex)a (

sumy

(yb

)ey)b (

sumz

(zc

)ez)c) Observe that

(x

mminus1

)equals 1 if and only

if x equals mminus 1 and hence

fm =sum

xyzisinZmZx+y+z=mminus1

ex otimes ey otimes ez =sum

xyzisinZmZ

(x+ y + z

mminus 1

)ex otimes ey otimes ez

The identity(x+y+zw

)=sum(

xa

)(yb

)(zc

)with sum over a b c isin 0 1 mminus 1 such

that a+ b+ c = w is true and thussumxyzisinZmZ

(x+ y + z

mminus 1

)ex otimes ey otimes ez

=sum

xyzisinZmZ

sumabcisin01mminus1

a+b+c=mminus1

(x

a

)(y

b

)(z

c

)ex otimes ey otimes ez (523)

We may simply rewrite (523) as

sumabcisin01mminus1

a+b+c=mminus1

sumxisinZmZ

(x

a

)ex otimes

sumyisinZmZ

(y

b

)eb otimes

sumzisinZmZ

(z

c

)ez

Therefore with respect to the basis ((sum

x

(xa

)ex)a (

sumy

(yb

)ey)b (

sumz

(zc

)ez)c) the

support of fm equals the tight set Φm (And even stronger fm is isomorphic tothe tensor F[x](xm) of Section 541)

Remark 526 Why did we reprove the cap set result Theorem 525 Ourmotivation being interested in the asymptotic spectrum of tensors was to seeif the techniques in the cap set papers are stronger than the Strassen supportfunctionals ie whether they give any new spectral points Above we have seenthat the cap set result itself can be proven with the support functionals In fact weshow in Section 46 that for oblique tensors the asymptotic slice-rank which wasintroduced in [Tao16] to give a concise proof of [EG17] equals the minimum valueover the support functionals In Section 611 we show that for all complex tensorsasymptotic slice-rank equals the minimum value of the quantum functionals

55 Graph tensors 85

55 Graph tensors

In this section we briefly discuss the application that motivated us to proveTheorem 57 in [CVZ16] namely upper bounding the asymptotic rank of so-calledgraph tensors Graph tensors are defined as follows

Let G = (VE) be a graph (or hypergraph) with vertex set V and edgeset E Let n isin N Let (bi)iisin[n] be the standard basis of Fn We define the graphtensor Tn(G) as

Tn(G) =sumiisin[n]E

otimesvisinV

(otimeseisinEvisine

bie

)

seen as a |V |-tensor Given a vertex v isin V let d(v) denote the degree of v thatis d(v) equals the number of edges e isin E that contain v Then Tn(G) is naturallyinotimes

visinV Fd(v) We write T(G) for T2(G) For example for the complete graphon four vertices K4 the graph tensor is

T(K4) = T( )

= T( )

otimes T( )

otimes T( )

otimes T( )

otimes T( )

otimes T( )

=sum

iisin016(bi1otimesbi2otimesbi5)otimes (bi2otimesbi3otimesbi6)otimes (bi3otimesbi4otimesbi5)otimes (bi1otimesbi4otimesbi6)

living in (C8)otimes4 Let Kk be the complete graph on k vertices The 2times2 matrix mul-tiplication tensor 〈2 2 2〉 equals the tensor T(K3) Define the exponent ω(T(G)) =log2 ˜R(T(G)) We study the exponent per edge τ(T(G)) = ω(T(G)) |E(G)|

Our result is an upper bound on τ(T(K4)) in terms of the combinatorialasymptotic subrank ˜Q(Φ(22)) which we studied in Theorem 516

Theorem 527 For any q ge 1 τ(T(K4)) le logq

( q + 2

˜Q(Φ(22))

)

Proof We apply a generalisation of the laser method See [CVZ16]

Corollary 528 Let k ge 4 Then τ(T(Kk)) le 0772943

Proof In the bound of Theorem 527 we plug in the value ˜Q(Φ(22)) = 2 fromTheorem 516 Then we optimise over q to obtain the value 0772943 Bya ldquocovering argumentrdquo we can show that τ(T(Kk)) is non-increasing when kincreases

For k ge 4 Corollary 528 improves the upper bound τ(T(Kk)) le 0790955that can be derived from the well-known upper bound of Le Gall [LG14] on theexponent of matrix multiplication ω = ω(T(K3))

86 Chapter 5 Tight tensors and combinatorial subrank cap sets

A standard ldquoflattening argumentrdquo (ie using the gauge points from the asymp-totic spectrum) yields the lower bound τ(T(Kk)) ge 1

2k(k minus 1) if k is even and

τ(T(Kk)) ge 12(k + 1)k if k is odd As a consequence if the exponent of matrix

multiplication ω equals 2 then τ(T(K4)) = τ(T(K3)) = 23 We raise the following

question is there a k ge 5 such that τ(T(Kk)) lt23

Tensor surgery cycle graphs

For graph tensors given by sparse graphs good upper bounds on the asymptoticrank can be obtained with an entirely different method called tensor surgery whichwe introduced in [CZ18] As an illustration let me mention the results we obtainedfor cycle graphs with tensor surgery Recall ω = log2 ˜R(〈2 2 2〉) = log2 ˜R(T(Ck))Let ωk = log2 ˜R(T(Ck)) First observe that ωk = k for even k For odd k triviallyk minus 1 le ωk le k We prove the following

Theorem 529 For k ` odd ωk+`minus1 le ωk + ω`

Corollary 530 Let k ge 5 odd Then ωk le ωkminus2 + ω3 and thus ωk le kminus12ω

Corollary 531 If ω = 2 then ωk = k minus 1 for all odd k

See [CZ18] for the proofs

56 Conclusion

Tight tensors are a subfamily of the oblique tensors For tight 3-tensors theminimum over the support functionals equals the asymptotic subrank This isproven via the CoppersmithndashWinograd method The construction is in fact of avery combinatorial nature In this chapter we studied the combinatorial notion ofsubrank We proved that combinatorial subrank is monotone under combinatorialdegeneration We studied the cap set problem via the support functionals Weextended the CoppersmithndashWinograd method to higher-order tensors and appliedthis method to study graph tensors

Chapter 6

Universal points in the asymp-totic spectrum of tensors entanglementpolytopes moment polytopes

This chapter is based on joint work with Matthias Christandl and Peter Vrana [CVZ18]

61 Introduction

In Chapter 4 following Strassen we introduced the asymptotic spectrum oftensors X(T ) = X(T 6) for T the semiring of k-tensors over F for some fixedinteger k and field F with addition given by direct sum oplus multiplication givenby tensor product otimes and preorder 6 given by restriction (or degeneration) Theasymptotic spectrum characterises the asymptotic rank ˜R and the asymptoticsubrank ˜Q We have seen that the asymptotic rank plays an important role inalgebraic complexity theory the asymptotic rank of the matrix multiplicationtensor 〈2 2 2〉 =

sumijkisin[2] eij otimes ejkotimes eki isin F4otimesF4otimesF4 characterises the exponent

of the arithmetic complexity of multiplying two n times n matrices over F thatis ˜R(〈2 2 2〉) = 2ω We have also seen in Chapter 5 how one may use theasymptotic subrank to upper bound the size of combinatorial objects like forexample cap sets in Fn3

New results in this chapter

So far the only elements we have seen in X(T ) (ie universal spectral pointscf Section 213) are the gauge points (Section 43) Besides that we have seenin Section 44 that the Strassen support functionals ζθ are in X(oblique) Inthis chapter we introduce for the first time an explicit infinite family of universalspectral points (over the complex numbers) the quantum functionals Our newinsight is to use the moment polytope Given a tensor t isin Cn1 otimes Cn2 otimes Cn3 themoment polytope P(t) is a convex polytope that carries representation-theoretic

87

88 Chapter 6 Universal points in the asymptotic spectrum of tensors

information about t The quantum functionals are defined as maximisations overmoment polytopes

Let me immediately put a disclaimer The quantum functionals do not give anew lower bound on the asymptotic rank of matrix multiplication 〈2 2 2〉 namelythe quantum functionals give the same lower bound as the gauge points Alsothe quantum functionals being defined for tensors over complex numbers only wedo not expect to get new upper bounds on the size of combinatorial objects thatare ldquolike cap setsrdquo

So what have we gained Arguably we have found the ldquorightrdquo viewpoint onhow to construct universal spectral points for tensors (In fact after writing ourpaper [CVZ18] we realised that Strassen had begun a study of moment polytopesin the appendix of the German survey [Str05] Strassen did not construct newuniversal spectral points however not in that publication at least) If there aremore universal spectral points then our viewpoint may lead the way to findingthem Moreover whereas no efficient algorithm is known for evaluating the supportfunctionals the moment polytope viewpoint may open the way to having efficientalgorithms for evaluating the quantum functionals

In Sections 62ndash67 we work towards the construction of the quantum functionalsand we give a proof that they are universal spectral points In Sections 68ndash610 wecompare the quantum functionals and the support functionals and in Section 611we relate asymptotic slice rank to the quantum functionals

In this chapter we will focus on 3-tensors but the theory naturally generalisesto k-tensors

62 SchurndashWeyl duality

For background on representation theory we refer to [Kra84] [Ful97] and [GW09]Let Sn be the symmetric group on n symbols Let Sn act on the tensor

space (Cd)otimesn by permuting the tensor legs

π middot v1 otimes middot middot middot otimes vn = vπminus1(1) otimes middot middot middot otimes vπminus1(n) π isin Sn

Let GLd be the general linear group of Cd Let GLd act on (Cd)otimesn via the diagonalembedding GLd rarr GLtimesnd g 7rarr (g g)

g middot v1 otimes middot middot middot otimes vn = (gv1)otimes middot middot middot otimes (gvn) g isin GLd

The actions of Sn and GLd commute so we have a well-defined action of the productgroup Sn timesGLd on (Cd)otimesn SchurndashWeyl duality describes the decomposition ofthe space (Cd)otimesn into a direct sum of irreducible Sn timesGLd representations Thisdecomposition is

(Cd)otimesn sim=oplusλ`dn

[λ]otimes Sλ(Cd) (61)

62 SchurndashWeyl duality 89

with [λ] an irreducible Sn representation of type λ and Sλ(Cd) an irreducibleGLd-representation of type λ when `(λ) le d and 0 when `(λ) gt d We use thenotation λ `d n for the partitions of n with at most d parts Let

Pλ (Cd)otimesn rarr (Cd)otimesn

be the equivariant projector onto the isotypical component of type λ ie onto thesubspace of (Cd)otimesn isomorphic to [λ]otimes Sλ(Cd) The projector Pλ is given by theaction of the group algebra element

Pλ =(dim[λ]

n

)2 sumTisinTab(λ)

cT isin C[Sn]

where Tab(λ) is the set of Young tableaux of shape λ filled with [n] and with cTthe Young symmetrizer

cT =sum

σisinC(T )

sgn(σ)σsum

πisinR(T )

π

where C(T ) R(T ) sube Sn are the subgroups of permutations inside columns andpermutations inside rows respectively The element Pλ is a minimal centralidempotent in C[Sn] and

sumλ`n Pλ = e

Back to the decomposition of (Cd)otimesn We need a handle on the size of thecomponents in the direct sum decomposition (61) For our application it is goodto think of d as a constant and n as a large number The number of summands inthe direct sum decomposition (61) is upper bounded by a polynomial in n

|λ `d n| le (n+ 1)d

ie there are only few summands compared to the total dimension dn There arethe following well-known bounds on the dimensions of the irreducible representa-tions [λ] and Sλ(Cd) that make up the summands

nprodd`=1(λ` + dminus `)

le dim[λ] le nprodd`=1 λ`

(62)

dimSλ(Cd) le (n+ 1)d(dminus1)2 (63)

Let p isin Rn be a probability vector iesumn

i=1 pi = 1 and pi ge 0 for i isin [n]Let H(p) be the Shannon entropy of the probability vector p

H(p) =nsumi=1

pi log2

1

pi

For α isin [0 1] let h(α) = H((α 1 minus α)) be the binary entropy For a partitionλ = (λ1 λ`) ` n let λ = λn = (λ1n λ`n) be the probability vectorobtained by normalising λ

90 Chapter 6 Universal points in the asymptotic spectrum of tensors

Let λ ` n For N isin N let Nλ = (Nλ1 Nλ2 Nλ`) be the stretchedpartition We see that asymptotically in the stretching factor N the dimensionof [Nλ] behaves like a multinomial coefficient and

2NnH(λ)minuso(N) le dim[Nλ] le 2NnH(λ) (64)

63 Kronecker and LittlewoodndashRichardson coef-

ficients gλmicroν cλmicroν

Let micro ν ` n Let Sn rarr SntimesSn π 7rarr (π π) be the diagonal embedding Considerthe decomposition of the tensor product [micro] otimes [ν] restricted along the diagonalembedding

[micro]otimes [ν] darrSntimesSnSnsim=otimesλ`n

HomSn([λ] [micro]otimes [ν])otimes [λ]

Define the Kronecker coefficient

gλmicroν = dim HomSn([λ] [micro]otimes [ν])

ie gλmicroν is the multiplicity of [λ] in [micro]otimes [ν]

Let λ `a+b Let GLa timesGLb rarr GLa+b (AB) 7rarr AoplusB be the block-diagonalembedding Consider the decomposition of the representation Sλ(Ca+b) restrictedalong the block-diagonal embedding

Sλ(Ca+b) darrGLa+b

GLatimesGLbsim=oplusmicro`aν`b

Hλmicroν otimes Smicro(Ca)otimes Sν(Cb)

with

Hλmicroν = HomGLatimesGLb(Smicro(Ca)otimes Sν(Cb)Sλ(Ca+b))

Define the LittlewoodndashRichardson coefficient cλmicroν = dimHλmicroν

For partitions λ λprime ` define λ + λprime elementwise The Kronecker and theLittlewoodndashRichardson coefficients have the following semigroup property (seeeg [CHM07])

Lemma 61 Let λ micro ν α β γ ` be partitions

(i) If gλmicroν gt 0 and gαβγ gt 0 then gλ+α micro+β ν+γ gt 0

(ii) If cλmicroν gt 0 and cαβγ gt 0 then cλ+αmicro+β ν+γ gt 0

64 Entropy inequalities 91

64 Entropy inequalities

The semigroup properties imply the following lemma Of this lemma the firststatement can be found in a paper by Christandl and Mitchison [CM06] while wedo not know of any source that explicitly states the second statement For theconvenience of the reader we give the proofs of both statements

Lemma 62 Let λ micro ν ` be partitions

(i) If gλmicroν gt 0 then H(λ) le H(micro) +H(ν)

(ii) If cλmicroν gt 0 then H(λ) le |micro||micro|+|ν|H(micro) + |ν|

|micro|+|ν|H(ν) + h( |micro||micro|+|ν|

)

Proof (i) Let gλmicroν gt 0 Suppose λ micro ν ` n Let N isin N Then Lemma 61implies gNλNmicroNν gt 0 This means HomSnN ([Nλ] [Nmicro]otimes [Nν]) 6= 0 which impliesdim[Nλ] le dim[Nmicro] dim[Nν] From (64) we have the dimension bounds

2NnH(λ)minuso(N) le dim[Nλ]

dim[Nmicro] le 2NnH(micro)

dim[Nν] le 2NnH(ν)

Thus NnH(λ) minus o(N) le NnH(micro) + NnH(ν) Divide by Nn and let N go toinfinity to get H(λ) le H(micro) +H(ν)

(ii) We restrict the decomposition

(Ca+b)otimesn sim=oplusλ`a+bn

[λ]otimes Sλ(Ca+b)

along the block-diagonal embedding to get

(Ca+b)otimesn darrGLa+b

GLatimesGLbsim=otimesλ`a+bn

[λ]otimes Sλ(Ca+b) darrGLa+b

GLatimesGLb

sim=oplusλ`a+bn

[λ]otimesoplusmicro`aν`b

Ccλmicroν otimes Smicro(Ca)otimes Sν(Cb)

sim=oplusmicro`aν`b

(oplusλ`a+bn

[λ]otimes Ccλmicroν)otimes Smicro(Ca)otimes Sν(Cb)

On the other hand

(Ca+b)otimesn darr sim= (Ca oplus Cb)otimesn darrsim= (Ca)otimesn oplus ((Ca)otimesnminus1 otimes Cb)oplus middot middot middot oplus (Cb)otimesn darr

sim=noplusk=0

C(nk) otimesoplusmicro`ak

([micro]otimes Smicro(Ca))otimesoplus

ν`bnminusk

([ν]otimes Sν(Cb))

92 Chapter 6 Universal points in the asymptotic spectrum of tensors

sim=noplusk=0

oplusmicro`akν`bnminusk

(C(nk) otimes [micro]otimes [ν]

)otimes Smicro(Ca)otimes Sν(Cb)

Suppose cλmicroν gt 0 Comparing the above expressions gives the inequality dim[λ] le(n|micro|

)dim[micro] dim[ν] By the semigroup property Lemma 61 we have cNλNmicroNν gt 0

for all N isin N Thus dim[Nλ] le(NnN |micro|

)dim[Nmicro] dim[Nν] for all N isin N Then

from (64) follows

2NnH(λ)minuso(N) le 2Nnh(|micro|n

)2N |micro|H(micro)2N |ν|H(ν)

We conclude H(λ) le h( |micro|n

) + |micro|nH(micro) + |ν|

nH(ν)

Let x = (x(1) x(2) x(3)) be a triple of probability vectors x(i) isin Rni Let θ isin Θbe a weighting Let Hθ(x) be the θ-weighted average of the Shannon entropies ofthe probability vectors x(1) x(2) and x(3)

Hθ(x) = θ(1)H(x(1)) + θ(2)H(x(2)) + θ(3)H(x(3))

(Note that this notation is slightly different from the notation used in Chapter 4)We will use the notation λ `3 n to say that λ is a triple of partitions of n ie λequals (λ(1) λ(2) λ(3)) where each λ(i) is a partition of n We write λ for the

normalised triple (λ(1) λ(2) λ(3))

Lemma 63 Let λ micro ν `3 be three triples of partitions

(i) If gλ(i)micro(i)ν(i) gt 0 for all i then 2Hθ(λ) le 2Hθ(micro)2Hθ(ν)

(ii) If micro `3 m ν `3 nminusm and cλ(i)

micro(i)ν(i) gt 0 for all i then 2Hθ(λ) le 2Hθ(micro) + 2Hθ(ν)

Proof (i) Suppose gλ(i)micro(i)ν(i) gt 0 for all i Then H(λ(i)) le H(micro(i)) +H(ν(i)) for

all i by Lemma 62 Thussum

i θ(i)H(λ(i)) lesum

i θ(i)H(micro(i))+sum

i θ(i)H(ν(i)) Then

Hθ(λ) le Hθ(micro) +Hθ(ν) We conclude 2Hθ(λ) le 2Hθ(micro)2Hθ(ν)

(ii) Suppose cλ(i)

micro(i)ν(i) gt 0 for all i Then H(λ(i)) le mnH(micro(i))+ nminusm

nH(ν(i))+h(m

n)

by Lemma 62 We take the θ-weighted average to get Hθ(λ) le mnHθ(micro) +

nminusmmHθ(ν) + h(m

n) We conclude 2Hθ(λ) le 2Hθ(micro) + 2Hθ(ν) by Lemma 49(iv)

65 Hilbert spaces and density operators

Endow the vector space Cn with a hermitian inner product (one may take thestandard hermitian inner product 〈u v〉 =

sumni=1 uivi for u v isin Cn where middot denotes

taking the complex conjugate) so that it is a Hilbert space

66 Moment polytopes P(t) 93

Let (V1 〈middot middot〉) and (V2 〈middot middot〉) be Hilbert spaces On V1 oplus V2 we define the innerproduct by 〈u1 oplus u2 v1 oplus v2〉 = 〈u1 v1〉+ 〈u2 v3〉 On V1 otimes V2 we define the innerproduct by 〈u1 otimes u2 v1 otimes v2〉 = 〈u1 v1〉〈u2 v2〉 and extending linearly

Let V be a Hilbert space A positive semidefinite hermitian operator ρ V rarr Vwith trace one is called a density operator The sequence of eigenvalues of a densityoperator ρ is a probability vector Let spec(ρ) = (p1 pn) be the sequence ofeigenvalues of ρ ordered non-increasingly p1 ge middot middot middot ge pn

Let V1 and V2 be Hilbert spaces Given a density operator ρ on V1 otimes V2the reduced density operator ρ1 = tr2 ρ is uniquely defined by the property thattr(ρ1X1) = tr(ρ(X1otimes IdV2)) for all operators X1 on V1 The operator ρ1 is again adensity operator The operation tr2 is called the partial trace over V2 Explicitly ρ1

is given by 〈ei ρ1(ej)〉 =sum

`〈ei otimes f` ρ(ej otimes f`)〉 where the ei are some basis of V1

and the fi are some basis of V2 (the statement is independent of basis choice)Let Vi be a Hilbert space and consider the tensor product V1otimesV2otimesV3 Associate

with t isin V1 otimes V2 otimes V3 the dual element tlowast = 〈t middot〉 isin (V1 otimes V2 otimes V3)lowast Then

ρt = ttlowast〈t t〉 = t〈t middot〉〈t t〉

is a density operator on V1 otimes V2 otimes V3 Viewing ρt as a density operator on theregrouped space V1otimes (V2otimes V3) we may take the partial trace of ρt over V2otimes V3 asdescribed above We denote the resulting density operator by ρt1 = tr23 ρ

t Wesimilarly define ρt2 = tr13 ρ

t and ρt3 = tr12 ρt

66 Moment polytopes P(t)

We give a brief introduction to moment polytopes We refer to [Nes84 Bri87Fra02 Wal14] for more information We begin with the general setting and thenspecialise to orbit closures in tensor spaces

661 General setting

Let G be a connected reductive algebraic group (We refer to Kraft [Kra84] andHumphreys [Hum75] for an introduction to algebraic groups) Fix a maximal torusT sube G and a Borel subgroup T sube B sube G We have the character group X(T ) theWeyl group W the root system Φ sube X(T ) and the system of positive roots Φ+ sube ΦFor λ micro isin X(T ) we set λ 4 micro if micro minus λ is a sum of positive roots Let V bea rational G-representation The restriction of the action of G to T gives adecomposition

V =oplus

λisinX(T )

Vλ Vλ = v isin V forallt isin T t middot v = λ(t)v

This decomposition is called the weight decomposition of V The λ isin X(T )with Vλ 6= 0 are called the weights of V with respect to T The Vλ are the

94 Chapter 6 Universal points in the asymptotic spectrum of tensors

weight spaces of V For v isin V let vλ be the component of v in Vλ Letsupp(v) = λ vλ 6= 0

Let E be the real vector space E = X(T ) otimes R The Weyl group W actson X(T ) and thus on E We enlarge 4 to a partial order on E as follows Forx y isin E let x 4 y if y minus x is a nonnegative linear combination of positive rootsLet D sube E be the positive Weyl chamber For every x isin E the orbit W middot xintersects the positive Weyl chamber D in exactly one point which we denote bydom(x)

Let V be a finite-dimensional rational G-module Let χ isin X(T ) cap D bea dominant character We denote the χ-isotypical component of V with V(χ)Let Z sube V be a Zariski closed set We denote the coordinate ring of Z with C[Z]We denote the degree d part of C[Z] with C[Z]d If Z is G-stable then C[Z]d is aG-module

Definition 64 Let V be a rational G-module and Z sube V a nontrivial irreducibleclosed G-stable cone The moment polytope of Z denoted by

P(Z)

is defined as the Euclidean closure in E of the set

R(Z) = χd (C[Z]d)(χlowast) 6= 0

of normalised characters χd for which the χlowast-isotypical component (C[Z]d)(χlowast) isnot zero

Theorem 65 (MumfordndashNess [Nes84] Brion [Bri87] Franz [Fra02]) The momentpolytope is indeed a convex polytope and it is equal to the image of the so-calledmoment map intersected with the positive Weyl chamber

P(Z) = micro(Z 0) capD

Let Z = G middot v be the orbit closure (in the Zariski topology) of a vector v isin V 0and suppose G middot v is a cone

Lemma 66 (See eg [Str05]) Suppose G middot v is a cone Then

R(G middot v) = χd (C[G middot v]d)(χlowast) 6= 0= χd (lin(G middot votimesd))(χ) 6= 0

662 Tensor spaces

We specialise to 3-tensors Let V = V1 otimes V2 otimes V3 with Vi = Cni Let

G = GLn1 timesGLn2 timesGLn3

67 Quantum functionals F θ(t) 95

T = T1 times T2 times T3

with Ti the diagonal matrices in GLni The weight decomposition of V is thedecomposition with respect to the standard basis elements ex1 otimes ex2 otimes ex3 wherex isin [n1]times [n2]times [n3] The support supp(v) is the support of v with respect to thestandard basis

In the current setting there is a beautiful rephrasing of Theorem 65 in termsof ordered spectra of reduced density matrices Recall from Section 65 that forv isin V 0 we have a density matrix ρv and reduced density matrices ρvi of whichwe may take the non-increasingly ordered spectra spec(ρvi )

Theorem 67 (WalterndashDoranndashGrossndashChristandl [WDGC13]) Let Z sube V be anontrivial irreducible closed G-stable cone Then

P(Z) = (spec ρz1 spec ρz2 spec ρz3) z isin Z 0

Let v isin V 0 We consider the moment polytope of the orbit closure Z = G middot vIn this setting Lemma 66 specialises to the following

Lemma 68 (See eg [Str05])

R(G middot v) = χd (C[G middot v]d)(χlowast) 6= 0= χd (lin(G middot votimesd))(χ) 6= 0= χd Pχv

otimesd 6= 0

where Pχ = Pχ(1) otimes Pχ(2) otimes Pχ(3) with Pχ(i) V otimesdi rarr V otimesdi the projector onto the

isotypical component of type χ(i) discussed in Section 62

On the other hand Theorem 67 immediately gives a description of the momentpolytope P(G middot v) in terms of ordered spectra of reduced density matrices

Theorem 69 Let v isin V 0 Then

P(G middot v) = (spec ρu1 spec ρu2 spec ρu3) u isin G middot v 0

Summarising we have two descriptions of the moment polytope a represen-tation-theoretic or invariant-theoretic description (Lemma 68) and a quantummarginal spectra description (Theorem 69) These two descriptions are the keyto proving the properties of the quantum functionals that we need

67 Quantum functionals F θ(t)

We will now define the quantum functionals and prove that they are universalspectral points

96 Chapter 6 Universal points in the asymptotic spectrum of tensors

Let p = (p1 pn) isin Rn be a probability vector iesumn

i=1 pi = 1 andpi ge 0 for all i isin [n] Recall that H(p) denotes the Shannon entropy of theprobability vector p H(p) =

sumni=1 pi log2 1pi Let x = (x(1) x(2) x(3)) be a

triple of probability vectors x(i) isin Rni Let θ isin Θ be a weighting Recallthat Hθ(x) denotes the θ-weighted average of the Shannon entropies of the threeprobability vectors x(1) x(2) x(3)

Hθ(x) = θ(1)H(x(1)) + θ(2)H(x(2)) + θ(3)H(x(3))

Let V = Cn1 otimes Cn2 otimes Cn3 Let G = GLn1 timesGLn2 timesGLn3 Let v isin V 0 Weuse the notation P(v) = P(G middot v) for the moment polytope of the orbit closureof v

Definition 610 For θ isin Θ and v isin V 0 let

F θ(v) = max2Hθ(x) x isin P(v)

Let F θ(0) = 0 We call the functions F θ the quantum functionals The namequantum functional comes from the fact that the moment polytope P(t) consistsof triples of quantum marginal entropies

Theorem 611 Let T be the semiring of 3-tensors over C Let 6 be the restrictionpreorder For θ isin Θ

F θ isin X(T 6)

In other words F θ is a semiring homomorphism T rarr Rge0 which is monotoneunder degeneration 6 In fact F θ is monotone under degeneration

Remark 612 The results in this chapter generalise to k-tensors over C In ourpaper [CVZ18] we discuss this general situation in detail and make a distinctionbetween upper quantum functionals and lower quantum functionals

Let p isin Rn and q isin Rm be probability vectors The tensor product potimesq isin Rnm

defined by

potimes q = (piqj i isin [n] j isin [m])

is a probability vector The direct sum poplus q isin Rn+m defined by

poplus q = (p1 pn q1 qm)

is a probability vectorLet x = (x(1) x(2) x(3)) and y = (y(1) y(2) y(3)) be triples of probability vectors

We define the tensor product xotimes y elementwise

xotimes y = (x(1) otimes y(1) x(2) otimes y(2) x(3) otimes y(3))

67 Quantum functionals F θ(t) 97

We define the direct sum xoplus y elementwise

xoplus y = (x(1) oplus y(1) x(2) oplus y(2) x(3) oplus y(3))

For x otimes y and x oplus y to be in the moment polytope we will need to reorder thecomponents non-increasingly For a triple of probability vectors x = (x(1) x(2) x(3))let

dom(x)

be the triple of probability vectors obtained from x be reordering the compo-nents x(i) such that they become non-increasing Let dom(S) = dom(x) x isin S

For v isin Cn1otimesCn2otimesCn3 we will use the notation G(v) = GLn1timesGLn2timesGLn3

to denote the group that naturally corresponds to the space that v lives in Wewill use the notation P(v) = P(G(v) middot v) for the moment polytope of the orbitclosure of v

Theorem 613 Let s isin Cn1 otimes Cn2 otimes Cn3 and t isin Cm1 otimes Cm2 otimes Cm3

(i) dom(P(s)otimesP(t)

)sube P(sotimes t)

(ii) forallα isin [0 1] dom(αP(s)oplus (1minus α) P(t)

)sube P(soplus t)

(iii) If s t isin Cn1 otimes Cn2 otimes Cn3 0 and s isin G(t) middot t then P(s) sube P(t)

(iv) P(soplus 0) = P(s)oplus 0

(v) P(〈1〉) = ((1) (1) (1)) with 〈1〉 = e1 otimes e1 otimes e1 isin C1 otimes C1 otimes C1

Proof To prove statements (i) and (ii) let x isin P(s) and y isin P(t) Then thereare elements a isin G(s) middot s and b isin G(t) middot t with ordered marginal spectra x and y

x = (spec ρa1 spec ρa2 spec ρa3)

y = (spec ρb1 spec ρb2 spec ρb3)

We prove statement (i) We have aotimes b isin G(sotimes t) middot sotimes t Thus

dom(xotimes y) = (spec ρaotimesb1 spec ρaotimesb2 spec ρaotimesb3 ) isin P(sotimes t)

We conclude dom(P(s)otimesP(t)) sube P(sotimes t) We prove statement (ii) Let α isin [0 1]Define the tensor u(α) isin Cn1+m1 otimes Cn2+m2 otimes Cn3+m3 by

u(α) =

radicαradic〈s s〉

aoplusradic

1minus αradic〈t t〉

b

Then u(α) isin G(soplus t) middot soplus t We have ρu(α)i = αρai oplus (1 minus α)ρbi From the

observation

spec(αρai oplus (1minus α)ρbi) = dom(αxoplus (1minus α)y)

98 Chapter 6 Universal points in the asymptotic spectrum of tensors

follows dom(αxoplus (1minus α)y) isin P(G(soplus t) middot soplus t) We conclude

dom(αP(s)oplus (1minus α)P(t)) sube P(soplus t)

We have thus proven statement (i) and (ii)We prove statement (iii) Let G = G(t) = G(s) Let s isin G middot t Then

G middot s sube G middot t so we have a G-equivariant restriction map C[G middot s] C[G middot t] onthe coordinate rings Let χd isin R(G middot s) with (C[G middot s]d)(χlowast) 6= 0 Then also(C[G middot t]d)(χlowast) 6= 0 by Schurrsquos lemma Thus χd isin R(G middot t) sube P(G middot t) Weconclude P(s) sube P(t)

We prove statement (iv) Let χd isin R(G(soplus 0) middot (soplus 0)) with Pχ(soplus0)otimesd 6= 0Recall from Section 62 that Pχ is given by the action of an element in the groupalgebra C[Sd] which we also denoted by Pχ From this viewpoint we see that also

Pχsotimesd 6= 0 So χd isin R(G(s) middot s)Statement (v) is a direct observation

Corollary 614

(i) F θ(s)F θ(t) le F θ(sotimes t)

(ii) F θ(s) + F θ(t) le F θ(soplus t)

(iii) If s t then F θ(s) le F θ(t)

(iv) F θ(〈1〉) = 1

Proof (i) Let x isin P(s) and y isin P(t) Then xotimesy isin P(sotimest) by Theorem 613 It isa basic fact that Hθ(x)+Hθ(y) = Hθ(xotimesy) (Lemma 49) so 2Hθ(x)2Hθ(y) = 2Hθ(xotimesy)We conclude F θ(s)F θ(t) le F θ(sotimes t)

(ii) Let x isin P(s) and y isin P(t) Then by Theorem 613 for all α isin [0 1]

dom(αxoplus (1minus α)y) isin P(soplus t)

It is a basic fact that αHθ(x) + (1 minus α)Hθ(y) + h(α) = Hθ(αx oplus (1 minus α)y)(Lemma 49) Thus for any α isin [0 1] we have 2αHθ(x)+(1minusα)Hθ(y)+h(α) le F θ(soplus t)Using Lemma 49(iv) we conclude F θ(s) + F θ(t) le F θ(soplus t)

(iii) This follows from statement (iii) and (iv) of Theorem 613 since bydefinition degeneration s t means soplus 0 isin G(toplus 0) middot (toplus 0)

(iv) This follows from statement (v) of Theorem 613

67 Quantum functionals F θ(t) 99

Theorem 615

(i) R(sotimes t) sube λN existmicroN isin R(s) νN isin R(t) gλ(i)micro(i)ν(i) gt 0 for all i

(ii) R(soplus t) sube λN existmicrom isin R(s) ν(N minusm) isin R(t) cλ(i)

micro(i)ν(i) gt 0 for all i

Proof (i) Let s isin V1 otimes V2 otimes V3 and let t isin W1 otimesW2 otimesW3 Let λN isin R(sotimes t)with Pλ(sotimes t)otimesN 6= 0 Let π be the natural reordering map

π ((V1 otimesW1)otimes (V2 otimesW2)otimes (V3 otimesW3))otimesN

rarr (V1 otimes V2 otimes V3)otimesN otimes (W1 otimesW2 otimesW3)otimesN

Then

(sotimes t)otimesN =summicroν

πminus1(Pmicro otimes Pν)π(sotimes t)otimesN

Let micro ν `3 N with Pλπminus1(Pmicro otimes Pν)π(s otimes t)otimesN 6= 0 Then Pmicros

otimesN 6= 0 andPνt

otimesN 6= 0 ie microN isin R(s) and νN isin R(t) Moreover Pλπminus1(Pmicro otimes Pν)π 6= 0

which means the Kronecker coefficients gλ(i)micro(i)ν(i) are nonzero(ii) Let λN isin R(soplus t) with Pλ(soplus t)otimesN 6= 0 Let us expand (soplus t)otimesN as

(soplus t)otimesN = sotimesN oplus (sotimesNminus1 otimes t)oplus middot middot middot oplus totimesN

Then Pλ does not vanish on some summand which we may assume to be of theform sotimesm otimes totimesNminusm Let π be the natural projection

π ((V1 oplusW1)otimes (V2 oplusW2)otimes (V3 oplusW3))otimesN

rarr (V1 otimes V2 otimes V3)otimesm otimes (W1 otimesW2 otimesW3)otimesNminusm

Let micro ν with Pλπminus1(Pmicro otimes Pν)π(soplus t)otimesN 6= 0 Then Pmicros

otimesm 6= 0 and PνtotimesNminusm 6= 0

Moreover Pλπminus1(Pmicro otimes Pν)π 6= 0 Therefore the LittlewoodndashRichardson coeffi-

cients cλ(i)

micro(i)ν(i) are nonzero

Corollary 616

(i) F θ(sotimes t) le F θ(s)F θ(t)

(ii) F θ(soplus t) le F θ(s) + F θ(t)

Proof (i) Let λN isin R(s otimes t) By Theorem 615 there is a microN isin R(s) and aνN isin R(t) such that the Kronecker coefficient gλ(i)micro(i)ν(i) is nonzero for every i

Then 2Hθ(micro) le F θ(s) and 2Hθ(ν) le F θ(t) by definition of F θ The Kroneckercoefficients being nonzero implies

2Hθ(λ) le 2Hθ(micro)2Hθ(ν)

100 Chapter 6 Universal points in the asymptotic spectrum of tensors

by Lemma 63 We conclude F θ(sotimes t) le F θ(s)F θ(t)

(ii) Let λN isin R(s oplus t) Then by Theorem 615 there are microm isin R(s) and

ν(N minus m) isin R(t) such that the LittlewoodndashRichardson coefficient cλ(i)

micro(i)ν(i) isnonzero for every i This means

2Hθ(λ) le 2Hθ(micro) + 2Hθ(ν)

by Lemma 63 We conclude F θ(soplus t) le F θ(s) + F θ(t)

Proof of Theorem 611 Corollary 614 and Corollary 616 together prove The-orem 611

68 Outer approximation

In this section we discuss an outer approximation of P(t) We will use this outerapproximation to show that the quantum functionals are at most the supportfunctionals

Let 4 be the dominance order ie majorization order on triples of probabilityvectors For any set S sube Rn1 times Rn2 times Rn3 of triples of probability vectors let S4

denote the upward closure with respect to 4

S4 = y isin Rn1 times Rn2 times Rn3 existx isin S x 4 y

Let conv(S) denote the convex hull of S in Rn1 times Rn2 times Rn3 Recall that for x isinS we defined dom(x) as the triple of probability vectors obtained from x =(x(1) x(2) x(3)) by reordering the components x(i) such that they become non-increasing and dom(S) = dom(x) x isin S

Theorem 617 (Strassen [Str05]) Let v isin V 0 Then

P(v) sube (dom conv supp v)4 (65)

Proof We give the proof for the convenience of the reader Let χd isin R(G middot v)Then (lin(G middot votimesd))(χ) 6= 0 Let Mχ sube lin(G middot votimesd) be a simple G-submodule withhighest weight χ Let N sube V otimesd be the G-module complement N oplusMχ = V otimesdThen votimesd is not in N Let v =

oplusγisinsupp v vγ be the weight decomposition Then votimesd

is a sum of tensor products of the vγ At least one summand is not in N say ofweight η =

sumγ dγγ with

sumγ dγ = d The projection V otimesd rarrMχ along N maps this

summand onto a nonzero weight vector of weight η So η is a weight of Mχ Thenalso dom(η) is a weight of Mχ Since χ is the highest weight of Mχ dom(η) 4 χ

Then dom(ηd) 4 χd We have ηd =sum

γdγdγ isin conv supp v We conclude

R(G middot v) sube (dom conv supp v)4 and thus P(G middot v) sube (dom conv supp v)4

69 Inner approximation for free tensors 101

69 Inner approximation for free tensors

In this section we discuss an inner approximation for the moment polytope of afree tensor We will use this inner approximation in the next section to prove thatthe quantum functionals coincide with the support functionals when restricted tofree tensors We will prove that not all tensors are free

We say a set Φ sube [n1] times [n2] times [n3] is free if every two different elementsof Φ differ in at least two coordinates in other words if the elements of Φ haveHamming distance at least two We say v isin V = Cn1 otimes Cn2 otimes Cn3 is free if forsome g isin G(v) = GLn1 timesGLn2 timesGLn3 the support supp(g middot v) sube [n1]times [n2]times [n3]is free (Free is called schlicht in [Str05])

Theorem 618 (Strassen [Str05]) Let v isin V 0 with supp(v) free Then

dom conv supp v sube P(v)

Proof We refer to [Str05]

Corollary 619 Let v isin V 0 with supp(v) free Then

P(v)4 =(dom conv supp v

)4

Proof By Theorem 618 dom conv supp v sube P(v) We take the upward closureon both sides to get (dom conv supp v)4 sube P(v)4 On the other hand fromTheorem 617 follows P(v)4 sube (dom conv supp v)4

Remark 620 Recall that v isin V is oblique if the support supp(g middot v) is anantichain for some g isin G(v) (Section 44) Such antichains are free so obliquetensors are free Thus tight sube oblique sube free Like the tight tensors andoblique tensors free tensors from a semigroup under otimes and oplus

Proposition 621 For n ge 5 there exists a tensor that is not free in CnotimesCnotimesCn

Proof We upper bound the maximal size of a free support Let Φ sube [n]times [n]times [n]be free Any two distinct elements in Φ are still distinct if we forget the thirdcoefficient of each Therefore |Φ| = |(α1 α2) α isin Φ| le n2 (This is a specialcase of the Singleton bound [Sin64] from coding theory This upper bound is tightsince Φ = (a b c) a b c isin [n] c = a+ b mod n is free and has size n2) Secondwe apply the following observation of Burgisser [Bur90 page 3] Let

Zn = t isin Cn otimes Cn otimes Cn existg isin G(t) |supp(g middot t)| lt n3 minus 3n2

Let Yn = Cn otimes Cn otimes Cn Zn Then the set Yn is Zariski open and nonemptyNow let n ge 5 and let t isin Yn Then forallg isin G(t) |supp(g middot t)| ge n3 minus 3n2 gt n2 Weconclude t is not free

102 Chapter 6 Universal points in the asymptotic spectrum of tensors

610 Quantum functionals versus support func-

tionals

We discussed the support functionals ζθ isin X(oblique 3-tensors over F) in Chap-ter 4 We recall its definition over C Let V = Cn1otimesCn2otimesCn3 For θ isin Θ = P([3])and t isin V 0 with supp(t) oblique

ζθ(t) = max2Hθ(P ) P isin P(supp(t))

We also discussed an extension of ζθ to all 3-tensors over C the upper supportfunctional

ζθ(t) = mingisinG(t)

max2Hθ(P ) P isin P(supp(g middot t))

We know ζθ(s otimes t) le ζθ(s)ζθ(t) ζθ(s oplus t) = ζθ(s) + ζθ(t) ζθ(〈1〉) = 1 ands 6 trArr ζθ(s) le ζθ(t) for any s t isin V

The set conv supp(g middot t) is the set of marginals of probability distributions onsupp(g middot t) Thus dom conv supp(g middot t) is the set of ordered marginals of probabilitydistributions on supp(g middot t) Therefore

ζθ(t) = mingisinG(t)

maxxisinS(gmiddott)

2Hθ(x)

with S(w) = dom conv suppw Let X sube Rn1 times Rn2 times Rn3 be a set of triples ofprobability vectors From Schur-convexity of the Shannon entropy function followsmaxxisinX 2Hθ(x) = maxxisinX4 2Hθ(x) Also Hθ(x) = Hθ(domx)

Theorem 622 ζθ(t) ge F θ(t)

Proof Let g isin G(t) such that

maxxisinS

2Hθ(x) = ζθ(t)

with S = dom conv supp(g middot t) We have

maxxisinS

2Hθ(x) = maxxisinS4

2Hθ(x)

By Theorem 617 P(t) sube S4 We conclude F θ(t) le ζθ(t)

Theorem 623 Let t isin V be free Then ζθ(t) = F θ(t)

Proof We know from Theorem 622 that ζθ(t) ge F θ(t) We prove ζθ(t) le F θ(t)Let g isin G(t) such that supp(g middot t) is free Let S = dom conv supp(g middot t) Thenζθ(t) le maxxisinS 2Hθ(x) = maxxisinS4 2Hθ(x) By Theorem 618 we have S4 = P(t)4We conclude ζθ(t) le F θ(t)

611 Asymptotic slice rank 103

We can show that the regularised upper support functional equals the quantumsupport functional As a consequence the quantum functional is at least the lowersupport functional which was discussed in Chapter 4

Theorem 624 limnrarrinfin1nζθ(totimesn)1n = F θ(t)

Proof We refer the reader to [CVZ18]

Corollary 625 F θ(v) ge ζθ(v)

Proof By Theorem 624 F θ(v) = limnrarrinfin ζθ(votimesn)1n We know ζθ(v) ge ζθ(v)

by Theorem 415 and thus limnrarrinfin ζθ(votimesn)1n ge limnrarrinfin ζθ(v

otimesn)1n The lowersupport functional ζθ is supermultiplicative under otimes (Theorem 414) so

limnrarrinfin

ζθ(votimesn)1n ge ζθ(v)

Combining these three inequalities proves the theorem

611 Asymptotic slice rank

We proved in Section 46 that for oblique t isin Fn1 otimes Fn2 otimes Fn3 the asymptotic slicerank limnrarrinfin SR(totimesn)1n exists and equals minθisinΘ ζ

θ(t) with Θ = P([3]) In thissection we prove the analogous statement for the quantum functionals

Theorem 626 Let t isin Cn1 otimes Cn2 otimes Cn3 Then

limnrarrinfin

SR(totimesn)1n = minθisinΘ

F θ(t)

We work towards the proof of Theorem 626 Let t isin Cn1 otimesCn2 otimesCn3 0 LetEθ(t) = log2 F

θ(t)

Lemma 627 For any ε gt 0 there is an n0 isin N such that for all n ge n0 there isa λn isin R(t) with miniisin[3]H(λ(i)) ge minθisinΘE

θ(t)minus ε

Proof By definition

minθisinΘ

Eθ(t) = minθisinΘ

maxxisinP(t)

sumjisin[3]

θ(j)H(x(j))

By Von Neumannrsquos minimax theorem the right-hand side equals

maxxisinP(t)

minθisinΘ

sumjisin[3]

θ(j)H(x(j))

which equals

maxxisinP(t)

minjisin[3]

H(x(j))

104 Chapter 6 Universal points in the asymptotic spectrum of tensors

Let ε gt 0 Let microm isin R(t) with minjisin[3] H(micro(j)) ge minθisinΘ Eθ(t)minus ε2 We will

use two facts We have (P(1) otimes P(1) otimes P(1))t = t 6= 0 The triples of partitions λwith Pλt

otimesn 6= 0 for some n form a semigroup Let n isin N We can write n = qm+rwith q r isin N 0 le r lt m Let λ(j) = qmicro(j) + (r) Then by the semigroup property

Pλtotimesn 6= 0 ie λn isin R(t) We have 1

n(qmicro(j) + (r)) = qm

nmicro(j) + r

n(r) By concavity

of Shannon entropy

H( 1n(qmicro(j) + (r))) = H( qm

nmicro(j) + r

n(r))

ge qmnH(micro(j))

ge (1minus mn

)H(micro(j))

When n is large enough (1minus mn

)H(micro(j)) is at least H(micro(j))minus ε2 Let n0 isin N suchthat this is the case for all j isin [3]

Lemma 628 Let λn isin R(t) Then SR(totimesn) ge miniisin[3] dim[λ(i)]

Proof We have the restriction totimesn ge Pλtotimesn 6= 0 Choose rank-one projections Aj

in the vector spaces Sλ(j)(Cnj) with

s = (id[λ(1)]otimesA1)otimes (id[λ(2)]otimesA2)otimes (id[λ(3)]otimesA3)Pλtotimesn 6= 0

The tensor s is invariant under Sn acting diagonally on (Cn1)otimesnotimes(Cn2)otimesnotimes(Cn3)otimesnThus the marginal spectra spec ρsi are uniform This implies s is semistableFrom [BCC+17 Theorem 46] follows that SR(s) equals miniisin[3] dim[λ(i)]

Lemma 629 lim infnrarrinfin SR(totimesn)1n ge minθisinΘ Fθ(t)

Proof Let ε gt 0 For n large enough choose λn isin R(t) as in Lemma 627 ByLemma 628 SR(totimesn) ge miniisin[3] dim[λ(i)] The right-hand side we lower bound by

miniisin[3]

dim[λ(i)] ge miniisin[3]

2nH(λ(i))2minuso(n) ge 2n(minθisinΘ Eθ(t)minusε)2minuso(n)

Then lim infnrarrinfin SR(totimesn)1n ge 2minθisinΘ Eθ(t)minusε

Lemma 630 lim supnrarrinfin SR(totimesn)1n le F θ(t)

Proof Let n isin N Define s1 s2 s3 isin Cn1 otimes Cn2 otimes Cn3 by

s1 =(sumλ(1)`n

H(λ(1))leEθ(t)

Pλ(1) otimes Idotimes Id)totimesn

s2 =(sumλ(2)`n

H(λ(2))leEθ(t)

Idotimes Pλ(2) otimes Id)

(totimesn minus s1)

612 Conclusion 105

s3 =(sumλ(3)`n

H(λ(3))leEθ(t)

Idotimes Idotimes Pλ(3)

)(totimesn minus s1 minus s2)

Then totimesn = s1 +s2 +s3 The slice rank of an element in the image of Pλ(1)otimes Idotimes Id

is at most dim[λ(1)] otimes Sλ(1)(Cn1) which is at most 2nH(λ(1))+o(n) (Section 62)Similarly for Id otimes Pλ(2) otimes Id and Id otimes Id otimes Pλ(3) The tensor s1 is in the imageof the sum

sumλ(1) Pλ(1) otimes Id otimes Id over λ(1) ` n with at most n1 parts There are

at most (n+ 1)n1 such partitions Thus SR(s1) le (n+ 1)n12nEθ(t)+o(n) Similarly

for s2 and s3 Therefore

lim supnrarrinfin

SR(totimesn)1n le lim supnrarrinfin

(3(n+ 1)maxiisin[3] ni 2nE

θ(t)+o(n))1n

(66)

The right-hand side of (66) equals F θ(t)

Proof of Theorem 626 Lemma 629 and Lemma 630 together prove Theo-rem 626

612 Conclusion

In this chapter we constructed the first infinite family of spectral points for 3-tensors over C the quantum functionals For 30 years the only explicit spectralpoints known were the gauge points The constructions in this chapter naturallygeneralise to higher-order tensors for which we refer to our paper [CVZ18] Wedo not know whether the quantum functionals are all spectral points for 3-tensorsover C Finally we showed that for complex tensors the asymptotic slice rankexists and equals the minimum value over the quantum functionals

Chapter 7

Algebraic branching programsapproximation and nondeterminism

This chapter is based on joint work with Karl Bringmann and Christian

Ikenmeyer [BIZ17]

71 Introduction

The study of asymptotic tensor rank in previous chapters was originally motivatedby the study of the complexity of matrix multiplication in the algebraic circuitmodel an algebraic model of computation In this chapter we will study severalother algebraic models of computation and algebraic complexity classes

Formulas the class VPe and the determinant

An (arithmetic) formula is a rooted binary tree whose leaves are each labeledwith a variable or a field constant and whose root and intermediate vertices arelabeled with either + (addition) or times (multiplication) In the natural way viarecursion over the tree structure a formula computes a multivariate polynomial f The formula size of a multivariate polynomial f is the smallest number of verticesrequired for any formula to compute f Here is an example of a formula of size 7computing the polynomial (3 + x)(3 + y)

3 x 3 y

+ +

times

A sequence of multivariate polynomials (fn)nisinN is called a family Valiant inhis seminal paper [Val79] introduced the complexity class VPe that is defined as

107

108 Chapter 7 Algebraic branching programs

the set of all families whose formula size is polynomially bounded (We say asequence (an)n isin NN of natural numbers is polynomially bounded if there exists aunivariate polynomial q such that an le q(n) for all n) For example the family((x1)

n + (x2)n + middot middot middot+ (xn)n)n is in VPe because the formula size of this family

grows quadratically

The smallest known formulas for the determinant family detn have size nO(logn)This follows from Berkowitzrsquo algorithm [Ber84] which gives an algebraic cir-cuit of depth O(log2 n) and thus by expanding we get an algebraic formula ofdepth O(log2 n) whose size is then trivially bounded by 2O(log2 n) = nO(logn) Itis a major open question in algebraic complexity theory whether formulas ofpolynomially bounded size exist for detn This question can be phrased in termsof complexity classes as asking whether or not the inclusion VPe sube VPs is strict(We will define VPs shortly)

Motivated by this question we study the closure class VPe of families ofpolynomials that can be approximated arbitrarily closely by families in VPe

(see Section 724 for the formal definition) Over the field R or C one can thinkof VPe as the set of families whose border formula size is polynomially boundedThe border formula size of a polynomial f is the smallest number c such that thereexists a sequence gi of polynomials with formula size at most c and limirarrinfin gi = f

Continuous lower bounds

In algebraic complexity theory problem instances correspond to vectors v isin FnA complexity lower bound often takes the form of a function f Fn rarr F that is zeroon the vectors of ldquolow complexityrdquo and nonzero on v We refer to Grochow [Gro13]for a discussion of settings where complexity lower bounds are obtained in thisway (eg [NW97 Raz09 LO15 GKKS13 LMR13 BI13]) Over the complexnumbers we can in fact assume that these functions f are continuous [Gro13](and even so-called highest-weight vector polynomials) If C and D are algebraiccomplexity classes with C sube D (for example C = VPe and D = VPs) thena proof of separation D 6sube C in this continuous manner implies the strongerseparation D 6sube C In our case it is thus natural to aim for the separation VPs 6subeVPe instead of the slightly weaker VPs 6sube VPe which provides further motivationfor studying VPe This is exactly analogous to the geometric complexity theoryapproach of Mulmuley and Sohoni (see eg [MS01 MS08] and the exposition[BLMW11 Sec 9]) which aims to prove the separation VNP 6sube VPs to attackValiantrsquos famous conjecture VPs 6= VNP [Val79] (Here VNP is the class ofp-definable families see Section 724)

New results in this chapter

We prove two new results in this chapter

71 Introduction 109

Algebraic branching programs of width 2 An algebraic branching pro-gram (abp) is a directed acyclic graph with a source vertex s and a sink vertex tthat has affine linear forms over the base field F as edge labels Moreover werequire that each vertex is labeled with an integer (its layer) and that edges in theabp only point from vertices in layer i to vertices in layer i+ 1 The width of anabp is the cardinality of its largest layer The size of an abp is the number of itsvertices The value of an abp is the sum of the values of all sndasht-paths where thevalue of an sndasht-path is the product of its edge labels We say that an abp computesits value The class VPs coincides with the class of families of polynomials thatcan be computed by abps of polynomially bounded size see eg [Sap16]

For k isin N we introduce the class VPk as the class of families of polyno-mials computable by width-k abps of polynomially bounded size It is well-known (see Lemma 72) that VPk sube VPe for all k ge 1 In 1992 Ben-Or andCleve [BOC92] showed that VPk = VPe for all k ge 3 In 2011 Allender andWang [AW16] showed that width-2 abps cannot compute every polynomial so inparticular we have a strict inclusion VP2 ( VP3

We prove that the closure of VP2 and the closure of VPe are equal

VP2 = VPe (71)

when char(F) 6= 2 From (71) and the result of Allender and Wang follows directlythat the inclusion VP2 ( VP2 is strict We have thus separated a complexityclass from its approximation closure

VNP via affine linear forms Every algebraic complexity class has a nondeter-ministic closure (see Section 725 for the definition) The nondeterministic closureof VP is called VNP and the nondeterministic closure of VPe is called VNPeIn 1980 Valiant [Val80] proved VNPe = VNP The nondeterministic closureof VP1 and VP2 we call VNP1 and VNP2 Using interpolation techniques wecan deduce VNP2 = VNP from (71) provided the field is infinite Using moresophisticated techniques we prove

VNP1 = VNP (72)

From (72) easily follows VP1 ( VNP1 Also from [AW16] we get VP2 ( VNP2We have thus separated complexity classes from their nondeterministic closures

Further related work

An excellent exposition on the history of small-width computation can be foundin [AW16] along with an explicit polynomial that cannot be computed by width-2abps namely x1x2 + x3x4 + middot middot middot+ x15x16 Saha Saptharishi and Saxena in [SSS09Cor 14] showed that x1x2 + x3x4 + x5x6 cannot be computed by width-2 abpsthat correspond to the iterated matrix multiplication of upper triangular matrices

110 Chapter 7 Algebraic branching programs

Burgisser in [Bur04] studied approximations in the model of general algebraiccircuits finding general upper bounds on the error degree For most algebraiccomplexity classes C the relation between C and C has not been an activeobject of study As pointed out recently by Forbes [For16] Nisanrsquos result [Nis91]implies that C = C for C being the class of size-k algebraic branching programson noncommuting variables A structured study of VP and VPs was startedin [GMQ16] Much work in lower bounds for algebraic approximation algorithmshas been done in the area of bilinear complexity dating back to [BCRL79 Str83Lic84] and more recently eg [Lan06 LO15 HIL13 Zui17 LM16a]

This chapter is organised as follows In Section 72 we discuss definitions andbasic results In Section 73 we prove that the approximation closure of VP2

equals the approximation closure of VPe ie VP2 = VPe In Section 74 we provethat the nondeterminism closure of VP1 equals VNP

72 Definitions and basic results

We briefly recall the definition of circuits formulas and branching programs andwe recall the definition of the corresponding complexity classes Then we discusssome straightforward relationships among these classes and review the proof of atheorem by Ben-Or and Cleve which inspired our work Finally we discuss theapproximation closure and the nondeterminism closure for algebraic complexityclasses

721 Computational models

Let x1 x2 be formal variables By F[x] we mean the ring of polynomials over Fwith variables x1 x2 xk with k large enough

A circuit is a directed acyclic graph G with one or more source vertices andone sink vertex Each source vertex is labelled by a variable xi or a constant c isin FThe other vertices are labelled by either + or times and have in-degree 2 (that isfan-in 2) Each vertex computes an element in F[x] by recursion over the graphThe element computed by the sink is the element computed by the circuit Thesize of a circuit is the number of vertices

A formula is a circuit whose graph is a treeAn algebraic branching program (abp) is a directed acyclic graph with a source

vertex s and a sink vertex t that has affine linear forms αxi + β α β isin F asedge labels Moreover we require that each vertex is labeled with an integer (itslayer) and that edges in the abp only point from vertices in layer i to vertices inlayer i+ 1 The width of an abp is the cardinality of its largest layer The size ofan abp is the number of its vertices The value of an abp is the sum of the valuesof all sndasht-paths where the value of an sndasht-path is the product of its edge labels

72 Definitions and basic results 111

We say that an abp computes its value

For example the following abp has depth 5 width 3 and computes thepolynomial x1x2 + x2 + 2x1 minus 1

x1 2

x1x2minus1

An abp G corresponds naturally to an iterated product of matrices for any twoconsecutive layers Li Li+1 in G let Mi be the matrix (evw)visinLiwisinLi+1

with evwthe label of the edge from v to w (or 0 if there is no edge from v to w) Then thevalue of G equals the product Mk middot middot middotM2M1

For example the above abp corresponds to the following iterated matrixproduct

(1 1 1

)minus1 0 00 x2 00 0 x1

1 0 0x1 1 00 0 2

1

11

722 Complexity classes VP VPe VPk

The circuit size of a polynomial f is the size of the smallest circuit computing f The formula size of a polynomial f is the size of the smallest formula computing f

A family is a sequence (fn)nisinN of multivariate polynomials over F A class is aset of families The class VP consists of all families (fn) with circuit size degreeand number of variables in poly(n) The class VPe consists of all families (fn)with formula size in poly(n) (The origin of the subscript e in VPe is the termldquoarithmetic expressionrdquo) Clearly VPe sube VP

We introduce classes defined by abps Let k ge 1 The class VPk consists of allfamilies computed by polynomial-size width-k abps with edges labelled by affinelinear forms

sumi αixi + β with coefficients αi β isin F

We note that the above classes depend on the choice of the ground field F

In our paper [BIZ17] we make a distinction between three different types ofedge labels for abps The class VPk in this chapter corresponds to the class VPg

k

in [BIZ17]

112 Chapter 7 Algebraic branching programs

723 The theorem of Ben-Or and Cleve

This subsection is about the relations among VPk and VPe

Lemma 71 VPk sube VP` when k le `

Proof This is clearly true

Lemma 72 VPk sube VPe for any k

Proof For the simple proof we refer to [BIZ17]

Ben-Or and Cleve [BOC92] showed that for k ge 3 the classes VPk and VPe

are in fact equal

Theorem 73 (Ben-Or and Cleve [BOC92]) For k ge 3 VPk = VPe

We will review the construction of Ben-Or and Cleve here because we will useit to prove Theorem 78 and Theorem 715 The following depth-reduction lemmafor formulas by Brent is a crucial ingredient

Lemma 74 (Brent [Bre74]) Let f be an n-variate degree-d polynomial computedby a formula of size s Then f can also be computed by a formula of size poly(s n d)and depth O(log s)

Proof See the survey of Saptharishi [Sap16 Lemma 55] for a modern proof

Proof of Theorem 73 Lemma 72 says VPk sube VPe We will prove theinlusion VPe sube VP3 from which follows VPe sube VPk by Lemma 71 andthus VPk = VPe For a polynomial h define the matrix

M(h) =

1 0 0h 1 00 0 1

which as part of an abp looks like

h

We call the following matrices primitive

bull M(h) with h any variable or any constant in F

bull the 3times 3 permutation matrices denoted by Mπ with π isin S3

bull the diagonal matrices Mabc = diag(a b c) with a b c isin F

72 Definitions and basic results 113

The entries of the primitives are variables or constants in F making them suitableto use in the construction of a width-3 abp

Let (fn) isin VPe Then fn can be computed by a formula of size s(n) isin poly(n)By Brentrsquos depth-reduction theorem for formulas (Lemma 74) fn can be computedby a formula of size poly(n) and depth d(n) isin O(log s(n))

We will construct a sequence of primitives A1 Am(n) such that

A1 middot middot middotAm(n) =

1 0 0fn 1 00 0 1

with m(n) isin O(4d(n)) = poly(n) Then

fn(x) = ( 1 1 1 )Mminus110A1 middot middot middotAm(

111

)

so fn(x) can be computed by a width-3 abp of length poly(n) proving the theoremTo explain the construction let h be a polynomial and consider a formula

computing h of depth d The goal is to construct (recursively on the formulastructure) primitives A1 Am such that

A1 middot middot middotAm =

1 0 0h 1 00 0 1

with m isin O(4d)

Suppose h is a variable or a constant Then M(h) is itself a primitive matrixSuppose h = f + g is a sum of two polynomials f g and suppose M(f) and

M(g) can be written as a product of primitives Then M(f + g) equals a productof primitives because M(f + g) = M(f)M(g) This can easily be verified directlyor by noting that in the corresponding partial abps the top-bottom paths (ui-vjpaths) have the same value

u1 u2 u3

v1 v2 v3

f

gsim

u1 u2 u3

v1 v2 v3

f+g

Suppose h = fg is a product of two polynomials f g and suppose M(f) andM(g) can be written as a product of primitives Then M(fg) equals a product ofprimitives because

M(f middot g) = M(23)

(M1minus11M(123)M(g)M(132)M(f)

)2M(23)

114 Chapter 7 Algebraic branching programs

(here (23) isin S3 denotes the transposition 1 7rarr 1 2 7rarr 3 3 7rarr 2 and (123) isin S3

denotes the cyclic shift 1 7rarr 2 2 7rarr 3 3 7rarr 1) as can be verified either directly orby checking that in the corresponding partial abps the top-bottom paths (ui-vjpaths) have the same value

u1 u2 u3

v1 v2 v3

f

minus1

g

f

g

minus1

sim

u1 u2 u3

v1 v2 v3

f middotg

This completes the construction

The length m of the construction is m(h) = 1 for h a variable or constant andrecursively m(f + g) = m(f) +m(g) m(f middot g) = 2(m(f) +m(g)) so m isin O(4d)where d is the formula size of h

The above result of Ben-Or and Cleve (Theorem 73) raises the intriguingquestion whether the inclusion VP2 sube VPe is strict Allender and Wang [AW16]show that the inclusion is indeed strict in fact they show that some polynomialscannot be computed by any width-2 abp

Theorem 75 (Allender and Wang [AW16]) The polynomial

x1x2 + x3x4 + middot middot middot+ x15x16

cannot be computed by any width-2 abp Therefore we have the separation ofclasses VP2 ( VP3 = VPe

72 Definitions and basic results 115

724 Approximation closure C

We define the norm of a complex multivariate polynomial as the sum of theabsolute values of its coefficients This defines a topology on the polynomial ringC[x1 xm] Given a complexity measure L say abp size or formula size thereis a natural notion of approximate complexity that is called border complexityNamely a polynomial f isin C[x] has border complexity Ltop at most c if there isa sequence of polynomials g1 g2 in C[x] converging to f such that each gisatisfies L(gi) le c It turns out that for reasonable classes over the field of complexnumbers C this topological notion of approximation is equivalent to what we callalgebraic approximation (see eg [Bur04]) Namely a polynomial f isin C[x] satisfiesL(f)alg le c iff there are polynomials f1 fe isin C[x] such that the polynomial

h = f + εf1 + ε2f2 + middot middot middot+ εefe isin C[εx]

has complexity LC(ε)(h) le c where ε is a formal variable and LC(ε)(h) denotesthe complexity of h over the field extension C(ε) This algebraic notion ofapproximation makes sense over any base field and we will use it in the statementsand proofs of this chapter

Definition 76 Let C(F) be a class over the field F We define the approximationclosure C(F) as follows a family (fn) over F is in C(F) if there are polynomialsfni(x) isin F[x] and a function e Nrarr N such that the family (gn) defined by

gn(x) = fn(x) + εfn1(x) + ε2fn2(x) + middot middot middot+ εe(n)fne(n)(x)

is in C(F(ε)) We define the poly-approximation closure Cpoly

(F) similarly butwith the additional requirement that e(n) isin poly(n) We call e(n) the errordegree

725 Nondeterminism closure N(C)

We introduce the nondeterminism closure for algebraic complexity classes

Definition 77 Let C be a class The class N(C) consists of families (fn) withthe following property there is a family (gn) isin C and p(n) q(n) isin poly(n) suchthat

fn(x) =sum

bisin01p(n)

gq(n)(bx)

where x and b denote sequences of variables x1 x2 and b1 b2 bp(n) We saythat f(x) is a hypercube sum over g and that b1 b2 bp(n) are the hypercubevariables For any subscript x we will use the notation VNPx to denote N(VPx)We remark that the map C 7rarr N(C) trivially satisfies all properties of being aKuratowski closure operator ie N(empty) = empty C sube N(C) N(CcupD) = N(C)cupN(D)and N(N(C)) = N(C)

116 Chapter 7 Algebraic branching programs

73 Approximation closure of VP2

We show that every polynomial can be approximated by a width-2 abp Evenbetter we show that every polynomial can be approximated by a width-2 abpof size polynomial in the formula size and with error degree polynomial in theformula size This is the main result of the current chapter

Theorem 78 VPe sube VP2poly

when char(F) 6= 2

Proof For a polynomial h define the matrix M(h) =(

1 0h 1

) We call the following

matrices primitives

bull M(h) with h any variable or constant in F

bull(

12ε

00 1

)

(minus2ε 0

0 1

)

(minus1 ε0 1

)

(minus1 minusε0 1

)

(minus1 00 1

)The entries of the primitives are variables or constants in the base field F(ε)making them suitable to use in a width-2 abp over the base field F(ε)

Let (fn) isin VPe so fn(x) can be computed by a formula of size s(n) isin poly(n)By Brentrsquos depth reduction theorem for formulas (Lemma 74) fn can be computedby a formula of size poly(n) and depth d(n) isin O(log s(n))

We will construct a sequence of primitives A1 Am(n) such that

A1 middot middot middotAm(n) =

(1 0fn 1

)+ ε

(fn111 fn112

fn121 fn122

)+ ε2

(fn211 fn212

fn221 fn222

)+ middot middot middot + εe

(fne11 fne12

fne21 fne22

)for some fnijk isin F[x] with m(n) e(n) isin O(8d(n)) = poly(n) Then

( 1 1 )(minus1 00 1

)A1 middot middot middotAm(n)( 1

1 ) = fn(x) +O(ε)

so fn(x) can be approximated by a width-2 abp of length poly(n) and with errordegree poly(n) proving the theorem

We begin with the construction Let h be a polynomial and consider a formulacomputing h of depth d The goal is to construct recursively on the tree structureof the formula a sequence of primitives A1 Am such that for some hijk isin F[x]

A1 middot middot middotAm =

(1 0h 1

)+ ε

(0 0h121 0

)+ ε2

(h211 h212

h221 h222

)+

middot middot middot + εe(he11 he12

he21 he22

)(73)

with m e isin O(8d) Notice the particular first-degree error pattern in (73) whichour recursion will rely on

73 Approximation closure of VP2 117

Suppose h is a variable or a constant Then M(h) is itself a primitive satisfy-ing (73)

Suppose h = f + g is a sum of two polynomials f g and suppose that

F =

(1 0f 1

)+ ε

(0 0f prime 0

)+O(ε2) (74)

G =

(1 0g 1

)+ ε

(0 0gprime 0

)+O(ε2) (75)

are products of primitives for some f prime gprime isin F[x] Then

G middot F =

(1 0

f + g 1

)+ ε

(0 0

f prime + gprime 0

)+O(ε2)

is a product of primitives satisfying (73)Suppose h = fg is a product of two polynomials and suppose that F and G

are of the form (74) and (75) and are products of primitives We will constructM((f + g)2) M(minusf 2) M(minusg2) approximately in such a way that when we usethe identity (f + g)2 minus f 2 minus g2 = 2fg the error terms cancel properly Define theexpressions sq+(A) and sqminus(A) by

sqplusmn(A) =

(minusε 00 1

)middot A middot

(minus1 plusmnε0 1

)middot A middot

(1ε

00 1

)

Then

sqplusmn(F ) =

(1∓ εf 0

plusmnf 2 +O(ε) 1plusmn εf

)+O(ε2)

We have

sqminus(F ) middot sqminus(G) middot sq+(G middot F )

=

(1 + εg 0

minusg2 +O(ε) 1minus εg

)middot(

1 + εf 0minusf 2 +O(ε) 1minus εf

)middot(

1minus ε(f + g) 0(f + g)2 +O(ε) 1 + ε(f + g)

)+O(ε2)

which simplifies to

sqminus(F ) middot sqminus(G) middot sq+(G middot F ) =

(1 0

2fg +O(ε) 1

)+O(ε2)

118 Chapter 7 Algebraic branching programs

We conclude(2 00 1

)middot sqminus(G) middot sqminus(F ) middot sq+(G middot F ) middot

(12

00 1

)=

(minus2ε 0

0 1

)middotG middot

(minus1 minusε0 1

)middotG middot

(minus1 00 1

)middot F middot

(minus1 minusε0 1

)F

middot(minus1 00 1

)middotG middot F middot

(minus1 ε0 1

)middotG middot F middot

(12ε

00 1

)=

(1 0

fg +O(ε) 1

)+O(ε2)

This completes the constructionThe length m of the construction is m(h) = 1 for h a variable or constant

and recursively m(f + g) = m(f) + m(g) m(f middot g) = 4(m(f) + m(g)) + 7 Weconclude m isin O(8d) The error degree e of the construction satisfies the samerecursion so e isin O(8d)

Remark 79 The construction in the above proof of Theorem 78 is differentfrom the construction in our paper [BIZ17] The recursion in the above proof issimpler while the construction in [BIZ17] has a better error degree and has aspecial form which relates it to a family of polynomials called continuants

Corollary 710 VP2 = VPe and VP2poly

= VPepoly

when char(F) 6= 2

Proof We have VP2 sube VPe by Lemma 72 Taking closures on both sides weobtain VP2 sube VPe and VP2

polysube VPepoly

When char(F) 6= 2 VPe sube VP2

poly(Theorem 78) By taking closures follows

VPe sube VP2 and VPepolysube VP2

poly

Corollary 711 VP2poly

= VPe when char(F) 6= 2 and F is infinite

Proof By Corollary 710 VP2poly

= VPepoly

We prove VPepoly

= VPe inLemma 712 below

Lemma 712 VPepoly

= VPe when char(F) 6= 2 and F is infinite

Proof The inclusion VPe sube VPepoly

is trivially true We prove the other directionLet (fn) isin VPe

poly Then there are polynomials fni(x) isin F[x] and e(n) isin poly(n)

such that

fn(x) + εfn1(x) + ε2fn2(x) + middot middot middot+ εe(n)fne(n)(x)

is computed by a poly-size formula Γ over F(ε) Let α0 α1 αe(n) be distinctelements in F such that replacing ε by αj in Γ is a valid substitution ie not

74 Nondeterminism closure of VP1 119

causing division by zero These αj exist since our field is infinite by assumptionView

gn(ε) = fn(x) + εfn1(x) + ε2fn2(x) + middot middot middot+ εe(n)fne(n)(x)

as a polynomial in ε The polynomial gn(ε) has degree at most e(n) so we canwrite gn(ε) as follows (Lagrange interpolation on e(n) + 1 points)

gn(ε) =

e(n)sumj=0

gn(αj)prod

0lemlee(n)m6=j

εminus αmαj minus αm

(76)

Clearly fn(x) = gn(0) However replacing ε by 0 in Γ is not a valid substitutionin general From (76) we see directly how to write gn(0) as a linear combinationof the values gn(αj) namely

gn(0) =

e(n)sumj=0

gn(αj)prod

0lemlee(n)m6=j

minusαmαj minus αm

that is

gn(0) =e(n)sumj=0

βj gn(αj) with βj =prod

0lemlee(n)m6=j

αmαm minus αj

The value gn(αj) is computed by the formula Γ with ε replaced by αj which we

denote by Γ|ε=αj Thus fn(x) is computed by the poly-size formulasume(n)

j=0 βj Γ|ε=αj We conclude (fn) isin VPe

Remark 713 The statement of Lemma 712 also holds with VPe replaced withVPs or with VP by a similar proof

74 Nondeterminism closure of VP1

Recall the definition of VNPx = N(VPx) from Definition 77 Valiant proved thefollowing characterisation of VNP in his seminal work [Val80] See also [BCS97Thm 2126] [Bur00 Thm 213] and [MP08 Thm 2]

Theorem 714 (Valiant [Val80]) VNPe = VNP

We strengthen Valiantrsquos characterisation of VNP from VNPe to VNP1

Theorem 715 VNP1 = VNP when char(F) 6= 2

120 Chapter 7 Algebraic branching programs

The idea of the proof is ldquoto simulate in VNP1rdquo the primitives that we used inthe proof of VPe sube VP3 (Theorem 73)

Proof of Theorem 715 Clearly VNP1 sube VNP by Lemma 72 and takingthe nondeterminism closure N We will prove that VNP sube VNP1 Recall thatin the proof of VPe sube VP3 (Theorem 73) we defined for any polynomial h thematrix

M(h) =

1 0 0h 1 00 0 1

and we called the following matrices primitives

bull M(h) with h any variable or any constant in F

bull the 3times 3 permutation matrices denoted by Mπ for π isin S3

bull the diagonal matrices Mabc = diag(a b c) with a b c isin F

In the proof of VPe sube VP3 we constructed for any family (fn) isin VPe a sequenceof primitive matrices An1 Ant(n) with t(n) isin poly(n) such that

fn(x) = ( 1 1 1 )Mminus110A1 middot middot middotAm(

111

) (77)

We will show VPe sube VNP1 by constructing a hypercube sum over a width-1abp that evaluates the right-hand side of (77) This implies VNPe sube VNP1 bytaking the N-closure Then by Valiantrsquos Theorem 714 VNP sube VNP1

Let f(x) be a polynomial and let A1 Ak be primitive matrices suchthat f(x) is computed as

f(x) = ( 1 1 1 )Ak middot middot middotA1

(111

)

View this expression as a width-3 abp G with vertex layers labeled as shown inthe left-hand diagram in Fig 71 Assume for simplicity that all edges betweenlayers are present possibly with label 0 The sum of the values of every sndasht pathin G equals f(x)

f(x) =sumjisin[3]k

Ak[jk jkminus1] middot middot middotA1[j2 j1] (78)

We introduce some hypercube variables To every vertex of G except s and twe associate a bit the bits in the ith layer we call b1[i] b2[i] b3[i] To an sndashtpath in G we associate an assignment of the bj[i] by setting the bits of verticesvisited by the path to 1 and the others to 0 For example in the right-hand

74 Nondeterminism closure of VP1 121

s

0

1

2

kminus1

k

t

A1

A2

Ak

s

1 0 0

0 1 0

0 1 0

0 0 1

0 1 0

t

Figure 71 Illustration of the layer labelling and the path labelling used in theproof of Theorem 715

diagram in Fig 71 we show an sndasht path with the corresponding assignment of thebits bj[i] The assignments of the bj[i] corresponding to sndasht paths are preciselythe assignments such that for every i isin [k] exactly one of b1[i] b2[i] b3[i] equals 1Let

V (b1 b2 b3) =prodiisin[k]

(b1[i] + b2[i] + b3[i]

)prodstisin[3]s 6=t

(1minus bs[i]bt[i]

) (79)

Then the assignments of the bj[i] corresponding to sndasht paths are precisely theassignments such that V (b1 b2 b3) = 1 Otherwise V (b1 b2 b3) = 0

We will write f(x) as a hypercube sum by replacing each Ai[ji jiminus1] in (78)by a product of affine linear forms Si(Ai) with variables b and xsum

b

V (b1 b2 b3)Sk(Ak) middot middot middotS1(A1)

Define the expression Eq(α β) = (1minus α minus β)(1minus α minus β) for α β isin 0 1 Theexpression Eq(α β) evaluates to 1 if α equals β and evaluates to 0 otherwise

bull For any variable or constant x define

Si(M(x)) =(1 + (xminus 1)(b1[i]minus b1[iminus1])

)middot(1minus (1minus b2[i])b2[iminus1]

)middot Eq

(b3[iminus1] b3[i]

)

122 Chapter 7 Algebraic branching programs

bull For any permutation π isin S3 define

Si(Mπ) = Eq(b1[iminus1] bπ(1)[i]

)middot Eq

(b2[iminus1] bπ(2)[i]

)middot Eq

(b3[iminus1] bπ(3)[i]

)

bull For any constants a b c isin F define

Si(Mabc) =(a middot b1[iminus1] + b middot b2[iminus1] + c middot b3[iminus 1]

)middot Eq

(b1[iminus1] b1[i]

)middot Eq

(b2[iminus1] b2[i]

)middot Eq

(b3[iminus1] b3[i]

)

One verifies that

f(x) =sumb

V (b1 b2 b3)Sk(Ak) middot middot middotS1(A1)

Some of the factors in the expressions for the Si(Ai) are not affine linear As afinal step we apply the equality 1 + xy = 1

2

sumcisin01(x + 1 minus 2c)(y + 1 minus 2c) to

write these factors as products of affine linear forms introducing new hypercubevariables

75 Conclusion

We finish with an overview of inclusions equalities and separations among theclasses VPk VPe VP and their approximation and nondeterminism closures(when char(F) 6= 2) see Fig 72 The figure relies on the following two simplelemmas of which proofs can be found in our paper [BIZ17]

Lemma 716 ([BIZ17 Prop 510]) VP1 = VP1

Lemma 717 ([BIZ17 Prop 511]) VP1 ( VNP1 when char(F) 6= 2

75 Conclusion 123

VP1 VP2 VPe VP

VP1 VP2 VPe VP

VNP1 VNP2 VNPe VNP=

(

(

= =

(

= sube

( sube[AW16]

717

=716 sube sube(

subesube(710

715 [Val80]

[Val79]

Figure 72 Overview of relations among the algebraic complexity classes VPkVPe VP and their approximation and nondeterminism closures (when char(F) isnot 2) The relations without reference are either by definition or follow logicallyfrom the other relations

Bibliography

[AJRS13] Elizabeth S Allman Peter D Jarvis John A Rhodes andJeremy G Sumner Tensor rank invariants inequalities andapplications SIAM J Matrix Anal Appl 34(3)1014ndash1045 2013doi101137120899066 p 14

[Alo98] Noga Alon The Shannon capacity of a union Combinatorica18(3)301ndash310 1998 doi101007PL00009824 p 37

[ASU13] Noga Alon Amir Shpilka and Christopher Umans On sunflowersand matrix multiplication Comput Complexity 22(2)219ndash243Jun 2013 doi101007s00037-013-0060-1 p 48

[AW16] Eric Allender and Fengming Wang On the power of algebraicbranching programs of width two Comput Complexity25(1)217ndash253 2016 doi101007s00037-015-0114-7 p 17109 114 123

[AZ14] Martin Aigner and Gunter M Ziegler Proofs from The BookSpringer-Verlag Berlin fifth edition 2014doi101007978-3-662-44205-0 p 71

[BC18] Boris Bukh and Christopher Cox On a fractional version ofHaemersrsquo bound arXiv 2018 arXiv180200476 p 41 42

[BCC+17] Jonah Blasiak Thomas Church Henry Cohn Joshua A GrochowEric Naslund William F Sawin and Chris Umans On cap setsand the group-theoretic approach to matrix multiplication DiscreteAnal 2017 arXiv160506702 doi1019086da1245 p 4883 84 104

125

126 Bibliography

[BCPZ16] Harry Buhrman Matthias Christandl Christopher Perry andJeroen Zuiddam Clean quantum and classical communicationprotocols Phys Rev Lett 117230503 Dec 2016doi101103PhysRevLett117230503 p 1

[BCRL79] Dario Bini Milvio Capovani Francesco Romani and Grazia LottiO(n27799) complexity for ntimes n approximate matrix multiplicationInf Process Lett 8(5)234ndash235 1979doi1010160020-0190(79)90113-3 p 3 110

[BCS97] Peter Burgisser Michael Clausen and M Amin ShokrollahiAlgebraic complexity theory volume 315 of Grundlehren MathWiss Springer-Verlag Berlin 1997doi101007978-3-662-03338-8 p 4 6 48 50 66 79 119

[BCSX10] Arnab Bhattacharyya Victor Chen Madhu Sudan and Ning XieTesting Linear-Invariant Non-linear Properties A Short Reportpages 260ndash268 Springer Berlin Heidelberg Berlin Heidelberg2010 doi101007978-3-642-16367-8_18 p 48

[BCZ17a] Markus Blaser Matthias Christandl and Jeroen Zuiddam Theborder support rank of two-by-two matrix multiplication is sevenarXiv 2017 arXiv170509652 p 1 15

[BCZ17b] Harry Buhrman Matthias Christandl and Jeroen ZuiddamNondeterministic Quantum Communication Complexity the CyclicEquality Game and Iterated Matrix Multiplication In Christos HPapadimitriou editor 8th Innovations in Theoretical ComputerScience Conference (ITCS 2017) pages 241ndash2418 2017arXiv160303757 doi104230LIPIcsITCS201724 p 115

[Ber84] Stuart J Berkowitz On computing the determinant in smallparallel time using a small number of processors Inform ProcessLett 18(3)147ndash150 1984 doi1010160020-0190(84)90018-8p 108

[BI13] Peter Burgisser and Christian Ikenmeyer Explicit lower bounds viageometric complexity theory Proceedings 45th Annual ACMSymposium on Theory of Computing 2013 pages 141ndash150 2013doi10114524886082488627 p 108

[Bin80] Dario Bini Relations between exact and approximate bilinearalgorithms Applications Calcolo 17(1)87ndash97 1980doi101007BF02575865 p 3

Bibliography 127

[BIZ17] Karl Bringmann Christian Ikenmeyer and Jeroen Zuiddam OnAlgebraic Branching Programs of Small Width In Ryan OrsquoDonnelleditor 32nd Computational Complexity Conference (CCC 2017)pages 201ndash2031 2017 doi104230LIPIcsCCC201720 p 1107 111 112 118 122

[Bla13] Anna Blasiak A graph-theoretic approach to network coding PhDthesis Cornell University 2013 URL httpsecommonscornelledubitstreamhandle181334147ab675pdf p 42

[BLMW11] Peter Burgisser Joseph M Landsberg Laurent Manivel and JerzyWeyman An overview of mathematical issues arising in thegeometric complexity theory approach to VP 6= VNP SIAM JComput 40(4)1179ndash1209 2011 doi101137090765328 p 108

[BOC92] Michael Ben-Or and Richard Cleve Computing algebraic formulasusing a constant number of registers SIAM J Comput21(1)54ndash58 1992 doi1011370221006 p 17 109 112

[BPR+00] Charles H Bennett Sandu Popescu Daniel Rohrlich John ASmolin and Ashish V Thapliyal Exact and asymptotic measuresof multipartite pure-state entanglement Phys Rev A63(1)012307 2000 doi101103PhysRevA63012307 p 48

[Bre74] Richard P Brent The parallel evaluation of general arithmeticexpressions J ACM 21(2)201ndash206 April 1974doi101145321812321815 p 112

[Bri87] Michel Brion Sur lrsquoimage de lrsquoapplication moment In Seminairedrsquoalgebre Paul Dubreil et Marie-Paule Malliavin (Paris 1986)volume 1296 of Lecture Notes in Math pages 177ndash192 SpringerBerlin 1987 doi101007BFb0078526 p 9 93 94

[BS83] Eberhard Becker and Niels and Schwartz Zum Darstellungssatzvon Kadison-Dubois Arch Math (Basel) 40(5)421ndash428 1983doi101007BF01192806 p 7 12 33

[Bur90] Peter Burgisser Degenerationsordnung und Tragerfunktionalbilinearer Abbildungen PhD thesis Universitat Konstanz 1990httpnbn-resolvingdeurnnbndebsz352-opus-20311p 57 101

[Bur00] Peter Burgisser Completeness and reduction in algebraiccomplexity theory volume 7 of Algorithms and Computation inMathematics Springer-Verlag Berlin 2000doi101007978-3-662-04179-6 p 119

128 Bibliography

[Bur04] Peter Burgisser The complexity of factors of multivariatepolynomials Found Comput Math 4(4)369ndash396 2004doi101007s10208-002-0059-5 p 110 115

[BX15] Arnab Bhattacharyya and Ning Xie Lower bounds for testingtriangle-freeness in boolean functions Comput Complexity24(1)65ndash101 2015 doi101007s00037-014-0092-1 p 48

[BZ17] Jop Briet and Jeroen Zuiddam On the orthogonal rank of Cayleygraphs and impossibility of quantum round elimination QuantumInf Comput 17(1amp2) 2017 URL httpwwwrintonpresscomxxqic17qic-17-120106-0116pdfarXiv160806113 p 2

[CHM07] Matthias Christandl Aram W Harrow and Graeme MitchisonNonzero Kronecker coefficients and what they tell us about spectraComm Math Phys 270(3)575ndash585 2007doi101007s00220-006-0157-3 p 90

[CJZ18] Matthias Christandl Asger Kjaeligrulff Jensen and Jeroen ZuiddamTensor rank is not multiplicative under the tensor product LinearAlgebra Appl 543125ndash139 2018doi101016jlaa201712020 p 2 15

[CKSV16] Suryajith Chillara Mrinal Kumar Ramprasad Saptharishi andV Vinay The chasm at depth four and tensor rank Old resultsnew insights arXiv 2016 arXiv160604200 p 15

[CLP17] Ernie Croot Vsevolod F Lev and Peter Pal Pach Progression-freesets in Zn

4 are exponentially small Ann of Math (2)185(1)331ndash337 2017 doi104007annals201718517 p 4881

[CM06] Matthias Christandl and Graeme Mitchison The spectra ofquantum states and the Kronecker coefficients of the symmetricgroup Comm Math Phys 261(3)789ndash797 2006doi101007s00220-005-1435-1 p 91

[CMR+14] Toby Cubitt Laura Mancinska David E Roberson SimoneSeverini Dan Stahlke and Andreas Winter Bounds onentanglement-assisted source-channel coding via the Lovasz thetanumber and its variants IEEE Trans Inform Theory60(11)7330ndash7344 2014 arXiv13107120doi101109TIT20142349502 p 42

Bibliography 129

[CT12] Thomas M Cover and Joy A Thomas Elements of informationtheory John Wiley amp Sons 2012 p 60

[CU13] Henry Cohn and Christopher Umans Fast matrix multiplicationusing coherent configurations In Proceedings of the Twenty-FourthAnnual ACM-SIAM Symposium on Discrete Algorithms pages1074ndash1086 SIAM 2013 p 15

[CVZ16] Matthias Christandl Peter Vrana and Jeroen ZuiddamAsymptotic tensor rank of graph tensors beyond matrixmultiplication arXiv 2016 arXiv160907476 p 2 65 67 7985

[CVZ18] Matthias Christandl Peter Vrana and Jeroen Zuiddam Universalpoints in the asymptotic spectrum of tensors In Proceedings of 50thAnnual ACM SIGACT Symposium on the Theory of Computing(STOCrsquo18) ACM New York 2018 arXiv170907851doi10114531887453188766 p 2 47 65 87 88 96 103 105

[CW82] Don Coppersmith and Shmuel Winograd On the asymptoticcomplexity of matrix multiplication SIAM J Comput11(3)472ndash492 1982 doi1011370211038 p 3

[CW87] Don Coppersmith and Shmuel Winograd Matrix multiplication viaarithmetic progressions In Proceedings of the nineteenth annualACM symposium on Theory of computing pages 1ndash6 ACM 1987p 3

[CW90] Don Coppersmith and Shmuel Winograd Matrix multiplication viaarithmetic progressions J Symbolic Comput 9(3)251ndash280 1990doi101016S0747-7171(08)80013-2 p 4 6 8 10 48 67

[CZ18] Matthias Christandl and Jeroen Zuiddam Tensor surgery andtensor rank Comput Complexity Mar 2018doi101007s00037-018-0164-8 p 2 86

[Dra15] Jan Draisma Multilinear Algebra and Applications (lecture notes)2015 URL httpsmathsitesunibechjdraismapublicationsmlapplpdfp 15

[DVC00] Wolfgang Dur Guivre Vidal and Juan Ignacio Cirac Three qubitscan be entangled in two inequivalent ways Phys Rev A (3)62(6)062314 12 2000 doi101103PhysRevA62062314 p 48

130 Bibliography

[Ede04] Yves Edel Extensions of generalized product caps Des CodesCryptogr 31(1)5ndash14 2004 doi101023A1027365901231p 48 83

[EG17] Jordan S Ellenberg and Dion Gijswijt On large subsets of Fnq with

no three-term arithmetic progression Ann of Math (2)185(1)339ndash343 2017 doi104007annals201718518 p 1048 81 83 84

[FK14] Hu Fu and Robert Kleinberg Improved lower bounds for testingtriangle-freeness in boolean functions via fast matrix multiplicationIn Approximation Randomization and CombinatorialOptimization Algorithms and Techniques (APPROXRANDOM2014) pages 669ndash676 2014doi104230LIPIcsAPPROX-RANDOM2014669 p 48

[For16] Michael Forbes Some concrete questions on the border complexityof polynomials Presentation given at the Workshop on AlgebraicComplexity Theory WACT 2016 in Tel Avivhttpswwwyoutubecomwatchv=1HMogQIHT6Q 2016 p 110

[Fra02] Matthias Franz Moment polytopes of projective G-varieties andtensor products of symmetric group representations J Lie Theory12(2)539ndash549 2002 URLhttpemisamsorgjournalsJLTvol12_no216htmlp 93 94

[Fri17] Tobias Fritz Resource convertibility and ordered commutativemonoids Math Structures Comput Sci 27(6)850ndash938 2017doi101017S0960129515000444 p 37

[Ful97] William Fulton Young tableaux volume 35 of LondonMathematical Society Student Texts Cambridge University PressCambridge 1997 With applications to representation theory andgeometry p 88

[GKKS13] Ankit Gupta Pritish Kamath Neeraj Kayal and RamprasadSaptharishi Approaching the chasm at depth four In 2013 IEEEConference on Computational ComplexitymdashCCC 2013 pages 65ndash73IEEE Computer Soc Los Alamitos CA 2013doi101109CCC201316 p 108

[GMQ16] Joshua A Grochow Ketan D Mulmuley and Youming QiaoBoundaries of VP and VNP In Ioannis Chatzigiannakis MichaelMitzenmacher Yuval Rabani and Davide Sangiorgi editors 43rd

Bibliography 131

International Colloquium on Automata Languages andProgramming (ICALP 2016) volume 55 pages 341ndash3414 2016arXiv160502815 doi104230LIPIcsICALP201634 p 110

[Gro13] Joshua A Grochow Unifying and generalizing known lower boundsvia geometric complexity theory arXiv 2013 arXiv13046333p 108

[GW09] Roe Goodman and Nolan R Wallach Symmetry representationsand invariants volume 255 of Graduate Texts in MathematicsSpringer Dordrecht 2009 doi101007978-0-387-79852-3p 88

[Hae79] Willem Haemers On some problems of Lovasz concerning theShannon capacity of a graph IEEE Trans Inform Theory25(2)231ndash232 1979 doi101109TIT19791056027 p 37 4042

[Has90] Johan Hastad Tensor rank is NP-complete J Algorithms11(4)644ndash654 1990 doi1010160196-6774(90)90014-6 p 47

[HHHH09] Ryszard Horodecki Pawe l Horodecki Micha l Horodecki and KarolHorodecki Quantum entanglement Rev Modern Phys81(2)865ndash942 2009 doi101103RevModPhys81865 p 48

[HIL13] Jonathan D Hauenstein Christian Ikenmeyer and Joseph MLandsberg Equations for lower bounds on border rank ExpMath 22(4)372ndash383 2013 doi101080105864582013825892p 15 110

[Hum75] James E Humphreys Linear algebraic groups Springer-VerlagNew York-Heidelberg 1975 Graduate Texts in Mathematics No21 p 93

[HX17] Ishay Haviv and Ning Xie Sunflowers and testing triangle-freenessof functions Comput Complexity 26(2)497ndash530 Jun 2017doi101007s00037-016-0138-7 p 48

[Ike13] Christian Ikenmeyer Geometric complexity theory tensor rankand LittlewoodndashRichardson coefficients PhD thesis UniversitatPaderborn 2013 p 14

[Kar72] Richard M Karp Reducibility among combinatorial problems InComplexity of computer computations (Proc Sympos IBM ThomasJ Watson Res Center Yorktown Heights NY 1972) pages85ndash103 Plenum New York 1972 p 36

132 Bibliography

[Knu94] Donald E Knuth The sandwich theorem Electron J Combin 11994 URL httpwwwcombinatoricsorgVolume_1Abstractsv1i1a1htmlp 41

[Kra84] Hanspeter Kraft Geometrische Methoden in der InvariantentheorieSpringer 1984 doi101007978-3-663-10143-7 p 50 88 93

[KS08] Tali Kaufman and Madhu Sudan Algebraic property testing Therole of invariance In Proceedings of the Fortieth Annual ACMSymposium on Theory of Computing STOC rsquo08 pages 403ndash412New York NY USA 2008 ACMdoi10114513743761374434 p 48

[KSS16] Robert Kleinberg William F Sawin and David E Speyer Thegrowth rate of tri-colored sum-free sets arXiv 2016arXiv160700047 p 48 79 83

[Lan06] Joseph M Landsberg The border rank of the multiplication of2times 2 matrices is seven J Amer Math Soc 19(2)447ndash459 2006doi101090S0894-0347-05-00506-0 p 110

[LG14] Francois Le Gall Powers of tensors and fast matrix multiplicationIn ISSAC 2014mdashProceedings of the 39th International Symposiumon Symbolic and Algebraic Computation pages 296ndash303 ACM NewYork 2014 doi10114526086282608664 p 4 6 8 48 85

[Lic84] Thomas Lickteig A note on border rank Inf Process Lett18(3)173ndash178 1984 doi1010160020-0190(84)90023-1p 110

[LM16a] Joseph M Landsberg and Mateusz Micha lek A 2n2 minus log(n)minus 1lower bound for the border rank of matrix multiplication arXiv2016 arXiv160807486 p 110

[LM16b] Joseph M Landsberg and Mateusz Micha lek Abelian tensorsJ Math Pures Appl 2016 doi101016jmatpur201611004p 14

[LMR13] Joseph M Landsberg Laurent Manivel and Nicolas RessayreHypersurfaces with degenerate duals and the geometric complexitytheory program Comment Math Helv 88(2)469ndash484 2013doi104171CMH292 p 108

[LO15] Joseph M Landsberg and Giorgio Ottaviani New lower bounds forthe border rank of matrix multiplication Theory Comput

Bibliography 133

11285ndash298 2015 arXiv11126007doi104086toc2015v011a011 p 108 110

[Lov79] Laszlo Lovasz On the Shannon capacity of a graph IEEE TransInform Theory 25(1)1ndash7 1979 doi101109TIT19791055985p 13 35 41

[Mar08] Murray Marshall Positive polynomials and sums of squaresvolume 146 of Mathematical Surveys and Monographs AmericanMathematical Society Providence RI 2008doi101090surv146 p 34

[MP71] Robert J McEliece and Edward C Posner Hide and seek datastorage and entropy The Annals of Mathematical Statistics42(5)1706ndash1716 1971 doi101214aoms1177693169 p 41

[MP08] Guillaume Malod and Natacha Portier Characterizing Valiantrsquosalgebraic complexity classes J Complexity 24(1)16ndash38 2008doi101016jjco200609006 p 119

[MS01] Ketan D Mulmuley and Milind Sohoni Geometric complexitytheory I An approach to the P vs NP and related problemsSIAM J Comput 31(2)496ndash526 2001doi101137S009753970038715X p 14 108

[MS08] Ketan D Mulmuley and Milind Sohoni Geometric complexitytheory II Towards explicit obstructions for embeddings amongclass varieties SIAM J Comput 38(3)1175ndash1206 2008doi101137080718115 p 108

[Nes84] Linda Ness A stratification of the null cone via the moment mapAmer J Math 106(6)1281ndash1329 1984 With an appendix byDavid Mumford doi1023072374395 p 9 93 94

[Nis91] Noam Nisan Lower bounds for non-commutative computation InProceedings of the twenty-third annual ACM symposium on Theoryof computing pages 410ndash418 ACM 1991doi101145103418103462 p 110

[Nor16] Sergey Norin A distribution on triples with maximum entropymarginal arXiv 2016 arXiv160800243 p 83

[NW97] Noam Nisan and Avi Wigderson Lower bounds on arithmeticcircuits via partial derivatives Comput Complexity 6(3)217ndash234199697 doi101007BF01294256 p 108

134 Bibliography

[Pan78] Victor Ya Pan Strassenrsquos algorithm is not optimal Trilineartechnique of aggregating uniting and canceling for constructingfast algorithms for matrix operations In 19th Annual Symposiumon Foundations of Computer Science (Ann Arbor Mich 1978)pages 166ndash176 IEEE Long Beach Calif 1978 p 3

[Pan80] Victor Ya Pan New fast algorithms for matrix operations SIAMJ Comput 9(2)321ndash342 1980 doi1011370209027 p 3

[Pan81] Victor Ya Pan New combinations of methods for the accelerationof matrix multiplication Comput Math Appl 7(1)73ndash125 1981doi1010160898-1221(81)90009-2 p 3

[Pan84] Victor Ya Pan How to multiply matrices faster volume 179 ofLecture Notes in Computer Science Springer-Verlag Berlin 1984doi1010073-540-13866-8 p 3

[Pan18] Victor Ya Pan Fast feasible and unfeasible matrix multiplicationarXiv 2018 arXiv180404102 p 6

[PD01] Alexander Prestel and Charles N Delzell Positive polynomialsSpringer Monographs in Mathematics Springer-Verlag Berlin2001 From Hilbertrsquos 17th problem to real algebradoi101007978-3-662-04648-7 p 34

[Peb16] Luke Pebody Proof of a conjecture of Kleinberg-Sawin-SpeyerarXiv 2016 arXiv160805740 p 83

[PS98] George Polya and Gabor Szego Problems and theorems inanalysis I Classics in Mathematics Springer-Verlag Berlin 1998Series integral calculus theory of functions Translated from theGerman by Dorothee Aeppli Reprint of the 1978 Englishtranslation doi101007978-3-642-61905-2 p 21

[Raz09] Ran Raz Multi-linear formulas for permanent and determinant areof super-polynomial size J ACM 56(2)Art 8 17 2009doi10114515027931502797 p 108

[Raz13] Ran Raz Tensor-rank and lower bounds for arithmetic formulasJ ACM 60(6)Art 40 15 2013 doi1011452535928 p 14

[Rom82] Francesco Romani Some properties of disjoint sums of tensorsrelated to matrix multiplication SIAM J Comput 11(2)263ndash2671982 doi1011370211020 p 3

Bibliography 135

[Sap16] Ramprasad Saptharishi A survey of lower bounds in arithmeticcircuit complexity 302 2016 Online survey URLhttpsgithubcomdasarpmarlowerbounds-survey p 6 17109 112

[Sch81] Arnold Schonhage Partial and total matrix multiplication SIAMJ Comput 10(3)434ndash455 1981 p 3

[Sch03] Alexander Schrijver Combinatorial optimization polyhedra andefficiency volume 24 Springer Science amp Business Media 2003p 37 41

[Sha56] Claude E Shannon The zero error capacity of a noisy channelInstitute of Radio Engineers Transactions on Information TheoryIT-2(September)8ndash19 1956 doi101109TIT19561056798p 13 35

[Sha09] Asaf Shapira Greenrsquos conjecture and testing linear-invariantproperties In Proceedings of the Forty-first Annual ACMSymposium on Theory of Computing STOC rsquo09 pages 159ndash166New York NY USA 2009 ACMdoi10114515364141536438 p 48

[Shi16] Yaroslav Shitov How hard is the tensor rank arXiv 2016arXiv161101559 p 47

[Sin64] Richard C Singleton Maximum distance q-nary codes IEEETrans Information Theory IT-10116ndash118 1964doi101109TIT19641053661 p 101

[SOK14] Adam Sawicki Micha l Oszmaniec and Marek Kus Convexity ofmomentum map Morse index and quantum entanglement RevMath Phys 26(3)1450004 39 2014doi101142S0129055X14500044 p 9

[SSS09] Chandan Saha Ramprasad Saptharishi and Nitin Saxena Thepower of depth 2 circuits over algebras In IARCS AnnualConference on Foundations of Software Technology and TheoreticalComputer Science volume 4 pages 371ndash382 2009arXiv09042058 doi104230LIPIcsFSTTCS20092333p 109

[Sto10] Andrew James Stothers On the complexity of matrix multiplicationPhD thesis University of Edinburgh 2010httphdlhandlenet18424734 p 4 6 8 48

136 Bibliography

[Str69] Volker Strassen Gaussian elimination is not optimal NumerMath 13(4)354ndash356 1969 doi101007BF02165411 p 3 5

[Str83] Volker Strassen Rank and optimal computation of generic tensorsLinear Algebra Appl 5253645ndash685 1983doi1010160024-3795(83)80041-X p 110

[Str86] Volker Strassen The asymptotic spectrum of tensors and theexponent of matrix multiplication In Proceedings of the 27thAnnual Symposium on Foundations of Computer Science SFCS rsquo86pages 49ndash54 Washington DC USA 1986 IEEE Computer Societydoi101109SFCS198652 p 4 7

[Str87] Volker Strassen Relative bilinear complexity and matrixmultiplication J Reine Angew Math 375376406ndash443 1987doi101515crll1987375-376406 p 3 4 49 67

[Str88] Volker Strassen The asymptotic spectrum of tensors J ReineAngew Math 384102ndash152 1988doi101515crll1988384102 p 4 7 12 19 26 27 28 2930 32 33 49 50 51

[Str91] Volker Strassen Degeneration and complexity of bilinear mapssome asymptotic spectra J Reine Angew Math 413127ndash1801991 doi101515crll1991413127 p 3 4 10 48 49 5255 56 57 66 67 81 82

[Str94] Volker Strassen Algebra and complexity In First EuropeanCongress of Mathematics Vol II (Paris 1992) volume 120 ofProgr Math pages 429ndash446 Birkhauser Basel 1994doi101007s10107-008-0221-1 p 67

[Str05] Volker Strassen Komplexitat und Geometrie bilinearerAbbildungen Jahresber Deutsch Math-Verein 107(1)3ndash31 2005p 4 88 94 95 100 101

[Tao08] Terence Tao Structure and randomness pages from year one of amathematical blog American Mathematical Soc 2008 p 48

[Tao16] Terence Tao A symmetric formulation of theCrootndashLevndashPachndashEllenbergndashGijswijt capset boundhttpsterrytaowordpresscom 2016 p 48 58 81 84

[Tob91] Verena Tobler Spezialisierung und Degeneration von TensorenPhD thesis Universitat Konstanz 1991httpnbn-resolvingdeurnnbndebsz352-opus-20324p 57

Bibliography 137

[TS16] Terence Tao and Will Sawin Notes on the ldquoslice rankrdquo of tensorshttpsterrytaowordpresscom 2016 p 48 58

[Val79] Leslie G Valiant Completeness classes in algebra In ConferenceRecord of the Eleventh Annual ACM Symposium on Theory ofComputing (Atlanta Ga 1979) pages 249ndash261 ACM New York1979 doi101145800135804419 p 107 108 123

[Val80] Leslie G Valiant Reducibility by algebraic projections Universityof Edinburgh Department of Computer Science 1980 InternalReport p 109 119 123

[VC15] Peter Vrana and Matthias Christandl Asymptotic entanglementtransformation between W and GHZ states J Math Phys56(2)022204 12 2015 arXiv13103244doi10106314908106 p 69

[VDDMV02] F Verstraete J Dehaene B De Moor and H Verschelde Fourqubits can be entangled in nine different ways Phys Rev A (3)65(5 part A)052112 5 2002 doi101103PhysRevA65052112p 48

[Wal14] Michael Walter Multipartite quantum states and their marginalsPhD thesis ETH Zurich 2014 arXiv14106820 p 93

[WDGC13] Michael Walter Brent Doran David Gross and MatthiasChristandl Entanglement polytopes multiparticle entanglementfrom single-particle information Science 340(6137)1205ndash12082013 arXiv12080365 doi101126science1232957 p 8 995

[Wil12] Virginia Vassilevska Williams Multiplying matrices faster thanCoppersmith-Winograd Extended abstract InSTOCrsquo12mdashProceedings of the 2012 ACM Symposium on Theory ofComputing pages 887ndash898 ACM New York 2012doi10114522139772214056 p 4 6 8 48

[Zui17] Jeroen Zuiddam A note on the gap between rank and border rankLinear Algebra Appl 52533ndash44 2017doi101016jlaa201703015 p 2 14 110

[Zui18] Jeroen Zuiddam The asymptotic spectrum of graphs and theShannon capacity arXiv 2018 arXiv180700169 p 35

Glossary

〈n〉 ntimes middot middot middot times n diagonal tensor 47

〈a b c〉 matrix multiplication tensor 48

G lowastH or-product 42

GH strong graph product and-product 35

α(G) stability number 35

χ(G) clique cover number 40

Kk complete graph on k vertices 36

F θ(t) quantum functional 96

G(t) GLn1 times middot middot middot timesGLnk for t isin Fn1 otimes middot middot middot otimes Fnk 52

H(P ) Shannon entropy of probability distribution P 52

h(p) binary entropy of probability p isin [0 1] 53

τ(Φ) hitting set number 59

˜τ(Φ) asymptotic hitting set number 60

ω matrix multiplication exponent 47

P moment polytope 94

139

140 Glossary

P(X) the set of probability distributions on X 52

R rank 27

˜R asymptotic rank 27

R(t) border rank 50

R(G) rank of a graph clique cover number 40

R(t) tensor rank 47

SR(t) slice rank 58

Q subrank 27

˜Q asymptotic subrank 27

Q(t) border subrank 50

Q(Φ) combinatorial subrank 10

Q(G) subrank of a graph stability number 40

supp(t) support 52

Θ(G) Shannon capacity 35

ϑ(G) Lovasz theta number 41

G tH disjoint union 36

W (t) Sn1 times middot middot middot times Snk for t isin Fn1 otimes middot middot middot otimes Fnk 53

X(S6) asymptotic spectrum of semiring S with Strassen preorder 6 25

ζ(S)(t) gauge point 51

ζθ(t) support functional 52

Samenvatting

Algebraısche complexiteit asymptotische spectra enverstrengelingspolytopen

Het is welbekend dat de rang van een matrix multiplicatief is onder het Krone-ckerproduct additief onder de directe som genormaliseerd op identiteitsmatricesen niet-stijgend onder vermenigvuldiging van links en van rechts met matricesMatrixrang is zelfs de enige reele parameter met deze vier eigenschappen In 1986initieerde Strassen de studie van de uitbreiding naar tensoren vind alle afbeel-dingen van k-tensoren naar de reele getallen die multiplicatief zijn onder hettensor Kroneckerproduct additief onder de directe som genormaliseerd op ldquoiden-titeitstensorenrdquo en niet-stijgend onder het toepassen van lineaire afbeeldingen opde k tensorfactoren Strassen noemde de verzameling van deze afbeeldingen hetldquoasymptotische spectrum van k-tensorenrdquo Hij bewees als we het asymptotischespectrum begrijpen dan begrijpen we de asymptotische relaties tussen tensorswaaronder de asymptotische subrang en de asymptotische rang In het bijzonderals we het asymptotische spectrum kennen dan kennen we de aritmetische com-plexiteit van matrixvermenigvuldiging een centraal probleem in de algebraıschecomplexiteitstheorie

Een van de hoofdresultaten in dit proefschrift is de eerste expliciete construc-tie van een oneindige familie van elementen in het asymptotische spectrum vancomplexe k-tensoren genaamd de quantumfunctionalen Onze constructie is geba-seerd op informatietheorie en momentpolytopen ook wel verstrengelingspolytopengenoemd Daarnaast bestuderen we onder andere de relatie tussen de recentgeıntroduceerde slice rang en de quantumfunctionalen en we bewijzen dat deldquoasymptotischerdquo slice rang gelijk is aan het minimum over de quantumfunctionalenNaast het bestuderen van de bovengenoemde tensorparameters geven we eenuitbreiding van de CoppersmithndashWinograd-methode (voor het verkrijgen vanondergrenzen op de asymptotische combinatorische subrang) naar hogere-orde

141

142 Samenvatting

tensoren dwz tensoren van orde minstens 4 We passen deze uitbreiding toeom nieuwe bovengrenzen te krijgen op de asymptotische tensorrang van complete-graaftensoren via de lasermethode (Gezamenlijk werk met Christandl en VranaQIP 2018 STOC 2018)

Als een nieuwe toepassing van de abstracte theorie van asymptotische spectraintroduceren we het asymptotische spectrum van grafen in de grafentheorie Ana-loog aan de situatie voor tensoren geldt als we het asymptotisch spectrum vangrafen begrijpen dan begrijpen we de Shannoncapaciteit een graafparameter diede zero-error-communicatiecomplexiteit van communicatiekanalen karakteriseertMet andere woorden we bewijzen een nieuwe dualiteitsstelling voor de Shannon-capaciteit Voorbeelden van elementen in het asymptotische spectrum van grafenzijn het thetagetal van Lovasz en de fractionele Haemersgrenzen

Tot slot bestuderen we een algebraısch model van berekening genaamd algebraicbranching programs Een algebraic branching program (abp) is het spoor vaneen product van matrices met polynomen van graad hoogstens 1 als elementenDe maximale grootte van de matrices heet de breedte van de abp In 1992bewezen Ben-Or en Cleve dat elk polynoom berekend kan worden door eenbreedte-3 abp met een aantal matrices dat polynomiaal is in de formula size vanhet polynoom Daarentegen bewezen Allender en Wang in 2011 dat sommigepolynomen niet berekend kunnen worden door breedte-2 abps Wij bewijzen dat elkpolynoom benaderd kan worden door een breedte-2 abp met een aantal matricesdat polynomiaal is in de formula size van het polynoom waarbij benaderingwordt bedoeld in de zin van degeneration (Gezamenlijk werk met Ikenmeyer enBringmann CCC 2017 JACM 2018)

Summary

Algebraic complexity asymptotic spectra andentanglement polytopes

Matrix rank is well-known to be multiplicative under the Kronecker productadditive under the direct sum normalised on identity matrices and non-increasingunder multiplying from the left and from the right by any matrices In fact matrixrank is the only real matrix parameter with these four properties In 1986 Strassenproposed to study the extension to tensors find all maps from k-tensors to thereals that are multiplicative under the tensor Kronecker product additive underthe direct sum normalised on ldquoidentity tensorsrdquo and non-increasing under actingwith linear maps on the k tensor factors Strassen called the collection of thesemaps the ldquoasymptotic spectrum of k-tensorsrdquo He proved that understandingthe asymptotic spectrum implies understanding the asymptotic relations amongtensors including the asymptotic subrank and the asymptotic rank In particularknowing the asymptotic spectrum means knowing the arithmetic complexity ofmatrix multiplication a central problem in algebraic complexity theory

One of the main results in this dissertation is the first explicit construction ofan infinite family of elements in the asymptotic spectrum of complex k-tensorscalled the quantum functionals Our construction is based on information theoryand moment polytopes ie entanglement polytopes Moreover among otherthings we study the relation of the recently introduced slice rank to the quantumfunctionals and find that ldquoasymptoticrdquo slice rank equals the minimum over thequantum functionals Besides studying the above tensor parameters we extendthe CoppersmithndashWinograd method (for obtaining asymptotic combinatorialsubrank lower bounds) to higher-order tensors ie order at least 4 We applythis generalisation to obtain new upper bounds on the asymptotic tensor rankof complete graph tensors via the laser method (Joint work with Christandland Vrana QIP 2018 STOC 2018)

143

144 Summary

In graph theory as a new instantiation of the abstract theory of asymptoticspectra we introduce the asymptotic spectrum of graphs Analogous to thesituation for tensors understanding the asymptotic spectrum of graphs meansunderstanding the Shannon capacity a graph parameter capturing the zero-errorcommunication complexity of communication channels In different words weprove a new duality theorem for Shannon capacity Some known elements in theasymptotic spectrum of graphs are the Lovasz theta number and the fractionalHaemers bounds

Finally we study an algebraic model of computation called algebraic branchingprograms An algebraic branching program (abp) is the trace of a product ofmatrices with affine linear forms as matrix entries The maximum size of thematrices is called the width of the abp In 1992 Ben-Or and Cleve provedthat width-3 abps can compute any polynomial efficiently in the formula sizeOn the other hand in 2011 Allender and Wang proved that some polynomialscannot be computed by any width-2 abp We prove that any polynomial can beefficiently approximated by a width-2 abp where approximation is defined in thesense of degeneration (Joint work with Ikenmeyer and Bringmann CCC 2017JACM 2018)

Titles in the ILLC Dissertation Series

ILLC DS-2009-01 Jakub SzymanikQuantifiers in TIME and SPACE Computational Complexity of GeneralizedQuantifiers in Natural Language

ILLC DS-2009-02 Hartmut FitzNeural Syntax

ILLC DS-2009-03 Brian Thomas SemmesA Game for the Borel Functions

ILLC DS-2009-04 Sara L UckelmanModalities in Medieval Logic

ILLC DS-2009-05 Andreas WitzelKnowledge and Games Theory and Implementation

ILLC DS-2009-06 Chantal BaxSubjectivity after Wittgenstein Wittgensteinrsquos embodied and embedded subjectand the debate about the death of man

ILLC DS-2009-07 Kata BaloghTheme with Variations A Context-based Analysis of Focus

ILLC DS-2009-08 Tomohiro HoshiEpistemic Dynamics and Protocol Information

ILLC DS-2009-09 Olivia LadinigTemporal expectations and their violations

ILLC DS-2009-10 Tikitu de JagerrdquoNow that you mention it I wonderrdquo Awareness Attention Assumption

ILLC DS-2009-11 Michael FrankeSignal to Act Game Theory in Pragmatics

ILLC DS-2009-12 Joel UckelmanMore Than the Sum of Its Parts Compact Preference Representation OverCombinatorial Domains

ILLC DS-2009-13 Stefan BoldCardinals as Ultrapowers A Canonical Measure Analysis under the Axiom ofDeterminacy

ILLC DS-2010-01 Reut TsarfatyRelational-Realizational Parsing

ILLC DS-2010-02 Jonathan ZvesperPlaying with Information

ILLC DS-2010-03 Cedric DegremontThe Temporal Mind Observations on the logic of belief change in interactivesystems

ILLC DS-2010-04 Daisuke IkegamiGames in Set Theory and Logic

ILLC DS-2010-05 Jarmo KontinenCoherence and Complexity in Fragments of Dependence Logic

ILLC DS-2010-06 Yanjing WangEpistemic Modelling and Protocol Dynamics

ILLC DS-2010-07 Marc StaudacherUse theories of meaning between conventions and social norms

ILLC DS-2010-08 Amelie GheerbrantFixed-Point Logics on Trees

ILLC DS-2010-09 Gaelle FontaineModal Fixpoint Logic Some Model Theoretic Questions

ILLC DS-2010-10 Jacob VosmaerLogic Algebra and Topology Investigations into canonical extensions dualitytheory and point-free topology

ILLC DS-2010-11 Nina GierasimczukKnowing Onersquos Limits Logical Analysis of Inductive Inference

ILLC DS-2010-12 Martin Mose BentzenStit Iit and Deontic Logic for Action Types

ILLC DS-2011-01 Wouter M KoolenCombining Strategies Efficiently High-Quality Decisions from ConflictingAdvice

ILLC DS-2011-02 Fernando Raymundo Velazquez-QuesadaSmall steps in dynamics of information

ILLC DS-2011-03 Marijn KoolenThe Meaning of Structure the Value of Link Evidence for Information Retrieval

ILLC DS-2011-04 Junte ZhangSystem Evaluation of Archival Description and Access

ILLC DS-2011-05 Lauri KeskinenCharacterizing All Models in Infinite Cardinalities

ILLC DS-2011-06 Rianne KapteinEffective Focused Retrieval by Exploiting Query Context and Document Struc-ture

ILLC DS-2011-07 Jop BrietGrothendieck Inequalities Nonlocal Games and Optimization

ILLC DS-2011-08 Stefan MinicaDynamic Logic of Questions

ILLC DS-2011-09 Raul Andres LealModalities Through the Looking Glass A study on coalgebraic modal logic andtheir applications

ILLC DS-2011-10 Lena KurzenComplexity in Interaction

ILLC DS-2011-11 Gideon BorensztajnThe neural basis of structure in language

ILLC DS-2012-01 Federico SangatiDecomposing and Regenerating Syntactic Trees

ILLC DS-2012-02 Markos MylonakisLearning the Latent Structure of Translation

ILLC DS-2012-03 Edgar Jose Andrade LoteroModels of Language Towards a practice-based account of information innatural language

ILLC DS-2012-04 Yurii KhomskiiRegularity Properties and Definability in the Real Number Continuum idealizedforcing polarized partitions Hausdorff gaps and mad families in the projectivehierarchy

ILLC DS-2012-05 David Garcıa SorianoQuery-Efficient Computation in Property Testing and Learning Theory

ILLC DS-2012-06 Dimitris GakisContextual Metaphilosophy - The Case of Wittgenstein

ILLC DS-2012-07 Pietro GallianiThe Dynamics of Imperfect Information

ILLC DS-2012-08 Umberto GrandiBinary Aggregation with Integrity Constraints

ILLC DS-2012-09 Wesley Halcrow HollidayKnowing What Follows Epistemic Closure and Epistemic Logic

ILLC DS-2012-10 Jeremy MeyersLocations Bodies and Sets A model theoretic investigation into nominalisticmereologies

ILLC DS-2012-11 Floor SietsmaLogics of Communication and Knowledge

ILLC DS-2012-12 Joris DormansEngineering emergence applied theory for game design

ILLC DS-2013-01 Simon PauwSize Matters Grounding Quantifiers in Spatial Perception

ILLC DS-2013-02 Virginie FiutekPlaying with Knowledge and Belief

ILLC DS-2013-03 Giannicola ScarpaQuantum entanglement in non-local games graph parameters and zero-errorinformation theory

ILLC DS-2014-01 Machiel KeestraSculpting the Space of Actions Explaining Human Action by IntegratingIntentions and Mechanisms

ILLC DS-2014-02 Thomas IcardThe Algorithmic Mind A Study of Inference in Action

ILLC DS-2014-03 Harald A BastiaanseVery Many Small Penguins

ILLC DS-2014-04 Ben RodenhauserA Matter of Trust Dynamic Attitudes in Epistemic Logic

ILLC DS-2015-01 Marıa Ines CrespoAffecting Meaning Subjectivity and evaluativity in gradable adjectives

ILLC DS-2015-02 Mathias Winther MadsenThe Kid the Clerk and the Gambler - Critical Studies in Statistics andCognitive Science

ILLC DS-2015-03 Shengyang ZhongOrthogonality and Quantum Geometry Towards a Relational Reconstructionof Quantum Theory

ILLC DS-2015-04 Sumit SourabhCorrespondence and Canonicity in Non-Classical Logic

ILLC DS-2015-05 Facundo CarreiroFragments of Fixpoint Logics Automata and Expressiveness

ILLC DS-2016-01 Ivano A CiardelliQuestions in Logic

ILLC DS-2016-02 Zoe ChristoffDynamic Logics of Networks Information Flow and the Spread of Opinion

ILLC DS-2016-03 Fleur Leonie BouwerWhat do we need to hear a beat The influence of attention musical abilitiesand accents on the perception of metrical rhythm

ILLC DS-2016-04 Johannes MartiInterpreting Linguistic Behavior with Possible World Models

ILLC DS-2016-05 Phong LeLearning Vector Representations for Sentences - The Recursive Deep LearningApproach

ILLC DS-2016-06 Gideon Maillette de Buy WennigerAligning the Foundations of Hierarchical Statistical Machine Translation

ILLC DS-2016-07 Andreas van CranenburghRich Statistical Parsing and Literary Language

ILLC DS-2016-08 Florian SpeelmanPosition-based Quantum Cryptography and Catalytic Computation

ILLC DS-2016-09 Teresa PiovesanQuantum entanglement insights via graph parameters and conic optimization

ILLC DS-2016-10 Paula HenkNonstandard Provability for Peano Arithmetic A Modal Perspective

ILLC DS-2017-01 Paolo GaleazziPlay Without Regret

ILLC DS-2017-02 Riccardo PinosioThe Logic of Kantrsquos Temporal Continuum

ILLC DS-2017-03 Matthijs WesteraExhaustivity and intonation a unified theory

ILLC DS-2017-04 Giovanni CinaCategories for the working modal logician

ILLC DS-2017-05 Shane Noah Steinert-ThrelkeldCommunication and Computation New Questions About Compositionality

ILLC DS-2017-06 Peter HawkeThe Problem of Epistemic Relevance

ILLC DS-2017-07 Aybuke OzgunEvidence in Epistemic Logic A Topological Perspective

ILLC DS-2017-08 Raquel Garrido AlhamaComputational Modelling of Artificial Language Learning Retention Recogni-tion amp Recurrence

ILLC DS-2017-09 Milos StanojevicPermutation Forests for Modeling Word Order in Machine Translation

ILLC DS-2018-01 Berit JanssenRetained or Lost in Transmission Analyzing and Predicting Stability in DutchFolk Songs

ILLC DS-2018-02 Hugo HuurdemanSupporting the Complex Dynamics of the Information Seeking Process

ILLC DS-2018-03 Corina KoolenReading beyond the female The relationship between perception of authorgender and literary quality

ILLC DS-2018-04 Jelle BruinebergAnticipating Affordances Intentionality in self-organizing brain-body-environmentsystems

ILLC DS-2018-05 Joachim DaiberTypologically Robust Statistical Machine Translation Understanding and Ex-ploiting Differences and Similarities Between Languages in Machine Transla-tion

ILLC DS-2018-06 Thomas BrochhagenSignaling under Uncertainty

ILLC DS-2018-07 Julian SchloderAssertion and Rejection

ILLC DS-2018-08 Srinivasan ArunachalamQuantum Algorithms and Learning Theory

ILLC DS-2018-09 Hugo de Holanda Cunha NobregaGames for functions Baire classes Weihrauch degrees transfinite computa-tions and ranks

ILLC DS-2018-10 Chenwei ShiReason to Believe

ILLC DS-2018-11 Malvin GattingerNew Directions in Model Checking Dynamic Epistemic Logic

ILLC DS-2018-12 Julia IlinFiltration Revisited Lattices of Stable Non-Classical Logics

  • Acknowledgements
  • Introduction
    • Matrix multiplication
    • The asymptotic spectrum of tensors
    • Higher-order CW method
    • Abstract asymptotic spectra
    • The asymptotic spectrum of graphs
    • Tensor degeneration
    • Combinatorial degeneration
    • Algebraic branching program degeneration
    • Organisation
      • The theory of asymptotic spectra
        • Introduction
        • Semirings and preorders
        • Strassen preorders
        • Asymptotic preorders
        • Maximal Strassen preorders
        • The asymptotic spectrum
        • The representation theorem
        • Abstract rank and subrank
        • Topological aspects
        • Uniqueness
        • Subsemirings
        • Subsemirings generated by one element
        • Universal spectral points
        • Conclusion
          • The asymptotic spectrum of graphs Shannon capacity
            • Introduction
            • The asymptotic spectrum of graphs
              • The semiring of graph isomorphism classes
              • Strassen preorder via graph homomorphisms
              • The asymptotic spectrum of graphs
              • Shannon capacity
                • Universal spectral points
                  • Lovaacutesz theta number
                  • Fractional graph parameters
                    • Conclusion
                      • The asymptotic spectrum of tensors matrix multiplication
                        • Introduction
                        • The asymptotic spectrum of tensors
                          • The semiring of tensor equivalence classes
                          • Strassen preorder via restriction
                          • The asymptotic spectrum of tensors
                          • Asymptotic rank and asymptotic subrank
                            • Gauge points
                            • Support functionals
                            • Upper and lower support functionals
                            • Asymptotic slice rank
                            • Conclusion
                              • Tight tensors and combinatorial subrank cap sets
                                • Introduction
                                • Higher-order CoppersmithndashWinograd method
                                  • Construction
                                  • Computational remarks
                                  • Examples type sets
                                    • Combinatorial degeneration method
                                    • Cap sets
                                      • Reduced polynomial multiplication
                                      • Cap sets
                                        • Graph tensors
                                        • Conclusion
                                          • Universal points in the asymptotic spectrum of tensors entanglement polytopes moment polytopes
                                            • Introduction
                                            • SchurndashWeyl duality
                                            • Kronecker and LittlewoodndashRichardson coefficients
                                            • Entropy inequalities
                                            • Hilbert spaces and density operators
                                            • Moment polytopes
                                              • General setting
                                              • Tensor spaces
                                                • Quantum functionals
                                                • Outer approximation
                                                • Inner approximation for free tensors
                                                • Quantum functionals versus support functionals
                                                • Asymptotic slice rank
                                                • Conclusion
                                                  • Algebraic branching programs approximation and nondeterminism
                                                    • Introduction
                                                    • Definitions and basic results
                                                      • Computational models
                                                      • Complexity classes
                                                      • The theorem of Ben-Or and Cleve
                                                      • Approximation closure
                                                      • Nondeterminism closure
                                                        • Approximation closure of VP2
                                                        • Nondeterminism closure of VP1
                                                        • Conclusion
                                                          • Bibliography
                                                          • Glossary
                                                          • Samenvatting
                                                          • Summary
Page 6: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch

Promotiecommisie

Promotores prof dr HM Buhrman Universiteit van Amsterdamprof dr M Christandl Koslashbenhavns Universitet

Overige leden prof dr M Laurent Tilburg Universityprof dr EM Opdam Universiteit van Amsterdamprof dr RM de Wolf Universiteit van Amsterdamdr J Briet CWI Amsterdamdr M Walter Universiteit van Amsterdam

Faculteit der Natuurwetenschappen Wiskunde en Informatica

Contents

Acknowledgements ix

1 Introduction 311 Matrix multiplication 512 The asymptotic spectrum of tensors 613 Higher-order CW method 1014 Abstract asymptotic spectra 1115 The asymptotic spectrum of graphs 1216 Tensor degeneration 1417 Combinatorial degeneration 1518 Algebraic branching program degeneration 1519 Organisation 17

2 The theory of asymptotic spectra 1921 Introduction 1922 Semirings and preorders 1923 Strassen preorders 2024 Asymptotic preorders 4sim 2125 Maximal Strassen preorders 2326 The asymptotic spectrum X(S6) 2527 The representation theorem 2628 Abstract rank and subrank RQ 2729 Topological aspects 29210 Uniqueness 30211 Subsemirings 31212 Subsemirings generated by one element 32213 Universal spectral points 33

v

214 Conclusion 33

3 The asymptotic spectrum of graphs Shannon capacity 3531 Introduction 3532 The asymptotic spectrum of graphs 37

321 The semiring of graph isomorphism classes G 37322 Strassen preorder via graph homomorphisms 38323 The asymptotic spectrum of graphs X(G) 39324 Shannon capacity Θ 39

33 Universal spectral points 41331 Lovasz theta number ϑ 41332 Fractional graph parameters 41

34 Conclusion 46

4 The asymptotic spectrum of tensors matrix multiplication 4741 Introduction 4742 The asymptotic spectrum of tensors 49

421 The semiring of tensor equivalence classes T 49422 Strassen preorder via restriction 49423 The asymptotic spectrum of tensors X(T ) 49424 Asymptotic rank and asymptotic subrank 50

43 Gauge points ζ(i) 5144 Support functionals ζθ 5245 Upper and lower support functionals ζθ ζθ 5646 Asymptotic slice rank 5847 Conclusion 63

5 Tight tensors and combinatorial subrank cap sets 6551 Introduction 6552 Higher-order CoppersmithndashWinograd method 68

521 Construction 69522 Computational remarks 77523 Examples type sets 78

53 Combinatorial degeneration method 7954 Cap sets 81

541 Reduced polynomial multiplication 81542 Cap sets 82

55 Graph tensors 8556 Conclusion 86

6 Universal points in the asymptotic spectrum of tensors entan-glement polytopes moment polytopes 8761 Introduction 87

vi

62 SchurndashWeyl duality 8863 Kronecker and LittlewoodndashRichardson coefficients gλmicroν c

λmicroν 90

64 Entropy inequalities 9165 Hilbert spaces and density operators 9266 Moment polytopes P(t) 93

661 General setting 93662 Tensor spaces 94

67 Quantum functionals F θ(t) 9568 Outer approximation 10069 Inner approximation for free tensors 101610 Quantum functionals versus support functionals 102611 Asymptotic slice rank 103612 Conclusion 105

7 Algebraic branching programs approximation and nondetermi-nism 10771 Introduction 10772 Definitions and basic results 110

721 Computational models 110722 Complexity classes VP VPe VPk 111723 The theorem of Ben-Or and Cleve 112724 Approximation closure C 115725 Nondeterminism closure N(C) 115

73 Approximation closure of VP2 11674 Nondeterminism closure of VP1 11975 Conclusion 122

Bibliography 125

Glossary 139

Samenvatting 141

Summary 143

vii

Acknowledgements

First of all I thank all my coauthors for very fruitful collaboration Harry BuhrmanMatthias Christandl Peter Vrana Jop Briet Chris Perry Asger Jensen MarkusBlaser Christian Ikenmeyer and Karl Bringmann

Chris Zaal Leen Torenvliet and Robert Belleman I thank for all their effortsto set up for me the ldquodouble bachelor programmerdquo in Mathematics and Computerscience at the University of Amsterdam (UvA) in 2009 This programme as wellas the ldquowebklasrdquo on the Riemann hypothesis organised by Jan van de Craats andRoland van der Veen and the close vicinity of the UvA to the Dutch nationalresearch institute for mathematics and computer science (CWI) made me decideto come to Amsterdam My enjoyable master thesis project in mathematics withEric Opdam made me follow the academic path for which I thank Eric

Of course most importantly I thank my PhD supervisor Harry Buhrmanfor introducing me to research as a bachelor student for absorbing me into theAlgorithms and Complexity group at CWI for having enough faith in me to hireme as his PhD student in 2014 and for his general guidance throughout I feelvery lucky for the opportunities and scientific freedom that this has brought me

Matthias Christandl has been my closest collaborator and mentor since wemet in Berkeley in 2014 In practice this meant countless nights of fun Skypesessions between Amsterdam and Copenhagen countless enjoyable visits to theUniversity of Copenhagen and countless kitchen table sessions at the HallinsgadeThanks Matthias for the energy inspiration and optimism And thanks Matthiasand Henriette for the hospitality

Jop Briet I thank for his general guidance and for lots of inspiration Thepolynomial method reading group which he mainly organised inspired partof my paper with Matthias Christandl and Peter Vrana on universal points inthe asymptotic spectrum of tensors (This reading group also resulted in DionGijswijtrsquos paper on cap sets) My paper with Jop on round elimination laterinspired me to write the paper on the asymptotic spectrum of graphs

ix

Christian Ikenmeyer I thank for numerous inspiring discussions on algebraiccomplexity theory and tensors which greatly influenced my papers on tensor rankand our joint paper with Karl Bringmann on algebraic branching programs

Peter Vrana I thank for our many enjoyable research collaborations the resultsof which form a central part of this dissertation for his clever insights and forfinding several mathematical mistakes while reading the draft of this dissertation

Ronald de Wolf I thank for his general advice throughout my PhD and formany suggestions regarding the current version of this dissertation which will beincorporated in the next version (but not in the printed version because of theregulations of the University of Amsterdam)

Jop Briet Monique Laurent Lex Schrijver Peter Vrana Matthias ChristandlMaris Ozols Michael Walter and Bart Sevenster I thank for helpful discussionsregarding the results in Chapter 2 and Chapter 3 of this dissertation

Srinivasan Arunachalam I thank for sharing the ups and downs during ourfour years as PhD students at CWI Florian Speelman Farrokh Labib SvenPolak Bart Litjens and Bart Sevenster I thank for numerous valuable researchdiscussions

Bikkie Aldeias and Rob van Rooijen I thank for their excellent library servicesMartijn Zuiddam and Maris Ozols I thank for proofreading the draft of this

dissertationFinally I thank my parents and my brothers and my friends for their support

Amsterdam Jeroen ZuiddamAugust 31 2018

x

Publications

[BCPZ16] Harry Buhrman Matthias Christandl Christopher Perry and JeroenZuiddam Clean quantum and classical communication protocolsPhysical Review Letters 117230503 2016httpslinkapsorgdoi101103PhysRevLett117230503

httparxivorgabs160507948

[BCZ17a] Markus Blaser Matthias Christandl and Jeroen Zuiddam The bordersupport rank of two-by-two matrix multiplication is sevenManuscript 2017httpsarxivorgabs170509652

[BCZ17b] Harry Buhrman Matthias Christandl and Jeroen ZuiddamNondeterministic Quantum Communication Complexity the CyclicEquality Game and Iterated Matrix MultiplicationIn Christos H Papadimitriou editor 8th Innovations in TheoreticalComputer Science Conference (ITCS) 2017httpdropsdagstuhldeopusvolltexte20178181

[BIZ17] Karl Bringmann Christian Ikenmeyer and Jeroen Zuiddam On algebraicbranching programs of small widthIn Ryan OrsquoDonnell editor 32nd Computational ComplexityConference (CCC) 2017httpsdoiorg104230LIPIcsCCC201720

httpsarxivorgabs170205328

Journal of the ACM Vol 65 No 5 Article 32 2018httpsdoiorg1011453209663

1

2 Acknowledgements

[BZ17] Jop Briet and Jeroen Zuiddam On the orthogonal rank of Cayley graphsand impossibility of quantum round eliminationQuantum Information and Computation 2017httpwwwrintonpresscomxxqic17qic-17-120106-0116pdf

httpsarxivorgabs160806113

[CJZ18] Matthias Christandl Asger Kjaeligrulff Jensen and Jeroen ZuiddamTensor rank is not multiplicative under the tensor productLinear Algebra and its Applications 543125ndash139 2018httpsdoiorg101016jlaa201712020

httpsarxivorgabs170509379

[CVZ16] Matthias Christandl Peter Vrana and Jeroen Zuiddam Asymptotic tensorrank of graph tensors beyond matrix multiplicationManuscript 2016httpsarxivorgabs160907476

[CVZ18] Matthias Christandl Peter Vrana and Jeroen Zuiddam Universal Pointsin the Asymptotic Spectrum of Tensors Extended AbstractIn Proceedings of 50th Annual ACM SIGACT Symposium on the Theory ofComputing (STOC) 2018httpsdoiorg10114531887453188766

httpsarxivorgabs170907851

[CZ18] Matthias Christandl and Jeroen Zuiddam Tensor surgery and tensor rank(Journal of) computational complexity 2018httpsdoiorg101007s00037-018-0164-8

httpsarxivorgabs160604085

[Zui17] Jeroen Zuiddam A note on the gap between rank and border rankLinear Algebra and its Applications 52533ndash44 2017httpsdoiorg101016jlaa201703015

httparxivorgabs150405597

[Zui18] Jeroen Zuiddam The asymptotic spectrum of graphs and the ShannoncapacityManuscript 2018httparxivorgabs180700169

This dissertation is based on the above papers with primary focus on the fourhighlighted papers

Toelichting op het relatieve belang van de co-auteurs voor elk artikel is hetbelang van de co-auteurs ongeveer gelijk verdeeld

Chapter 1

Introduction

Volker Strassen published in 1969 his famous algorithm for multiplying anytwo ntimes n matrices using only O(n281) rather than O(n3) arithmetical opera-tions [Str69] His discovery marked the beginning of a still ongoing line of researchin the field of algebraic complexity theory a line of research that by now touchesseveral fields of mathematics including algebraic geometry representation theory(quantum) information theory and combinatorics This dissertation is inspired byand contributes to this line of research

No further progress followed for almost 10 years after Strassenrsquos discoverydespite the fact that ldquomany scientists understood that discovery as a signal toattack the problem and to push the exponent further downrdquo [Pan84] Then in 1978Pan improved the exponent from 281 to 279 [Pan78 Pan80] One year later BiniCapovani Lotti and Romani improved the exponent to 278 by constructing fastldquoapproximativerdquo algorithms for matrix multiplication and making these algorithmsexact via the method of interpolation [BCRL79 Bin80] Cast in the languageof tensors the result of Bini et al corresponds to what we now call a ldquoborderrankrdquo upper bound The idea of studying approximative complexity or bordercomplexity for algebraic problems has nowadays become an important theme inalgebraic complexity theory

Schonhage then obtained the exponent 255 by constructing a fast algorithm forcomputing many ldquodisjointrdquo small matrix multiplications and transforming this intoan algorithm for one large matrix multiplication [Sch81] The upper bound was im-proved shortly after by works of Pan [Pan81] Romani [Rom82] and Coppersmithand Winograd [CW82] resulting in the exponent 250 Then in 1987 Strassenpublished the laser method with which he obtained the exponent 248 [Str87] Thelaser method was used in the same year by Coppersmith and Winograd to obtainthe exponent 238 [CW87] To do this they invented a method for constructingcertain large combinatorial structures This method or actually the extendedversion that Strassen published in [Str91] we now call the CoppersmithndashWinogradmethod All further improvements on upper bounding the exponent essentially

3

4 Chapter 1 Introduction

follow the framework of Coppersmith and Winograd and the improvements donot affect the first two digits after the comma [CW90 Sto10 Wil12 LG14]

Define ω to be the optimal exponent in the complexity of matrix multiplicationWe call ω the exponent of matrix multiplication To summarise the above historicalaccount on upper bounds ω lt 238 On the other hand the only lower bound wecurrently have is the trivial lower bound 2 le ω

The history of upper bounds on the matrix multiplication exponent ω whichbegan with Strassenrsquos algorithm and ended with the Strassen laser methodand CoppersmithndashWinograd method is well-known and well-documented seeeg [BCS97 Section 1513] However there is remarkable work of Strassen ona theory of lower bounds for ω and similar types of exponents and this workhas received almost no attention This theory of lower bounds is the theory ofasymptotic spectra of tensors and is the topic of a series of papers by Strassen[Str86 Str87 Str88 Str91 Str05]

In the foregoing the word tensor has popped up twicemdashnamely when wementioned border rank and just now when we mentioned asymptotic spectraof tensorsmdashbut we have not discussed at all why tensors should be relevant forunderstanding the complexity of matrix multiplication First we give a mini courseon tensors A k-tensor t = (ti1ik)i1ik is a k-dimensional array of numbers fromsome field say the complex numbers C Thus a 2-tensor is simply a matrix Ak-tensor is called simple if there exist k vectors v1 vk such that the entries of tare given by the products ti1ik = (v1)i1 middot middot middot (vk)ik for all indices ij The tensorrank of t is the smallest number n such that t can be written as a sum of n simpletensors Thus the tensor rank of a 2-tensor is simply its matrix rank Returning tothe problem of finding the complexity of matrix multiplication there is a special3-tensor called the matrix multiplication tensor that encodes the computationalproblem of multiplying two 2times 2 matrices This 3-tensor is commonly denotedby 〈2 2 2〉 It turns out that the matrix multiplication exponent ω is exactly theasymptotic rate of growth of the tensor rank of the ldquoKronecker powersrdquo of thetensor 〈2 2 2〉 This important observation follows from the fundamental fact thatthe computational problem of multiplying matrices is ldquoself-reduciblerdquo Namely wecan multiply two matrices by viewing them as block matrices and then performmatrix multiplication at the level of the blocks

We wrap up this introductory story To understand the computational com-plexity of matrix multiplication one should understand the asymptotic rate ofgrowth of the tensor rank of a certain family of tensors a family that is obtainedby taking powers of a fixed tensor The theory of asymptotic spectra is the theoryof bounds on such asymptotic parameters of tensors

The main story line of this dissertation concerns the theory of asymptoticspectra In Section 11 of this introduction we discuss in more detail the computa-tional problem of multiplying matrices In Section 12 we discuss the asymptoticspectrum of tensors and discuss a new result an explicit description of infinitely

11 Matrix multiplication 5

many elements in the asymptotic spectrum of tensors In Section 13 we considera new higher-order CoppersmithndashWinograd method

The theory of asymptotic spectra of tensors is a special case of an abstracttheory of asymptotic spectra of preordered semirings which we discuss in Sec-tion 14 In Section 15 we apply this abstract theory to a new setting namelyto graphs By doing this we obtain a new dual characterisation of the Shannoncapacity of graphs

The second story line of this dissertation is about degeneration an algebraickind of approximation related to the concept of border rank of Bini et al We discussdegeneration in the context of tensors in Section 16 There is a combinatorialversion of tensor degeneration which we call combinatorial degeneration Wediscuss a new result regarding combinatorial degeneration in Section 17 FinallySection 18 is about a new result concerning degeneration for algebraic branchingprograms an algebraic model of computation

We finish in Section 19 with a discussion of the organisation of this dissertationinto chapters

11 Matrix multiplication

In this section we discuss in more detail the computational problem of multiplyingtwo matrices

Algebraic complexity theory studies algebraic algorithms for algebraic problemsRoughly speaking algebraic algorithms are algorithms that use only the basicarithmetical operations + and times over some field say R or C A fundamentalexample of an algebraic problem is matrix multiplication

If we multiply two ntimesn matrices by computing the inner products between anyrow of the first matrix and any column of the second matrix one by one we needroughly 2 middot n3 arithmetical operations (+ and times) For example we can multiplytwo 2times2 matrices with 12 arithmetical operations namely 8 multiplications and 4additions(

a11 a12

a21 a22

)(b11 b12

b21 b22

)=

(a11b11 + a12b21 a11b12 + a12b22

a21b11 + a22b21 a21b12 + a22b22

)

Since matrix multiplication is a basic operation in linear algebra it is worthwhileto see if we can do better than 2 middot n3 In 1969 Strassen [Str69] published a betteralgorithm The base routine of Strassenrsquos algorithm is an algorithm for multiplyingtwo 2times 2 matrices with 7 multiplications 18 additions and certain sign changes(

a11 a12

a21 a22

)(b11 b12

b21 b22

)=

(x1 + x4 minus x5 + x7 x3 + x5

x2 + x4 x1 + x3 minus x2 + x6

)with

x1 = (a11 + a22)(b11 + b22)

6 Chapter 1 Introduction

x2 = (a21 + a22)b11

x3 = a11(b12 minus b22)

x4 = a22(minusb11 + b21)

x5 = (a11 + a12)b22

x6 = (minusa11 + a21)(b11 + b12)

x7 = (a12 minus a22)(b21 + b22)

The general routine of Strassenrsquos algorithm multiplies two n times n matrices byrecursively dividing the matrices into four blocks and applying the base routineto multiply the blocks (this is the self-reducibility of matrix multiplication thatwe mentioned earlier) The base routine does not assume commutativity of thevariables for correctness so indeed we can take the variables to be matrices Afterexpanding the recurrence we see that Strassenrsquos algorithm uses 47middotnlog2 7 asymp 47middotn281

arithmetical operations Over the years Strassenrsquos algorithm was improved bymany researchers The best algorithm known today uses C middot n238 arithmeticaloperations where C is some constant [CW90 Sto10 Wil12 LG14] The exponentof matrix multiplication ω is the infimum over all real numbers β such that forsome constant Cβ we can multiply for any n isin N any two ntimes n matrices with atmost Cβ middot nβ arithmetical operations From the above it follows that ω le 238From a simple flattening argument it follows that 2 le ω We are left with thefollowing well-known open problem what is the value of the matrix multiplicationexponent ω

The constant C for the currently best algorithm is impractically large (fora discussion of this issue see eg [Pan18]) For a practical fast algorithm oneshould either improve C or find a balance between C and the exponent of nWe will ignore the size of C in this dissertation and focus on the exponent ωFor an overview of the field of algebraic complexity theory the reader shouldconsult [BCS97] and [Sap16]

12 The asymptotic spectrum of tensors

We now discuss the theory of asymptotic spectra for tensors

Let s and t be k-tensors over a field F s isin Fn1otimesmiddot middot middototimesFnk t isin Fm1otimesmiddot middot middototimesFmk We say s restricts to t and write s gt t if there are linear maps Ai Fni rarr Fmisuch that (A1 otimes middot middot middot otimes Ak)(s) = t Let [n] = 1 n for n isin N We definethe product s otimes t isin Fn1m1 otimes middot middot middot otimes Fnkmk by (s otimes t)(i1j1)(ikjk) = si1iktj1jkfor i isin [n1] times middot middot middot [nk] and j isin [m1] times middot middot middot times [mk] This product generalizes thewell-known Kronecker product of matrices We refer to this product as the tensor(Kronecker) product We define the direct sum s oplus t isin Fn1+m1 otimes middot middot middot otimes Fnk+mk

by (s oplus t)`1`k = s`1`k if ` isin [n1] times middot middot middot times [nk] (s oplus t)n1+`1nk+`k = t`1`k if` isin [m1]times middot middot middot times [mk] and (soplus t)`1`k = 0 for the remaining indices

12 Asymptotic spectra of tensors 7

The asymptotic restriction problem asks to compute the infimum of all realnumbers β ge 0 such that for all n isin N

sotimesβn+o(n) gt totimesn

We may think of the asymptotic restriction problem as having two directionsnamely to find

1 obstructions ldquocertificatesrdquo that prohibit sotimesβn+o(n) gt totimesn or

2 constructions linear maps that carry out sotimesβn+o(n) gt totimesn

Ideally we would like to find matching obstructions and constructions so that weindeed learn the value of β

What do obstructions look like We set β equal to one it turns out that itis sufficient to understand this case We say s restricts asymptotically to t andwrite s gtsim t if

sotimesn+o(n) gt totimesn

What do obstructions look like for asymptotic restriction gtsim More precisely whatdo obstructions look like for gtsim restricted to a subset S sube k-tensors over FLet us assume S is closed under direct sum and tensor product and containsthe diagonal tensors 〈n〉 =

sumni=1 ei otimes middot middot middot otimes ei for n isin N where e1 en is the

standard basis of Fn Let X(S) be the set of all maps φ S rarr Rge0 that are

(a) monotone under restriction gt

(b) multiplicative under the tensor Kronecker product otimes

(c) additive under the direct sum oplus

(d) normalised to φ(〈n〉) = n at the diagonal tensor 〈n〉

The elements φ isin X(S) are called spectral points of S The set X(S) is called theasymptotic spectrum of S

Spectral points φ isin X(S) are obstructions Let s t isin S If s gtsim t thenby definition we have a restriction sotimesn+o(n) gt totimesn Then (a) and (b) imply theinequality φ(s)n+o(n) = φ(sotimesn+o(n)) gt φ(totimesn) = φ(t)n This implies φ(s) gt φ(t)We negate that statement if φ(s) lt φ(t) then not s gtsim t In that case φ is anobstruction to s gtsim t

The remarkable fact is that X(S) is a complete set of obstructions for gtsimNamely for s t isin S the asymptotic restriction s gtsim t holds if and only if we haveφ(s) gt φ(t) for all spectral points φ isin X(S) This was proven by Volker Strassenin [Str86 Str88] His proof uses a theorem of Becker and Schwarz [BS83] which iscommonly referred to as the KadisonndashDubois theorem (for historical reasons) or

8 Chapter 1 Introduction

the real representation theorem (We will say more about this completeness resultin Section 14)

Let us introduce tensor rank and subrank and their asymptotic versionsThe tensor rank of t is the size of the smallest diagonal tensor that restrictsto t R(t) = minr isin N t 6 〈r〉 and the subrank of t is the size of thelargest diagonal tensor to which t restricts Q(t) = maxr isin N 〈r〉 6 tAsymptotic rank is defined as ˜R(t) = limnrarrinfinR(totimesn)1n and asymptotic subrankis defined as ˜Q(t) = limnrarrinfinQ(totimesn)1n From Feketersquos lemma it follows that

˜Q(t) = supn Q(totimesn)1n and ˜R(t) = infn R(totimesn)1n One easily verifies that everyspectral point φ isin X(S) is an upper bound on asymptotic subrank and a lowerbound on asymptotic rank for any tensor t isin S

˜Q(t) le φ(t) le ˜R(t)

Strassen used the completeness of X(S) for 6sim to prove ˜Q(t) = minφisinX(S) φ(t) and

˜R(t) = maxφisinX(S) φ(t) One should think of these expressions as being dual to thedefining expressions for ˜Q and ˜R

We mentioned that Strassen was motivated to study the asymptotic spectrumof tensors by the study of the complexity of matrix multiplication The preciseconnection with matrix multiplication is as follows The matrix multiplicationexponent ω is characterised by the asymptotic rank ˜R(〈2 2 2〉) of the matrixmultiplication tensor

〈2 2 2〉 =sum

ijkisin[2]

eij otimes ejk otimes eki isin F4 otimes F4 otimes F4

via ˜R(〈2 2 2〉) = 2ω We know the trivial lower bound 2 le ω see Section 43 Weknow the (nontrivial) upper bound ω le 23728639 which is by Coppersmith andWinograd [CW90] and improvements by Stothers [Sto10] Williams [Wil12] andLe Gall [LG14] It may seem that for the study of matrix multiplication only theasymptotic rank ˜R is of interest and that the asymptotic subrank ˜Q is just a toyparameter Asymptotic subrank however plays an important role in the currentlybest matrix multiplication algorithms We will discuss this idea in the context ofthe asymptotic subrank of so-called complete graph tensors in Section 55

The important message is understanding the asymptotic spectrum of ten-sors X(S) means understanding asymptotic restriction 6sim the asymptotic sub-rank ˜Q and the asymptotic rank ˜R of tensors Of course we should now find anexplicit description of X(S)

Our main result regarding the asymptotic spectrum of tensors is the explicitdescription of an infinite family of elements in the asymptotic spectrum of allcomplex tensors X(complex k-tensors) which we call the quantum function-als (Chapter 6) Finding such an infinite family has been an open problemsince the work of Strassen Moment polytopes (studied under the name en-tanglement polytopes in quantum information theory [WDGC13]) play a key

12 Asymptotic spectra of tensors 9

role here To each tensor t is associated a convex polytope P(t) collectingrepresentation-theoretic information about t called the moment polytope of t(See eg [Nes84 Bri87 WDGC13 SOK14]) The moment polytope has twoimportant equivalent descriptions

Quantum marginal spectra description We begin with the descriptionof P(t) in terms of quantum marginal spectra

Let V be a (finite-dimensional) Hilbert space In quantum information theorya positive semidefinite hermitian operator ρ V rarr V with trace one is calleda density operator The sequence of eigenvalues of a density operator ρ is aprobability vector We let spec(ρ) = (p1 pn) be the sequence of eigenvalues of ρordered non-increasingly p1 ge middot middot middot ge pn Let V1 and V2 be Hilbert spaces Given adensity operator ρ on V1 otimes V2 the reduced density operator ρ1 = tr2 ρ is uniquelydefined by the property that tr(ρ1X1) = tr(ρ(X1otimesIdV2)) for all operators X1 on V1The operator ρ1 is again a density operator The operation tr2 is called the partialtrace over V2 In an explicit form ρ1 is given by 〈ei ρ1(ej)〉 =

sum`〈eiotimesf` ρ(ejotimesf`)〉

where the ei form a basis of V1 and the fi form an orthonormal basis of V2 (thestatement is independent of basis choice)

Let Vi be a Hilbert space and consider the tensor product V1 otimes V2 otimes V3Associate with t isin V1 otimes V2 otimes V3 the dual element tlowast = 〈t middot〉 isin (V1 otimes V2 otimes V3)

lowastThen ρt = ttlowast〈t t〉 = t〈t middot〉〈t t〉 is a density operator on V1otimesV2otimesV3 Viewing ρt

as a density operator on the regrouped space V1 otimes (V2 otimes V3) we may take thepartial trace of ρt over V2otimesV3 as described above We denote the resulting densityoperator by ρt1 = tr23 ρ

t We similarly define ρt2 and ρt3

Let V = V1otimesV2otimesV3 Let G = GL(V1)timesGL(V2)timesGL(V3) act naturally on V Let t isin V 0 The moment polytope of t is

P(t) = P(G middot t) = (spec(ρu1) spec(ρu2) spec(ρu3)) u isin G middot t 0

Here G middot t denotes the Zariski closure or equivalently the Euclidean closure in Vof the orbit G middot t = g middot t g isin G

Representation-theoretic description On the other hand there is a de-scription of P(t) in terms of non-vanishing of representation-theoretic multiplicitiesWe do not state this description here but stress that it is crucial for our proofs

Quantum functionals For any probability vector θ isin Rk (iesumk

i=1 θ(i) = 1and θ(i) ge 0 for all i isin [k]) we define the quantum functional F θ as an optimisationover the moment polytope

F θ(t) = max

2sumki=1 θ(i)H(x(i)) (x(1) x(k)) isin P(t)

Here H(y) denotes Shannon entropy of the probability vector y We prove that F θ

satisfies properties (a) (b) (c) and (d) for all complex k-tensors

Theorem (Theorem 611) F θ isin X(complex k-tensors)

10 Chapter 1 Introduction

To put our result into context Strassen in [Str91] constructed elements in theasymptotic spectrum of S = oblique k-tensors over F with the preorder 6|SThe set S is a strict and non-generic subset of all k-tensors over F These elementswe call the (Strassen) support functionals On oblique tensors over C the quantumfunctionals and the support functionals coincide An advantage of the supportfunctionals over the quantum functionals is that they are defined over any fieldIn fact the support functionals are ldquopowerful enoughrdquo to reprove the result ofEllenbergndashGijswijt on cap sets [EG17] We discuss the support functionals inSection 44

13 Higher-order CW method

Recall that in the asymptotic restriction problem we have an obstruction directionand a construction direction The quantum functionals and the support functionalsprovide obstructions Now we look at the construction direction Constructionsare asymptotic transformations sotimesβn+o(n) gt totimesn We restrict attention to the casethat t is a diagonal tensor 〈r〉 Constructions in this case essentially correspondto lower bounds on the asymptotic subrank ˜Q(s) The goal is now to constructgood lower bounds on ˜Q(s)

Strassen solved the problem of computing the asymptotic subrank for so-calledtight 3-tensors with the CoppersmithndashWinograd (CW) method and the supportfunctionals [CW90 Str91] The CW method is combinatorial Let us introduce thecombinatorial viewpoint Let I1 Ik be finite sets We call a set D sube I1timesmiddot middot middottimesIka diagonal if any two distinct elements a b isin D differ in all k coordinates LetΦ sube I1timesmiddot middot middottimes Ik We call a diagonal D sube Φ free if D = Φcap (D1timesmiddot middot middottimesDk) HereDi = ai a isin D is the projection ofD onto the ith coordinate The subrank Q(Φ)of Φ is the size of the largest free diagonal D sube Φ For two sets Φ sube I1 times middot middot middot times Ikand Ψ sube J1 times middot middot middot times Jk we define the product ΦtimesΨ sube (I1 times J1)times middot middot middot times (Ik times Jk)by Φ times Ψ = ((a1 b1) (ak bk)) a isin Φ b isin Ψ The asymptotic subrankis defined as ˜Q(Φ) = limnrarrinfinQ(Φtimesn)1n One may think of Φ as a k-partitehypergraph and of a free diagonal in Φ as an induced k-partite matching

How does this combinatorial version of subrank relate to the tensor version ofsubrank that we defined earlier Let t isin Fn1otimesmiddot middot middototimesFnk Expand t in the standardbasis t =

sumiisin[n1]timesmiddotmiddotmiddottimes[nk] ti ei1 otimes middot middot middot otimes eik Let supp(t) be the support of t in the

standard basis supp(t) = i isin [n1]timesmiddot middot middottimes [nk] ti 6= 0 Then Q(supp(t)) le Q(t)We want to construct large free diagonals Let Φ sube I1timesmiddot middot middottimesIk We call Φ tight

if there are injective maps αi Ii rarr Z such that if a isin Φ thensumk

i=1 αi(ai) = 0For a set X let P(X) be the set of probability distributions on X For θ isin P([k]) letHθ(Φ) = maxPisinP(Φ)

sumki=1 θ(i)H(Pi) where H(Pi) denotes the Shannon entropy

of the ith marginal distribution of P In [Str91] Strassen used the CW methodand the support functionals to characterise the asymptotic subrank ˜Q(Φ) fortight Φ sube I1 times I2 times I3 He proved the following Let Φ sube I1 times I2 times I3 be tight

14 Abstract asymptotic spectra 11

Then

˜Q(Φ) = minθisinP([3])

2Hθ(Φ) = maxPisinP(Φ)

miniisin[3]

2H(Pi) (11)

We study the higher-order regime Φ sube I1 times middot middot middot times Ik k ge 4

Theorem (Theorem 57) Let Φ sube I1 times middot middot middot times Ik be tight Then ˜Q(Φ) is lowerbounded by an expression that generalizes the right-hand side of (11)

Stating the lower bound requires a few definitions so we do not state it hereIt is not known whether our new lower bound matches the upper bound given byquantum or support functionals

Using Theorem 57 we managed to exactly determine the asymptotic subranksof several new examples These results in turn we used to obtain upper boundson the asymptotic rank of so-called complete graph tensors via a higher-orderStrassen laser method

14 Abstract asymptotic spectra

Strassen mainly studied tensors but he developed an abstract theory of asymptoticspectra in a general setting In the next section we apply this abstract theory tographs We now introduce the abstract theory One has a semiring S (think of asemiring as a ring without additive inverses) that contains N and a preorder 6on S that (1) behaves well with respect to the semiring operations (2) inducesthe natural order on N and (3) for any a b isin S b 6= 0 there is an r isin N sube Swith a 6 r middot b We call such a preorder a Strassen preorder The main theoremis that the asymptotic version 6sim of the Strassen preorder is characterised by themonotone semiring homomorphisms S rarr Rge0 For a b isin S let a 6sim b if there is a

sequence xn isin NN with x1nn rarr 1 when nrarrinfin and an 6 bnxn for all n isin N Let

X = X(S6) = φ isin Hom(SRge0) foralla b isin S a 6 brArr φ(a) le φ(b)

The set X is called the asymptotic spectrum of (S6)

Theorem (Strassen) a 6sim b iff forallφ isin X φ(a) le φ(b)

Strassen applies this theorem to study rank and subrank of tensors Wedefine an abstract notion of rank R(a) = minn isin N a 6 n and an abstractnotion of subrank Q(a) = maxm isin N m 6 a We then naturally have anasymptotic rank ˜R(a) = limnrarrinfinR(an)1n and (under certain mild conditions) anasymptotic subrank ˜Q(a) = limnrarrinfinQ(an)1n In fact ˜R(a) = infn R(an)1n and

˜Q(a) = supn Q(an)1n by Feketersquos lemma The theorem implies the following dualcharacterisations

12 Chapter 1 Introduction

Corollary (Section 28) If a isin S with ak gt 2 for some k isin N then

˜Q(a) = minφisinX

φ(a)

If a isin S with φ(a) ge 1 for some φ isin X then

˜R(a) = maxφisinX

φ(a)

In Chapter 2 we will discuss the abstract theory of asymptotic spectra Wewill discuss a proof of the above theorem that is obtained by integrating the proofsof Strassen in [Str88] and the proof of the KadisonndashDubois theorem of Beckerand Schwarz in [BS83] We will also discuss some basic properties of generalasymptotic spectra

15 The asymptotic spectrum of graphs

In the previous section we have seen the abstract theory of asymptotic spectraWe now discuss a problem in graph theory where we can apply this abstracttheory Consider a communication channel with input alphabet a b c d e andoutput alphabet 1 2 3 4 5 When the sender gives an input to the channel thereceiver gets an output according to the following diagram where an outgoingarrow is picked randomly (say uniformly randomly)

a 1

b 2

c 3

d 4

e 5

Output 2 has an incoming arrow from a and an incoming arrow from b Wesay a and b are confusable because the receiver cannot know whether a or bwas given as an input to the channel In this channel the pairs of inputsa b b c c d d e e a are confusable If we restrict the input set toa subset of pairwise non-confusable letters say a c then we can use the channelto communicate two messages with zero error It is clear that for this channel anynon-confusable set of inputs has size at most two Can we make better use of thechannel if we use the channel twice Yes now the input set is the set of two letterwords aa ab ac ad ae ba bb and we have a set of pairwise non-confusablewords aa bc ce db ed which has size 5 Thus ldquoper channel userdquo we can send atleast

radic5 letters What happens if we use the channel n times

15 The asymptotic spectrum of graphs 13

The situation is concisely described by drawing the confusability graph of thechannel which has the input letters as vertices and the confusable pairs of inputletters as edges For the above channel the confusability graph is the 5-cycle C5

a

b

cd

e

A subset of inputs that are pairwise non-confusable corresponds to a subset ofthe vertices in the confusability graph that contains no edges an independent setThe independence number of any graph G is the size of the largest independentset in G and is denoted by α(G) If G is the confusability graph of some channelthen the confusability graph for using the channel n times is denoted by Gn (thegraph product is called the strong graph product) The question of how manyletters we can send asymptotically translates to computing the limit

Θ(G) = limnrarrinfin

α(Gn)1n

which exists because α is supermultiplicative under The parameter Θ(G) wasintroduced by Shannon [Sha56] and is called the Shannon capacity of the graph GComputing the Shannon capacity is a nontrivial problem already for small graphsLovasz in 1979 [Lov79] computed the value Θ(C5) =

radic5 by introducing and

evaluating a new graph parameter ϑ which is now known as the Lovasz thetanumber Already for the 7-cycle C7 the Shannon capacity is not known

Duality theorem We propose a new application of the abstract theory ofasymptotic spectra to graph theory The main theorem that results from this is adual characterisation of the Shannon capacity of graphs For graphs G and H wesay G 6 H if there is a graph homomorphism Grarr H ie from the complementof G to the complement of H We show graphs are a semiring under the stronggraph product and the disjoint union t and 6 is a Strassen preorder onthis semiring The rank in this setting is the clique cover number χ(middot) = χ( middot )ie the chromatic number of the complement The subrank in this setting is theindependence number α(middot) Let X(G) be the set of semiring homomorphismsfrom graphs to Rge0 that are monotone under 6 From the abstract theory ofasymptotic spectra we derive the following duality theorem

Theorem (Theorem 31) Θ(G) = minφisinX(G) φ(G)

In Chapter 3 we will prove Theorem 31 and we will discuss the known elementsin X(G) which are the Lovasz theta number and a family of parameters obtainedby ldquofractionalisingrdquo

14 Chapter 1 Introduction

16 Tensor degeneration

We move to the second story line that we mentioned earlier degeneration Degen-eration is a prominent theme in algebraic complexity theory Roughly speakingdegeneration is an algebraic notion of approximation defined via orbit closures

For tensors for example degeneration is defined as follows Let V1 V2 V3

be finite-dimensional complex vector spaces and let V = V1 otimes V2 otimes V3 be thetensor product space Let G = GL(V1) times GL(V2) times GL(V3) act naturally on V Let s t isin V Let G middot t = g middot t g isin G be the orbit of t under G We say tdegenerates to s and write t s if s is an element in the orbit closure G middot t Herethe closure is taken with respect to the Zariski topology or equivalently withrespect to the Euclidean topology One should think of this degeneration asa topologically closed version of the restriction preorder le for tensors that wedefined earlier Degeneration is a ldquolargerrdquo preorder than restriction in the sensethat s t implies s le t

In several algebraic models of computation approximative computations cor-respond to certain degenerations In some models such an approximative com-putation can be turned into an exact computation at a small cost for exampleusing the method of interpolation The currently fastest matrix multiplicationalgorithms are constructed in this way for example

On the other hand it turns out that if a lower bound technique for analgebraic measure of complexity is ldquocontinuousrdquo then the lower bounds obtainedwith this technique are already lower bounds on the approximative version ofthe complexity measure This observation turns approximative complexity anddegeneration into an interesting topic itself A research program in this directionis the geometric complexity theory program of Mulmuley and Sohoni towardsseparating the algebraic complexity class VP (and related classes) from VNP[MS01] (see also [Ike13])

In this section we briefly discuss three results related to degeneration of tensorsthat are not discussed further in this dissertation Then we will discuss resultson combinatorial degeneration in Section 17 and algebraic branching programdegeneration in Section 18

Ratio of tensor rank and border rank The approximative or degenera-tion version of tensor rank is called border rank and is denoted by R It has beenknown since the work of Bini and Strassen that tensor rank R and border rank Rare different How much can they be different In [Zui17] we showed the followinglower bound Let k ge 3 There is a sequence of k-tensors tn in (C2n)otimesk such thatR(tn)R(tn) ge k minus o(1) when n rarr infin This answers a question of Landsbergand Micha lek [LM16b] and disproves a conjecture of Rhodes [AJRS13] Furtherprogress will most likely require the construction of explicit tensors with hightensor rank which has implications in formula complexity [Raz13]

Border support rank Support rank is a variation on tensor rank whichhas its own approximative version called border support rank A border support

17 Combinatorial degeneration 15

rank upper bound for the matrix multiplication tensor yields an upper bound onthe asymptotic complexity This was shown by Cohn and Umans in the contextof the group theoretic approach towards fast matrix multiplication [CU13] Theyasked what is the border support rank of the smallest matrix multiplicationtensor 〈2 2 2〉 In [BCZ17a] we showed that it equals seven Our proof usesthe highest-weight vector technique (see also [HIL13]) Our original motivationto study support rank is a connection that we found between support rank andnondeterministic multiparty quantum communication complexity [BCZ17b]

Tensor rank under outer tensor product We applied degeneration asa tool to study an outer tensor product otimes on tensors For s isin Cn1 otimes middot middot middot otimes Cnk

and t isin Cm1 otimes middot middot middot otimes Cm` let s otimes t be the natural (k + `)-tensor in Cn1 otimes middot middot middot otimesCnk otimes Cm1 otimes middot middot middot otimes Cm` The products otimes and otimes differ by a regrouping of thetensor indices It is well known that tensor rank is not multiplicative under otimesIn [CJZ18] we showed that tensor rank is already not multiplicative under otimes astronger result Nonmultiplicativity occurs when taking a power of a tensor whoseborder rank is strictly smaller than its tensor rank This answers a question ofDraisma [Dra15] and Saptharishi et al [CKSV16]

17 Combinatorial degeneration

In the previous section we introduced the general idea of degeneration and discusseddegeneration of tensors Combinatorial degeneration is the combinatorial analogueof tensor degeneration Consider sets Φ sube Ψ sube I1 times middot middot middot times Ik of k-tuples Wesay Φ is a combinatorial degeneration of Ψ and write Ψ Φ if there are mapsui Ii rarr Z such that for all α isin I1 times middot middot middot times Ik if α isin Ψ Φ then

sumki=1 ui(αi) gt 0

and if α isin Φ thensumk

i=1 ui(αi) = 0 We prove that combinatorial asymptoticsubrank is nonincreasing under combinatorial degeneration

Theorem (Theorem 521) If Ψ Φ then ˜Q(Ψ) ge ˜Q(Φ)

The analogous statement for subrank of tensors is trivially true The crucialpoint is that Theorem 521 is about combinatorial subrank As an example Theo-rem 521 combined with the CW method yields an elegant optimal constructionof tri-colored sum-free sets which are combinatorial objects related to cap sets

18 Algebraic branching program degeneration

We now consider degeneration in the context of algebraic branching programs Acentral theme in algebraic complexity theory is the study of the power of differentalgebraic models of computation and the study of the corresponding complexityclasses We have already (implicitly) used an algebraic model of computationwhen we discussed matrix multiplication circuits

16 Chapter 1 Introduction

bull A circuit is a directed acyclic graph G with one or more source verticesand one sink vertex Each source vertex is labelled by a variable xi ora constant α isin F The other vertices are labelled by either + or times andhave in-degree 2 (that is fan-in 2) Each vertex of G naturally computesa polynomial The value of G is the element computed at the sink vertexThe size of G is the number of vertices (One may also allow multiple sinkvertices in order to compute multiple polynomials eg to compute matrixmultiplication) Here is an example of a circuit computing xy + 2x+ y minus 1

minus1 2 x y source vertices

times times

+ +

+ sink vertex

Consider the following two models

bull A formula is a circuit whose graph is a tree

bull An algebraic branching program (abp) is a directed acyclic graph G withone source vertex s one sink vertex t and affine linear forms over the basefield F as edge labels Moreover each vertex is labeled with an integer (itslayer) and the arrows in the abp point from vertices in layer i to vertices inlayer i+ 1 The cardinality of the largest layer we call the width of the abpThe number of vertices we call the size of the abp The value of an abp isthe sum of the values of all sndasht-paths where the value of an sndasht-path is theproduct of its edge labels We say that an abp computes its value Here isan example of a width-3 abp computing xy + 2x+ y minus 1

s

t

x2

xyminus1

19 Organisation 17

The above models of computation give rise to complexity classes A complexityclass consists of families of multivariate polynomials (fn)n = (f(x1 xqn)n)nisinNover some fixed field F We say a family of polynomials (fn)n is a p-family if thedegree of fn and the number of variables of fn grow polynomially in n Let VPbe the class of p-families with polynomially bounded circuit size Let VPe be theclass of p-families with polynomially bounded formula size For k isin N let VPk bethe class of families of polynomials computable by width-k abps of polynomiallybounded size Let VPs be the class of p-families computable by skew circuitsof polynomial size Skew circuits are a type of circuits between formulas andgeneral circuits The class VPs coincides with the class of families of polynomialscomputable by abps of polynomially bounded size (see eg [Sap16]) Ben-Orand Cleve proved that VP3 = VP4 = middot middot middot = VPe [BOC92] Allender and Wangproved VP2 ( VP3 [AW16] Thus VP2 ( VP3 = VP4 = middot middot middot = VPe sube VPsThe following separation problem is one of the many open problems regardingalgebraic complexity classes Is the inclusion VPe sube VPs strict Motivated by thisseparation problem we study the approximation closure of VPe We mentionedthat Ben-Or and Cleve proved that formula size is polynomially equivalent towidth-3 abp size [BOC92] Regarding width-2 there are explicit polynomials thatcannot be computed by any width-2 abp of any size [AW16] The abp model hasa natural notion of approximation When we allow approximation in our abpsthe situation changes completely

Theorem (Theorem 78) Any polynomial can be approximated by a width-2 abpof size polynomial in the formula size

In terms of complexity classes this means VP2 = VPe where middot denotes theldquoapproximation closurerdquo of the complexity class The theorem suggests an ap-proach regarding the separation of VPe and VPs Namely superpolynomial lowerbounds on formula size may be obtained from superpolynomial lower bounds onapproximate width-2 abp size We moreover study the nondeterminism closure ofcomplexity classes and prove a new characterisation of the complexity class VNP

19 Organisation

This dissertation is divided into chapters as follows We will begin with the abstracttheory of asymptotic spectra in Chapter 2 Then we introduce the asymptoticspectra of graphs and a new characterisation of the Shannon capacity in Chapter 3In Chapter 4 we introduce the asymptotic spectrum of tensors discuss the supportfunctionals of Strassen for oblique tensors and a characterisation of asymptoticslice rank of oblique tensors as the minimum over the support functionals InChapter 5 we discuss tight tensors the higher-order CoppersmithndashWinogradmethod the combinatorial degeneration method and applications to the cap setproblem type sets and graph tensors In Chapter 6 we introduce an infinite family

18 Chapter 1 Introduction

of elements in the asymptotic spectrum of complex k-tensors and characterise theasymptotic slice rank as the minimum over the quantum functionals Finally inChapter 7 we study algebraic branching programs and approximation closure andnondeterminism closure of algebraic complexity classes

Chapter 2

The theory of asymptotic spectra

21 Introduction

This is an expository chapter about the abstract theory of asymptotic spectra ofVolker Strassen [Str88] The theory studies semirings S that are endowed with apreorder 6 The main result Theorem 212 is that under certain conditions theasymptotic version 6sim of this preorder is characterised by the semiring homomor-phisms S rarr Rge0 that are monotone under 6 These monotone homomorphismsmake up the ldquoasymptotic spectrumrdquo of (S6) For the elements of S we havenatural notions of rank and subrank generalising rank and subrank of tensorsThe asymptotic spectrum gives a dual characterisation of the asymptotic versionsof rank and subrank This dual description may be thought of as a ldquolower boundrdquomethod in the sense of computational complexity theory In Chapter 3 andChapter 4 we will study two specific pairs (S6)

22 Semirings and preorders

A (commutative) semiring is a set S with a binary addition operation + a binarymultiplication operation middot and elements 0 1 isin S such that for all a b c isin S

(1) + is associative (a+ b) + c = a+ (b+ c)

(2) + is commutative a+ b = b+ a

(3) 0 + a = a

(4) middot is associative (a middot b) middot c = a middot (b middot c)

(5) middot is commutative a middot b = b middot a

(6) 1 middot a = a

19

20 Chapter 2 The theory of asymptotic spectra

(7) middot distributes over + a middot (b+ c) = (a middot b) + (a middot c)

(8) 0 middot a = 0

As usual we abbreviate a middot b as ab A preorder is a relation 4 on a set X such thatfor all a b c isin X

(1) 4 is reflexive a 4 a

(2) 4 is transitive a 4 b and b 4 c implies a 4 c

As usual a 4 b is the same as b lt a Let N = 0 1 2 be the set of naturalnumbers and let Ngt0 = 1 2 be the set of strictly-positive natural numbersWe write le for the natural order 0 le 1 le 2 le 3 le middot middot middot on N

23 Strassen preorders

Let S be a semiring with N sube S A preorder 4 on S is a Strassen preorder if

(1) forallnm isin N n le m iff n 4 m

(2) foralla b c d isin S if a 4 b and c 4 d then a+ c 4 b+ d and ac 4 bd

(3) foralla b isin S b 6= 0 existr isin N a 4 rb

Note that condition (2) is equivalent to the condition foralla b s isin S if a 4 b thena+ s 4 b+ s and as 4 bs

Let 4 be a Strassen preorder on S Then 0 4 1 by condition (1) For a isin Swe have a 4 a by reflexivity and thus 0 4 a by condition (2)

Examples

We give two examples of a semiring with a Strassen preorder Proofs and formaldefinitions are given later

Graphs Let S be the set of all (isomorphism classes of) finite simple graphsLet GH isin S Let G t H be the disjoint union of G and H Let G H bethe strong graph product of G and H (see Chapter 3) With addition t andmultiplication the set S becomes a semiring The 0 in S is the graph with novertices and the 1 in S is the graph with a single vertex Let G be the complementof G Define a preorder 6 on S by G 6 H if there is a graph homomorphismGrarr H Then 6 is a Strassen preorder We will investigate this semiring furtherin Chapter 3

24 Asymptotic preorders 4sim 21

Tensors Let F be a field Let k isin N Let S be the set of all k-tensors over Fwith arbitrary format that is S = cupFn1 otimes middot middot middot otimes Fnk n1 nk isin N Fors isin Fn1 otimes middot middot middot otimes Fnk and t isin Fm1 otimes middot middot middot otimes Fmk let s 6 t if there are linear mapsAi Fmi rarr Fni with (A1otimesmiddot middot middototimesAk)t = s We identify any s t isin S for which s 6 tand t 6 s Let oplus be the direct sum of k-tensors and let otimes be the tensor productof k-tensors (see Chapter 4) With addition oplus and multiplication otimes the set Sbecomes a semiring The 0 in S is the zero tensor and the 1 in S is the standardbasis element e1otimes middot middot middot otimes e1 isin F1otimes middot middot middot otimes F1 The preorder 6 is a Strassen preorderWe will investigate this semiring further in Chapter 4 Chapter 5 and Chapter 6

24 Asymptotic preorders 4simDefinition 21 Let 4 be a relation on S Define the relation 4sim on S by

a2 4sim a1 if exist(xN) isin NN infNx

1NN = 1 forallN isin N aN2 4 aN1 xN (21)

If 4 is a Strassen preorder then we may in (21) replace the infimum infN x1NN

by the limit limNrarrinfin x1NN since we may assume xN+M le xNxM (if aN2 4 aN1 xN

and aM2 4 aM1 xM then aN+M2 4 aN+M

1 xNxM) and then apply Feketersquos lemma(Lemma 22)

Lemma 22 (Feketersquos lemma see [PS98 No 98]) Let x1 x2 x3 isin Rge0 satisfyxn+m le xn + xm Then limnrarrinfin xnn = infn xnn

Proof Let y = infn xnn Let ε gt 0 Let m isin Ngt0 with xmm lt y + ε Anyn isin N can be written in the form n = qm+ r where r is an integer 0 le r le mminus 1Set x0 = 0 Then xn = xqm+r le xm + xm + middot middot middot+ xm + xr = qxm + xr Therefore

xnn

=xqm+r

qm+ rle qxm + xr

qm+ r=xmm

qm

qm+ r+xrn

Thus

y le xnnlt (y + ε)

qm

n+xrn

The claim follows because xrnrarr 0 and qmnrarr 1 when nrarrinfin

For a1 a2 isin S if a1 4 a2 then clearly a1 4sim a2

Lemma 23 Let 4 be a Strassen preorder on S Then 4sim is a Strassen preorderon S the ldquoasymptotic preorderrdquo corresponding to 4

Proof Let a b c d isin S We verify that 4sim is a preorderFirst reflexivity We have a 4 a so aN 4 aN middot 1 so a 4sim a

22 Chapter 2 The theory of asymptotic spectra

Second transitivity Let a 4sim b and b 4sim c This means aN 4 bNxN andbN 4 cNyN with x

1NN rarr 1 and y

1NN rarr 1 Then aN 4 bNxN 4 cNxNyN Since

(xNyN)1N rarr 1 we conclude a 4sim cWe verify condition (1) Let nm isin N If n le m then n 4 m so n 4sim m If

n 4sim m then nN 4 nMxN so nN le mNxN which implies n le mWe verify condition (2) Let a 4sim b and c 4sim d This means aN 4 bNxN and

cN 4 dNyN Thus aNcN 4 bNdNxNyN and so ac 4sim bd Assume xN and yN arenondecreasing (otherwise set xN = maxnleN xn) Then

(a+ c)N =Nsumm=0

(N

m

)amcNminusm 4

Nsumm=0

(N

m

)bmdNminusmxmyNminusm

4Nsumm=0

(N

m

)bmdNminusmxNyN = (b+ d)NxNyN

Thus a+ c 4sim b+ dWe verify (3) Let a b isin S b 6= 0 Then there is an r isin N with a 4 rb and

thus a 4sim rb

Lemma 24 Let 4 be a Strassen preorder on S Let a1 a2 b isin S

(i) If a2 + b 4 a1 + b then a2 4sim a1

(ii) If a2b 4 a1b with b 6= 0 then a2 4sim a1

(iii) If a24simsim a1 then a2 4sim a1

(iv) If exists isin S foralln isin N na2 4 na1 + s then a2 4sim a1

Proof (ii) Let a2b 4 a1b By an inductive argument similar to the argument weused to prove (24)

forallN isin N aN2 b 4 aN1 b (22)

Let m r isin N with 1 4 mb 4 r (We use b 6= 0) From (22) follows

forallN isin N aN2 4 aN2 mb 4 aN1 mb 4 aN1 r

Thus we conclude a2 4sim a1(iii) Let a2

4simsim a1 This means aN2 4sim aN1 xN with x1NN rarr 1 This in turn means

that (aN2 )M 4 (aN1 xN)MyNM with forallN isin N y1MNM rarr 1 that is

aNM2 4 aNM1 xMN yNM

Choose a sequence N 7rarrMN such that (yNMN)1MN le 2 eg given N let MN be

the smallest M for which (yNM)1M le 2 Then aNMN2 4 aNMN

1 xMNN yNMN

and

(xMNN yNMN

)1(NMN ) = x1NN (yNMN

)1(NMN ) le x1NN 21N rarr 1

25 Maximal Strassen preorders 23

We conclude a2 4sim a1(iv) Let s isin S with foralln isin N na2 4 na1 + s We may assume a1 6= 0 Let k isin N

with s 4 ka1 Then

foralln isin N kna2 4 kna1 + ka1 = ka1(n+ 1) (23)

Apply (ii) to (23) to get

foralln isin N a2n 4sim a1(n+ 1)

By an inductive argument

forallN isin N aN2 4sim aNminus12 a12 4sim aNminus2

2 a213 4sim middot middot middot 4sim aN1 (N + 1)

Since (N + 1)1N rarr 1 a24simsim a1 From (iii) follows a2 4sim a1

(i) Let a2 + b 4 a1 + b We first prove

forallq isin N qa2 + b 4 qa1 + b (24)

By assumption the statement is true for q = 1 suppose the statement is truefor q minus 1 then

qa2 + b = (q minus 1)a2 + (a2 + b) 4 (q minus 1)a2 + (a1 + b)

= ((q minus 1)a2 + b) + a1 4 ((q minus 1)a1 + b) + a1 = qa1 + b

which proves the statement by induction Then foralln isin N na2 4 na1 + b From (iv)follows a2 4sim a1

25 Maximal Strassen preorders

Let P be the set of Strassen preorders on S For 4142 isin P we write 42 sube 41

if for all a b isin S a 42 b implies a 41 b (The notation 42 sube 41 is natural if wethink of the relations 4i as sets of pairs (a b) with a 4i b)

Lemma 25 Let 4 isin P with 4 = 4sim and a2 64 a1 Then there is an element4a1a2 isin P with 4 sube 4a1a2 and a1 4a1a2 a2

Proof For x1 x2 isin S let

x1 4a1a2 x2 if exists isin S x1 + sa2 4 x2 + sa1

The relation 4a1a2 is reflexive since x + 0 middot a2 4 x + 0 middot a1 The relation 4a1a2

is transitive if x1 4a1a2 x2 and x2 4a1a2 x3 then x1 + sa2 4 x2 + sa1 andx2 + ta2 4 x3 + ta1 for some s t isin S and so x1 + (t + s)a2 4 x2 + ta2 + sa1 4x3 + ta1 + sa1 = x3 + (t + s)a1 Thus x1 4a1a2 x3 We conclude that 4a1a2 is apreorder on S

24 Chapter 2 The theory of asymptotic spectra

We prove that 4a1a2 is a Strassen preorder If x1 4a1a2 x2 and y1 4a1a2 y2then clearly x1 + y1 4a1a2 x2 + y2 If x1 4a1a2 x2 and y isin S then x1y 4a1a2 x2yFrom this follows if x1 4a1a2 x2 and y1 4a1a2 y2 then x1y2 4a1a2 x2y2

Let nm isin N If n le m then n 4 m so n 4a1a2 m If n 6le m then n ge m+ 1Suppose n 4a1a2 m Let s isin S with n+ sa2 4 m+ sa1 Adding m+ 1 4 n gives

m+ 1 + n+ sa2 4 n+m+ sa1

Since 4 = 4sim we may apply Lemma 24 (i) to obtain

1 + sa2 4 sa1 (25)

From (25) follows s 6= 0 From (25) also follows

sa2 4 sa1 (26)

Since 4 = 4sim we may apply Lemma 24 (ii) to (26) to obtain the contradiction

a2 4 a1

Therefore n 64a1a2 m We conclude that 4a1a2 is a Strassen preorder thatis 4 isin P

Finally we have a1 4a1a2 a2 since a1 + 1 middot a2 4 a2 + 1 middot a1 Also if x1 4 x2then x1 + 0 middot a2 4 x2 + 0 middot a1 that is 4 sube 4a1a2

Let 4 be a Strassen preorder Let P4 be the set of Strassen preorderscontaining 4 ordered by inclusion sube Let C sube P4 be any chain Then theunion of all preorders in C is an element of P4 and contains all elements of CTherefore by Zornrsquos lemma P4 contains a maximal element (maximal withrespect to inclusion sube)

Lemma 26 Let 4 be maximal in P Then 4 = 4sim

Proof Trivially 4 sube 4sim From Lemma 23 we know 4sim isin P From maximalityof 4 follows 4 = 4sim

A relation 4 on S is total if for all a b isin S a 4 b or b 4 a

Lemma 27 Let 4 be maximal in P Then 4 is total

Proof Suppose 4 is not total say a1 64 a2 and a2 64 a1 By Lemma 25 there is anelement 4a1a2isin P with 4 sube 4a1a2 and a1 4a1a2 a2 Then 4 is strictly containedin 4a1a2 which contradicts the maximality of 4 We conclude 4 is total

26 The asymptotic spectrum X(S6) 25

26 The asymptotic spectrum X(S6)

Definition 28 Let S be a semiring with N sube S and let 6 be a Strassen preorderon S Let

X(S6) = φ isin Hom(SRge0) a 6 brArr φ(a) le φ(b)

We call X(S6) the asymptotic spectrum of (S6) We call the elements ofX(S6) spectral points

Lemma 29 Let 4 isin P be total There is exactly one semiring homomorphismφ S rarr Rge0 with

a 4 brArr φ(a) le φ(b)

Moreover if 4 is maximal in P then

a 4 bhArr φ(a) le φ(b)

Proof Let 4 isin P be total For a isin S define

φ(a) = inf rs

r s isin N sa 4 rψ(a) = supu

v u v isin N u 4 va

We prove ψ(a) le φ(a) Let r s u v isin N Suppose u 4 va and sa 4 r Thenfollows su 4 vsa 4 vr Thus uv le rs We prove ψ(a) ge φ(a) Supposeψ(a) lt φ(a) Let r s isin N with ψ(a) lt rs lt φ(a) Then sa 64 r Fromtotality follows sa lt r Thus ψ(a) ge rs which is a contradiction We concludeψ(a) = φ(a)

Let a b isin S We prove φ(a+ b) le φ(a) + φ(b) Let sa sb ra rb isin N Supposesaa 4 ra and sbb 4 rb Then sasba 4 sbra and sasbb 4 sarb By additionsasb(a+b) 4 sbra+sarb Thus φ(a+b) le ra

sa+ rb

sb We prove ψ(a+b) ge ψ(a)+ψ(b)

Suppose ua 4 vaa and ub 4 vbb Then vbua 4 vavba and vaub 4 vavbb By additionvbua + vaub 4 vavb(a+ b) Thus ψ(a+ b) ge ua

va+ ub

vb We thus have additivity

We prove φ(ab) le φ(a)φ(b) Suppose saa 4 ra and sbb 4 rb Then sasbab 4rarb Thus φ(ab) le ra

sa

rbsb

We prove ψ(ab) ge ψ(a)ψ(b) Suppose ua 4 vaa and

ub 4 vbb Then uaub 4 vavbab Thus uava

ubvble ψ(ab) We thus have multiplicativity

We prove monotonicity a 4 brArr φ(a) le φ(b) Suppose sbb 4 rb From a 4 bfollows sba 4 sbb 4 rb Thus φ(a) le rb

sb

We prove φ(1) = 1 Trivially 1 4 1 Therefore φ(1) le 11

= 1 and ψ(1) ge 11

= 1We prove φ(0) = 0 Trivially sa0 4 0 so φ(0) le 0

sa= 0 Trivially 0 4 va0 so

φ(0) ge 0va

= 0We prove the uniqueness of φ Let φ1 φ2 be semiring homomorphisms S rarr Rge0

with a 4 b rArr φi(a) le φi(b) Suppose φ1(a) lt φ2(a) Let u v isin N with

26 Chapter 2 The theory of asymptotic spectra

φ1(a) lt uvlt φ2(a) Then va 64 u so by totality va lt u Thus φ1(a) ge u

v which

is a contradiction This proves uniquenessFinally suppose 4 is maximal in P Lemma 26 gives 4 = 4sim Let a 64 b

From Lemma 24 (iv) follows existn na 64 nb+ 1 By totality na lt nb+ 1 Apply φto get φ(a) ge φ(b) + 1

n In particular φ(a) gt φ(b)

Lemma 210 The map

X(S6)rarr maximal elements in P6 φ 7rarr 4φwith a 4φ b iff φ(a) le φ(b) is a bijection

Proof Let φ isin X(S6) One verifies that 4φ is a Strassen preorder and 6 sube 6sim sube4φ Let 4 be maximal in P4φ Lemma 27 says that 4 is total By Lemma 29there is a ψ isin X(S6) with 4 sube 4ψ Clearly 4φ sube 4ψ The uniqueness statementof Lemma 29 implies φ = ψ This means 4φ = 4 that is 4φ is maximal Weconclude that the map is well defined

Let 4 maximal in P6 Then 4 is total By Lemma 29 there is a φ isin X(S6)with 4 sube 4φ We conclude the map is surjective

Let φ ψ isin X(S6) with 4φ = 4ψ From Lemma 29 follows φ = ψ Weconclude the map is injective

Lemma 211 Let a b isin S Then a 6sim b iff a 4 b for all maximal 4 isin P6

Proof Let 4 isin P6 be maximal Then 6sim sube 4sim = 4 by Lemma 26 so a 6sim bimplies a 4 b

Suppose a 66sim b Let n isin Nge1 with na 66sim nb+1 (Lemma 24 (iv)) By Lemma 25there is an element 4nb+1na isin P with 6sim sube 4nb+1na and we may assume 4nb+1na

is maximal Then nb+ 1 4nb+1na na and so a 64nb+1na b

27 The representation theorem

The following theorem is the main theorem

Theorem 212 ([Str88 Th 24]) Let S be a commutative semiring with N sube Sand let 6 be a Strassen preorder on S Let X = X(S6) be the set of 6-monotonesemiring homomorphisms from S to Rge0

X = X(S6) = φ isin Hom(SRge0) foralla b isin S a 6 brArr φ(a) le φ(b)

For a b isin S let a 6sim b if there is a sequence (xN) isin NN with x1NN rarr 1 when

N rarrinfin such that forallN isin N aN 6 bNxN Then

foralla b isin S a 6sim b iff forallφ isin X φ(a) le φ(b)

Proof Let a b isin S Suppose a 6sim b Then clearly for all φ isin X we haveφ(a) le φ(b) Suppose a 66sim b By Lemma 211 there is a maximal 4isin P6 witha 64 b By Lemma 210 there is a φ isin X with φ(a) gt φ(b)

28 Abstract rank and subrank RQ 27

28 Abstract rank and subrank RQ

We generalise the notions of rank and subrank for tensors to arbitrary semiringswith a Strassen preorder Let a isin S Define the rank

R(a) = minr isin N a 6 r

and the subrank

Q(a) = maxr isin N r 6 a

Then Q(a) le R(a) Define the asymptotic rank

˜R(a) = limNrarrinfin

R(aN)1N

Define the asymptotic subrank

˜Q(a) = limNrarrinfin

Q(aN)1N

By Feketersquos lemma (Lemma 22) asymptotic rank is an infimum and asymptoticsubrank is a supremum as follows

˜R(a) = infN

R(aN)1N

˜Q(a) = supN

Q(aN)1N when a = 0 or a ge 1

Theorem 212 implies that the asymptotic rank and asymptotic subrank have thefollowing dual characterisation in terms of the asymptotic spectrum (This is astraightforward generalisation of [Str88 Th 38])

Corollary 213 (cf [Str88 Th 38]) For a isin S with existφ isin Xφ(a) ge 1

˜R(a) = maxφisinX

φ(a)

Proof Let φ isin X For N isin N R(aN) ge φ(a)N Therefore ˜R(a) ge φ(a)and so ˜R(a) ge maxφisinX φ(a) It remains to prove ˜R(a) le maxφisinX φ(a) Welet x = maxφisinX φ(a) By assumption x ge 1 By definition of x we have

forallφ isin X φ(a) le x

Take the mth power on both sides

forallφ isin Xm isin N φ(am) le xm

Take the ceiling on the right-hand side

forallφ isin Xm isin N φ(am) le dxme

28 Chapter 2 The theory of asymptotic spectra

Apply Theorem 212 to get asymptotic preorders

forallm isin N am 6sim dxme

Then by definition of asymptotic preorder

forallmN isin N amN 6 dxmeN2εmN for some εmN isin o(N)

Then

forallmN isin N R(amN)1mN le dxme1m2εmNmN

From x ge 1 follows dxme1m rarr x when m rarr infin Choose m = m(N) withm(N)rarrinfin as N rarrinfin and εm(N)N isin o(N) to get ˜R(a) = infN R(aN )1N le x

Corollary 214 (cf [Str88 Th 38]) For a isin S with existk isin N ak gt 2

˜Q(a) = minφisinX

φ(a)

Proof Let φ isin X For N isin N Q(aN ) le φ(a)N Therefore ˜Q(a) le φ(a) so ˜Q(a) leminφisinX φ(a) It remains to prove ˜Q(a) ge minφisinX φ(a) Let y = minφisinX φ(a)

From the assumption ak gt 2 follows y gt 1 By definition of y we have

forallφ isin X φ(a) ge y

Take the mth power on both sides

forallφ isin Xm isin N φ(am) ge ym

Take the floor on the right-hand side

forallφ isin Xm isin N φ(am) ge bymc

Apply Theorem 212 to get asymptotic preorders

forallm isin N am gtsim bymc

Then by definition of asymptotic preorder

forallmN isin N amN2εmN gt bymcN for some εmN isin o(N)

Now we use ak gt 2 to get

forallmN isin N amN+kεmN gt bymcN

Then

forallmN isin N Q(amN+kεmN )1

mN+kεmN ge bymcN

mN+kεmN

Choose m = m(N) with m(N) rarr infin as N rarr infin and εm(N)N isin o(N) to obtain

˜Q(a) = supN Q(aN)1N ge y

29 Topological aspects 29

29 Topological aspects

Theorem 212 does not tell the full story Namely there is also a topologicalcomponent which we will now discuss Let S be a semiring with N sube S Let 6 bea Strassen preorder on S Let X = X(S6) be the asymptotic spectrum of (S6)For a isin S let

a Xrarr Rge0 φ 7rarr φ(a) (27)

The map a simply evaluates a given homomorphism φ at a One may think of aas the collection (φ(a))φisinX of all evaluations of the elements of X at a Let Rge0

have the Euclidean topology Endow X with the weak topology with respect tothe family of functions a a isin S That is endow X with the coarsest topologysuch that each a becomes continuous

Let C(XRge0) be the semiring of continuous functions Xrarr Rge0 with additionand multiplication defined pointwise on X that is (f + g)(x) = f(x) + g(x)and (f middot g)(x) = f(x)g(x) for f g isin C(XRge0) and x isin X Define the semiringhomomorphism

Φ S rarr C(XRge0) a 7rarr a

which maps a to the evaluator a defined in (27)

Theorem 215 ([Str88 Th 24])

(i) X is a nonempty compact Hausdorff space

(ii) foralla b isin S a 6sim b iff Φ(a) le Φ(b) pointwise on X

(iii) Φ(S) separates the points of X

Proof Statement (ii) follows from Theorem 212Statement (iii) is clearWe prove statement (i) We have 2 66sim 1 so from Theorem 212 follows that X

cannot be emptyFor a isin S let na isin N with a le na Then for φ isin X φ(a) le na and so

φ(a) isin [0 na] Embed X subeprod

aisinS[0 na] as a set via φ 7rarr (φ(a))aisinS The setprodaisinS[0 na] with the product topology is compact by the theorem of TychonoffTo see that X is closed in

prodaisinS[0 na] we write X as an intersection of sets

X =φ isin

prodaisinS

[0 na] φ(0) = 0capφ isin

prodaisinS

[0 na] φ(1) = 1

cap⋂bcisinS

φ isin

prodaisinS

[0 na] φ(b+ c)minus φ(b)minus φ(c) = 0

cap⋂bcisinS

φ isin

prodaisinS

[0 na] φ(bc)minus φ(b)φ(c) = 0

30 Chapter 2 The theory of asymptotic spectra

cap⋂bcisinSblec

φ isin

prodaisinS

[0 na] φ(b) le φ(c)

and we observe that the intersected sets are closed

X = 0minus1(0) cap 1minus1(1)

cap⋂bcisinS

((b+ c)minus bminus c

)minus1(0)

cap⋂bcisinS

((bc)minus bc

)minus1(0)

cap⋂bcisinSblec

(cminus b

)minus1([0infin))

This implies X is also compactLet φ ψ isin X be distinct Let a isin S with φ(a) 6= ψ(a) Then a(φ) 6= a(ψ)

Let U 3 a(φ) V 3 a(ψ) be open and disjoint subsets of Rge0 Then aminus1(U) andaminus1(V ) are open and disjoint subsets of X We conclude that X is Hausdorff

210 Uniqueness

Let S be a semiring with N sube S Let 6 be a Strassen preorder on S LetX = X(S6) be the asymptotic spectrum of (S6) The object X is unique inthe following sense

Theorem 216 ([Str88 Cor 27]) Let Y be a compact Hausdorff space LetΨ S rarr C(YRge0) be a homomorphism of semirings such that

Ψ(S) separates the points of Y (28)

and

foralla b isin S a 6sim bhArr Ψ(a) le Ψ(b) pointwise on Y (29)

Then there is a unique homeomorphism (continuous bijection with continuousinverse) h Y rarr X such that the diagram

S

C(XRge0) C(YRge0)

ΨΦ

hlowast

(210)

commutes where hlowast φ 7rarr φ h Namely let h y 7rarr(a 7rarr Ψ(a)(y)

)

211 Subsemirings 31

Proof We prove uniqueness Suppose there are two such homeomorphisms

h1 h2 Y rarr X

Suppose x 6= h2(hminus11 (x)) for some x isin X Since Φ(S) separates the points of X

there is an a isin S with Φ(a)(x) 6= Φ(a)(h2(hminus11 (x))) Let y = hminus1

1 (x) isin Y ThenΦ(a)(h1(y)) 6= Φ(a)(h2(y)) Since (210) commutes Φ(a)(h1(y)) = Ψ(a)(y) andΦ(a)(h2(y)) = Ψ(a)(y) a contradiction

We prove existence Let h Y rarr X y 7rarr (a 7rarr Ψ(a)(y)) One verifies that his well-defined continuous injective and that the diagram in (210) commutes Itremains to show that h is surjective We know that Q middot Φ(S) is a Q-subalgebraof C(XR) which separates points and which contains the nonzero constantfunction Φ(1) so by the StonendashWeierstrass theorem Q middot Φ(S) is dense in C(XR)under the sup-norm Suppose h is not surjective Then h(Y) ( X is a properclosed subset Let x0 isin X h(Y) be in the complement Since X is a compactHausdorff space there is a continuous function f Xrarr [minus1 1] with

f(h(Y)) = 1

f(x0) = minus1

We know that f can be approximated by elements from Q middot Φ(S) ie let ε gt 0then there are a1 a2 isin S N isin N such that

1N

(Φ(a1)(x)minus Φ(a2)(x)

)gt 1minus ε for all x isin h(Y)

1N

(Φ(a1)(x0)minus Φ(a2)(x0)

)lt minus1 + ε

This means Ψ(a1) ge Ψ(a2) pointwise on Y so a1 gtsim a2 but also Φ(a1) 6ge Φ(a2)pointwise on X so a1 6gtsim a2 This is a contradiction

211 Subsemirings

Let S be a subsemiring of a semiring T and let 6 be a Strassen preorder on T Then the restriction 6|S is a Strassen preorder on S How are the asymptoticspectra X(S6|S) and X(T6) related Obviously for φ isin X(T6) we haveφ|S isin X(S6|S) In fact the uniqueness theorem of Section 210 implies that allelements of X(S6|S) are restrictions of elements of X(T6)

Corollary 217 Let S be a subsemiring of a semiring T Let 6 be a Strassenpreorder on T Then

X(S6|S) = X(T6)|S

Proof Let

X = X(S6|S)

32 Chapter 2 The theory of asymptotic spectra

Φ S rarr C(XRge0) a 7rarr a

and let

Y = X(T6)|S = φ|S φ isin X(T6)Ψ S rarr C(YRge0) a 7rarr

(φ|S 7rarr φ|S(a)

)

Then Y is a compact Hausdorff space Let φ|S ψ|S isin Y be distinct Then there isan a isin S with φ|S(a) 6= ψ|S(a) so (28) holds For a b isin S a 6sim b iff Φ(a) le Φ(b)iff Ψ(a) le Ψ(b) so (29) holds Therefore

h X(T6)|S rarr X(S6|S) φ|S 7rarr(a 7rarr Ψ(a)(φ|S)

)= φ|S

is a homeomorphism

212 Subsemirings generated by one element

Let S be a semiring and let 6 be a Strassen preorder on S We specialise to thesimplest type of subsemiring of S Namely let a isin S and let

N[a] = ksumi=0

ni ai k isin N ni isin N

sube S

be the subsemiring of S generated by a We call X(N[a]) = X(N[a]6|N[a]) theasymptotic spectrum of a

Corollary 218 (cf [Str88]) If ak gt 2 for some k isin N then

˜Q isin X(N[a])

If φ(a) ge 1 for some φ isin X then

˜R isin X(N[a])

Proof Let X = X(N[a]) Let n1 nq By Corollary 214

˜Q(an1 + middot middot middot+ anq) = minφisinX

φ(an1 + middot middot middot+ anq)

Since φ is a homomorphism φ(an1 + middot middot middot+ anq) = φ(a)n1 + middot middot middot+ φ(a)nq Now weobserve that xn1 + middot middot middot+ xnq is minimised by taking x minimal in the domain Weconclude

˜Q(an1 + middot middot middot+ anq) =

qsumi=1

(minφisinX

φ(a))ni = ˜Q(a)n1 + middot middot middot+ ˜Q(a)nq

The claim for asymptotic rank ˜R similarly follows from Corollary 213

213 Universal spectral points 33

Remark 219 In general asymptotic subrank ˜Q and asymptotic rank ˜R are notelements of the asymptotic spectrum We will see an example in Chapter 4 relatedto the matrix multiplication tensor

Remark 220 Corollary 218 is closely related to Schonhagersquos τ -theorem fortensors also called Schonhagersquos asymptotic sum inequality The τ -theorem featuresin every recent fast matrix multiplication algorithm (ie every algorithm based onthe laser method)

Remark 221 An element φ isin X(N[a]) is uniquely determined by the valueof φ(a) isin Rge0 We may thus identify the asymptotic spectrum X(N[a]) with acompact (ie closed and bounded) subset of the positive reals Rge0 via φ 7rarr φ(a)

213 Universal spectral points

Having discussed the simplest type of subsemiring in the previous section letus discuss the most difficult type of supersemiring When applying the theoryof asymptotic spectra to some setting there is a natural largest semiring S inwhich the objects of study live For example we may study the semiring S of all(equivalence classes of) 3-tensors of arbitrary format over F Or we may studythe semiring S of all (isomorphism classes of) finite simple graphs We refer tothe elements of the asymptotic spectrum X(S) of the ldquoambientrdquo semiring S bythe term universal spectral points (cf [Str88 page 119]) The universal spectralpoints are the most useful monotone homomorphisms

214 Conclusion

To a semiring S with a Strassen preorder 6 we associated an asymptotic pre-order 6sim We proved that this asymptotic preorder is characterised by the6-monotone semiring homomorphisms S rarr Rge0 which make up the asymp-totic spectrum X(S6) of (S6) For (S6) we naturally have a rank functionR S rarr N and a subrank function Q S rarr N Their asymptotic versions

˜R(a) = infn R(an)1n and ˜Q(a) = supn Q(an)1n coincide with maxφisinX(S6) φ(a)

and minφisinX(S6) φ(a) respectively assuming existφ isin Xφ(a) ge 1 and existk isin N ak gt 2respectively Unfortunately we have proved the existence of the asymptotic spec-trum by nonconstructive means Explicitly constructing spectral points for a givenpair (S6) will be a challenging task

Some remarks about our proof in this chapter The proof in [Str88] uses theKadisonndashDubois theorem from the paper of Becker and Schwartz [BS83] as ablack-box Our presentation basically integrates the proof of Strassen with theproof of Becker and Schwartz The notions of rank and subrank were in [Str88] onlydiscussed for tensors We considered the straightforward generalisation to arbitrary

34 Chapter 2 The theory of asymptotic spectra

semirings with a Strassen preorder An evident feature of our presentation is thatwe do not pass from the semiring to its Grothendieck ring but instead stay inthe semiring In this way we stay close to the ldquoreal worldrdquo objects I thank JopBriet and Lex Schrijver for this idea There is a large body of literature on theKadisonndashDubois theorem for which we refer to the modern books by Prestel andDelzell [PD01 Theorem 526] and Marshall [Mar08 Theorem 544]

Chapter 3

The asymptotic spectrum of graphsShannon capacity

This chapter is based on the manuscript [Zui18]

31 Introduction

This chapter is about the Shannon capacity of graphs which was introduced byClaude Shannon in the context of coding theory [Sha56] More precisely we willapply the theory of asymptotic spectra of Chapter 2 to gain a better understandingof Shannon capacity (and other asymptotic properties of graphs)

We first recall the definition of the Shannon capacity of a graph Let G be a(finite simple) graph with vertex set V (G) and edge set E(G) An independent setor stable set in G is a subset of V (G) that contains no edges The independencenumber or stability number α(G) is the cardinality of the largest independentset in G For graphs G and H the and-product GH also called strong graphproduct is defined by

V (GH) = V (G)times V (H)

E(GH) =(g h) (gprime hprime)

(g gprime isin E(G) or g = gprime

)and

(h hprime isin E(H) or h = hprime

)and (g h) 6= (gprime hprime)

The Shannon capacity Θ(G) is defined as the limit

Θ(G) = limNrarrinfin

α(GN)1N (31)

This limit exists and equals the supremum supN α(GN)1N by Feketersquos lemma(Lemma 22)

Computing the Shannon capacity is nontrivial already for small graphs Lovaszin [Lov79] computed the value Θ(C5) =

radic5 where Ck denotes the k-cycle graph

by introducing and evaluating a new graph parameter ϑ which is now known as

35

36 Chapter 3 The asymptotic spectrum of graphs Shannon capacity

the Lovasz theta number For example the value of Θ(C7) is currently not knownThe Shannon capacity Θ is not known to be hard to compute in the sense ofcomputational complexity On the other hand deciding whether α(G) le k givena graph G and k isin N is NP-complete [Kar72]

New result dual description of Shannon capacity

The new result of this chapter is a dual characterisation of the Shannon capacityof graphs This characterisation is obtained by applying Strassenrsquos theory ofasymptotic spectra of Chapter 2 Thus this chapter also serves as an illustrationof the theory of asymptotic spectra

To state the theorem we need the standard notions graph homomorphismgraph complement and graph disjoint union Let G and H be graphs A graphhomomorphism f G rarr H is a map f V (G) rarr V (H) such that for allu v isin V (G) if u v isin E(G) then f(u) f(v) isin E(H) In other words a graphhomomorphism maps edges to edges The complement G of G is defined by

V (G) = V (G)

E(G) =u v u v 6isin E(G) u 6= v

We define a relation 6 on graphs let G 6 H if there is a graph homomor-phism Grarr H from the complement of G to the complement of H The disjointunion G tH is defined by

V (G tH) = V (G) t V (H)

E(G tH) = E(G) t E(H)

For n isin N the complete graph Kn is the graph with V (Kn) = [n] = 1 2 nand E(Kn) = i j i j isin [n] i 6= j Thus K0 = K0 is the empty graphand K1 = K1 is the graph consisting of a single vertex and no edges

Theorem 31 Let S sube graphs be a collection of graphs which is closed underthe disjoint union t and the strong graph product and which contains the graphwith a single vertex K1 Define the asymptotic spectrum X(S) as the set of allmaps φ S rarr Rge0 such that for all GH isin S

(1) if G 6 H then φ(G) le φ(H)

(2) φ(G tH) = φ(G) + φ(H)

(3) φ(GH) = φ(G)φ(H)

(4) φ(K1) = 1

32 The asymptotic spectrum of graphs 37

Let G 6sim H if there is a sequence (xN) isin NN with x1NN rarr 1 when N rarr infin such

that for every N isin N

GN 6 (HN)txN = HN t middot middot middot tHN︸ ︷︷ ︸xN

Then

(i) G 6sim H iff forallφ isin X(S) φ(G) le φ(H)

(ii) Θ(G) = minφisinX(S) φ(G)

Statement (ii) of Theorem 31 is nontrivial in the sense that Θ is not anelement of X(graphs) Namely Θ is not additive under t by a result ofAlon [Alo98] and Θ is not multiplicative under by a result of Haemers [Hae79]It turns out that the graph parameter G 7rarr maxφisinX(graphs) φ(G) is itself anelement of X(graphs) and is equal to the fractional clique cover number χf (seeSection 332 and eg [Sch03 Eq (67112)]) Fritz in [Fri17] proves (independentlyof Strassenrsquos line of work) a statement that is weaker than Theorem 31 Namelyhe proves the statement of Theorem 31 without the additivity condition (2)

In Section 32 we will prove Theorem 31 by applying the theory of asymptoticspectra of Chapter 2 to the appropriate semiring and preorder In Section 33 wewill discuss the elements in the asymptotic spectrum of graphs X(graphs) thatare currently known to me the Lovasz theta number the fractional clique covernumber the fractional orthogonal rank of the complement and the fractionalHaemers bounds We moreover prove a sufficient condition for the ldquofractionalisa-tionrdquo of a graph parameter to be in the asymptotic spectrum of graphs

32 The asymptotic spectrum of graphs

In this section we prove Theorem 31 by applying the theory of asymptotic spectrato the appropriate semiring

321 The semiring of graph isomorphism classes GA graph homomorphism f Grarr H is a graph isomorphism if f is bijective asa map V (G)rarr V (H) and bijective as a map E(G)rarr E(H) We write G sim= Hif there is a graph isomorphism f G rarr H The relation sim= is an equivalencerelation on graphs which we call isomorphism For example the graphs Gand H given by

V (G) = a b c d E(G) = a b b c c d a dV (H) = 1 2 3 4 E(H) = 1 3 2 3 2 4 1 4

38 Chapter 3 The asymptotic spectrum of graphs Shannon capacity

are isomorphic Let G = graphssim= be the set of equivalence classes in graphsunder sim= ie the isomorphism classes The relation 6 is a preorder on G Recallthat Kn is the complete graph on n vertices and thus Kn is the graph with nvertices and no edges

Lemma 32 Let ABC isin graphs

(i) t and are commutative and associative operations on G

(ii) distributes over t on G ie A (B t C) = (AB) t (A C)

(iii) K1 A = A

(iv) K0 A = K0

(v) K0 t A = A

(vi) Kn tKm = Kn+m

Proof We leave the proof to the reader

In other words Lemma 32 says that (Gt K0 K1) is a (commutative)semiring in which the elements K0 K1 K2 behave like the natural numbers NWe will denote this semiring simply by G

322 Strassen preorder via graph homomorphisms

Let G be the semiring of graphs Recall that G 6 H if there is a graph homomor-phism f Grarr H

Lemma 33 The preorder 6 is a Strassen preorder on G That is for graphsABCD isin G we have the following

(i) For nm isin N Kn 6 Km iff n le m

(ii) If A 6 B and C 6 D then A t C 6 B tD and A C 6 B D

(iii) For AB isin G if B 6= K0 then there is an r isin N with A 6 Kr B

Proof Statement (i) is easy to verify We prove (ii) Let f Ararr B and g C rarr Dbe graph homomorphisms Let the map f t g V (A) t V (C)rarr V (B) t V (D) bedefined by

(f + g)(a) = f(a) for a isin V (A)

(f + g)(c) = g(c) for c isin V (C)

32 The asymptotic spectrum of graphs 39

One verifies directly that f t g is a graph homomorphism A t C rarr B tD Letthe map f g V (A)times V (C)rarr V (B)times V (D) be defined by

(f g)(a c) = (f(a) g(c))

One verifies directly that f g is a graph homomorphism A C rarr B D Thisproves (ii) We prove (iii) Let r = |V (A)| Then A 6 Kr By assumptionB 6= K0 so K1 6 B Therefore A 6 Kr

sim= Kr1 6 KrB This proves (iii)

323 The asymptotic spectrum of graphs X(G)

We thus have a semiring G with a Strassen preorder 6 We are therefore in theposition to apply the theory of asymptotic spectra (Chapter 2) Let us translatethe abstract terminology to this setting

Let G 6sim H if there is a sequence (xN) isin NN with (xN)1N rarr 1 such that forevery N isin N we have GN 6 HN KxN ie GN 6 (HN)txN

Let S sube G be a subsemiring For example one may take S = G or one maychoose any set X sube G and let S = N[X] be the subsemiring of G generated by Xunder t and

The asymptotic spectrum of S is the set X(S) of 6-monotone semiring homo-morphisms S rarr Rge0 ie all maps φ S rarr Rge0 such that for all GH isin S

(1) if G 6 H then φ(G) le φ(H)

(2) φ(G tH) = φ(G) + φ(H)

(3) φ(GH) = φ(G)φ(H)

(4) φ(K1) = 1

We call X(G) the asymptotic spectrum of graphs

Theorem 34 Let GH isin S Then G 6sim H iff forallφ isin X(S) φ(G) le φ(H)

Proof By Lemma 32 we have a semigroup S and by Lemma 33 we have aStrassen preorder 6 so we may apply Theorem 212

We refer to Chapter 2 for a discussion of the topological properties of X(S)

324 Shannon capacity Θ

Let us discuss the (asymptotic) rank and (asymptotic) subrank for (G6) Recallthat an independent set in G is a subset of V (G) that contains no edges andthe independence number α(G) is the cardinality of the largest independent setin G A colouring of G is an assignment of colours to the elements of V (G) suchthat connected vertices get distinct colours The chromatic number χ(G) is the

40 Chapter 3 The asymptotic spectrum of graphs Shannon capacity

smallest number of colours in any colouring of G The clique cover number χ(G)is defined as the chromatic number of the complement χ(G) = χ(G)

For the semiring G with preorder 6 the abstract definition of subrank ofSection 28 becomes Q(G) = maxm isin N Km 6 G and the abstract definitionof rank becomes R(G) = minn isin N G 6 Kn

Lemma 35

(i) α(G) = Q(G)

(ii) χ(G) = R(G)

Proof We leave the proof to the reader

We see directly that the asymptotic rank is the Shannon capacity

˜R(G) = limNrarrinfin

R(GN)1N = limNrarrinfin

α(GN)1N = Θ(G)

and that the asymptotic subrank is the asymptotic clique cover number

˜Q(G) = limNrarrinfin

Q(GN)1N = limNrarrinfin

χ(GN)1N = ˜χ(G)

Let S sube G be a subsemiring Let G isin S

Corollary 36 Θ(G) = minφisinX(S) φ(G)

Proof Let G be a graph Either G = K0 or K1 6 G 6 K1 or G contains at leastone edge In the first two cases the claim is clearly true In the third case G gt K2

and we may thus apply Corollary 213

Corollary 37 ˜χ(G) = maxφisinX(S) φ(G)

Proof This is Corollary 214

Remark 38 As mentioned earlier it turns out that ˜χ is in fact itself an elementof X(G) See Section 332 (This is a striking difference with the situation fortensors which we will discuss in Chapter 4 there both asymptotic rank andasymptotic subrank are not in the asymptotic spectrum see Remark 44)

Shannon capacity is not in the asymptotic spectrum

Lemma 39 GG ge K|V (G)|

Proof Let D = (u u) u isin V (G) Let (u u) (v v) isin D Then eitheru v isin E(G) or u v isin E(G) (exclusive or) and so (u u) (v v) 6isin E(GG)Therefore the subgraph in GG induced by D is isomorphic to K|V (G)|

Example 310 Let G be the Schlafli graph This is a graph with 27 verticesThus Θ(GG) ge |V (G)| = 27 On the other hand Haemers in [Hae79] showedthat Θ(G)Θ(G) le 21 This implies the map Θ is not in X(G) since it is notmultiplicative under

33 Universal spectral points 41

33 Universal spectral points

The abstract theory of asymptotic spectra of Chapter 2 does not explicitly describethe elements of X(G) ie the universal spectral points (cf Section 213) Howeverseveral graph parameters from the literature can be shown to be universal spectralpoints In fact recently in [BC18] the first infinite family of universal spectralpoints was found the fractional Haemers bounds We give a brief (and probablyincomplete) overview of currently known elements in X(G)

331 Lovasz theta number ϑ

For any real symmetric matrix A let Λ(A) be the largest eigenvalue The Lovasztheta number ϑ(G) is defined as

ϑ(G) = minΛ(A) A isin RV (G)timesV (G) symmetric u v 6isin E(G)rArr Auv = 1

The parameter ϑ(G) was introduced by Lovasz in [Lov79] We refer to [Knu94]and [Sch03] for a survey It follows from well-known properties that ϑ isin X(G)

332 Fractional graph parameters

Besides the Lovasz theta number there are several elements in X(G) that arenaturally obtained as fractional versions of -submultiplicative t-subadditive6-monotone maps G rarr Rge0 For any map φ G rarr Rge0 we define a fractionalversion φf by

φf (G) = infd

φ(GKd

)d

We will discuss several fractional parameters from the literature and prove ageneral theorem about fractional parameters

Fractional clique cover number

We consider the fractional version of the clique cover number χ(G) = χ(G) It iswell-known that χf isin X(G) see eg [Sch03] The fractional clique cover number χfin fact equals the asymptotic clique cover number ˜χ(G) = limNrarrinfin χ(GN)1N

which we introduced in the previous section see [MP71] and also [Sch03 Th 6717]

Fractional Haemers bound

Let rank(A) denote the matrix rank of any matrix A For any set C of matricesdefine rank(C) = minrank(A) A isin C For a field F and a graph G define theset of matrices

MF(G) = A isin FV (G)timesV (G) foralluv Avv 6= 0 u v 6isin E(G)rArr Auv = 0

42 Chapter 3 The asymptotic spectrum of graphs Shannon capacity

Let RF(G) = rank(MF(G)) The parameter RF(G) was introduced by Haemersin [Hae79] and is known as the Haemers bound The fractional Haemers bound RF

f

was studied by Anna Blasiak in [Bla13] and was recently shown to be -multiplica-tive by Bukh and Cox in [BC18] From this it is not hard to prove that RF

f isin X(G)Bukh and Cox in [BC18] furthermore prove a separation result for any field F ofnonzero characteristic and any ε gt 0 there is a graph G such that for any field Fprimewith char(F) 6= char(Fprime) the inequality RF

f(G) lt εRFprimef (G) holds This separation

result implies that there are infinitely many elements in X(G)

Fractional orthogonal rank

In [CMR+14] the orthogonal rank ξ(G) and its fractional version the projectiverank ξf (G) are studied It easily follows from results in [CMR+14] that G 7rarr ξf (G)is in X(G)

General fractional parameters

We will prove something general about fractional parameters Define the lexico-graphic product GnH by

V (GnH) = V (G)times V (H)

E(GnH) =(g h) (gprime hprime) g gprime isin E(G)

or (g = gprime and h hprime isin E(H))

The lexicographic product satisfies GnH = GnH Also define the or-productG lowastH by

V (G lowastH) = V (G)times V (H)

E(G lowastH) =(g h) (gprime hprime) g gprime isin E(G) or h hprime isin E(H)

The or-product and the strong graph product are related by G lowastH = GH Thestrong graph product gives a subgraph of the lexicographic product which gives asubgraph of the or-product

GH sube GnH sube G lowastH

Therefore G lowastH 6 GnH 6 GH Finally GnKd = G lowastKd and of courseGKd = Gtd

We will prove if φ G rarr Rge0 is -submultiplicative t-subadditive and 6-monotone then φf is again -submultiplicative t-subadditive and 6-monotoneMoreover if φ G rarr N is 6-monotone and satisfies

forallGH isin G φ(GnH) ge φ(GnKφ(H))

then φf is n-supermultiplicative and more importantly φf is -supermultiplica-tive

33 Universal spectral points 43

Lemma 311

(i) If φ is t-superadditive then φf is t-superadditive

(ii) If φ is 6-monotone then φf is 6-monotone

(iii) If φ is t-subadditive and 6-monotone then φf is t-subadditive

(iv) If foralln isin N φ(Kn) = n then foralln isin N φf (Kn) = n

(v) If φ is -submultiplicative and 6-monotone then φf is -submultiplicative

Proof Let GH isin G Let d isin N(i) The lexicographic product distributes over the disjoint union

(G tH) nKd = (GnKd) t (H nKd)

By superadditivity

φ((GnKd) t (H nKd)) ge φ(GnKd) + φ(H nKd)

Therefore

φf (G tH) = infd

φ((G tH) nKd)

d

= infd

φ((GnKd) t (H nKd))

d

ge infd

φ(GnKd)

d+φ(H nKd)

d

ge infd1

φ(GnKd1)

d1

+ infd2

φ(H nKd2)

d2

= φf (G) + φf (H)

(ii) Let G 6 H Then G n Kd 6 H n Kd Thus φ(G n Kd) le φ(H n Kd)Therefore φf (G) le φf (H)

(iii) We have GnKd 6 GKd = Gtd Thus by monotonicity and subadditivity

φ(GnKd) le dφ(G)

and for d e isin N

φ(GnKde) = φ((GnKd) nKe) le eφ(GnKd)

We use this inequality to get for d1 d2 isin N

φ(GnKd1)

d1

+φ(H nKd2)

d2

ge φ(GnKd1d2) + φ(H nKd1d2)

d1d2

44 Chapter 3 The asymptotic spectrum of graphs Shannon capacity

From subadditivity follows

φ(GnKd1d2) + φ(H nKd1d2)

d1d2

ge φ((GnKd1d2) t (H nKd1d2))

d1d2

=φ((G tH) nKd1d2)

d1d2

ge φf (G tH)

We conclude φf (G) + φf (H) ge φf (G tH)(iv) Let n isin N Then φf (Kn) = infd φ(Kn nKd)d = infd φ(Knd)d = n(v) Let d1 d2 isin N We claim

(GH) nKd1d2 le (GnKd1) (H nKd2)

This is the same as saying there is a graph homomorphism

(GH) nKd1d2 rarr (GnKd1) (H nKd2)

which is the same as saying there is a graph homomorphism

(G lowastH) nKd1d2 rarr (GnKd1) lowast (H nKd2)

where lowast denotes the or-product of graphs One verifies that (g h (i j)) 7rarr((g i) (h j)) is such a graph homomorphism proving the claim The claimtogether with monotonicity and submultiplicativity gives

φ((GH)nKd1d2) le φ((GnKd1) (H nKd2)) le φ(GnKd1)φ(H nKd2)

Therefore

φf (GH) = infd

φ((GH) nKd)

d

= infd1d2

φ((GH) nKd1d2)

d1d2

le infd1d2

φ(GnKd1)

d1

φ(H nKd2)

d2

= φf (G)φf (H)

This concludes the proof of the lemma

Lemma 312 Let φ G rarr N satisfy

forallGH isin G φ(GnH) ge φ(GnKφ(H)) (32)

Then

infH

φ(GnH)

φ(H)= inf

d

φ(GnKd)

d

33 Universal spectral points 45

Proof From (32) follows

φ(GnH)

φ(H)geφ(GnKφ(H))

φ(H)

and so

φ(GnH)

φ(H)ge inf

d

φ(GnKd)

d

We take the infimum over H to get

infH

φ(GnH)

φ(H)ge inf

d

φ(GnKd)

d

The inequality in the other direction

infH

φ(GnH)

φ(H)le inf

d

φ(GnKd)

d

is trivially true

Lemma 313 Let φ G rarr N be 6-monotone and satisfy

forallGH isin G φ(GnH) ge φ(GnKφ(H))

Then φf is n- and -supermultiplicative

Proof Let AB isin G We have AB gt AnB so

φf (AB) ge φf (AnB)

It remains to show φf (AnB) ge φf (A)φf (B) We have

φ(AnB nH)

φ(H)=φ(An (B nH))

φ(B nH)

φ(B nH)

φ(H)

which implies

φ(AnB nH)

φ(H)ge inf

Hprime

φ(AnH prime)

φ(H prime)infHprimeprime

φ(B nH primeprime)

φ(H primeprime)= φf (A)φf (B)

Take the infimum over H to obtain φf (AnB) ge φf (A)φf (B)

Theorem 314 Let φ G rarr N be t-additive -submultiplicative 6-monotoneand Kn-normalised and satisfy

forallGH isin G φ(GnH) ge φ(GnKφ(H))

Then φf is in X(G)

Proof This follows from Lemma 311 Lemma 312 and Lemma 313

46 Chapter 3 The asymptotic spectrum of graphs Shannon capacity

34 Conclusion

In this chapter we introduced a new connection between Strassenrsquos theory ofasymptotic spectra and the Shannon capacity of graphs In particular we charac-terised the Shannon capacity (which is defined as a supremum) as a minimisationover elements in the asymptotic spectrum of graphs Known elements in theasymptotic spectrum of graphs include the fractional clique cover number theLovasz theta number the projective rank and the fractional Haemers bound Weare left with a clear goal for future work find all elements in the asymptoticspectrum of graphs

Chapter 4

The asymptotic spectrum of tensorsexponent of matrix multiplication

This chapter is based on joint work with Matthias Christandl and Peter Vrana [CVZ18]

41 Introduction

This chapter is about tensors t isin Fn1 otimes middot middot middot otimes Fnk and their asymptotic propertiesThe theory of asymptotic spectra of Chapter 2 was developed by Strassen exactlyfor the purpose of understanding the asymptotic properties of tensors Thischapter is expository and provides the necessary background for understandingChapter 5 and Chapter 6

Let us first define the asymptotic properties of interest and discuss some oftheir applications We need the concepts restriction tensor product and diagonaltensor Let s isin Fn1 otimes middot middot middot otimes Fnk and t isin Fm1 otimes middot middot middot otimes Fmk be tensors We say srestricts to t and write s gt t if there are linear maps Ai Fni rarr Fmi suchthat t = (A1 otimes middot middot middot otimes Ak) middot s The tensor product of s and t is the elements otimes t isin Fn1m1 otimes middot middot middot otimes Fnkmk with coordinates (s otimes t)ij = sitj We naturallydefine the direct sum s oplus t isin Fn1+m1 otimes middot middot middot otimes Fnk+mk We define the diagonaltensors 〈n〉 =

sumni=1 ei otimes middot middot middot otimes ei for n isin N where e1 en is the standard basis

of Fn The tensor rank R(t) is the smallest number n isin N such that t can bewritten as a sum of simple tensors a simple tensor being a tensor of the formv1 otimes middot middot middot otimes vk Equivalently R(t) = minn isin N t 6 〈n〉 The asymptotic rankis the regularisation ˜R(t) = limnrarrinfinR(totimesn)1n While tensor rank is known to behard to compute [Has90 Shi16] we do not know whether asymptotic rank is hardto compute

The exponent of matrix multiplication

The motivating example for studying asymptotic rank is the problem of findingthe exponent of matrix multiplication ω Recall from the introduction that ω

47

48 Chapter 4 The asymptotic spectrum of tensors matrix multiplication

is the infimum over a isin R such that two n times n matrices can be multipliedusing O(na) arithmetic operations (in the algebraic circuit model) It turns out(see [BCS97]) that ω is characterised by the asymptotic rank ˜R(〈2 2 2〉) of thematrix multiplication tensor

〈2 2 2〉 =sum

ijkisin[2]

eij otimes ejk otimes eki isin F4 otimes F4 otimes F4

Namely ˜R(〈2 2 2〉) = 2ω We know the trivial lower bound 2 le ω see Section 43We know the (non-trivial) upper bound ω le 23728639 which is by Coppersmithand Winograd [CW90] and improvements by Stothers Williams and Le Gall[Sto10 Wil12 LG14]

Asymptotic subrank and asymptotic restriction

Besides (asymptotic) rank we naturally define subrank Q(t) = maxm isin N 〈m〉 6 t and the asymptotic subrank ˜Q(t) = limnrarrinfinQ(totimesn)1n Moreover wesay s restricts asymptotically to t written s gtsim t if there is a sequence of naturalnumbers a(n) isin o(n) such that for all n isin N

sotimesn otimes 〈2〉otimesa(n) gt totimesn

One can prove (see [Str91]) that

sotimesn otimes 〈2〉otimeso(n) gt totimesn iff sotimesn+o(n) gt totimesn

Our goal is to understand asymptotic restriction asymptotic rank and asymptoticsubrank

More connections quantum information combinatorics algebraic prop-erty testing

Besides matrix multiplication other applications of asymptotic restriction oftensors asymptotic rank of tensors and asymptotic subrank of tensors includedeciding the feasibility of an asymptotic transformation between pure quantumstates via stochastic local operations and classical communication (slocc) inquantum information theory [BPR+00 DVC00 VDDMV02 HHHH09] boundingthe size of combinatorial structures like cap sets and tri-colored sum-free sets inadditive combinatorics [Ede04 Tao08 ASU13 CLP17 EG17 Tao16 BCC+17KSS16 TS16] see Chapter 5 and bounding the query complexity of certainproperties in algebraic property testing [KS08 BCSX10 Sha09 BX15 HX17FK14]

This chapter is organised as follows In Section 42 we briefly discuss thesemiring of tensors the asymptotic spectrum of tensors and asymptotic rank and

42 The asymptotic spectrum of tensors 49

subrank In Section 43 we discuss the gauge points a simple construction of finitelymany elements in the asymptotic spectrum of tensors In Section 44 we discussthe Strassen support functionals a family of elements in the asymptotic spectrumof ldquoobliquerdquo tensors This family is parametrised by probability distributionson [k] In Section 45 we discuss an extension of the support functionals calledthe Strassen upper support functionals which have the potential to be universalFinally in Section 46 we prove a new result we show how asymptotic slice rankis related to the support functionals

42 The asymptotic spectrum of tensors

Let us properly set up the semiring of tensors and the asymptotic spectrum Forthe proofs we refer to [Str87 Str88 Str91]

421 The semiring of tensor equivalence classes TWe begin by putting an equivalence relation on tensors For example we want toidentify isomorphic tensors and also for any tensor t isin Fn1 otimes middot middot middot otimes Fnk we wantto identify t with toplus 0 where 0 isin Fm1 otimes middot middot middot otimes Fmk is a zero tensor of any format

We say s is isomorphic to t and write s sim= t if there are bijective linear mapsAi Fmi rarr Fni such that t = (A1 Ak) middot s

We say s and t are equivalent and write s sim t if there are zero tensorss0 = 0 isin Fa1 times middot middot middot times Fak and t0 = 0 isin Fb1 times middot middot middot times Fbk such that s oplus s0

sim= t oplus t0The equivalence relation sim is in fact the equivalence relation generated by therestriction preorder 6

Let T be the set of sim-equivalence classes of k-tensors over F for some fixed kand field F The direct sum and the tensor product naturally carry over to T and T becomes a semiring with additive unit 〈0〉 and multiplicative unit 〈1〉(more precisely the equivalence classes of those tensors but we will not make thisdistinction)

422 Strassen preorder via restriction

Restriction 6 induces a partial order on T which behaves well with respect tothe semiring operations and naturally n le m if and only if 〈n〉 6 〈m〉 Thereforerestriction 6 is a Strassen preorder on T

423 The asymptotic spectrum of tensors X(T )

Let S sube T be a subsemiring Let

X(S) = X(S6) = φ isin Hom(SRge0) foralla b isin S a 6 brArr φ(a) le φ(b)

50 Chapter 4 The asymptotic spectrum of tensors matrix multiplication

We call X(S) the asymptotic spectrum of S and we call X(T ) the asymptoticspectrum of k-tensors over F

Theorem 41 ([Str88]) Let s t isin S Then s t iff forallφ isin X(S) φ(s) le φ(t)

Proof This follows from Theorem 212

We refer to Chapter 2 for a discussion of the topological properties of X(S)

Remark 42 We mention that X(S) may equivalently be defined with degenera-tion instead of restriction ge Over C we say f degenerates to g written f gif f sim= f prime and g sim= gprime and gprime is in the Euclidean closure (or equivalently Zariskiclosure) of the orbit GLn1 times middot middot middot timesGLnk middot f prime It is a nontrivial fact from algebraicgeometry (see [Kra84 Lemma III231] or [BCS97]) that there is a degenerationf g if and only if there are matrices Ai with entries polynomial in ε such that(A1 Ak) middot f = εdg + εd+1g1 + middot middot middot + εd+ege for some elements g1 ge Thelatter definition of degeneration is valid when C is replaced by an arbitrary fieldF and that is how degeneration is defined for an arbitrary field Degenerationis weaker than restriction f ge g implies f g Asymptotically however thenotions coincide f amp g if and only if fotimesn otimes 〈2〉otimeso(n) gotimesn We mention thatanalogous to restriction degeneration gives rise to border rank and border subrankR(f) = minr isin N f 〈r〉 and Q(f) = maxs isin N 〈s〉 f respectively

424 Asymptotic rank and asymptotic subrank

The abstract theory of asymptotic spectra characterises asymptotic subrank andasymptotic rank as follows

Corollary 43 Let S sube T be a subsemiring Let a isin S Then

˜Q(a) = minφisinX(S)

φ(a) (41)

˜R(a) = maxφisinX(S)

φ(a) (42)

Proof Statement (42) follows from Corollary 213 since either a = 0 or a gt 1For statement (41) if totimesk gt 2 for some k isin N then we apply Corollary 214Otherwise one can show that ˜Q(t) equals 0 or 1 using the gauge points of thenext section (see [Str88 Lemma 37])

Remark 44 One verifies that ˜R and ˜Q are 6-monotones and have value non 〈n〉 They are not universal spectral points however Namely the asymptoticrank of each of the three tensors

〈2 1 1〉 = e1 otimes e1 otimes 1 + e2 otimes e2 otimes 1 isin F2 otimes F2 otimes F1

〈1 1 2〉 = e1 otimes 1otimes e1 + e2 otimes 1otimes e2 isin F2 otimes F1 otimes F2

43 Gauge points ζ(i) 51

〈1 2 2〉 = 1otimes e1 otimes e1 + 1otimes e2 otimes e2 isin F1 otimes F2 otimes F2

equals 2 whereas their tensor product equals the matrix multiplication ten-sor 〈2 2 2〉 whose tensor rank equals 7 and whose asymptotic rank is thus atmost 7 ie strictly smaller than 23 Therefore asymptotic rank is not multiplica-tive On the other hand the asymptotic subrank of each of the above three tensorsequals 1 whereas the asymptotic subrank of 〈2 2 2〉 equals 4 see Chapter 5Therefore asymptotic subrank is not multiplicative

Goal 45 Our goal is now to explicitly describe elements in X(T ) universalspectral points or more modestly to describe elements in X(S) for interestingsubsemirings S sube T

Strassen constructed a finite family of elements in X(T ) the gauge points andan infinite family of elements in X(oblique tensors) the support functionalsThe support functionals are powerful enough to determine the asymptotic subrankof any ldquotight tensorrdquo Tight tensors are discussed in Chapter 5 In Chapter 6 weconstruct an infinite family in X(k-tensors over C) the quantum functionalsIn the rest of this chapter we discuss the gauge points and the support functionalsWe will focus on the case k = 3 for clarity of exposition

43 Gauge points ζ(i)

Strassen in [Str88] introduced a finite family of elements in X(T ) called the gaugepoints We focus on 3-tensors but the construction generalises immediately tok-tensors Let Vi = Fni Let t isin V1 otimes V2 otimes V3 Let i isin [3] Let flatteni(t) bethe image of t under the grouping V1 otimes V2 otimes V3 rarr Vi otimes (

otimesj 6=i Vj) We think

of flatteni(t) as a matrix Let ζ(i) T rarr N t 7rarr rank(flatteni(t)) with rankdenoting matrix rank We call ζ(1) ζ(2) ζ(3) the gauge points From the propertiesof matrix rank follows directly that ζ(i) is multiplicative under otimes additive under oplusmonotone under restriction 6 (and under degeneration ) and normalised to 1on 〈1〉 = e1 otimes e1 otimes e1

Theorem 46 ζ(1) ζ(2) ζ(3) isin X(T )

Recall ˜Q(t) le φ(t) le ˜R(t) for φ isin X(T ) In particular maxi ζ(i)(t) le ˜R(t)

We do not know whether maxiisin[3] ζ(i) equals ˜R To be precise we do not know any t

for which maxi ζ(i)(t) lt ˜R(t) and we do not know a proof that maxi ζ

(i)(t) = ˜R(t)for all t There are various families of tensors t for which maxi ζ

(i)(t) = ˜R(t) isproven We will see such a family in Section 542 For the matrix multiplicationtensor 〈2 2 2〉 we have 4 = maxi ζ

(i)(〈2 2 2〉) le 2ω so maxi ζ(i)(t) = ˜R(t) would

imply that the matrix multiplication exponent ω equals 2On the other hand ˜Q(t) le mini ζ

(i)(t) There exist t for which ˜Q(t) is

strictly smaller than miniisin[3] ζ(i)(t) To show this strict inequality we need another

52 Chapter 4 The asymptotic spectrum of tensors matrix multiplication

technique of Strassen the support functionals The support functionals are thetopic of the next section

44 Support functionals ζθ

Strassen in [Str91] constructed an infinite family of elements in the asymptoticspectrum of oblique k-tensors called the support functionals In this section we ex-plain the construction of the support functionals The support functionals providethe benchmark for our new quantum functionals (Chapter 6) and are relevant inthe context of combinatorial problems like the cap set problem (Section 542) Forclarity of exposition we focus on 3-tensors The ideas extend directly to k-tensors

Oblique tensors are tensors for which in some basis the support has thefollowing special structure Let t isin Fn1 otimes Fn2 otimes Fn3 Let e1 eni be thestandard basis of Fni Write t =

sumijk tijk ei otimes ej otimes ek Let [ni] = 1 2 ni

Let supp(t) = (i j k) tijk 6= 0 sube [n1] times [n2] times [n3] be the support of t withrespect to the standard basis Let [ni] have the natural ordering 1 lt 2 lt middot middot middot lt [ni]and let [n1]times [n2]times [n3] have the product order denoted by le That is x le yif for all i isin [3] holds xi le yi We call supp(t) oblique if supp(t) is an antichainwith respect to le ie if any two elements in supp(t) are incomparable withrespect to le We call a tensor t oblique if supp(g middot t) is oblique for some groupelement g isin G(t) = GLn1 times GLn2 times GLn3 The family of oblique tensors is asemiring under oplus and otimes

Not all tensors are oblique Obliqueness is not a generic property (see Propo-sition 621) However many tensors that are of interest in algebraic complexitytheory are oblique notably the matrix multiplication tensors

〈a b c〉 =sumiisin[a]

sumjisin[b]

sumkisin[c]

eij otimes ejk otimes eki isin Fab otimes Fbc otimes Fca

For any finite set X let P(X) be the set of all probability distributions on XFor any probability distribution P isin P(X) the Shannon entropy of P is definedas H(P ) = minus

sumxisinX P (x) log2 P (x) with 0 log2 0 understood as 0 Given finite

sets X1 Xk and a probability distribution P isin P(X1 times middot middot middot times Xk) on theproduct set X1 times middot middot middot timesXk we denote the marginal distribution of P on Xi by Pithat is Pi(a) =

sumxxi=a

P (x) for any a isin Xi

Definition 47 Let θ isin Θ = P([3]) For t isin Fn1 otimes Fn2 otimes Fn3 0 with supp(t)oblique define

ζθ(t) = max2sum3i=1 θ(i)H(Pi) P isin P(supp(t))

We call the ζθ for θ isin Θ the support functionals

Theorem 48 ζθ isin X(oblique) for θ isin Θ

44 Support functionals ζθ 53

We work towards the proof of Theorem 48 For p isin [0 1] let h(p) be thebinary entropy function h(p) = minusp log2 p minus (1 minus p) log2(1 minus p) ie h(p) is theShannon entropy of the probability vector (p 1minus p) The following properties ofthe Shannon entropy are well-known

Lemma 49

(i) H(P otimesQ) = H(P ) +H(Q) for P isin P(X1) Q isin P(X2)

(ii) H(P ) le H(P1) +H(P2) for P isin P(X1 timesX2)

(iii) H(pPoplus(1minusp)Q) = pH(P )+(1minusp)H(Q)+h(p) for PQ isin P(X) p isin [0 1]

(iv) 2a + 2b = max0leple1 2pa+(1minusp)b+h(p) for a b isin R

For X sube [n1]times [n2]times [n3] let Xle = y isin [n1]times [n2]times [n3] existx isin X y le x bethe downward closure of X Let max(X) = y isin X forallx isin X y le x rArr y = xbe the maximal points of X with respect to le Let Sn be the symmetric groupof permutations of [n] Then the product group Sn1 times Sn2 times Sn3 acts naturallyon [n1]times [n2]times [n3]

Lemma 410 Let t isin Fn1 otimes Fn2 otimes Fn3 For every g isin G(t) there is a triple ofpermutations w isin W (t) = Sn1 times Sn2 times Sn3 with w middotmax(supp(g middot t)) sube supp(t)le

Proof We prepare for the construction of w Let n isin N Let e1 en bethe standard basis of Fn Let g isin GLn Let f1 fn with fj = g middot ej be thetransformed basis of Fn Let (Ei)iisin[n] and (Fj)jisin[n] be the complete flags of Fnwith

Ei = Spanei ei+1 enFj = Spanfj fj+1 fn

Define the map

π [n]rarr [n] j 7rarr maxi isin [n] Ei cap (fj + Fj+1) 6= empty

(43)

We prove π is injective Let j k isin [n] with j le k and suppose i = π(j) = π(k)Let Ftimes = F 0 From (43) follows

(Ftimesei + Ei+1) cap (fj + Fj+1) 6= empty (44)

Ei+1 cap (fj + Fj+1) = empty (45)

(Ftimesei + Ei+1) cap (fk + Fk+1) 6= empty (46)

Suppose j lt k Then from (44) and (46) we obtain a contradiction to (45) Weconclude that j = k Thus π is injective

54 Chapter 4 The asymptotic spectrum of tensors matrix multiplication

For each Fni define as above the standard complete flag (Eij)jisin[ni] of Fni the

complete flag (F ij )jisin[ni] corresponding to the basis given by gi and the permuta-

tion πi [ni]rarr [ni] Let w = (π1 π2 π3) isin W (t)We will prove w middot max(supp(g middot t)) sube supp(t)le Let y isin max(supp(g middot t))

Let x = w middot y By construction of πi the intersection Eixicap (f iyi + F i

yi+1) is notempty Choose

f iyi isin Eixicap (f iyi + F i

yi+1)

Let tlowast be the multilinear map Fn1 times Fn2 times Fn3 rarr F with tlowast(ei ej ek) = tijk for alli isin [n1] j isin [n2] k isin [n3] Then

tlowast(f 1y1 f 2y2 f 3y3

) = tlowast(f 1y1 f 2y2 f 3y3

) +sum

zisin[n1]times[n2]times[n3]zgty

cz tlowast(f 1

z1 f 2z2 f 3z3

) (47)

for some cz isin F Since y is maximal in supp(gmiddott) the sum over z gt y in (47) equalszero We conclude tlowast(f 1

y1 f 2y2 f 3y3

) = tlowast(f 1y1 f 2y2 f 3y3

) 6= 0 Thus tlowast(E1x1timesE2

x2timesE3

x3)

is not zero and thus x isin supp(t)le

Proof of Theorem 48 We prove ζθ on oblique tensors is otimes-multiplicative oplus-additive 6-monotone and normalised to 1 on 〈1〉 = e1otimese1otimese1 The normalisationζθ(〈1〉) = 1 is clear

We prove ζθ is otimes-supermultiplicative Let s isin Fn1 otimes Fn2 otimes Fn3 and lett isin Fm1otimesFm2otimesFm3 Let P isin P(supp(t)) and Q isin P(supp(s)) Then the productP otimesQ isin P(supp(sotimes t)) has marginals PiotimesQi Since H(PiotimesQi) = H(Pi)+H(Qi)(Lemma 49(i)) we conclude ζθ(s)ζθ(t) le ζθ(sotimes t)

We prove ζθ is otimes-submultiplicative For P isin P(supp(t)) and θ isin Θ we use thenotation Hθ(P ) =

sum3i=1 θ(i)H(Pi) We naturally identify supp(t) with a subset

of [n1] times [n2] times [n3] times [m1] times [m2] times [m3] Let P isin P(supp(t)) Let P[3] be themarginal distribution of P on [n1] times [n2] times [n3] and let P3+[3] be the marginaldistribution of P on [m1]times [m2]times [m3] Then Hθ(P ) le Hθ(P[3]) +Hθ(P3+[3]) byLemma 49(ii) We conclude ζθ(sotimes t) le ζθ(s)ζθ(t)

We prove ζθ is oplus-additive By definition

ζθ(soplus t) = max2Hθ(P ) P isin P(supp(soplus t))= max

max0leple1

2Hθ(pPoplus(1minusp)Q) P isin P(supp(s)) Q isin P(supp(t))

From Lemma 49(iii) and (iv) follows

max

max0leple1

2Hθ(pPoplus(1minusp)Q) P isin P(supp(s)) Q isin P(supp(t))

= max

max0leple1

2pHθ(P )+(1minusp)Hθ(Q)+h(p) P isin P(supp(s)) Q isin P(supp(t))

= max

2Hθ(P ) + 2Hθ(Q) P isin P(supp(s)) Q isin P(supp(t))

44 Support functionals ζθ 55

= ζθ(s) + ζθ(t)

We conclude ζθ(soplus t) = ζθ(s) + ζθ(t)We prove ζθ is 6-monotone Let s 6 t with supp(s) and supp(t) oblique Then

there are linear maps Ai with s = (A1 otimes A2 otimes A3) middot t If A1 A2 A3 are of theform diag(1 1 0 0) then ζθ(s) le ζθ(t) Suppose g = (A1 A2 A3) isin G(t)Let P isin P(supp(t)) maximise Hθ on P(supp(t)) Let σ isin W such that σ middot Phas non-increasing marginals Then Hθ(σ middot P ) = Hθ(P ) and σ middot P maximises Hθ

on P(supp(σ middot t)) Then σ middot P maximises Hθ on P(supp(σ middot t)le) by Lemma 412below Let Q isin P(supp(g middot t)) maximise Hθ on P(supp(g middot t)) By Lemma 410there is a w isin W with w middot supp(g middot t) sube supp(σ middot t)le Then Hθ(w middotQ) = Hθ(Q) leHθ(σ middot P ) = Hθ(P ) Thus maxPisinP(supp(gmiddott)) Hθ(P ) le maxPisinsupp(t) Hθ(P ) Weconclude ζθ(g middot t) le ζθ(t)

The following two lemmas finish the above proof of Theorem 48 Recall thatin the proof we defined Hθ(P ) =

sum3i=1 θ(i)H(Pi) for θ isin Θ

Lemma 411 ([Str91 Prop 21]) Let Φ sube [n1] times [n2] times [n3] Let P isin P(Φ)Let supp(P ) be the support x isin Φ P (x) 6= 0 For x isin Φ define hP (x) =minussum3

i=1 θ(i) log2 Pi(xi) Then P maximises Hθ on P(Φ) if and only if

forallx isin supp(P ) hP (x) = maxyisinΦ

hP (y) (48)

Proof We write Hθ(P ) in terms of hP

Hθ(P ) =3sumi=1

θ(i)H(Pi) =sum

xisinsupp(P )

P (x)hP (x) (49)

For Q isin P(Φ)

limεrarr0+

d

dεHθ

((1minus ε)P + εQ

)= lim

εrarr0+

d

sumx

((1minus ε)P (x) + εQ(x)

)h(1minusε)P+εQ(x)

=sumx

P (x)

( 3sumi=1

θ(i)Pi(xi)minusQi(xi)

Pi(xi) ln(2)

)+sumx

(minusP (x) +Q(x)

)hP (x)

=sumx

Q(x)hP (x)minussumx

P (x)hP (x)

Therefore since Hθ is continuous and concave P maximises Hθ if and only if

forallQ isin P(Φ)sumx

Q(x)hP (x)minussumx

P (x)hP (x) le 0 (410)

56 Chapter 4 The asymptotic spectrum of tensors matrix multiplication

We will prove (410) is equivalent to (48) Supposesum

xQ(x)hP (x) lesum

x P (x)hP (x)for every Q isin P(Φ) In particular hP (y) le

sumx P (x)hP (x) for every y isin Φ so

maxyisinΦ hP (y) lesum

x P (x)hP (x) Then maxyisinΦ hP (y) =sum

x P (x)hP (x) We con-clude maxyisinΦ hP (y) = hP (x) for every x isin supp(P )

Suppose maxyisinΦ hP (y) = hP (x) for every x isin supp(P ) Then hP (y) le hP (x)for every Q isin P(Φ) y isin supp(Q) x isin supp(P ) We conclude

sumxQ(x)hP (x) lesum

x P (x)hP (x)

Lemma 412 ([Str91 Cor 22]) Let Φ sube [n1]times [n2]times [n3] Let P maximise Hθ

on P(Φ) Suppose Pi is nonincreasing on [ni] for each i isin [3] Then P max-imises Hθ on P(Φle) where Φle is the downward closure of Φ with respect to le

Proof We know P satisfies (48) We will prove P satisfies (48) with Φ replacedby Φle Then we are done by Lemma 411 Let x isin Φle Then x le y forsome y isin Φ Then (P1(x1) P2(x2) P3(x3)) ge (P1(y1) P2(y2) P3(y3)) since each Piis nonincreasing Then hP (x) le hP (y) We conclude maxΦle hP le maxΦ hP Onthe other hand Φ sube Φle Therefore maxΦ hP le maxΦle hP

Using the support functionals Strassen managed to fully compute the asymp-totic spectrum of several semirings generated by oblique tensors We will see anexample in Section 542

45 Upper and lower support functionals ζθ ζθ

In Section 44 we defined the support functionals ζθ oblique rarr Rge0 andproved that ζθ isin X(oblique) From the general theory of asymptotic spectra(Chapter 2) we know ζθ is the restriction of some map φ tensors rarr Rge0

in X(T ) However the proof of that fact was non-constructive In other wordswe know that ζθ can be extended to an element of X(T ) In this short sectionwe discuss a candidate extension proposed by Strassen called the upper supportfunctional We also discuss a companion called the lower support functional

For arbitrary t isin Fn1 otimes Fn2 otimes Fn3 the upper support functional and the lowersupport functional are defined as

ζθ(t) = mingisinG(t)

max2Hθ(P ) P isin P(supp(g middot t))

ζθ(t) = maxgisinG(t)

max2Hθ(P ) P isin P(max(supp(g middot t)))

with G(t) = GLn1 timesGLn2 timesGLn3 and Hθ(P ) =sum3

i=1 θ(i)H(Pi) We summarisethe known properties of the upper and lower support functional

Theorem 413 ([Str91]) Let s isin Fn1 otimes Fn2 otimes Fn3 and t isin Fm1 otimes Fm2 otimes Fm3Let θ isin Θ = P([3])

45 Upper and lower support functionals ζθ ζθ 57

(i) ζθ(〈n〉) = n for n isin N

(ii) ζθ(soplus t) = ζθ(s) + ζθ(t)

(iii) ζθ(sotimes t) le ζθ(s)ζθ(t)

(iv) If s gt t then ζθ(s) ge ζθ(t)

Theorem 414 ([Str91]) Let s isin Fn1 otimes Fn2 otimes Fn3 and t isin Fm1 otimes Fm2 otimes Fm3Let θ isin Θ

(i) ζθ(〈n〉) = n for n isin N

(ii) ζθ(soplus t) ge ζθ(s) + ζθ(t)

(iii) ζθ(sotimes t) ge ζθ(s)ζθ(t)

(iv) If s gt t then ζθ(s) ge ζθ(t)

Theorem 415 ([Str91]) ζθ(sotimes t) ge ζθ(s)ζθ(t) and ζθ(t) ge ζθ(t) for θ isin Θ

Regarding statement (ii) in Theorem 414 Burgisser [Bur90] shows that thelower support functional ζθ is not in general additive under the direct sumwhen θi gt 0 for all i See also [Str91 Comment (iii)] In particular this impliesthat the upper support functional ζθ(t) and the lower support functional ζθ(t)are not equal in general the upper support functional being additive In factto show that the lower support functional is not additive Burgisser first showsthat when F is algebraically closed the generic value of ζθ on Fn otimes Fn otimes Fnequals (1minusmini θi) log2 n+ o(n) On the other hand Tobler [Tob91] shows thatthe generic value of ζθ on FnotimesFnotimesFn equals log2 n So even generically ζθ and ζθare different on Fn otimes Fn otimes Fn

For θ isin Θ we say f is θ-robust if ζθ(t) = ζθ(t) We say t is robust if t is θ-robustfor all θ isin Θ Let us try to understand what robust tensors look like A tensor tis θ-robust if and only if

ζθ(t) le ζθ(t) (411)

The set of θ-robust tensors is closed under oplus and otimes since

ζθ(soplus t) = ζθ(s) + ζθ(t) = ζθ(s) + ζθ(t) le ζθ(soplus t)

and

ζθ(sotimes t) le ζθ(s)ζθ(t) = ζθ(s)ζθ(t) le ζθ(sotimes t)

For X sube [n1] times [n2] times [n3] we use the notation Hθ(X) = maxPisinP(X) Hθ(P )Let t isin Fn1 otimes Fn2 otimes Fn3 0 Equation (411) means that there are g h isin G(t)and P isin P(max supp(h middot t)) such that Hθ(supp(g middot t)) le Hθ(P ) In this case we

58 Chapter 4 The asymptotic spectrum of tensors matrix multiplication

have ζθ(t) = ζθ(t) = 2Hθ(P ) In particular t is θ-robust if there is a g isin G(t) suchthat the maximisation Hθ(supp(g middot t) is attained by a P isin P(max(supp(g middot t)))This criterion is automatically satisfied for all θ when supp(g middot t) = max(supp(g middot t))for some g isin G(t) Suppose t is oblique Then supp(g middot t) is an antichainfor some g isin G(t) and thus supp(g middot t) = max supp(g middot t) Then t is robustand ζθ(t) = ζθ(t) = 2Hθ(supp(gmiddott))

46 Asymptotic slice rank

Slice rank is a variation on tensor rank that was introduced by Terence Taoin [Tao16] to study cap sets We will look at cap sets in Section 54 Here westudy the relationship between asymptotic slice rank and the support functionals

Consider the following characterisation of tensor rank Let a simple tensor beany tensor of the form v1 otimes v2 otimes v3 isin V1 otimes V2 otimes V3 with vi isin Vi for i isin [k] Thenthe rank R(t) of t isin V1 otimes V2 otimes V3 is the smallest number r such that t can bewritten as a sum of r simple tensors

Slice rank is defined similarly but with simple tensors replaced by slicesFor S sube [k] let VS =

otimesiisinS Vi For j isin [k] let j = j A tensor in V1 otimes V2 otimes V3

is called a slice if it is of the form v otimes w with v isin Vj and w isin Vj for some j isin [k](under the natural reordering of the tensor legs) Let t isin V1 otimes V2 otimes V3 The slicerank of t denoted by SR(t) is the smallest number r such that t can be writtenas a sum of r slices For example the tensor

W = e1 otimes e1 otimes e2 + e1 otimes e2 otimes e1 + e2 otimes e1 otimes e1 isin F2 otimes F2 otimes F2 (412)

has slice rank 2 since we can write W = e1 otimes (e1 otimes e2 + e2 otimes e1) + e2 otimes e1 otimes e1In fact the slice rank of any element in V1 otimes V2 otimes V3 is at most mini dimVi Thetensor rank of W on the other hand is known to be 3

Slice rank is clearly monotone under restriction The slice rank of the diagonaltensor 〈r〉 equals r [Tao16] It follows that subrank is at most slice rank

Q(t) le SR(t)

The motivation for the introduction of slice rank in [Tao16] was finding upperbounds on subrank Q(t) and asymptotic subrank ˜Q(t)

The main result of this section is the following theorem Recall that a tensor tis oblique if the support supp(g middot t) is an antichain for some g isin G(t)

Theorem 416 Let t be oblique Then

limnrarrinfin

SR(totimesn)1n = minθisinP([3])

ζθ(t)

Our proof of Theorem 416 is based on a proof of Tao and Sawin in [TS16]and discussions of the author with Dion Gijswijt The explicit connection betweenasymptotic slice rank and the support functionals is new

46 Asymptotic slice rank 59

We use Theorem 416 before giving its proof to see that SR is not submulti-plicative and not supermultiplicative under the tensor product otimes In particular wecannot use Feketersquos lemma Lemma 22 to prove that the limit limnrarrinfin SR(totimesn)1n

exists Thus the existence of the limit is a non-trivial consequence of Theorem 416Let W as in (412) Then SR(W ) = 2 We have ζ(131313)(W ) = 2h(13) lt 2

From Theorem 416 follows SR(Wotimesn) le 2nh(13)+o(1) We conclude SR(Wotimesn) lt 2n

for n large enough We conclude SR is not supermultiplicative Now it is alsoclear that slice rank is not the same as (border) subrank since (border) subrankis supermultiplicative

Next the tensorssumn

i=1 eiotimeseiotimes1sumn

i=1 eiotimes1otimeseisumn

i=1 1otimeseiotimesei have slice rankone while their tensor product equals the matrix multiplication tensor 〈n n n〉which has slice rank n2 by Theorem 416 and Theorem 53 in the next chapterapplied to the tight tensor 〈n n n〉 We conclude SR is not submultiplicative

Slice rank and hitting set number

We study the hitting set number of the support of a tensor Let Φ sube [n1]times[n2]times[n3]A hitting set for Φ is a 3-tuple of sets A1 sube [n1] A3 sube [n2] A3 sube [n3] such that forevery a isin Φ there is an i isin [3] with ai isin Ai We may think of Φ as a 3-partite3-uniform hypergraph Then the definition of hitting set says every edge a isin Φ ishit by an element of some Ai A hitting set is also called a vertex cover everyedge being covered by some vertex or a transversal The size of the hittingset (A1 A2 A3) is |A1|+ |A2|+ |A3| The hitting set number τ(Φ) is the size ofthe smallest hitting set for Φ Let t isin Fn1 otimes Fn2 otimes Fn3

Lemma 417 Let g isin G(t) = GLn1timesGLn2timesGLn3 Then SR(t) le τ(supp(g middot t))

Proof This is clear

Lemma 418 Let g isin G(t) Then SR(t) ge τ(max(supp(g middot t)))

Proof It is sufficient to consider g = e Let

t =

r1sumi=1

v1i otimes u1

i +

r2sumi=1

v2i otimes u2

i +

r3sumi=1

v3i otimes u3

i

be a slice decomposition We may assume vj1 vjrj

are linearly independent

Let Vj = Spanvj1 vjrj sube Fnj Let Wj sube (Fnj)lowast be the elements in thedual space that vanish on Vj Let Bj sube Wj be a basis with the followingproperty with respect to the standard basis the matrix with the elementsof Bj as columns is in reduced row echelon form ie each column is of theform (lowast middot middot middot lowast 1 0 middot middot middot 0)T and the pivot elements (the 1rsquos) are all in different rowsLet Sj sube [nj] be the indices of the pivot element Let Sj = [nj] Sj be thecomplement Then |Sj| = rj We claim (S1 S2 S3) is a hitting set for max(supp(t))

60 Chapter 4 The asymptotic spectrum of tensors matrix multiplication

Then r1 + r2 + r3 = |S1|+ |S2|+ |S3| ge τ(max(supp(t))) Let x isin max(supp(t))Suppose x isin S1 times S2 times S3 For every j isin [3] let φj isin Bj have its pivot element atindex xj Let φ = φ1 otimes φ2 otimes φ3 Then φ isin W1 otimesW2 otimesW3 so φ(t) = 0 Since x ismaximal and each Bj is in reduced row echelon form

φ(t) =sumylex

ty φ(ey1 otimes ey2 otimes ey3)

=sumyltx

ty φ(ey1 otimes ey2 otimes ey3) + tx ex1 otimes ex2 otimes ex3

=sumyltx

sy ey1 otimes ey2 otimes ey3 + tx ex1 otimes ex2 otimes ex3

for some sy isin F From φ(t) = 0 follows tx = 0 This contradicts x isin supp(t) sox 6isin S1 times S2 times S3 ie there is a j isin [3] with xj isin Sj

Asymptotic hitting set number

We now study the asymptotic hitting set number ˜τ(Φ) = limnrarrinfin τ(Φtimesn)1nWe will use some basic facts of types and type classes Let X be a finite

set Let N isin N An N-type on X is a probability distribution P on X withN middot P (x) isin N for all x isin X Let P be an N -type on X The type class TNP sube XN

is the set of sequences s = (s1 sN) with x occuring N middot P (x) times in s forevery x isin X ie |i isin [N ] si = x| = N middot P (x)

Lemma 419 The number of N-types on X equals(N+|X|minus1|X|minus1

) Let P be an

N-type The size of the type class TNP equals the multinomial coefficient(NNP

)

Proof We leave the proof to the reader

Lemma 420 Let P be an N-type on X Then

1

(N + 1)|X|2NH(P ) le

(N

NP

)le 2NH(P )

Proof See eg [CT12 Theorem 1113]

Lemma 421 log2˜τ(Φ) le maxPisinP(Φ) miniisin[3]H(Pi)

Proof Let P maximise maxPisinP(Φ) miniH(Pi) Let n isin N We construct a hittingset (A1 A2 A3) for Φn as follows Let x isin Φn Viewing x as an n-tuple of elementsin Φ let Q isin Pn(Φ) be the type of x (ie the empirical distribution) Let j isin [3]with H(Qj) = miniisin[3]H(Qi) By our choice of P we have

H(Qj) = miniisin[3]

H(Qi) le miniisin[3]

H(Pi)

46 Asymptotic slice rank 61

Viewing x as a 3-tuple (x1 x2 x3) add xj to Aj We repeat this for all x isin ΦnThe final (A1 A2 A3) is a hitting set for Φn by construction For each j isin [3]

|Aj| lesumQj

|T nQj | lesumQj

2nH(Qj)

where the sum is over Qj isin Pn(Φj) with H(Qj) le miniisin[3]H(Pi) Then

|Aj| le |Pn(Φj)| 2nminiH(Pi) = poly(n)2nminiH(Pi)

We conclude |A1|+ |A2|+ |A3| le poly(n)2nminiH(Pi)

Lemma 422 log2˜τ(Φ) ge maxPisinP(Φ) miniisin[3]H(Pi)

Proof Let P maximise maxPisinP(Φ) miniH(Pi) Let n isin N Let (A1 A2 A3) be ahitting set for Φn Let Q isin Pn(Φ) be an n-type with miniH(Qi) = miniH(Pi)minuso(n) Let Ψ = T nQ sube Φn be the set of strings with type Q Then (A1 A2 A3) is ahitting set for Ψ Let πi Ψrarr Φn

i (x1 x2 x3) 7rarr xi Then

Ψ = πminus11 (A1) cup πminus1

2 (A2) cup πminus13 (A3)

Let j isin [3] with |πminus1j (Aj)| ge 1

3|Ψ| The fiber πminus1

j (a) has constant size over a isin Ψj

Let cj = |πminus1j (a)| be this size Then

|Ψ| =sumaisinΨj

|πminus1j (a)| =

sumaisinΨj

cj = |Ψj| cj

And

|πminus1j (Aj)| =

sumaisinAjcapΨj

|πminus1j (a)| = |Aj capΨj| cj le |Aj| cj

Therefore

|Aj| ge|πminus1j (Aj)|cj

ge13|Ψ|cj

= 13|Ψj|

We have |Ψj| ge 2nH(Qj)minuso(n) ge 2nminiH(Qi)minuso(n) ge 2nminiH(Pi)minuso(n) We conclude|A1|+ |A2|+ |A3| ge |Aj| ge 1

3|Ψj| ge 1

32nminiH(Pi)minuso(n)

Lemma 423 log2˜τ(Φ) = maxPisinP(Φ) miniisin[3] H(Pi)

Proof This follows directly from the above lemmas

62 Chapter 4 The asymptotic spectrum of tensors matrix multiplication

Asymptotic slice rank

We now combine the above lemmas about slice rank and the asymptotic hittingset number to prove Theorem 416 First we have the following basic lemma

Lemma 424 minθisinΘ maxPisinP(Φ)Hθ(P ) = maxPisinP(Φ) miniisin[3]H(Pi)

Proof Since Hθ(P ) is convex in θ and concave in P von Neumannrsquos minimaxtheorem gives minθ maxPisinP(Φ)Hθ(P ) = maxPisinP(Φ) minθHθ(P ) Finally we usethat minθHθ(P ) = miniH(Pi)

Define fsim(t) = lim supnrarrinfin f(totimesn)1n and fsim(t) = lim infnrarrinfin f(totimesn)1n

Lemma 425 Let t isin Fn1 otimes Fn2 otimes Fn3 Then

maxgisinG(t)

maxPisinP(max supp(gmiddott))

miniH(Pi) le SRsim(t) le SRsim(t) le min

θζθ(t)

Proof By definition SRsim(t) le SRsim(t) From Lemma 417 follows

SRsim(t) le˜τ(supp(g middot t))

for any g isin G(t) Lemma 423 gives ˜τ(supp(g middot t)) = maxPisinP(supp(gmiddott)) mini 2H(Pi)

Thus with the help of Lemma 424

SRsim(t) le mingisinG(t)

maxPisinP(supp(gmiddott))

mini

2H(Pi) = minθζθ(t)

From Lemma 418 follows

˜τ(max(supp(g middot t))) le SRsim(t)

for any g isin G(t) Lemma 423 gives

maxgisinG(t)

maxPisinP(max(supp(gmiddott)))

mini

2H(Pi) le SRsim(t)

This proves the lemma

Proof of Theorem 416 We may assume Φ = supp(t) is oblique Then withthe help of Lemma 424 and Lemma 425

minθisinΘ

ζθ(t) = minθisinΘ

ζθ(t)

= minθisinΘ

maxPisinmax(Φ)

2Hθ(P )

= maxPisinmax(Φ)

miniisin[3]

2H(Pi)

le maxgisinG(t)

maxPisinP(max(supp(gmiddott)))

miniisin[3]

2H(Pi)

le SRsim(t)

le SRsim(t)

le minθisinΘ

ζθ(t)

This proves the claim

47 Conclusion 63

47 Conclusion

The study of asymptotic rank of tensors is motivated by the open problem of findingthe exponent of matrix multiplication Asymptotic subrank has applicationsin for example combinatorics and algebraic property testing Via the theoryof asymptotic spectra Strassen characterised asymptotic rank and asymptoticsubrank in terms of the asymptotic spectrum of tensors Strassen introduced thegauge points in X(T ) and the support functionals in X(oblique) More preciselythere are the lower support functionals and the upper support functionals Thelower support functionals are not additive and can thus not be universal spectralpoints The upper support functionals may be universal spectral points but thiscan however not be shown with the help of the lower support functionals Finallywe showed that for oblique tensors the asymptotic slice rank exists and equals theminimum value over the support functionals In the next chapter we will see asubfamily of the oblique 3-tensors for which the support functionals are powerfulenough to compute the asymptotic subrank

Chapter 5

Tight tensors and combinatorialsubrank cap sets

This chapter is based on joint work with Matthias Christandl and Peter Vrana [CVZ16

CVZ18]

51 Introduction

In the previous chapter we discussed the gauge points and the support function-als ζθ The gauge points are in the asymptotic spectrum of all tensors while thesupport functionals are in the asymptotic spectrum of oblique tensors

How ldquopowerfulrdquo are the support functionals We know ˜Q(t) le ζθ(t) le ˜R(t) for

oblique t Thus maxθ ζθ(t) le ˜R(t) In fact maxθ ζ

θ(t) is at most the maximumover the gauge points maxS ζ(S) and in turn maxS ζ(S) is at most ˜R(t) Asremarked earlier it is not known whether maxS ζ(S) equals ˜R(t) in general

On the other hand we have ˜Q(t) le minθ ζθ(t) Do we attain equality here

in general ˜Q(t) = minθ ζθ(t) The answer is ldquoyesrdquo for the subsemiring of tight

3-tensors In this chapter we study tight k-tensors

Tight tensors

Let I1 Ik be finite sets Let Φ sube I1 times middot middot middot times Ik We say Φ is tight if there areinjective maps ui Ii rarr Z for i isin [k] such that

forallα isin Φ u1(α1) + middot middot middot+ uk(αk) = 0

We say t isin Fn1 otimes middot middot middot otimes Fnk is tight if there is a g isin G(t) = GLn1 times middot middot middot times GLnksuch that the support supp(g middot t) is tight

Recall that a tensor is oblique if the support is an antichain in some basisClearly tight tensors are oblique To summarise the families of tensors that we

65

66 Chapter 5 Tight tensors and combinatorial subrank cap sets

have defined up to now we have

tight sube oblique sube robust sube θ-robust

Recall that the families of oblique robust and θ-robust tensors each form asemiring under otimes and oplus Tight tensors have the same property [Str91 Section 5]Another property is that any subset of a tight set is tight

Example 51 Let k ge 3 be fixed For any integer n ge 1 and c isin [n] the set

Φn(c) = α isin 0 nminus 1k α1 + middot middot middot+ αk = c

is tight For any integer n ge 2 and any c isin [n] the set

Ψn(c) = α isin 0 nminus 1k α1 + middot middot middot+ αk = c mod n

is not tight (cf Exercise 1520 in [BCS97])

Example 52 When F contains a primitive nth root of unity ζ the tensor

tn =sum

αisinΨn(nminus1)

eα1 otimes middot middot middot otimes eαk isin (Fn)otimesk

which has support Ψn(n minus 1) is tight Namely the elements vj =sumn

i=1 ζijei

for j isin [n] form a basis of Fn Let g isin G(tn) be the corresponding basistransformation Then we have tn =

sumnj=1 vj otimes middot middot middot otimes vj and we see that the

support supp(g middot tn) = α isin [n]k α1 = middot middot middot = αk is tight (See also [BCS97Exercise 1525]) When the characteristic of F equals n the tensor tn is also tightas we will see in Section 542

Combinatorial subrank and the CoppersmithndashWinograd method

We care about tight tensors because of a remarkable theorem for tight 3-tensors ofStrassen (Theorem 53 below) To understand the theorem we need the concept ofcombinatorial asymptotic subrank (cf [Str91 Section 5]) We say D sube I1timesmiddot middot middottimesIkis a diagonal when any two distinct α β isin D are distinct in all k coordinates Inother words for elements in D the value at one coordinate uniquely determinesthe value at the other k minus 1 coordinates Let Φ sube I1 times middot middot middot times Ik We say adiagonal D sube I1 times middot middot middot times Ik is free for Φ or simply D sube Φ is a free diagonalif D = Φ cap (D1 times middot middot middot times Dk) where Di = xi (x1 xk) isin D Define the(combinatorial) subrank Q(Φ) as the size of the largest free diagonal D sube ΦFor Φ sube I1 times middot middot middot times Ik and Ψ sube J1 times middot middot middot times Jk we naturally define the productΦtimesΨ sube (I1 times J1)times middot middot middot times (Ik times Jk) by

ΦtimesΨ = ((α1 β1) (αk βk)) α isin Φ β isin Ψ

51 Introduction 67

Define the (combinatorial) asymptotic subrank ˜Q(Φ) = limnrarrinfinQ(Φtimesn)1n Lett isin Fn1 otimes middot middot middot otimes Fnk and let Φ be the support of t in the standard basis ThenQ(Φ) le Q(t) and ˜Q(Φ) le ˜Q(t) The number Q(Φ) may be interpreted as thelargest number n such that 〈n〉 can be obtained from t using a restriction thatconsists of matrices that have at most one nonzero entry in each row and ineach column (This is called M-restriction in [Str87 Section 6] which standsfor monomial restriction) We may also interpret Φ as a k-partite hypergraphThen Q(Φ) is the size of the largest induced k-partite matching in Φ

Let Φ sube [n1]timesmiddot middot middottimes [nk] and let t isin Fn1 otimesmiddot middot middototimesFnk be any tensor with supportequal to Φ Then the (asymptotic) subranks of Φ and t are related as follows

Q(Φ) le Q(t) and ˜Q(Φ) le ˜Q(t)

Strassen proved the following theorem using the method of Coppersmith andWinograd [CW90] Recall that for Φ sube I1 times I2 times I3 we let P(Φ) be the set ofprobability distributions on Φ For P isin P(Φ) let P1 P2 P3 be the marginaldistributions of P on the 3 components of I1 times I2 times I3

Theorem 53 ([Str91 Lemma 51]) Let Φ sube I1 times I2 times I3 be tight Then

˜Q(Φ) = maxPisinP(Φ)

miniisin[3]

2H(Pi) (51)

The consequence of Theorem 53 is that the support functionals are sufficientlypowerful to compute the asymptotic subrank of tight 3-tensors

Corollary 54 ([Str91 Proposition 54]) Let t isin Fn1 otimes Fn2 otimes Fn3 be tight Then

˜Q(t) = minθisinP([3])

ζθ(t)

Moreover if Φ = supp(g middot t) is tight for some g isin G(t) then ˜Q(t) = ˜Q(Φ)

Remark 55 Strassen conjectured in [Str94 Conjecture 53] that for the familyof tight 3-tensors the support functionals give all spectral points in the asymp-totic spectrum X(tight 3-tensors) In [Str91] numerous examples are given ofsubfamilies of tight 3-tensors for which this is the case

Remark 56 Equation (51) becomes false when we let Φ sube I1 times middot middot middot times Ikwith k ge 4 and we let the right-hand side of the equation be maxPisinP(Φ) mini 2H(Pi)see [CVZ16 Example 1138]

New results in this chapter

This chapter is an investigation of tight tensors combinatorial asymptotic subrankand applications More precisely this chapter contains the following new results

68 Chapter 5 Tight tensors and combinatorial subrank cap sets

Higher-order CoppersmithndashWinograd method In Section 52 we extendTheorem 53 to obtain a lower bound for ˜Q(Φ) for tight sets Φ sube I1 times middot middot middot times Ikwith k ge 4 Our lower bound is not known to be optimal in general We computeexamples for which the lower bound is optimal

Combinatorial degeneration method In Section 53 we further extend therange of application of the CoppersmithndashWinograd method via a partial order

on supports of tensors called combinatorial degeneration We prove that if Φ Ψthen ˜Q(Φ) le ˜Q(Ψ) Suppose Ψ is not tight but Φ is tight then we may apply the(higher-order) CoppersmithndashWinograd method to obtain a lower bound on ˜Q(Φ)and thus on ˜Q(Ψ)

Cap sets In Section 54 we relate the theory of asymptotic spectra theCoppersmithndashWinograd method and the combinatorial degeneration methodto the problem of upper bounding the maximum size of cap sets in Fnp

Graph tensors Graph tensors are generalisations of the matrix multiplicationtensor 〈2 2 2〉 parametrised by graphs In Section 55 we discuss how one canapply the higher-order CoppersmithndashWinograd method to obtain upper boundson the asymptotic rank of complete graph tensors We also briefly discuss thesurgery method which gives good upper bounds on the asymptotic rank of graphtensors for sparse graphs like cycle graphs

52 Higher-order CW method

In this section we extend Theorem 53 to tight Φ sube I1 times middot middot middot times Ik with k ge 4We introduce some notation Let P(Φ) be the set of probability distributionson Φ For P isin P(Φ) let P1 Pk be the marginal distributions of P on the kcomponents of I1 times middot middot middot times Ik Let R(Φ) be the set of all subsets R sube Φ2 suchthat R 6sube (x x) x isin Φ and R sube (x y) isin Φ2 xi = yi for some i isin [k]For P isin P(Φ) and R isin R(Φ) let Q(R (P1 Pk)) be the set of probabilitydistributions Q on R whose marginal distributions on the 2k components of Rsatisfy Qi = Qk+i = Pi for i isin [k]

Let I1 Ik be finite subsets of Z The result of this section is a lower bound onthe asymptotic subrank of any Φ sube I1timesmiddot middot middottimesIk satisfying foralla isin Φ

sumki=1 ai = 0 For

R sube R(Φ) let r(R) be the rank over Q of the matrix with rows xminusy (x y) isin R

Theorem 57 Let Φ sube Zk be a finite set with foralla isin Φsumk

i=1 ai = 0 Then

log2 ˜Q(Φ) ge maxP

minRQ

H(P )minus (k minus 2)H(Q)minusH(P )

r(R)

with P isin P(Φ) R isin R(Φ) and Q isin Q(R (P1 Pk))

52 Higher-order CoppersmithndashWinograd method 69

521 Construction

We prepare for the proof of Theorem 57 by discussing some basic facts

Average-free sets

Lemma 58 Let k isin N Let M isin N We say a subset B sube ZMZ is (k minus 1)-average-free if

forallx1 xk isin B x1 + middot middot middot+ xkminus1 = (k minus 1)xk rArr x1 = middot middot middot = xk

There is a (k minus 1)-average-free set B sube ZMZ of size |B| = M1minuso(1)

Proof There is a set A sube 1 bMminus1kminus1c of size |A| = M1minuso(1) with

forallx1 xk isin A x1 + middot middot middot+ xkminus1 = (k minus 1)xk rArr x1 = middot middot middot = xk (52)

see [VC15 Lemma 10] Let B = a mod M a isin A sube ZMZ Then |B| = |A|Let x1 xk isin B with x1 + middot middot middot+ xkminus1 = (k minus 1)xk View x1 xk as elementsin 1 bMminus1

kminus1c Then x1 + middot middot middot+xkminus1 = (kminus1)xk still holds From (52) follows

x1 = middot middot middot = xk in Z and hence also in ZMZ

Linear combinations of uniform variables

Lemma 59 Let M be a prime Let u1 un be independently uniformly dis-tributed over ZMZ Let v1 vm be (ZMZ)-linear combinations of u1 unThen the vector v = (v1 vm) is uniformly distributed over the range of v in(ZMZ)m

Proof Let vi =sum

j cijuj with cij isin ZMZ Then v = Cu with u = (u1 un)and C the matrix with entries Cij = cij Let y in the image of C Then thecardinality of the preimage Cminus1(y) equals the cardinality of the kernel of CIndeed if Cx = y then Cminus1(y) = x + ker(C) Since u is uniform we concludethat v is uniform on the image of C

Free diagonals

Lemma 510 Let G be a graph with n vertices and m edges Then G has at leastnminusm connected components

Proof A graph without edges has n connected components For every edge thatwe add to the graph we lose at most one connected component

Lemma 511 Let I1 Ik be finite sets Let Ψ sube I1 times middot middot middot times Ik Let

C = a b sube Ψ a 6= bexisti isin [k] ai = bi

Then Q(Ψ) ge |Ψ| minus |C| Obviously the statement remains true if we replace C bythe larger set (a b) isin Ψ2 a 6= bexisti isin [k] ai = bi

70 Chapter 5 Tight tensors and combinatorial subrank cap sets

Proof Let G = (Ψ C) be the graph with vertex set Ψ and edge set C Let Γ sube Ψcontain exactly one vertex per connected component of G The vertices in Γ arepairwise not adjacent So Γ is a diagonal Of course Γ sube Ψcap (Γ1times middot middot middot times Γk) Leta isin Ψ cap (Γ1 times middot middot middot times Γk) Let x1 xk isin Γ with

(x1)1 = a1 (x2)2 = a2 (xk)k = ak

Then x1 xk are all adjacent to a in G ie they are all in the same connectedcomponent Then x1 = middot middot middot = xk since Γ contains precisely one vertex perconnected component So a = x1 = middot middot middot = xk So a isin Γ We conclude thatΓ supe Ψ cap (Γ1 times middot middot middot times Γk) Finally |Γ| ge |Ψ| minus |C| by Lemma 510

We now give the proof of Theorem 57 We repeat some notation from aboveLet k ge 3 Let Φ sube Zk be a finite set Let P(Φ) be the set of probabilitydistributions on Φ For P isin P(Φ) let P1 Pk be the marginal distributionsof P on the k components of Zk Let R(Φ) be the set of all subsets R sube Φ2 suchthat R 6sube (x x) x isin Φ and R sube (x y) isin Φ2 xi = yi for some i isin [k]For P isin P(Φ) and R isin R(Φ) let Q(R (P1 Pk)) be the set of probabilitydistributions Q on R whose marginal distributions on the 2k components of Rsatisfy Qi = Qk+i = Pi for i isin [k] For R sube R(Φ) let r(R) be the rank over Q ofthe matrix with rows

xminus y (x y) isin R

For any prime M let rM(R) be the rank over ZMZ of the same matrix

Theorem (Theorem 57) Let Φ sube Zk be a finite set with foralla isin Φsumk

i=1 ai = 0Then

log2 ˜Q(Φ) ge maxP

minRQ

H(P )minus (k minus 2)H(Q)minusH(P )

r(R)

with P isin P(Φ) R isin R(Φ) and Q isin Q(R (P1 Pk))

Proof Let P be a rational probability distribution on Φ ie foralla isin Φ P (a) isin Q

Choice of parameters

This proof involves a variable N that we will let go to infinity and a primenumber M that depends on N For the sake of rigor we first set the dependenceof M on N and make sure that N is large enough for M to have good properties

Let n isin N such that P is an n-type ie foralla isin Φ nP (a) isin N Let N = tn be amultiple of n Let

f(N) = log2

(2|Φ|

2

maxRisinR(Φ)

(N + |R| minus 1

|R| minus 1

))isin o(N) (53)

52 Higher-order CoppersmithndashWinograd method 71

Let

g(N) = |Φ| log2(N + 1) isin o(N)

By Lemma 420

2NH(P )minusg(N) le(N

NP

) (54)

Let

micro(N) = maxRQ

H(Q)minusH(P ) + (1 + g(N) + f(N)) 1N

r(R)(55)

with R isin R(Φ) and Q isin Q(R (P1 Pk)) Let M be a prime with

d2micro(N)Ne leM le 2d2micro(N)Ne (56)

Such a prime exists by Bertrandrsquos postulate see eg [AZ14] We can make Marbitrarily large by choosing N large enough Choose N = tn large enough suchthat

M gt k minus 1 (57)

forallR isin R(Φ) rM(R) = r(R) (58)

We will later let t and thus N go to infinity

Restrict to marginal type classes

The set ΦotimesN is a finite subset of (ZN)k Let a isin ΦotimesN Then we have thatai = ((ai)1 (ai)N) isin ZN for i isin [k] We restrict to those a for which ai is inthe type class TNPi for all i isin [k] Thus let

Ψ = ΦotimesN cap (TNP1times middot middot middot times TNPk)

We prove a lower bound on the size of Ψ Let (s1 sN ) isin TNP Then sj isin Φ forj isin [N ] and ((s1)i (sN)i) isin TNPi for i isin [k] So(

((s1)1 (sN)1) ((s1)k (sN)k))isin ΦotimesN cap (TNP1

times middot middot middot times TNPk) = Ψ

Thus |Ψ| ge |TNP | By Lemma 419 |TNP | =(NNP

) By Lemma 420

(NNP

)ge

2NH(P )minusg(N) Therefore

|Ψ| ge 2NH(P )minusg(N) (59)

72 Chapter 5 Tight tensors and combinatorial subrank cap sets

Hashing

Let u1 ukminus1 v1 vN isin ZMZ For i isin [k] let

hi ZN rarr ZMZ

x 7rarr

ui +

sumNj=1 xjvj for 1 le i le k minus 1

1kminus1

(u1 + middot middot middot+ ukminus1 minus

sumNj=1 xjvj

)for i = k

Note that kminus1 is invertible in ZMZ by (57) Let a isin Ψ Then ((a1)j (ak)j) isinΦ for j isin [N ] So

sumki=1(ai)j = 0 for every j isin [N ] Thus

ksumi=1

Nsumj=1

(ai)jvj =Nsumj=1

vj

ksumi=1

(ai)j = 0

Therefore

h1(a1) + middot middot middot+ hkminus1(akminus1) = (k minus 1)hk(ak)

Restrict to average-free set

Let B sube ZMZ be a (k minus 1)-average-free set of size

|B| geM1minusκ(M) with κ(M) isin o(1) (510)

meaning

forallx1 xk isin B x1 + middot middot middot+ xkminus1 = (k minus 1)xk rArr x1 = middot middot middot = xk (511)

(Lemma 58) Let Ψprime sube Ψ be the subset

Ψprime = a isin Ψ foralli isin [k] hi(ai) isin B

Let a isin Ψprime Then a isin Ψ so

h1(a1) + middot middot middot+ hkminus1(akminus1) = (k minus 1)hk(ak)

Since hi(ai) isin B for every i isin [k] (511) implies

h1(a1) = middot middot middot = hk(ak)

Probabilistic method

Clearly Q(ΦotimesN) ge Q(Ψ) ge Q(Ψprime) Let

C prime = (a b) isin Ψprime2 a 6= bexisti isin [k] ai = bi

52 Higher-order CoppersmithndashWinograd method 73

Let X = |Ψprime| and Y = |C prime| By Lemma 511

Q(Ψprime) ge X minus Y

Let u1 ukminus1 v1 vN be independent uniformly random variables over thefield ZMZ Then X and Y are random variables Then

Q(Ψprime) ge E[X minus Y ] = E[X]minus E[Y ]

where the expectation is over u1 ukminus1 v1 vN We will prove

E[X] = |B| |Ψ|Mminus(kminus1) (512)

E[Y ] le |B|maxRQ

2NH(Q)+f(N)Mminus(kminus1)minusr(R) (513)

with f(N) as defined in (53) and R isin R(Φ) Q isin Q(R (P1 Pk)) Beforeproving (512) and (513) we derive the final bound

Derivation of final bound

From (512) and (513) follows

E[X]minus E[Y ] ge |B| |Ψ|Mminus(kminus1) minus |B|maxRQ

2NH(Q)+f(N)Mminus(kminus1)minusr(R)

We factor out |B| |Ψ| and Mminus(kminus1)

E[X]minus E[Y ] ge |B| |Ψ|Mminus(kminus1)(

1minus 1

|Ψ|maxRQ

2NH(Q)+f(N)Mminusr(R))

From our choice of micro(N) from (55)

micro(N) = maxRQ

H(Q)minusH(P ) + (1 + g(N) + f(N)) 1N

r(R)

follows

maxRQ

2N(H(Q)minusH(P )minusr(R)micro(N))+g(N)+f(N) le 1

2 (514)

Apply |B| geM1minusκ(M) from (510) and |Ψ| ge 2NH(P )minusg(N) from (59) to get

E[X]minus E[Y ] geM1minusκ(M)2NH(P )minusg(N)Mminus(kminus1)

middot(

1minus 2minusNH(P )+g(N) maxRQ

2NH(Q)+f(N)Mminusr(R))

geMminus(kminus2+κ(M))2NH(P )minusg(N)

74 Chapter 5 Tight tensors and combinatorial subrank cap sets

middot(

1minusmaxRQ

2NH(Q)minusNH(P )+g(N)+f(N)Mminusr(R))

(Here we used (514) to see that the second factor is nonnegative) Apply theupper bound 2micro(N)N leM le 2micro(N)N+2 from (56) to get

E[X]minus E[Y ] ge (2micro(N)N+2)minus(kminus2+κ(M))2NH(P )minusg(N)

middot(

1minusmaxRQ

2NH(Q)minusNH(P )+g(N)+f(N)(2micro(N)N)minusr(R))

= 2N(H(P )minus(kminus2+κ(M))micro(N))minus2(kminus2+κ(M))minusg(N)

middot(

1minusmaxRQ

2N(H(Q)minusH(P )minusr(R)micro(N))+g(N)+f(N))

Using (514) we get

E[X]minus E[Y ] ge 2N(H(P )minus(kminus2+κ(M))micro(N))minus2(kminus2+κ(M))minusg(N)(1minus 1

2)

= 2N(H(P )minus(kminus2+κ(M))micro(N))minus2(kminus2+κ(M))minusg(N)minus1

Then

1

Nlog2 Q(ΦotimesN)

ge 1

Nlog2(E[X]minus E[Y ])

ge H(P )minus (k minus 2 + κ(M)) maxRQ

H(Q)minusH(P ) + (1 + g(N) + f(N)) 1N

r(R)

minus 2(k minus 2 + κ(M)) + g(N) + 1

N

We let t and thus N go to infinity and obtain

log2 ˜Q(Φ) ge H(P )minus (k minus 2) maxRQ

H(Q)minusH(P )

r(R)

This lower bound holds for any rational probability distribution P on Φ and bycontinuity for any real probability distribution P on Φ

It remains to prove (512) and (513) We do this in the lemmas below

Lemma 512 E[X] = |B| |Ψ|Mminus(kminus1)

Proof Let a isin Ψ Then h1(a1) + middot middot middot+ hkminus1(akminus1) = (k minus 1)hk(ak) The followingfour statements are equivalent

a isin Ψprime

foralli isin [k] hi(ai) isin B

52 Higher-order CoppersmithndashWinograd method 75

existb isin B h1(a1) = middot middot middot = hk(ak) = b

existb isin B h1(a1) = middot middot middot = hkminus1(akminus1) = b

Therefore

P[a isin Ψprime] =sumbisinB

P[h1(a1) = middot middot middot = hkminus1(akminus1) = b]

For b isin B

P[h1(a1) = middot middot middot = hkminus1(akminus1) = b] = (Mminus1)kminus1

We conclude

E[X] =sumaisinΨ

P[a isin Ψprime]

=sumaisinΨ

sumbisinB

P[h1(a1) = middot middot middot = hkminus1(akminus1) = b]

=sumaisinΨ

sumbisinB

(Mminus1)kminus1

= |Ψ| |B|Mminus(kminus1)

This proves the lemma

Lemma 513 E[Y ] le |B|maxRQ 2NH(Q)+f(N)Mminus(kminus1)minusr(R)

Proof Let

C = (a aprime) isin Ψ2 a 6= aprimeexisti isin [k] ai = aprimei

Let (a aprime) isin C The following statements are equivalent

(a aprime) isin C prime (515)

a aprime isin Ψprime (516)

foralli isin [k] hi(ai) hi(aprimei) isin B (517)

existb isin B h1(a1) = middot middot middot = hk(ak) = h1(aprime1) = middot middot middot = hk(aprimek) = b (518)

Therefore

E[Y ] =sum

(aaprime)isinC

P[(a aprime) isin C prime]

=sum

(aaprime)isinC

sumbisinB

P[h1(a1) = middot middot middot = hk(ak) = h1(aprime1) = middot middot middot = hk(aprimek) = b]

76 Chapter 5 Tight tensors and combinatorial subrank cap sets

Let (a aprime) isin C Then hi(ai) and hi(aprimei) are ZMZ-linear combinations of

u1 ukminus1 v1 vN The random variable(h1(a1) hk(ak) h1(aprime1) hk(a

primek))

is uniformly distributed over the image subspace V sube (ZMZ)2k Let b isin BThen (b b) isin V since u1 = middot middot middot = uk = b v1 vN = 0 is a valid assignmentTherefore

P[h1(a1) = middot middot middot = hk(ak) = h1(aprime1) = middot middot middot = hk(aprimek) = b] = |V |minus1

And |V | equals M to the power the rank of the matrix1 0 middot middot middot 0 1

kminus11 0 middot middot middot 0 1

kminus1

0 1 0 1kminus1

0 1 0 1kminus1

0 0 middot middot middot 1 1

kminus10 0 middot middot middot 1 1

kminus1

a1 a2 middot middot middot akminus1 minus akkminus1

aprime1 aprime2 middot middot middot aprimekminus1 minusaprimekkminus1

(519)

over ZMZ with a1 ak aprime1 a

primek thought of as column vectors in (ZMZ)N

With column operations we transform (519) into0 0 middot middot middot 0 0 1 0 middot middot middot 0 00 0 middot middot middot 0 0 0 1 0 0

0 0 middot middot middot 0 0 0 0 1 0

a1 minus aprime1 a2 minus aprime2 middot middot middot akminus1 minus aprimekminus1 ak minus aprimek aprime1 aprime2 middot middot middot aprimekminus1 0

(520)

Matrix (520) has rank equal to k minus 1 plus rM(a aprime) = rk(A(a aprime)) where

A(a aprime) =(a1 minus aprime1 a2 minus aprime2 middot middot middot ak minus aprimek

)

We obtain

E[Y ] lesum

(aaprime)isinC

sumbisinB

Mminus(kminus1+rM (aaprime))

Since the summands are independent of b we get

E[Y ] le |B|sum

(aaprime)isinC

Mminus(kminus1+rM (aaprime))

Let (a aprime) isin C Consider the rows of A(a aprime) The N rows are of theform xi minus yi with (xi yi) isin Φ2 Let s = ((x1 y1) (xN yN)) Let R =

52 Higher-order CoppersmithndashWinograd method 77

(x1 y1) (xN yN) We have rM(a aprime) = rM(R) and rM(R) = r(R) by (58)Let Q be the N -type with supp(Q) = R and s isin TNQ From a 6= aprime followsR 6sube (x x) x isin Φ From existi isin [k] ai = aprimei follows existi isin [k] R sube (x y) xi = yiFrom a aprime isin TNP1

times middot middot middot times TNPk follows Qi = Qk+i = Pi for all i isin [k] We thus have

E[Y ] le |B|sum

RisinR(Φ)

sumQisinQ(R(P1Pk))

supp(Q)=RQ is N -type

sumsisinTNQ

Mminus(kminus1+r(R))

The number of N -types Q with supp(Q) = R is at most the number of N -typeson R which is at most

(N+|R|minus1|R|minus1

)(Lemma 419) For any Q isin Q(R (P1 Pk))

|TNQ | le 2NH(Q) (Lemma 419) Therefore

E[Y ] le |B|sum

RisinR(Φ)

(N + |R| minus 1

|R| minus 1

)max

QisinQ(R(P1Pk))2NH(Q) Mminus(kminus1+r(R))

Also |R(Φ)| le 2|Φ|2

Therefore

E[Y ] le |B| 2|Φ|2 maxRisinR(Φ)

(N + |R| minus 1

|R| minus 1

)max

QisinQ(R(P1Pk))2NH(Q) Mminus(kminus1+r(R))

We conclude that

E[Y ] le |B|maxRQ

2NH(Q)+f(N)Mminus(kminus1)minusr(R)

This proves the lemma

522 Computational remarks

The following two lemmas are helpful when applying Theorem 57 We leave theproof to the reader

Lemma 514 Let P isin P(Φ) Let RRprime isin R(Φ) with R sube Rprime and r(R) = r(Rprime)Then

maxQisinQ(R(P1Pk))

H(Q)minusH(P )

r(R)le max

QisinQ(Rprime(P1Pk))

H(Q)minusH(P )

r(Rprime)

Lemma 515 Let R isin R(Φ) There is an equivalence relation Rprime isin R(Φ) withR sube Rprime and r(R) = r(Rprime)

78 Chapter 5 Tight tensors and combinatorial subrank cap sets

523 Examples type sets

We discuss some examples The first example we will use to get good upperbounds on the asymptotic rank of complete graph tensors in Section 55 We focuson one family of examples that is parametrised by partitions Let λ ` k be aninteger partition of k with d parts Let

Φλ = a isin 0 1 dminus 1 type(a) = λ

The set Φλ is tight

Theorem 516 log2 ˜Q(Φ(22)) = 1

Proof Let Φ = Φ(22) Clearly ˜Q(Φ) le 2 After relabelling foralla isin Φsumk

i=1 ai = 0We may thus apply Theorem 57 Let P be the uniform probability distributionon Φ Then H(P ) = log2 6

Let R isin R(Φ) We may assume that

R sube (1 1 0 0) (1 0 1 0) (1 0 0 1)2

cup(0 0 1 1) (0 1 0 1) (0 1 1 0)2

We may assume R is an equivalence relation (Lemma 515) Let (x y) isin RLet Rprime = R cup ((1 1 1 1) minus x (1 1 1 1) minus y) isin R(Φ) Then R sube Rprime andRprime isin R(Φ) and r(R) = r(Rprime) We may thus assume that if (x y) isin R then also((1 1 1 1)minus x (1 1 1 1)minus y) isin R (Lemma 514)

Let S = (1 1 0 0) (1 0 1 0) (1 0 0 1) By the above observation it sufficesto consider equivalence relations on S There are three types of such equivalencerelations

Type (3) all three elements of S are equivalent Then |R| = 18 and r(R) = 2Type (2 1) two elements of S are equivalent and inequivalent to the third

element (which is equivalent to itself) Then |R| = 10 and r(R) = 1Type (1 1 1)) all elements of S are inequivalent Then R sube (x x) x isin Φ

which is a contradictionFor type (3) and (2 1) the uniform probability distribution Q on R has

marginals Qi = Q4+i = Pi for i isin [4] The uniform Q is optimal Then H(Q) =log2 |R| Let R(3) and R(21) be equivalence relations of type (3) and (2 1) Then

log2 ˜Q(Φ) ge minH(P )minus 2

r(R(3))

(log2 |R(3)| minusH(P )

)

H(P )minus 2

r(R(21))

(log2 |R(21)| minusH(P )

)= min log2 6minus 2

2(log2 18minus log2 6)

log2 6minus 21(log2 10minus log2 6)

= min1 log25425 = 1

This proves the theorem

53 Combinatorial degeneration method 79

Theorem 517 log2 ˜Q(Φ(0kminus11)) = h(1k)

Proof We refer to [CVZ16]

With Srinivasan Arunachalam and Peter Vrana we have the following unpub-lished result

Theorem 518 log2 ˜Q(Φ(0k21k2)) = 1

53 Combinatorial degeneration method

In this section we extend the (higher-order) CoppersmithndashWinograd method via apreorder called combinatorial degeneration Suppose Ψ sube I1timesmiddot middot middottimes Ik is not tightbut has a tight subset Φ sube Ψ In the rest of this section we focus on obtaining alower bound on ˜Q(Ψ) via Φ This has an application in the context of tri-coloredsum-free sets (Section 542) for example

Definition 519 ([BCS97]) Let Φ sube Ψ sube I1 times middot middot middot times Ik We say that Φ is acombinatorial degeneration of Ψ and write Ψ Φ if there are maps ui Ii rarr Z(i isin [k]) such that for all α isin I1 times middot middot middot times Ik if α isin Ψ Φ then

sumki=1 ui(αi) gt 0

and if α isin Φ thensumk

i=1 ui(αi) = 0 Note that the maps ui need not be injective

Combinatorial degeneration gets its name from the following standard proposi-tion see eg [BCS97 Proposition 1530]

Proposition 520 Let t isin Fn1 otimes middot middot middot otimes Fnk Let Ψ = supp(t) Let Φ sube Ψ suchthat Ψ Φ Then t t|Φ

Proposition 520 brings us only slightly closer to our goal Namely givent isin Fn1 otimesmiddot middot middototimesFnk with Ψ = supp(t) and given Φ sube Ψ such that ΨΦ it followsdirectly from Proposition 520 that t t|Φ and thus ˜Q(t) ge ˜Q(t|Φ) This howeverdoes not give us a lower bound on the combinatorial asymptotic subrank ˜Q(Ψ)The following theorem does Our theorem extends a result in [KSS16]

Theorem 521 Let Φ sube Ψ sube I1 times middot middot middot times Ik If Ψ Φ then

˜Q(Ψ) ge ˜Q(Φ)

Lemma 522 Let Φ sube Ψ sube I1 times middot middot middot times Ik If Ψ Φ then ˜Q(Ψ) ge Q(Φ)

Proof Pick maps ui Ii rarr Z such that

ksumi=1

ui(αi) = 0 for α isin Φ

ksumi=1

ui(αi) gt 0 for α isin Ψ Φ

80 Chapter 5 Tight tensors and combinatorial subrank cap sets

Let D be a free diagonal in Φ with |D| = Q(Φ) and let

wi =sumxisinDi

ui(x)

Let n isin N and define

Wi =

(x1 xn|D|) isin Itimesn|D|i

n|D|sumj=1

ui(xj) = nwi

Then

Ψtimesn|D| cap (W1 times middot middot middot timesWk) = Φtimesn|D| cap (W1 times middot middot middot timesWk)

The inclusion supe is clear To show sube let (x1 xk) isin Ψtimesn|D| cap (W1 times middot middot middot timesWk)Write xi = (xi1 xi2 xin|D|) and consider the n|D| times k matrix of evaluations

u1(x11) u2(x21) middot middot middot uk(xk1)u1(x12) u2(x22) middot middot middot uk(xk2)

u1(x1n|D|) u2(x2n|D|) middot middot middot uk(xkn|D|)

The sum of the ith column is nwi by definition of Wi andsumk

i=1 nwi = 0 Therow sums are nonnegative by definition of the maps u1 uk We conclude thatthe row sums are zero Therefore (x1 xk) is an element of Φtimesn|D|

Since D is a free diagonal in Φ Dtimesn|D| is a free diagonal in Φtimesn|D| and alsoDtimesn|D| cap (W1times middot middot middot timesWk) is a free diagonal in Φtimesn|D| cap (W1times middot middot middot timesWk) which inturn is equal to Ψtimesn|D| cap (W1 times middot middot middot timesWk) Therefore Dtimesn|D| cap (W1 times middot middot middot timesWk) isalso a free diagonal in Ψtimesn|D| ie

Q(Ψtimesn|D|) ge |Dtimesn|D| cap (W1 times middot middot middot timesWk)|

In the set Dtimesn|D| consider the strings with uniform type ie where all |D|elements of D occur exactly n times These are clearly in W1 times middot middot middot timesWk andtheir number is

(n|D|nn

) Therefore

Q(Ψtimesn|D|) ge(

n|D|n n

)= |D|n|D|minuso(n)

which implies ˜Q(Ψ) = limnrarrinfinQ(Ψtimesn|D|)1

n|D| ge |D|

Proof of Theorem 521 We have ˜Q(Ψ) = limnrarrinfin ˜Q(Ψtimesn)1n It follows fromLemma 522 that

limnrarrinfin ˜Q(Ψtimesn)1n ge lim

nrarrinfinQ(Φtimesn)1n

The right-hand side is ˜Q(Φ)

54 Cap sets 81

54 Cap sets

A subset A sube (Z3Z)n is called a cap set if any line in A is a point a linebeing a triple of points of the form (u u + v u + 2v) Until recently it wasnot known whether the maximal size of a cap set in (Z3Z)n grows like 3nminuso(n)

or like cnminuso(n) for some c lt 3 Gijswijt and Ellenberg in [EG17] inspired bythe work of Croot Lev and Pach in [CLP17] settled this question showing thatc le 3(207+33

radic33)138 asymp 2755 Tao realised in [Tao16] that the cap set question

may naturally be phrased as the problem of computing the size of the largestmain diagonal in powers of the ldquocap set tensorrdquo

sumα eα1 otimes eα2 otimes eα3 where the

sum is over α1 α2 α3 isin F3 with α1 + α2 + α3 = 0 Here main diagonal refersto a subset A of the basis elements such that restricting the cap set tensor toAtimesAtimesA gives the tensor

sumvisinA votimes votimes v We show that the cap set tensor is in

the GL3(F3)times3 orbit of the ldquoreduced polynomial multiplication tensorrdquo which wasstudied in [Str91] and we show how recent results follow from this connectionusing Theorem 521

541 Reduced polynomial multiplication

Let tn be the tensorsum

α eα1 otimes eα2 otimes eα3 where the sum is over (α1 α2 α3) in0 1 nminus13 such that α1 +α2 = α3 We call tn the reduced polynomial multi-plication tensor since tn is essentially the structure tensor of the algebra F[x](xn)of univariate polynomials modulo the ideal generated by xn The support of tnequals

(α1 α2 α3) isin 0 nminus 13

∣∣α1 + α2 = α3

which via α3 7rarr nminus 1minus α3 we may identify with the set

Φn =

(α1 α2 α3) isin 0 nminus 13∣∣α1 + α2 + α3 = nminus 1

(521)

The support Φn is tight (cf Example 51) Strassen proves in [Str91 Theorem 67]using Corollary 54 that ˜Q(tn) = ˜Q(Φn) = z(n) where z(n) is defined as

z(n) =γn minus 1

γ minus 1γminus2(nminus1)3 (522)

with γ equal to the unique positive real solution of the equation 1γminus1minus n

γnminus1= nminus1

3

The following table contains values of z(n) for small n See also [Str91 Table 1]

82 Chapter 5 Tight tensors and combinatorial subrank cap sets

n z(n)

rounded exact

2 188988 3223 = 2h(13)

3 275510 3(207 + 33radic

33)1384 3610725 4461586 5309737 6156208 7001559 78461210 869012

In fact [Str91 Theorem 67] says that the asymptotic spectrum of tn is completelydetermined by the support functionals and that the possible values that thespectral points can take on tn form the closed interval [z(n) n] (cf Remark 221)

X(N[tn]) = ζθ|N[tn] θ isin P([3]) φ(tn) φ isin X(N[tn]) = [z(n) n]

542 Cap sets

We turn to cap sets

Definition 523 A three-term progression-free set is a set A sube (ZmZ)n satisfy-ing the following For all (x1 x2 x3) isin Atimes3 there are u v isin (ZmZ)n such that(x1 x2 x3) = (u u + v u + 2v) if and only if x1 = x2 = x3 Let r3((ZmZ)n) bethe size of the largest three-term progression-free set in (ZmZ)n and define theregularisation ˜r3(ZmZ) = limnrarrinfin r3((ZmZ)n)1n

A three-term progression-free set in (Z3Z)n is called a cap or cap set Wenext discuss an asymmetric variation on three-term progression free sets calledtri-colored sum-free sets which are potentially larger They are interesting sinceall known upper bound techniques for the size of three-term progression-free setsturn out to be upper bounds on the size of tri-colored sum-free sets

Definition 524 Let G be an abelian group Let Γ sube GtimesGtimesG For i isin [3] wedefine the marginal sets Γi = x isin G existα isin Γ αi = x We say Γ is tricoloredsum-free if the following holds The set Γ is a diagonal and for any α isin Γ1timesΓ2timesΓ3α1 + α2 + α3 = 0 if and only if α isin Γ (Recall that Γ sube I1 times I2 times I3 is a diagonalwhen any two distinct α β isin Γ are distinct in all coordinates) Let s3(G) be thesize of the largest tricolored sum-free set in GtimesGtimesG and define the regularisation

˜s3(G) = limnrarrinfin s3(Gtimesn)1n

Equivalently Γ sube GtimesGtimesG is a tricolored sum-free set if and only if Γ is afree diagonal in α isin GtimesGtimesG α1 + α2 + α3 = 0

54 Cap sets 83

If the set A sube G = (ZmZ)n is three-term progression-free then the setΓ = (a aminus2a) a isin A sube G times G times G is tri-colored sum-free Therefore wehave ˜r3(ZmZ) le ˜s3(ZmZ)

We summarise the recent history of results on cap sets For clarity we focuson m = 3 we refer the reader to the references for the general results Edel in[Ede04] proved the lower bound 221739 le ˜r3(Z3Z) In [EG17] Ellenberg andGijswijt proved the upper bound

˜r3(Z3Z) le 3(207 + 33radic

33)138 asymp 2755

Blasiak et al [BCC+17] proved that in fact

˜s3(Z3Z) le 3(207 + 33radic

33)138

This upper bound was shown to be an equality in [KSS16 Nor16 Peb16]

Theorem 525 ˜s3(Z3Z) = 3(207 + 33radic

33)138

We reprove Theorem 525 by proving that ˜s3(ZmZ) equals the asymptoticsubrank z(m) of tm discussed in Section 541 when m is a prime power Thesignificance of our proof lies in the explicit connection to the framework ofasymptotic spectra and not in the obtained value which also for prime powers mwas already computed in [BCC+17 KSS16 Nor16 Peb16]

Proof We will prove ˜s3(ZmZ) = z(m) when m is a prime power By defini-tion ˜s3(ZmZ) equals the asymptotic subrank of the set

α isin 0 mminus 13 α1 + α2 + α3 = 0 mod m

which via α3 7rarr α3 minus (mminus 1) we may identify with the set

Ψm = α isin 0 mminus 13 α1 + α2 + α3 = mminus 1 mod m

and so ˜s3(ZmZ) = ˜Q(Ψm) Let

Φm = α isin 0 mminus 13 α1 + α2 + α3 = mminus 1

We know ˜Q(Φm) = z(m) (Section 541) We will show that ˜Q(Φm) = ˜Q(Ψm)when m is a prime power This proves the theorem

We prove ˜Q(Φm) le ˜Q(Ψm) There is a combinatorial degeneration Φm ΨmIndeed let ui 0 mminus 1 rarr 0 mminus 1 be the identity map If α isin Φmthen

sum3i=1 ui(αi) = m minus 1 and if α isin Ψm Φm then

sum3i=1 ui(αi) equals m minus 1

plus a positive multiple of m This means Theorem 521 applies and we thusobtain ˜Q(Φm) le ˜Q(Ψm) This proves the claim

We show ˜Q(Ψm) le ˜Q(Φm) when m is a power of the prime p Let F = FpLet fm isin Fm otimes Fm otimes Fm have support Ψm with all nonzero coefficients equal

84 Chapter 5 Tight tensors and combinatorial subrank cap sets

to 1 Obviously ˜Q(Ψm) le ˜Q(fm) To compute ˜Q(fm) we show that there is abasis in which the support of fm equals the tight set Φm Then ˜Q(fm) = ˜Q(Φm)(Corollary 54) This implies the claim We prepare to give the basis (which isthe same basis as used in [BCC+17]) First observe that the rule x 7rarr

(xa

)gives a

well-defined map ZmZrarr ZpZ since for a isin 0 1 mminus 1 if x = y mod mthen

(xa

)=(ya

)mod p by Lucasrsquo theorem Let (ex)x be the standard basis of Fm

The elements (sum

xisinZmZ(xa

)ex)aisinZmZ form a basis of Fm since the matrix (

(xa

))ax

is upper triangular with ones on the diagonal We will now rewrite fm in the basis((sum

x

(xa

)ex)a (

sumy

(yb

)ey)b (

sumz

(zc

)ez)c) Observe that

(x

mminus1

)equals 1 if and only

if x equals mminus 1 and hence

fm =sum

xyzisinZmZx+y+z=mminus1

ex otimes ey otimes ez =sum

xyzisinZmZ

(x+ y + z

mminus 1

)ex otimes ey otimes ez

The identity(x+y+zw

)=sum(

xa

)(yb

)(zc

)with sum over a b c isin 0 1 mminus 1 such

that a+ b+ c = w is true and thussumxyzisinZmZ

(x+ y + z

mminus 1

)ex otimes ey otimes ez

=sum

xyzisinZmZ

sumabcisin01mminus1

a+b+c=mminus1

(x

a

)(y

b

)(z

c

)ex otimes ey otimes ez (523)

We may simply rewrite (523) as

sumabcisin01mminus1

a+b+c=mminus1

sumxisinZmZ

(x

a

)ex otimes

sumyisinZmZ

(y

b

)eb otimes

sumzisinZmZ

(z

c

)ez

Therefore with respect to the basis ((sum

x

(xa

)ex)a (

sumy

(yb

)ey)b (

sumz

(zc

)ez)c) the

support of fm equals the tight set Φm (And even stronger fm is isomorphic tothe tensor F[x](xm) of Section 541)

Remark 526 Why did we reprove the cap set result Theorem 525 Ourmotivation being interested in the asymptotic spectrum of tensors was to seeif the techniques in the cap set papers are stronger than the Strassen supportfunctionals ie whether they give any new spectral points Above we have seenthat the cap set result itself can be proven with the support functionals In fact weshow in Section 46 that for oblique tensors the asymptotic slice-rank which wasintroduced in [Tao16] to give a concise proof of [EG17] equals the minimum valueover the support functionals In Section 611 we show that for all complex tensorsasymptotic slice-rank equals the minimum value of the quantum functionals

55 Graph tensors 85

55 Graph tensors

In this section we briefly discuss the application that motivated us to proveTheorem 57 in [CVZ16] namely upper bounding the asymptotic rank of so-calledgraph tensors Graph tensors are defined as follows

Let G = (VE) be a graph (or hypergraph) with vertex set V and edgeset E Let n isin N Let (bi)iisin[n] be the standard basis of Fn We define the graphtensor Tn(G) as

Tn(G) =sumiisin[n]E

otimesvisinV

(otimeseisinEvisine

bie

)

seen as a |V |-tensor Given a vertex v isin V let d(v) denote the degree of v thatis d(v) equals the number of edges e isin E that contain v Then Tn(G) is naturallyinotimes

visinV Fd(v) We write T(G) for T2(G) For example for the complete graphon four vertices K4 the graph tensor is

T(K4) = T( )

= T( )

otimes T( )

otimes T( )

otimes T( )

otimes T( )

otimes T( )

=sum

iisin016(bi1otimesbi2otimesbi5)otimes (bi2otimesbi3otimesbi6)otimes (bi3otimesbi4otimesbi5)otimes (bi1otimesbi4otimesbi6)

living in (C8)otimes4 Let Kk be the complete graph on k vertices The 2times2 matrix mul-tiplication tensor 〈2 2 2〉 equals the tensor T(K3) Define the exponent ω(T(G)) =log2 ˜R(T(G)) We study the exponent per edge τ(T(G)) = ω(T(G)) |E(G)|

Our result is an upper bound on τ(T(K4)) in terms of the combinatorialasymptotic subrank ˜Q(Φ(22)) which we studied in Theorem 516

Theorem 527 For any q ge 1 τ(T(K4)) le logq

( q + 2

˜Q(Φ(22))

)

Proof We apply a generalisation of the laser method See [CVZ16]

Corollary 528 Let k ge 4 Then τ(T(Kk)) le 0772943

Proof In the bound of Theorem 527 we plug in the value ˜Q(Φ(22)) = 2 fromTheorem 516 Then we optimise over q to obtain the value 0772943 Bya ldquocovering argumentrdquo we can show that τ(T(Kk)) is non-increasing when kincreases

For k ge 4 Corollary 528 improves the upper bound τ(T(Kk)) le 0790955that can be derived from the well-known upper bound of Le Gall [LG14] on theexponent of matrix multiplication ω = ω(T(K3))

86 Chapter 5 Tight tensors and combinatorial subrank cap sets

A standard ldquoflattening argumentrdquo (ie using the gauge points from the asymp-totic spectrum) yields the lower bound τ(T(Kk)) ge 1

2k(k minus 1) if k is even and

τ(T(Kk)) ge 12(k + 1)k if k is odd As a consequence if the exponent of matrix

multiplication ω equals 2 then τ(T(K4)) = τ(T(K3)) = 23 We raise the following

question is there a k ge 5 such that τ(T(Kk)) lt23

Tensor surgery cycle graphs

For graph tensors given by sparse graphs good upper bounds on the asymptoticrank can be obtained with an entirely different method called tensor surgery whichwe introduced in [CZ18] As an illustration let me mention the results we obtainedfor cycle graphs with tensor surgery Recall ω = log2 ˜R(〈2 2 2〉) = log2 ˜R(T(Ck))Let ωk = log2 ˜R(T(Ck)) First observe that ωk = k for even k For odd k triviallyk minus 1 le ωk le k We prove the following

Theorem 529 For k ` odd ωk+`minus1 le ωk + ω`

Corollary 530 Let k ge 5 odd Then ωk le ωkminus2 + ω3 and thus ωk le kminus12ω

Corollary 531 If ω = 2 then ωk = k minus 1 for all odd k

See [CZ18] for the proofs

56 Conclusion

Tight tensors are a subfamily of the oblique tensors For tight 3-tensors theminimum over the support functionals equals the asymptotic subrank This isproven via the CoppersmithndashWinograd method The construction is in fact of avery combinatorial nature In this chapter we studied the combinatorial notion ofsubrank We proved that combinatorial subrank is monotone under combinatorialdegeneration We studied the cap set problem via the support functionals Weextended the CoppersmithndashWinograd method to higher-order tensors and appliedthis method to study graph tensors

Chapter 6

Universal points in the asymp-totic spectrum of tensors entanglementpolytopes moment polytopes

This chapter is based on joint work with Matthias Christandl and Peter Vrana [CVZ18]

61 Introduction

In Chapter 4 following Strassen we introduced the asymptotic spectrum oftensors X(T ) = X(T 6) for T the semiring of k-tensors over F for some fixedinteger k and field F with addition given by direct sum oplus multiplication givenby tensor product otimes and preorder 6 given by restriction (or degeneration) Theasymptotic spectrum characterises the asymptotic rank ˜R and the asymptoticsubrank ˜Q We have seen that the asymptotic rank plays an important role inalgebraic complexity theory the asymptotic rank of the matrix multiplicationtensor 〈2 2 2〉 =

sumijkisin[2] eij otimes ejkotimes eki isin F4otimesF4otimesF4 characterises the exponent

of the arithmetic complexity of multiplying two n times n matrices over F thatis ˜R(〈2 2 2〉) = 2ω We have also seen in Chapter 5 how one may use theasymptotic subrank to upper bound the size of combinatorial objects like forexample cap sets in Fn3

New results in this chapter

So far the only elements we have seen in X(T ) (ie universal spectral pointscf Section 213) are the gauge points (Section 43) Besides that we have seenin Section 44 that the Strassen support functionals ζθ are in X(oblique) Inthis chapter we introduce for the first time an explicit infinite family of universalspectral points (over the complex numbers) the quantum functionals Our newinsight is to use the moment polytope Given a tensor t isin Cn1 otimes Cn2 otimes Cn3 themoment polytope P(t) is a convex polytope that carries representation-theoretic

87

88 Chapter 6 Universal points in the asymptotic spectrum of tensors

information about t The quantum functionals are defined as maximisations overmoment polytopes

Let me immediately put a disclaimer The quantum functionals do not give anew lower bound on the asymptotic rank of matrix multiplication 〈2 2 2〉 namelythe quantum functionals give the same lower bound as the gauge points Alsothe quantum functionals being defined for tensors over complex numbers only wedo not expect to get new upper bounds on the size of combinatorial objects thatare ldquolike cap setsrdquo

So what have we gained Arguably we have found the ldquorightrdquo viewpoint onhow to construct universal spectral points for tensors (In fact after writing ourpaper [CVZ18] we realised that Strassen had begun a study of moment polytopesin the appendix of the German survey [Str05] Strassen did not construct newuniversal spectral points however not in that publication at least) If there aremore universal spectral points then our viewpoint may lead the way to findingthem Moreover whereas no efficient algorithm is known for evaluating the supportfunctionals the moment polytope viewpoint may open the way to having efficientalgorithms for evaluating the quantum functionals

In Sections 62ndash67 we work towards the construction of the quantum functionalsand we give a proof that they are universal spectral points In Sections 68ndash610 wecompare the quantum functionals and the support functionals and in Section 611we relate asymptotic slice rank to the quantum functionals

In this chapter we will focus on 3-tensors but the theory naturally generalisesto k-tensors

62 SchurndashWeyl duality

For background on representation theory we refer to [Kra84] [Ful97] and [GW09]Let Sn be the symmetric group on n symbols Let Sn act on the tensor

space (Cd)otimesn by permuting the tensor legs

π middot v1 otimes middot middot middot otimes vn = vπminus1(1) otimes middot middot middot otimes vπminus1(n) π isin Sn

Let GLd be the general linear group of Cd Let GLd act on (Cd)otimesn via the diagonalembedding GLd rarr GLtimesnd g 7rarr (g g)

g middot v1 otimes middot middot middot otimes vn = (gv1)otimes middot middot middot otimes (gvn) g isin GLd

The actions of Sn and GLd commute so we have a well-defined action of the productgroup Sn timesGLd on (Cd)otimesn SchurndashWeyl duality describes the decomposition ofthe space (Cd)otimesn into a direct sum of irreducible Sn timesGLd representations Thisdecomposition is

(Cd)otimesn sim=oplusλ`dn

[λ]otimes Sλ(Cd) (61)

62 SchurndashWeyl duality 89

with [λ] an irreducible Sn representation of type λ and Sλ(Cd) an irreducibleGLd-representation of type λ when `(λ) le d and 0 when `(λ) gt d We use thenotation λ `d n for the partitions of n with at most d parts Let

Pλ (Cd)otimesn rarr (Cd)otimesn

be the equivariant projector onto the isotypical component of type λ ie onto thesubspace of (Cd)otimesn isomorphic to [λ]otimes Sλ(Cd) The projector Pλ is given by theaction of the group algebra element

Pλ =(dim[λ]

n

)2 sumTisinTab(λ)

cT isin C[Sn]

where Tab(λ) is the set of Young tableaux of shape λ filled with [n] and with cTthe Young symmetrizer

cT =sum

σisinC(T )

sgn(σ)σsum

πisinR(T )

π

where C(T ) R(T ) sube Sn are the subgroups of permutations inside columns andpermutations inside rows respectively The element Pλ is a minimal centralidempotent in C[Sn] and

sumλ`n Pλ = e

Back to the decomposition of (Cd)otimesn We need a handle on the size of thecomponents in the direct sum decomposition (61) For our application it is goodto think of d as a constant and n as a large number The number of summands inthe direct sum decomposition (61) is upper bounded by a polynomial in n

|λ `d n| le (n+ 1)d

ie there are only few summands compared to the total dimension dn There arethe following well-known bounds on the dimensions of the irreducible representa-tions [λ] and Sλ(Cd) that make up the summands

nprodd`=1(λ` + dminus `)

le dim[λ] le nprodd`=1 λ`

(62)

dimSλ(Cd) le (n+ 1)d(dminus1)2 (63)

Let p isin Rn be a probability vector iesumn

i=1 pi = 1 and pi ge 0 for i isin [n]Let H(p) be the Shannon entropy of the probability vector p

H(p) =nsumi=1

pi log2

1

pi

For α isin [0 1] let h(α) = H((α 1 minus α)) be the binary entropy For a partitionλ = (λ1 λ`) ` n let λ = λn = (λ1n λ`n) be the probability vectorobtained by normalising λ

90 Chapter 6 Universal points in the asymptotic spectrum of tensors

Let λ ` n For N isin N let Nλ = (Nλ1 Nλ2 Nλ`) be the stretchedpartition We see that asymptotically in the stretching factor N the dimensionof [Nλ] behaves like a multinomial coefficient and

2NnH(λ)minuso(N) le dim[Nλ] le 2NnH(λ) (64)

63 Kronecker and LittlewoodndashRichardson coef-

ficients gλmicroν cλmicroν

Let micro ν ` n Let Sn rarr SntimesSn π 7rarr (π π) be the diagonal embedding Considerthe decomposition of the tensor product [micro] otimes [ν] restricted along the diagonalembedding

[micro]otimes [ν] darrSntimesSnSnsim=otimesλ`n

HomSn([λ] [micro]otimes [ν])otimes [λ]

Define the Kronecker coefficient

gλmicroν = dim HomSn([λ] [micro]otimes [ν])

ie gλmicroν is the multiplicity of [λ] in [micro]otimes [ν]

Let λ `a+b Let GLa timesGLb rarr GLa+b (AB) 7rarr AoplusB be the block-diagonalembedding Consider the decomposition of the representation Sλ(Ca+b) restrictedalong the block-diagonal embedding

Sλ(Ca+b) darrGLa+b

GLatimesGLbsim=oplusmicro`aν`b

Hλmicroν otimes Smicro(Ca)otimes Sν(Cb)

with

Hλmicroν = HomGLatimesGLb(Smicro(Ca)otimes Sν(Cb)Sλ(Ca+b))

Define the LittlewoodndashRichardson coefficient cλmicroν = dimHλmicroν

For partitions λ λprime ` define λ + λprime elementwise The Kronecker and theLittlewoodndashRichardson coefficients have the following semigroup property (seeeg [CHM07])

Lemma 61 Let λ micro ν α β γ ` be partitions

(i) If gλmicroν gt 0 and gαβγ gt 0 then gλ+α micro+β ν+γ gt 0

(ii) If cλmicroν gt 0 and cαβγ gt 0 then cλ+αmicro+β ν+γ gt 0

64 Entropy inequalities 91

64 Entropy inequalities

The semigroup properties imply the following lemma Of this lemma the firststatement can be found in a paper by Christandl and Mitchison [CM06] while wedo not know of any source that explicitly states the second statement For theconvenience of the reader we give the proofs of both statements

Lemma 62 Let λ micro ν ` be partitions

(i) If gλmicroν gt 0 then H(λ) le H(micro) +H(ν)

(ii) If cλmicroν gt 0 then H(λ) le |micro||micro|+|ν|H(micro) + |ν|

|micro|+|ν|H(ν) + h( |micro||micro|+|ν|

)

Proof (i) Let gλmicroν gt 0 Suppose λ micro ν ` n Let N isin N Then Lemma 61implies gNλNmicroNν gt 0 This means HomSnN ([Nλ] [Nmicro]otimes [Nν]) 6= 0 which impliesdim[Nλ] le dim[Nmicro] dim[Nν] From (64) we have the dimension bounds

2NnH(λ)minuso(N) le dim[Nλ]

dim[Nmicro] le 2NnH(micro)

dim[Nν] le 2NnH(ν)

Thus NnH(λ) minus o(N) le NnH(micro) + NnH(ν) Divide by Nn and let N go toinfinity to get H(λ) le H(micro) +H(ν)

(ii) We restrict the decomposition

(Ca+b)otimesn sim=oplusλ`a+bn

[λ]otimes Sλ(Ca+b)

along the block-diagonal embedding to get

(Ca+b)otimesn darrGLa+b

GLatimesGLbsim=otimesλ`a+bn

[λ]otimes Sλ(Ca+b) darrGLa+b

GLatimesGLb

sim=oplusλ`a+bn

[λ]otimesoplusmicro`aν`b

Ccλmicroν otimes Smicro(Ca)otimes Sν(Cb)

sim=oplusmicro`aν`b

(oplusλ`a+bn

[λ]otimes Ccλmicroν)otimes Smicro(Ca)otimes Sν(Cb)

On the other hand

(Ca+b)otimesn darr sim= (Ca oplus Cb)otimesn darrsim= (Ca)otimesn oplus ((Ca)otimesnminus1 otimes Cb)oplus middot middot middot oplus (Cb)otimesn darr

sim=noplusk=0

C(nk) otimesoplusmicro`ak

([micro]otimes Smicro(Ca))otimesoplus

ν`bnminusk

([ν]otimes Sν(Cb))

92 Chapter 6 Universal points in the asymptotic spectrum of tensors

sim=noplusk=0

oplusmicro`akν`bnminusk

(C(nk) otimes [micro]otimes [ν]

)otimes Smicro(Ca)otimes Sν(Cb)

Suppose cλmicroν gt 0 Comparing the above expressions gives the inequality dim[λ] le(n|micro|

)dim[micro] dim[ν] By the semigroup property Lemma 61 we have cNλNmicroNν gt 0

for all N isin N Thus dim[Nλ] le(NnN |micro|

)dim[Nmicro] dim[Nν] for all N isin N Then

from (64) follows

2NnH(λ)minuso(N) le 2Nnh(|micro|n

)2N |micro|H(micro)2N |ν|H(ν)

We conclude H(λ) le h( |micro|n

) + |micro|nH(micro) + |ν|

nH(ν)

Let x = (x(1) x(2) x(3)) be a triple of probability vectors x(i) isin Rni Let θ isin Θbe a weighting Let Hθ(x) be the θ-weighted average of the Shannon entropies ofthe probability vectors x(1) x(2) and x(3)

Hθ(x) = θ(1)H(x(1)) + θ(2)H(x(2)) + θ(3)H(x(3))

(Note that this notation is slightly different from the notation used in Chapter 4)We will use the notation λ `3 n to say that λ is a triple of partitions of n ie λequals (λ(1) λ(2) λ(3)) where each λ(i) is a partition of n We write λ for the

normalised triple (λ(1) λ(2) λ(3))

Lemma 63 Let λ micro ν `3 be three triples of partitions

(i) If gλ(i)micro(i)ν(i) gt 0 for all i then 2Hθ(λ) le 2Hθ(micro)2Hθ(ν)

(ii) If micro `3 m ν `3 nminusm and cλ(i)

micro(i)ν(i) gt 0 for all i then 2Hθ(λ) le 2Hθ(micro) + 2Hθ(ν)

Proof (i) Suppose gλ(i)micro(i)ν(i) gt 0 for all i Then H(λ(i)) le H(micro(i)) +H(ν(i)) for

all i by Lemma 62 Thussum

i θ(i)H(λ(i)) lesum

i θ(i)H(micro(i))+sum

i θ(i)H(ν(i)) Then

Hθ(λ) le Hθ(micro) +Hθ(ν) We conclude 2Hθ(λ) le 2Hθ(micro)2Hθ(ν)

(ii) Suppose cλ(i)

micro(i)ν(i) gt 0 for all i Then H(λ(i)) le mnH(micro(i))+ nminusm

nH(ν(i))+h(m

n)

by Lemma 62 We take the θ-weighted average to get Hθ(λ) le mnHθ(micro) +

nminusmmHθ(ν) + h(m

n) We conclude 2Hθ(λ) le 2Hθ(micro) + 2Hθ(ν) by Lemma 49(iv)

65 Hilbert spaces and density operators

Endow the vector space Cn with a hermitian inner product (one may take thestandard hermitian inner product 〈u v〉 =

sumni=1 uivi for u v isin Cn where middot denotes

taking the complex conjugate) so that it is a Hilbert space

66 Moment polytopes P(t) 93

Let (V1 〈middot middot〉) and (V2 〈middot middot〉) be Hilbert spaces On V1 oplus V2 we define the innerproduct by 〈u1 oplus u2 v1 oplus v2〉 = 〈u1 v1〉+ 〈u2 v3〉 On V1 otimes V2 we define the innerproduct by 〈u1 otimes u2 v1 otimes v2〉 = 〈u1 v1〉〈u2 v2〉 and extending linearly

Let V be a Hilbert space A positive semidefinite hermitian operator ρ V rarr Vwith trace one is called a density operator The sequence of eigenvalues of a densityoperator ρ is a probability vector Let spec(ρ) = (p1 pn) be the sequence ofeigenvalues of ρ ordered non-increasingly p1 ge middot middot middot ge pn

Let V1 and V2 be Hilbert spaces Given a density operator ρ on V1 otimes V2the reduced density operator ρ1 = tr2 ρ is uniquely defined by the property thattr(ρ1X1) = tr(ρ(X1otimes IdV2)) for all operators X1 on V1 The operator ρ1 is again adensity operator The operation tr2 is called the partial trace over V2 Explicitly ρ1

is given by 〈ei ρ1(ej)〉 =sum

`〈ei otimes f` ρ(ej otimes f`)〉 where the ei are some basis of V1

and the fi are some basis of V2 (the statement is independent of basis choice)Let Vi be a Hilbert space and consider the tensor product V1otimesV2otimesV3 Associate

with t isin V1 otimes V2 otimes V3 the dual element tlowast = 〈t middot〉 isin (V1 otimes V2 otimes V3)lowast Then

ρt = ttlowast〈t t〉 = t〈t middot〉〈t t〉

is a density operator on V1 otimes V2 otimes V3 Viewing ρt as a density operator on theregrouped space V1otimes (V2otimes V3) we may take the partial trace of ρt over V2otimes V3 asdescribed above We denote the resulting density operator by ρt1 = tr23 ρ

t Wesimilarly define ρt2 = tr13 ρ

t and ρt3 = tr12 ρt

66 Moment polytopes P(t)

We give a brief introduction to moment polytopes We refer to [Nes84 Bri87Fra02 Wal14] for more information We begin with the general setting and thenspecialise to orbit closures in tensor spaces

661 General setting

Let G be a connected reductive algebraic group (We refer to Kraft [Kra84] andHumphreys [Hum75] for an introduction to algebraic groups) Fix a maximal torusT sube G and a Borel subgroup T sube B sube G We have the character group X(T ) theWeyl group W the root system Φ sube X(T ) and the system of positive roots Φ+ sube ΦFor λ micro isin X(T ) we set λ 4 micro if micro minus λ is a sum of positive roots Let V bea rational G-representation The restriction of the action of G to T gives adecomposition

V =oplus

λisinX(T )

Vλ Vλ = v isin V forallt isin T t middot v = λ(t)v

This decomposition is called the weight decomposition of V The λ isin X(T )with Vλ 6= 0 are called the weights of V with respect to T The Vλ are the

94 Chapter 6 Universal points in the asymptotic spectrum of tensors

weight spaces of V For v isin V let vλ be the component of v in Vλ Letsupp(v) = λ vλ 6= 0

Let E be the real vector space E = X(T ) otimes R The Weyl group W actson X(T ) and thus on E We enlarge 4 to a partial order on E as follows Forx y isin E let x 4 y if y minus x is a nonnegative linear combination of positive rootsLet D sube E be the positive Weyl chamber For every x isin E the orbit W middot xintersects the positive Weyl chamber D in exactly one point which we denote bydom(x)

Let V be a finite-dimensional rational G-module Let χ isin X(T ) cap D bea dominant character We denote the χ-isotypical component of V with V(χ)Let Z sube V be a Zariski closed set We denote the coordinate ring of Z with C[Z]We denote the degree d part of C[Z] with C[Z]d If Z is G-stable then C[Z]d is aG-module

Definition 64 Let V be a rational G-module and Z sube V a nontrivial irreducibleclosed G-stable cone The moment polytope of Z denoted by

P(Z)

is defined as the Euclidean closure in E of the set

R(Z) = χd (C[Z]d)(χlowast) 6= 0

of normalised characters χd for which the χlowast-isotypical component (C[Z]d)(χlowast) isnot zero

Theorem 65 (MumfordndashNess [Nes84] Brion [Bri87] Franz [Fra02]) The momentpolytope is indeed a convex polytope and it is equal to the image of the so-calledmoment map intersected with the positive Weyl chamber

P(Z) = micro(Z 0) capD

Let Z = G middot v be the orbit closure (in the Zariski topology) of a vector v isin V 0and suppose G middot v is a cone

Lemma 66 (See eg [Str05]) Suppose G middot v is a cone Then

R(G middot v) = χd (C[G middot v]d)(χlowast) 6= 0= χd (lin(G middot votimesd))(χ) 6= 0

662 Tensor spaces

We specialise to 3-tensors Let V = V1 otimes V2 otimes V3 with Vi = Cni Let

G = GLn1 timesGLn2 timesGLn3

67 Quantum functionals F θ(t) 95

T = T1 times T2 times T3

with Ti the diagonal matrices in GLni The weight decomposition of V is thedecomposition with respect to the standard basis elements ex1 otimes ex2 otimes ex3 wherex isin [n1]times [n2]times [n3] The support supp(v) is the support of v with respect to thestandard basis

In the current setting there is a beautiful rephrasing of Theorem 65 in termsof ordered spectra of reduced density matrices Recall from Section 65 that forv isin V 0 we have a density matrix ρv and reduced density matrices ρvi of whichwe may take the non-increasingly ordered spectra spec(ρvi )

Theorem 67 (WalterndashDoranndashGrossndashChristandl [WDGC13]) Let Z sube V be anontrivial irreducible closed G-stable cone Then

P(Z) = (spec ρz1 spec ρz2 spec ρz3) z isin Z 0

Let v isin V 0 We consider the moment polytope of the orbit closure Z = G middot vIn this setting Lemma 66 specialises to the following

Lemma 68 (See eg [Str05])

R(G middot v) = χd (C[G middot v]d)(χlowast) 6= 0= χd (lin(G middot votimesd))(χ) 6= 0= χd Pχv

otimesd 6= 0

where Pχ = Pχ(1) otimes Pχ(2) otimes Pχ(3) with Pχ(i) V otimesdi rarr V otimesdi the projector onto the

isotypical component of type χ(i) discussed in Section 62

On the other hand Theorem 67 immediately gives a description of the momentpolytope P(G middot v) in terms of ordered spectra of reduced density matrices

Theorem 69 Let v isin V 0 Then

P(G middot v) = (spec ρu1 spec ρu2 spec ρu3) u isin G middot v 0

Summarising we have two descriptions of the moment polytope a represen-tation-theoretic or invariant-theoretic description (Lemma 68) and a quantummarginal spectra description (Theorem 69) These two descriptions are the keyto proving the properties of the quantum functionals that we need

67 Quantum functionals F θ(t)

We will now define the quantum functionals and prove that they are universalspectral points

96 Chapter 6 Universal points in the asymptotic spectrum of tensors

Let p = (p1 pn) isin Rn be a probability vector iesumn

i=1 pi = 1 andpi ge 0 for all i isin [n] Recall that H(p) denotes the Shannon entropy of theprobability vector p H(p) =

sumni=1 pi log2 1pi Let x = (x(1) x(2) x(3)) be a

triple of probability vectors x(i) isin Rni Let θ isin Θ be a weighting Recallthat Hθ(x) denotes the θ-weighted average of the Shannon entropies of the threeprobability vectors x(1) x(2) x(3)

Hθ(x) = θ(1)H(x(1)) + θ(2)H(x(2)) + θ(3)H(x(3))

Let V = Cn1 otimes Cn2 otimes Cn3 Let G = GLn1 timesGLn2 timesGLn3 Let v isin V 0 Weuse the notation P(v) = P(G middot v) for the moment polytope of the orbit closureof v

Definition 610 For θ isin Θ and v isin V 0 let

F θ(v) = max2Hθ(x) x isin P(v)

Let F θ(0) = 0 We call the functions F θ the quantum functionals The namequantum functional comes from the fact that the moment polytope P(t) consistsof triples of quantum marginal entropies

Theorem 611 Let T be the semiring of 3-tensors over C Let 6 be the restrictionpreorder For θ isin Θ

F θ isin X(T 6)

In other words F θ is a semiring homomorphism T rarr Rge0 which is monotoneunder degeneration 6 In fact F θ is monotone under degeneration

Remark 612 The results in this chapter generalise to k-tensors over C In ourpaper [CVZ18] we discuss this general situation in detail and make a distinctionbetween upper quantum functionals and lower quantum functionals

Let p isin Rn and q isin Rm be probability vectors The tensor product potimesq isin Rnm

defined by

potimes q = (piqj i isin [n] j isin [m])

is a probability vector The direct sum poplus q isin Rn+m defined by

poplus q = (p1 pn q1 qm)

is a probability vectorLet x = (x(1) x(2) x(3)) and y = (y(1) y(2) y(3)) be triples of probability vectors

We define the tensor product xotimes y elementwise

xotimes y = (x(1) otimes y(1) x(2) otimes y(2) x(3) otimes y(3))

67 Quantum functionals F θ(t) 97

We define the direct sum xoplus y elementwise

xoplus y = (x(1) oplus y(1) x(2) oplus y(2) x(3) oplus y(3))

For x otimes y and x oplus y to be in the moment polytope we will need to reorder thecomponents non-increasingly For a triple of probability vectors x = (x(1) x(2) x(3))let

dom(x)

be the triple of probability vectors obtained from x be reordering the compo-nents x(i) such that they become non-increasing Let dom(S) = dom(x) x isin S

For v isin Cn1otimesCn2otimesCn3 we will use the notation G(v) = GLn1timesGLn2timesGLn3

to denote the group that naturally corresponds to the space that v lives in Wewill use the notation P(v) = P(G(v) middot v) for the moment polytope of the orbitclosure of v

Theorem 613 Let s isin Cn1 otimes Cn2 otimes Cn3 and t isin Cm1 otimes Cm2 otimes Cm3

(i) dom(P(s)otimesP(t)

)sube P(sotimes t)

(ii) forallα isin [0 1] dom(αP(s)oplus (1minus α) P(t)

)sube P(soplus t)

(iii) If s t isin Cn1 otimes Cn2 otimes Cn3 0 and s isin G(t) middot t then P(s) sube P(t)

(iv) P(soplus 0) = P(s)oplus 0

(v) P(〈1〉) = ((1) (1) (1)) with 〈1〉 = e1 otimes e1 otimes e1 isin C1 otimes C1 otimes C1

Proof To prove statements (i) and (ii) let x isin P(s) and y isin P(t) Then thereare elements a isin G(s) middot s and b isin G(t) middot t with ordered marginal spectra x and y

x = (spec ρa1 spec ρa2 spec ρa3)

y = (spec ρb1 spec ρb2 spec ρb3)

We prove statement (i) We have aotimes b isin G(sotimes t) middot sotimes t Thus

dom(xotimes y) = (spec ρaotimesb1 spec ρaotimesb2 spec ρaotimesb3 ) isin P(sotimes t)

We conclude dom(P(s)otimesP(t)) sube P(sotimes t) We prove statement (ii) Let α isin [0 1]Define the tensor u(α) isin Cn1+m1 otimes Cn2+m2 otimes Cn3+m3 by

u(α) =

radicαradic〈s s〉

aoplusradic

1minus αradic〈t t〉

b

Then u(α) isin G(soplus t) middot soplus t We have ρu(α)i = αρai oplus (1 minus α)ρbi From the

observation

spec(αρai oplus (1minus α)ρbi) = dom(αxoplus (1minus α)y)

98 Chapter 6 Universal points in the asymptotic spectrum of tensors

follows dom(αxoplus (1minus α)y) isin P(G(soplus t) middot soplus t) We conclude

dom(αP(s)oplus (1minus α)P(t)) sube P(soplus t)

We have thus proven statement (i) and (ii)We prove statement (iii) Let G = G(t) = G(s) Let s isin G middot t Then

G middot s sube G middot t so we have a G-equivariant restriction map C[G middot s] C[G middot t] onthe coordinate rings Let χd isin R(G middot s) with (C[G middot s]d)(χlowast) 6= 0 Then also(C[G middot t]d)(χlowast) 6= 0 by Schurrsquos lemma Thus χd isin R(G middot t) sube P(G middot t) Weconclude P(s) sube P(t)

We prove statement (iv) Let χd isin R(G(soplus 0) middot (soplus 0)) with Pχ(soplus0)otimesd 6= 0Recall from Section 62 that Pχ is given by the action of an element in the groupalgebra C[Sd] which we also denoted by Pχ From this viewpoint we see that also

Pχsotimesd 6= 0 So χd isin R(G(s) middot s)Statement (v) is a direct observation

Corollary 614

(i) F θ(s)F θ(t) le F θ(sotimes t)

(ii) F θ(s) + F θ(t) le F θ(soplus t)

(iii) If s t then F θ(s) le F θ(t)

(iv) F θ(〈1〉) = 1

Proof (i) Let x isin P(s) and y isin P(t) Then xotimesy isin P(sotimest) by Theorem 613 It isa basic fact that Hθ(x)+Hθ(y) = Hθ(xotimesy) (Lemma 49) so 2Hθ(x)2Hθ(y) = 2Hθ(xotimesy)We conclude F θ(s)F θ(t) le F θ(sotimes t)

(ii) Let x isin P(s) and y isin P(t) Then by Theorem 613 for all α isin [0 1]

dom(αxoplus (1minus α)y) isin P(soplus t)

It is a basic fact that αHθ(x) + (1 minus α)Hθ(y) + h(α) = Hθ(αx oplus (1 minus α)y)(Lemma 49) Thus for any α isin [0 1] we have 2αHθ(x)+(1minusα)Hθ(y)+h(α) le F θ(soplus t)Using Lemma 49(iv) we conclude F θ(s) + F θ(t) le F θ(soplus t)

(iii) This follows from statement (iii) and (iv) of Theorem 613 since bydefinition degeneration s t means soplus 0 isin G(toplus 0) middot (toplus 0)

(iv) This follows from statement (v) of Theorem 613

67 Quantum functionals F θ(t) 99

Theorem 615

(i) R(sotimes t) sube λN existmicroN isin R(s) νN isin R(t) gλ(i)micro(i)ν(i) gt 0 for all i

(ii) R(soplus t) sube λN existmicrom isin R(s) ν(N minusm) isin R(t) cλ(i)

micro(i)ν(i) gt 0 for all i

Proof (i) Let s isin V1 otimes V2 otimes V3 and let t isin W1 otimesW2 otimesW3 Let λN isin R(sotimes t)with Pλ(sotimes t)otimesN 6= 0 Let π be the natural reordering map

π ((V1 otimesW1)otimes (V2 otimesW2)otimes (V3 otimesW3))otimesN

rarr (V1 otimes V2 otimes V3)otimesN otimes (W1 otimesW2 otimesW3)otimesN

Then

(sotimes t)otimesN =summicroν

πminus1(Pmicro otimes Pν)π(sotimes t)otimesN

Let micro ν `3 N with Pλπminus1(Pmicro otimes Pν)π(s otimes t)otimesN 6= 0 Then Pmicros

otimesN 6= 0 andPνt

otimesN 6= 0 ie microN isin R(s) and νN isin R(t) Moreover Pλπminus1(Pmicro otimes Pν)π 6= 0

which means the Kronecker coefficients gλ(i)micro(i)ν(i) are nonzero(ii) Let λN isin R(soplus t) with Pλ(soplus t)otimesN 6= 0 Let us expand (soplus t)otimesN as

(soplus t)otimesN = sotimesN oplus (sotimesNminus1 otimes t)oplus middot middot middot oplus totimesN

Then Pλ does not vanish on some summand which we may assume to be of theform sotimesm otimes totimesNminusm Let π be the natural projection

π ((V1 oplusW1)otimes (V2 oplusW2)otimes (V3 oplusW3))otimesN

rarr (V1 otimes V2 otimes V3)otimesm otimes (W1 otimesW2 otimesW3)otimesNminusm

Let micro ν with Pλπminus1(Pmicro otimes Pν)π(soplus t)otimesN 6= 0 Then Pmicros

otimesm 6= 0 and PνtotimesNminusm 6= 0

Moreover Pλπminus1(Pmicro otimes Pν)π 6= 0 Therefore the LittlewoodndashRichardson coeffi-

cients cλ(i)

micro(i)ν(i) are nonzero

Corollary 616

(i) F θ(sotimes t) le F θ(s)F θ(t)

(ii) F θ(soplus t) le F θ(s) + F θ(t)

Proof (i) Let λN isin R(s otimes t) By Theorem 615 there is a microN isin R(s) and aνN isin R(t) such that the Kronecker coefficient gλ(i)micro(i)ν(i) is nonzero for every i

Then 2Hθ(micro) le F θ(s) and 2Hθ(ν) le F θ(t) by definition of F θ The Kroneckercoefficients being nonzero implies

2Hθ(λ) le 2Hθ(micro)2Hθ(ν)

100 Chapter 6 Universal points in the asymptotic spectrum of tensors

by Lemma 63 We conclude F θ(sotimes t) le F θ(s)F θ(t)

(ii) Let λN isin R(s oplus t) Then by Theorem 615 there are microm isin R(s) and

ν(N minus m) isin R(t) such that the LittlewoodndashRichardson coefficient cλ(i)

micro(i)ν(i) isnonzero for every i This means

2Hθ(λ) le 2Hθ(micro) + 2Hθ(ν)

by Lemma 63 We conclude F θ(soplus t) le F θ(s) + F θ(t)

Proof of Theorem 611 Corollary 614 and Corollary 616 together prove The-orem 611

68 Outer approximation

In this section we discuss an outer approximation of P(t) We will use this outerapproximation to show that the quantum functionals are at most the supportfunctionals

Let 4 be the dominance order ie majorization order on triples of probabilityvectors For any set S sube Rn1 times Rn2 times Rn3 of triples of probability vectors let S4

denote the upward closure with respect to 4

S4 = y isin Rn1 times Rn2 times Rn3 existx isin S x 4 y

Let conv(S) denote the convex hull of S in Rn1 times Rn2 times Rn3 Recall that for x isinS we defined dom(x) as the triple of probability vectors obtained from x =(x(1) x(2) x(3)) by reordering the components x(i) such that they become non-increasing and dom(S) = dom(x) x isin S

Theorem 617 (Strassen [Str05]) Let v isin V 0 Then

P(v) sube (dom conv supp v)4 (65)

Proof We give the proof for the convenience of the reader Let χd isin R(G middot v)Then (lin(G middot votimesd))(χ) 6= 0 Let Mχ sube lin(G middot votimesd) be a simple G-submodule withhighest weight χ Let N sube V otimesd be the G-module complement N oplusMχ = V otimesdThen votimesd is not in N Let v =

oplusγisinsupp v vγ be the weight decomposition Then votimesd

is a sum of tensor products of the vγ At least one summand is not in N say ofweight η =

sumγ dγγ with

sumγ dγ = d The projection V otimesd rarrMχ along N maps this

summand onto a nonzero weight vector of weight η So η is a weight of Mχ Thenalso dom(η) is a weight of Mχ Since χ is the highest weight of Mχ dom(η) 4 χ

Then dom(ηd) 4 χd We have ηd =sum

γdγdγ isin conv supp v We conclude

R(G middot v) sube (dom conv supp v)4 and thus P(G middot v) sube (dom conv supp v)4

69 Inner approximation for free tensors 101

69 Inner approximation for free tensors

In this section we discuss an inner approximation for the moment polytope of afree tensor We will use this inner approximation in the next section to prove thatthe quantum functionals coincide with the support functionals when restricted tofree tensors We will prove that not all tensors are free

We say a set Φ sube [n1] times [n2] times [n3] is free if every two different elementsof Φ differ in at least two coordinates in other words if the elements of Φ haveHamming distance at least two We say v isin V = Cn1 otimes Cn2 otimes Cn3 is free if forsome g isin G(v) = GLn1 timesGLn2 timesGLn3 the support supp(g middot v) sube [n1]times [n2]times [n3]is free (Free is called schlicht in [Str05])

Theorem 618 (Strassen [Str05]) Let v isin V 0 with supp(v) free Then

dom conv supp v sube P(v)

Proof We refer to [Str05]

Corollary 619 Let v isin V 0 with supp(v) free Then

P(v)4 =(dom conv supp v

)4

Proof By Theorem 618 dom conv supp v sube P(v) We take the upward closureon both sides to get (dom conv supp v)4 sube P(v)4 On the other hand fromTheorem 617 follows P(v)4 sube (dom conv supp v)4

Remark 620 Recall that v isin V is oblique if the support supp(g middot v) is anantichain for some g isin G(v) (Section 44) Such antichains are free so obliquetensors are free Thus tight sube oblique sube free Like the tight tensors andoblique tensors free tensors from a semigroup under otimes and oplus

Proposition 621 For n ge 5 there exists a tensor that is not free in CnotimesCnotimesCn

Proof We upper bound the maximal size of a free support Let Φ sube [n]times [n]times [n]be free Any two distinct elements in Φ are still distinct if we forget the thirdcoefficient of each Therefore |Φ| = |(α1 α2) α isin Φ| le n2 (This is a specialcase of the Singleton bound [Sin64] from coding theory This upper bound is tightsince Φ = (a b c) a b c isin [n] c = a+ b mod n is free and has size n2) Secondwe apply the following observation of Burgisser [Bur90 page 3] Let

Zn = t isin Cn otimes Cn otimes Cn existg isin G(t) |supp(g middot t)| lt n3 minus 3n2

Let Yn = Cn otimes Cn otimes Cn Zn Then the set Yn is Zariski open and nonemptyNow let n ge 5 and let t isin Yn Then forallg isin G(t) |supp(g middot t)| ge n3 minus 3n2 gt n2 Weconclude t is not free

102 Chapter 6 Universal points in the asymptotic spectrum of tensors

610 Quantum functionals versus support func-

tionals

We discussed the support functionals ζθ isin X(oblique 3-tensors over F) in Chap-ter 4 We recall its definition over C Let V = Cn1otimesCn2otimesCn3 For θ isin Θ = P([3])and t isin V 0 with supp(t) oblique

ζθ(t) = max2Hθ(P ) P isin P(supp(t))

We also discussed an extension of ζθ to all 3-tensors over C the upper supportfunctional

ζθ(t) = mingisinG(t)

max2Hθ(P ) P isin P(supp(g middot t))

We know ζθ(s otimes t) le ζθ(s)ζθ(t) ζθ(s oplus t) = ζθ(s) + ζθ(t) ζθ(〈1〉) = 1 ands 6 trArr ζθ(s) le ζθ(t) for any s t isin V

The set conv supp(g middot t) is the set of marginals of probability distributions onsupp(g middot t) Thus dom conv supp(g middot t) is the set of ordered marginals of probabilitydistributions on supp(g middot t) Therefore

ζθ(t) = mingisinG(t)

maxxisinS(gmiddott)

2Hθ(x)

with S(w) = dom conv suppw Let X sube Rn1 times Rn2 times Rn3 be a set of triples ofprobability vectors From Schur-convexity of the Shannon entropy function followsmaxxisinX 2Hθ(x) = maxxisinX4 2Hθ(x) Also Hθ(x) = Hθ(domx)

Theorem 622 ζθ(t) ge F θ(t)

Proof Let g isin G(t) such that

maxxisinS

2Hθ(x) = ζθ(t)

with S = dom conv supp(g middot t) We have

maxxisinS

2Hθ(x) = maxxisinS4

2Hθ(x)

By Theorem 617 P(t) sube S4 We conclude F θ(t) le ζθ(t)

Theorem 623 Let t isin V be free Then ζθ(t) = F θ(t)

Proof We know from Theorem 622 that ζθ(t) ge F θ(t) We prove ζθ(t) le F θ(t)Let g isin G(t) such that supp(g middot t) is free Let S = dom conv supp(g middot t) Thenζθ(t) le maxxisinS 2Hθ(x) = maxxisinS4 2Hθ(x) By Theorem 618 we have S4 = P(t)4We conclude ζθ(t) le F θ(t)

611 Asymptotic slice rank 103

We can show that the regularised upper support functional equals the quantumsupport functional As a consequence the quantum functional is at least the lowersupport functional which was discussed in Chapter 4

Theorem 624 limnrarrinfin1nζθ(totimesn)1n = F θ(t)

Proof We refer the reader to [CVZ18]

Corollary 625 F θ(v) ge ζθ(v)

Proof By Theorem 624 F θ(v) = limnrarrinfin ζθ(votimesn)1n We know ζθ(v) ge ζθ(v)

by Theorem 415 and thus limnrarrinfin ζθ(votimesn)1n ge limnrarrinfin ζθ(v

otimesn)1n The lowersupport functional ζθ is supermultiplicative under otimes (Theorem 414) so

limnrarrinfin

ζθ(votimesn)1n ge ζθ(v)

Combining these three inequalities proves the theorem

611 Asymptotic slice rank

We proved in Section 46 that for oblique t isin Fn1 otimes Fn2 otimes Fn3 the asymptotic slicerank limnrarrinfin SR(totimesn)1n exists and equals minθisinΘ ζ

θ(t) with Θ = P([3]) In thissection we prove the analogous statement for the quantum functionals

Theorem 626 Let t isin Cn1 otimes Cn2 otimes Cn3 Then

limnrarrinfin

SR(totimesn)1n = minθisinΘ

F θ(t)

We work towards the proof of Theorem 626 Let t isin Cn1 otimesCn2 otimesCn3 0 LetEθ(t) = log2 F

θ(t)

Lemma 627 For any ε gt 0 there is an n0 isin N such that for all n ge n0 there isa λn isin R(t) with miniisin[3]H(λ(i)) ge minθisinΘE

θ(t)minus ε

Proof By definition

minθisinΘ

Eθ(t) = minθisinΘ

maxxisinP(t)

sumjisin[3]

θ(j)H(x(j))

By Von Neumannrsquos minimax theorem the right-hand side equals

maxxisinP(t)

minθisinΘ

sumjisin[3]

θ(j)H(x(j))

which equals

maxxisinP(t)

minjisin[3]

H(x(j))

104 Chapter 6 Universal points in the asymptotic spectrum of tensors

Let ε gt 0 Let microm isin R(t) with minjisin[3] H(micro(j)) ge minθisinΘ Eθ(t)minus ε2 We will

use two facts We have (P(1) otimes P(1) otimes P(1))t = t 6= 0 The triples of partitions λwith Pλt

otimesn 6= 0 for some n form a semigroup Let n isin N We can write n = qm+rwith q r isin N 0 le r lt m Let λ(j) = qmicro(j) + (r) Then by the semigroup property

Pλtotimesn 6= 0 ie λn isin R(t) We have 1

n(qmicro(j) + (r)) = qm

nmicro(j) + r

n(r) By concavity

of Shannon entropy

H( 1n(qmicro(j) + (r))) = H( qm

nmicro(j) + r

n(r))

ge qmnH(micro(j))

ge (1minus mn

)H(micro(j))

When n is large enough (1minus mn

)H(micro(j)) is at least H(micro(j))minus ε2 Let n0 isin N suchthat this is the case for all j isin [3]

Lemma 628 Let λn isin R(t) Then SR(totimesn) ge miniisin[3] dim[λ(i)]

Proof We have the restriction totimesn ge Pλtotimesn 6= 0 Choose rank-one projections Aj

in the vector spaces Sλ(j)(Cnj) with

s = (id[λ(1)]otimesA1)otimes (id[λ(2)]otimesA2)otimes (id[λ(3)]otimesA3)Pλtotimesn 6= 0

The tensor s is invariant under Sn acting diagonally on (Cn1)otimesnotimes(Cn2)otimesnotimes(Cn3)otimesnThus the marginal spectra spec ρsi are uniform This implies s is semistableFrom [BCC+17 Theorem 46] follows that SR(s) equals miniisin[3] dim[λ(i)]

Lemma 629 lim infnrarrinfin SR(totimesn)1n ge minθisinΘ Fθ(t)

Proof Let ε gt 0 For n large enough choose λn isin R(t) as in Lemma 627 ByLemma 628 SR(totimesn) ge miniisin[3] dim[λ(i)] The right-hand side we lower bound by

miniisin[3]

dim[λ(i)] ge miniisin[3]

2nH(λ(i))2minuso(n) ge 2n(minθisinΘ Eθ(t)minusε)2minuso(n)

Then lim infnrarrinfin SR(totimesn)1n ge 2minθisinΘ Eθ(t)minusε

Lemma 630 lim supnrarrinfin SR(totimesn)1n le F θ(t)

Proof Let n isin N Define s1 s2 s3 isin Cn1 otimes Cn2 otimes Cn3 by

s1 =(sumλ(1)`n

H(λ(1))leEθ(t)

Pλ(1) otimes Idotimes Id)totimesn

s2 =(sumλ(2)`n

H(λ(2))leEθ(t)

Idotimes Pλ(2) otimes Id)

(totimesn minus s1)

612 Conclusion 105

s3 =(sumλ(3)`n

H(λ(3))leEθ(t)

Idotimes Idotimes Pλ(3)

)(totimesn minus s1 minus s2)

Then totimesn = s1 +s2 +s3 The slice rank of an element in the image of Pλ(1)otimes Idotimes Id

is at most dim[λ(1)] otimes Sλ(1)(Cn1) which is at most 2nH(λ(1))+o(n) (Section 62)Similarly for Id otimes Pλ(2) otimes Id and Id otimes Id otimes Pλ(3) The tensor s1 is in the imageof the sum

sumλ(1) Pλ(1) otimes Id otimes Id over λ(1) ` n with at most n1 parts There are

at most (n+ 1)n1 such partitions Thus SR(s1) le (n+ 1)n12nEθ(t)+o(n) Similarly

for s2 and s3 Therefore

lim supnrarrinfin

SR(totimesn)1n le lim supnrarrinfin

(3(n+ 1)maxiisin[3] ni 2nE

θ(t)+o(n))1n

(66)

The right-hand side of (66) equals F θ(t)

Proof of Theorem 626 Lemma 629 and Lemma 630 together prove Theo-rem 626

612 Conclusion

In this chapter we constructed the first infinite family of spectral points for 3-tensors over C the quantum functionals For 30 years the only explicit spectralpoints known were the gauge points The constructions in this chapter naturallygeneralise to higher-order tensors for which we refer to our paper [CVZ18] Wedo not know whether the quantum functionals are all spectral points for 3-tensorsover C Finally we showed that for complex tensors the asymptotic slice rankexists and equals the minimum value over the quantum functionals

Chapter 7

Algebraic branching programsapproximation and nondeterminism

This chapter is based on joint work with Karl Bringmann and Christian

Ikenmeyer [BIZ17]

71 Introduction

The study of asymptotic tensor rank in previous chapters was originally motivatedby the study of the complexity of matrix multiplication in the algebraic circuitmodel an algebraic model of computation In this chapter we will study severalother algebraic models of computation and algebraic complexity classes

Formulas the class VPe and the determinant

An (arithmetic) formula is a rooted binary tree whose leaves are each labeledwith a variable or a field constant and whose root and intermediate vertices arelabeled with either + (addition) or times (multiplication) In the natural way viarecursion over the tree structure a formula computes a multivariate polynomial f The formula size of a multivariate polynomial f is the smallest number of verticesrequired for any formula to compute f Here is an example of a formula of size 7computing the polynomial (3 + x)(3 + y)

3 x 3 y

+ +

times

A sequence of multivariate polynomials (fn)nisinN is called a family Valiant inhis seminal paper [Val79] introduced the complexity class VPe that is defined as

107

108 Chapter 7 Algebraic branching programs

the set of all families whose formula size is polynomially bounded (We say asequence (an)n isin NN of natural numbers is polynomially bounded if there exists aunivariate polynomial q such that an le q(n) for all n) For example the family((x1)

n + (x2)n + middot middot middot+ (xn)n)n is in VPe because the formula size of this family

grows quadratically

The smallest known formulas for the determinant family detn have size nO(logn)This follows from Berkowitzrsquo algorithm [Ber84] which gives an algebraic cir-cuit of depth O(log2 n) and thus by expanding we get an algebraic formula ofdepth O(log2 n) whose size is then trivially bounded by 2O(log2 n) = nO(logn) Itis a major open question in algebraic complexity theory whether formulas ofpolynomially bounded size exist for detn This question can be phrased in termsof complexity classes as asking whether or not the inclusion VPe sube VPs is strict(We will define VPs shortly)

Motivated by this question we study the closure class VPe of families ofpolynomials that can be approximated arbitrarily closely by families in VPe

(see Section 724 for the formal definition) Over the field R or C one can thinkof VPe as the set of families whose border formula size is polynomially boundedThe border formula size of a polynomial f is the smallest number c such that thereexists a sequence gi of polynomials with formula size at most c and limirarrinfin gi = f

Continuous lower bounds

In algebraic complexity theory problem instances correspond to vectors v isin FnA complexity lower bound often takes the form of a function f Fn rarr F that is zeroon the vectors of ldquolow complexityrdquo and nonzero on v We refer to Grochow [Gro13]for a discussion of settings where complexity lower bounds are obtained in thisway (eg [NW97 Raz09 LO15 GKKS13 LMR13 BI13]) Over the complexnumbers we can in fact assume that these functions f are continuous [Gro13](and even so-called highest-weight vector polynomials) If C and D are algebraiccomplexity classes with C sube D (for example C = VPe and D = VPs) thena proof of separation D 6sube C in this continuous manner implies the strongerseparation D 6sube C In our case it is thus natural to aim for the separation VPs 6subeVPe instead of the slightly weaker VPs 6sube VPe which provides further motivationfor studying VPe This is exactly analogous to the geometric complexity theoryapproach of Mulmuley and Sohoni (see eg [MS01 MS08] and the exposition[BLMW11 Sec 9]) which aims to prove the separation VNP 6sube VPs to attackValiantrsquos famous conjecture VPs 6= VNP [Val79] (Here VNP is the class ofp-definable families see Section 724)

New results in this chapter

We prove two new results in this chapter

71 Introduction 109

Algebraic branching programs of width 2 An algebraic branching pro-gram (abp) is a directed acyclic graph with a source vertex s and a sink vertex tthat has affine linear forms over the base field F as edge labels Moreover werequire that each vertex is labeled with an integer (its layer) and that edges in theabp only point from vertices in layer i to vertices in layer i+ 1 The width of anabp is the cardinality of its largest layer The size of an abp is the number of itsvertices The value of an abp is the sum of the values of all sndasht-paths where thevalue of an sndasht-path is the product of its edge labels We say that an abp computesits value The class VPs coincides with the class of families of polynomials thatcan be computed by abps of polynomially bounded size see eg [Sap16]

For k isin N we introduce the class VPk as the class of families of polyno-mials computable by width-k abps of polynomially bounded size It is well-known (see Lemma 72) that VPk sube VPe for all k ge 1 In 1992 Ben-Or andCleve [BOC92] showed that VPk = VPe for all k ge 3 In 2011 Allender andWang [AW16] showed that width-2 abps cannot compute every polynomial so inparticular we have a strict inclusion VP2 ( VP3

We prove that the closure of VP2 and the closure of VPe are equal

VP2 = VPe (71)

when char(F) 6= 2 From (71) and the result of Allender and Wang follows directlythat the inclusion VP2 ( VP2 is strict We have thus separated a complexityclass from its approximation closure

VNP via affine linear forms Every algebraic complexity class has a nondeter-ministic closure (see Section 725 for the definition) The nondeterministic closureof VP is called VNP and the nondeterministic closure of VPe is called VNPeIn 1980 Valiant [Val80] proved VNPe = VNP The nondeterministic closureof VP1 and VP2 we call VNP1 and VNP2 Using interpolation techniques wecan deduce VNP2 = VNP from (71) provided the field is infinite Using moresophisticated techniques we prove

VNP1 = VNP (72)

From (72) easily follows VP1 ( VNP1 Also from [AW16] we get VP2 ( VNP2We have thus separated complexity classes from their nondeterministic closures

Further related work

An excellent exposition on the history of small-width computation can be foundin [AW16] along with an explicit polynomial that cannot be computed by width-2abps namely x1x2 + x3x4 + middot middot middot+ x15x16 Saha Saptharishi and Saxena in [SSS09Cor 14] showed that x1x2 + x3x4 + x5x6 cannot be computed by width-2 abpsthat correspond to the iterated matrix multiplication of upper triangular matrices

110 Chapter 7 Algebraic branching programs

Burgisser in [Bur04] studied approximations in the model of general algebraiccircuits finding general upper bounds on the error degree For most algebraiccomplexity classes C the relation between C and C has not been an activeobject of study As pointed out recently by Forbes [For16] Nisanrsquos result [Nis91]implies that C = C for C being the class of size-k algebraic branching programson noncommuting variables A structured study of VP and VPs was startedin [GMQ16] Much work in lower bounds for algebraic approximation algorithmshas been done in the area of bilinear complexity dating back to [BCRL79 Str83Lic84] and more recently eg [Lan06 LO15 HIL13 Zui17 LM16a]

This chapter is organised as follows In Section 72 we discuss definitions andbasic results In Section 73 we prove that the approximation closure of VP2

equals the approximation closure of VPe ie VP2 = VPe In Section 74 we provethat the nondeterminism closure of VP1 equals VNP

72 Definitions and basic results

We briefly recall the definition of circuits formulas and branching programs andwe recall the definition of the corresponding complexity classes Then we discusssome straightforward relationships among these classes and review the proof of atheorem by Ben-Or and Cleve which inspired our work Finally we discuss theapproximation closure and the nondeterminism closure for algebraic complexityclasses

721 Computational models

Let x1 x2 be formal variables By F[x] we mean the ring of polynomials over Fwith variables x1 x2 xk with k large enough

A circuit is a directed acyclic graph G with one or more source vertices andone sink vertex Each source vertex is labelled by a variable xi or a constant c isin FThe other vertices are labelled by either + or times and have in-degree 2 (that isfan-in 2) Each vertex computes an element in F[x] by recursion over the graphThe element computed by the sink is the element computed by the circuit Thesize of a circuit is the number of vertices

A formula is a circuit whose graph is a treeAn algebraic branching program (abp) is a directed acyclic graph with a source

vertex s and a sink vertex t that has affine linear forms αxi + β α β isin F asedge labels Moreover we require that each vertex is labeled with an integer (itslayer) and that edges in the abp only point from vertices in layer i to vertices inlayer i+ 1 The width of an abp is the cardinality of its largest layer The size ofan abp is the number of its vertices The value of an abp is the sum of the valuesof all sndasht-paths where the value of an sndasht-path is the product of its edge labels

72 Definitions and basic results 111

We say that an abp computes its value

For example the following abp has depth 5 width 3 and computes thepolynomial x1x2 + x2 + 2x1 minus 1

x1 2

x1x2minus1

An abp G corresponds naturally to an iterated product of matrices for any twoconsecutive layers Li Li+1 in G let Mi be the matrix (evw)visinLiwisinLi+1

with evwthe label of the edge from v to w (or 0 if there is no edge from v to w) Then thevalue of G equals the product Mk middot middot middotM2M1

For example the above abp corresponds to the following iterated matrixproduct

(1 1 1

)minus1 0 00 x2 00 0 x1

1 0 0x1 1 00 0 2

1

11

722 Complexity classes VP VPe VPk

The circuit size of a polynomial f is the size of the smallest circuit computing f The formula size of a polynomial f is the size of the smallest formula computing f

A family is a sequence (fn)nisinN of multivariate polynomials over F A class is aset of families The class VP consists of all families (fn) with circuit size degreeand number of variables in poly(n) The class VPe consists of all families (fn)with formula size in poly(n) (The origin of the subscript e in VPe is the termldquoarithmetic expressionrdquo) Clearly VPe sube VP

We introduce classes defined by abps Let k ge 1 The class VPk consists of allfamilies computed by polynomial-size width-k abps with edges labelled by affinelinear forms

sumi αixi + β with coefficients αi β isin F

We note that the above classes depend on the choice of the ground field F

In our paper [BIZ17] we make a distinction between three different types ofedge labels for abps The class VPk in this chapter corresponds to the class VPg

k

in [BIZ17]

112 Chapter 7 Algebraic branching programs

723 The theorem of Ben-Or and Cleve

This subsection is about the relations among VPk and VPe

Lemma 71 VPk sube VP` when k le `

Proof This is clearly true

Lemma 72 VPk sube VPe for any k

Proof For the simple proof we refer to [BIZ17]

Ben-Or and Cleve [BOC92] showed that for k ge 3 the classes VPk and VPe

are in fact equal

Theorem 73 (Ben-Or and Cleve [BOC92]) For k ge 3 VPk = VPe

We will review the construction of Ben-Or and Cleve here because we will useit to prove Theorem 78 and Theorem 715 The following depth-reduction lemmafor formulas by Brent is a crucial ingredient

Lemma 74 (Brent [Bre74]) Let f be an n-variate degree-d polynomial computedby a formula of size s Then f can also be computed by a formula of size poly(s n d)and depth O(log s)

Proof See the survey of Saptharishi [Sap16 Lemma 55] for a modern proof

Proof of Theorem 73 Lemma 72 says VPk sube VPe We will prove theinlusion VPe sube VP3 from which follows VPe sube VPk by Lemma 71 andthus VPk = VPe For a polynomial h define the matrix

M(h) =

1 0 0h 1 00 0 1

which as part of an abp looks like

h

We call the following matrices primitive

bull M(h) with h any variable or any constant in F

bull the 3times 3 permutation matrices denoted by Mπ with π isin S3

bull the diagonal matrices Mabc = diag(a b c) with a b c isin F

72 Definitions and basic results 113

The entries of the primitives are variables or constants in F making them suitableto use in the construction of a width-3 abp

Let (fn) isin VPe Then fn can be computed by a formula of size s(n) isin poly(n)By Brentrsquos depth-reduction theorem for formulas (Lemma 74) fn can be computedby a formula of size poly(n) and depth d(n) isin O(log s(n))

We will construct a sequence of primitives A1 Am(n) such that

A1 middot middot middotAm(n) =

1 0 0fn 1 00 0 1

with m(n) isin O(4d(n)) = poly(n) Then

fn(x) = ( 1 1 1 )Mminus110A1 middot middot middotAm(

111

)

so fn(x) can be computed by a width-3 abp of length poly(n) proving the theoremTo explain the construction let h be a polynomial and consider a formula

computing h of depth d The goal is to construct (recursively on the formulastructure) primitives A1 Am such that

A1 middot middot middotAm =

1 0 0h 1 00 0 1

with m isin O(4d)

Suppose h is a variable or a constant Then M(h) is itself a primitive matrixSuppose h = f + g is a sum of two polynomials f g and suppose M(f) and

M(g) can be written as a product of primitives Then M(f + g) equals a productof primitives because M(f + g) = M(f)M(g) This can easily be verified directlyor by noting that in the corresponding partial abps the top-bottom paths (ui-vjpaths) have the same value

u1 u2 u3

v1 v2 v3

f

gsim

u1 u2 u3

v1 v2 v3

f+g

Suppose h = fg is a product of two polynomials f g and suppose M(f) andM(g) can be written as a product of primitives Then M(fg) equals a product ofprimitives because

M(f middot g) = M(23)

(M1minus11M(123)M(g)M(132)M(f)

)2M(23)

114 Chapter 7 Algebraic branching programs

(here (23) isin S3 denotes the transposition 1 7rarr 1 2 7rarr 3 3 7rarr 2 and (123) isin S3

denotes the cyclic shift 1 7rarr 2 2 7rarr 3 3 7rarr 1) as can be verified either directly orby checking that in the corresponding partial abps the top-bottom paths (ui-vjpaths) have the same value

u1 u2 u3

v1 v2 v3

f

minus1

g

f

g

minus1

sim

u1 u2 u3

v1 v2 v3

f middotg

This completes the construction

The length m of the construction is m(h) = 1 for h a variable or constant andrecursively m(f + g) = m(f) +m(g) m(f middot g) = 2(m(f) +m(g)) so m isin O(4d)where d is the formula size of h

The above result of Ben-Or and Cleve (Theorem 73) raises the intriguingquestion whether the inclusion VP2 sube VPe is strict Allender and Wang [AW16]show that the inclusion is indeed strict in fact they show that some polynomialscannot be computed by any width-2 abp

Theorem 75 (Allender and Wang [AW16]) The polynomial

x1x2 + x3x4 + middot middot middot+ x15x16

cannot be computed by any width-2 abp Therefore we have the separation ofclasses VP2 ( VP3 = VPe

72 Definitions and basic results 115

724 Approximation closure C

We define the norm of a complex multivariate polynomial as the sum of theabsolute values of its coefficients This defines a topology on the polynomial ringC[x1 xm] Given a complexity measure L say abp size or formula size thereis a natural notion of approximate complexity that is called border complexityNamely a polynomial f isin C[x] has border complexity Ltop at most c if there isa sequence of polynomials g1 g2 in C[x] converging to f such that each gisatisfies L(gi) le c It turns out that for reasonable classes over the field of complexnumbers C this topological notion of approximation is equivalent to what we callalgebraic approximation (see eg [Bur04]) Namely a polynomial f isin C[x] satisfiesL(f)alg le c iff there are polynomials f1 fe isin C[x] such that the polynomial

h = f + εf1 + ε2f2 + middot middot middot+ εefe isin C[εx]

has complexity LC(ε)(h) le c where ε is a formal variable and LC(ε)(h) denotesthe complexity of h over the field extension C(ε) This algebraic notion ofapproximation makes sense over any base field and we will use it in the statementsand proofs of this chapter

Definition 76 Let C(F) be a class over the field F We define the approximationclosure C(F) as follows a family (fn) over F is in C(F) if there are polynomialsfni(x) isin F[x] and a function e Nrarr N such that the family (gn) defined by

gn(x) = fn(x) + εfn1(x) + ε2fn2(x) + middot middot middot+ εe(n)fne(n)(x)

is in C(F(ε)) We define the poly-approximation closure Cpoly

(F) similarly butwith the additional requirement that e(n) isin poly(n) We call e(n) the errordegree

725 Nondeterminism closure N(C)

We introduce the nondeterminism closure for algebraic complexity classes

Definition 77 Let C be a class The class N(C) consists of families (fn) withthe following property there is a family (gn) isin C and p(n) q(n) isin poly(n) suchthat

fn(x) =sum

bisin01p(n)

gq(n)(bx)

where x and b denote sequences of variables x1 x2 and b1 b2 bp(n) We saythat f(x) is a hypercube sum over g and that b1 b2 bp(n) are the hypercubevariables For any subscript x we will use the notation VNPx to denote N(VPx)We remark that the map C 7rarr N(C) trivially satisfies all properties of being aKuratowski closure operator ie N(empty) = empty C sube N(C) N(CcupD) = N(C)cupN(D)and N(N(C)) = N(C)

116 Chapter 7 Algebraic branching programs

73 Approximation closure of VP2

We show that every polynomial can be approximated by a width-2 abp Evenbetter we show that every polynomial can be approximated by a width-2 abpof size polynomial in the formula size and with error degree polynomial in theformula size This is the main result of the current chapter

Theorem 78 VPe sube VP2poly

when char(F) 6= 2

Proof For a polynomial h define the matrix M(h) =(

1 0h 1

) We call the following

matrices primitives

bull M(h) with h any variable or constant in F

bull(

12ε

00 1

)

(minus2ε 0

0 1

)

(minus1 ε0 1

)

(minus1 minusε0 1

)

(minus1 00 1

)The entries of the primitives are variables or constants in the base field F(ε)making them suitable to use in a width-2 abp over the base field F(ε)

Let (fn) isin VPe so fn(x) can be computed by a formula of size s(n) isin poly(n)By Brentrsquos depth reduction theorem for formulas (Lemma 74) fn can be computedby a formula of size poly(n) and depth d(n) isin O(log s(n))

We will construct a sequence of primitives A1 Am(n) such that

A1 middot middot middotAm(n) =

(1 0fn 1

)+ ε

(fn111 fn112

fn121 fn122

)+ ε2

(fn211 fn212

fn221 fn222

)+ middot middot middot + εe

(fne11 fne12

fne21 fne22

)for some fnijk isin F[x] with m(n) e(n) isin O(8d(n)) = poly(n) Then

( 1 1 )(minus1 00 1

)A1 middot middot middotAm(n)( 1

1 ) = fn(x) +O(ε)

so fn(x) can be approximated by a width-2 abp of length poly(n) and with errordegree poly(n) proving the theorem

We begin with the construction Let h be a polynomial and consider a formulacomputing h of depth d The goal is to construct recursively on the tree structureof the formula a sequence of primitives A1 Am such that for some hijk isin F[x]

A1 middot middot middotAm =

(1 0h 1

)+ ε

(0 0h121 0

)+ ε2

(h211 h212

h221 h222

)+

middot middot middot + εe(he11 he12

he21 he22

)(73)

with m e isin O(8d) Notice the particular first-degree error pattern in (73) whichour recursion will rely on

73 Approximation closure of VP2 117

Suppose h is a variable or a constant Then M(h) is itself a primitive satisfy-ing (73)

Suppose h = f + g is a sum of two polynomials f g and suppose that

F =

(1 0f 1

)+ ε

(0 0f prime 0

)+O(ε2) (74)

G =

(1 0g 1

)+ ε

(0 0gprime 0

)+O(ε2) (75)

are products of primitives for some f prime gprime isin F[x] Then

G middot F =

(1 0

f + g 1

)+ ε

(0 0

f prime + gprime 0

)+O(ε2)

is a product of primitives satisfying (73)Suppose h = fg is a product of two polynomials and suppose that F and G

are of the form (74) and (75) and are products of primitives We will constructM((f + g)2) M(minusf 2) M(minusg2) approximately in such a way that when we usethe identity (f + g)2 minus f 2 minus g2 = 2fg the error terms cancel properly Define theexpressions sq+(A) and sqminus(A) by

sqplusmn(A) =

(minusε 00 1

)middot A middot

(minus1 plusmnε0 1

)middot A middot

(1ε

00 1

)

Then

sqplusmn(F ) =

(1∓ εf 0

plusmnf 2 +O(ε) 1plusmn εf

)+O(ε2)

We have

sqminus(F ) middot sqminus(G) middot sq+(G middot F )

=

(1 + εg 0

minusg2 +O(ε) 1minus εg

)middot(

1 + εf 0minusf 2 +O(ε) 1minus εf

)middot(

1minus ε(f + g) 0(f + g)2 +O(ε) 1 + ε(f + g)

)+O(ε2)

which simplifies to

sqminus(F ) middot sqminus(G) middot sq+(G middot F ) =

(1 0

2fg +O(ε) 1

)+O(ε2)

118 Chapter 7 Algebraic branching programs

We conclude(2 00 1

)middot sqminus(G) middot sqminus(F ) middot sq+(G middot F ) middot

(12

00 1

)=

(minus2ε 0

0 1

)middotG middot

(minus1 minusε0 1

)middotG middot

(minus1 00 1

)middot F middot

(minus1 minusε0 1

)F

middot(minus1 00 1

)middotG middot F middot

(minus1 ε0 1

)middotG middot F middot

(12ε

00 1

)=

(1 0

fg +O(ε) 1

)+O(ε2)

This completes the constructionThe length m of the construction is m(h) = 1 for h a variable or constant

and recursively m(f + g) = m(f) + m(g) m(f middot g) = 4(m(f) + m(g)) + 7 Weconclude m isin O(8d) The error degree e of the construction satisfies the samerecursion so e isin O(8d)

Remark 79 The construction in the above proof of Theorem 78 is differentfrom the construction in our paper [BIZ17] The recursion in the above proof issimpler while the construction in [BIZ17] has a better error degree and has aspecial form which relates it to a family of polynomials called continuants

Corollary 710 VP2 = VPe and VP2poly

= VPepoly

when char(F) 6= 2

Proof We have VP2 sube VPe by Lemma 72 Taking closures on both sides weobtain VP2 sube VPe and VP2

polysube VPepoly

When char(F) 6= 2 VPe sube VP2

poly(Theorem 78) By taking closures follows

VPe sube VP2 and VPepolysube VP2

poly

Corollary 711 VP2poly

= VPe when char(F) 6= 2 and F is infinite

Proof By Corollary 710 VP2poly

= VPepoly

We prove VPepoly

= VPe inLemma 712 below

Lemma 712 VPepoly

= VPe when char(F) 6= 2 and F is infinite

Proof The inclusion VPe sube VPepoly

is trivially true We prove the other directionLet (fn) isin VPe

poly Then there are polynomials fni(x) isin F[x] and e(n) isin poly(n)

such that

fn(x) + εfn1(x) + ε2fn2(x) + middot middot middot+ εe(n)fne(n)(x)

is computed by a poly-size formula Γ over F(ε) Let α0 α1 αe(n) be distinctelements in F such that replacing ε by αj in Γ is a valid substitution ie not

74 Nondeterminism closure of VP1 119

causing division by zero These αj exist since our field is infinite by assumptionView

gn(ε) = fn(x) + εfn1(x) + ε2fn2(x) + middot middot middot+ εe(n)fne(n)(x)

as a polynomial in ε The polynomial gn(ε) has degree at most e(n) so we canwrite gn(ε) as follows (Lagrange interpolation on e(n) + 1 points)

gn(ε) =

e(n)sumj=0

gn(αj)prod

0lemlee(n)m6=j

εminus αmαj minus αm

(76)

Clearly fn(x) = gn(0) However replacing ε by 0 in Γ is not a valid substitutionin general From (76) we see directly how to write gn(0) as a linear combinationof the values gn(αj) namely

gn(0) =

e(n)sumj=0

gn(αj)prod

0lemlee(n)m6=j

minusαmαj minus αm

that is

gn(0) =e(n)sumj=0

βj gn(αj) with βj =prod

0lemlee(n)m6=j

αmαm minus αj

The value gn(αj) is computed by the formula Γ with ε replaced by αj which we

denote by Γ|ε=αj Thus fn(x) is computed by the poly-size formulasume(n)

j=0 βj Γ|ε=αj We conclude (fn) isin VPe

Remark 713 The statement of Lemma 712 also holds with VPe replaced withVPs or with VP by a similar proof

74 Nondeterminism closure of VP1

Recall the definition of VNPx = N(VPx) from Definition 77 Valiant proved thefollowing characterisation of VNP in his seminal work [Val80] See also [BCS97Thm 2126] [Bur00 Thm 213] and [MP08 Thm 2]

Theorem 714 (Valiant [Val80]) VNPe = VNP

We strengthen Valiantrsquos characterisation of VNP from VNPe to VNP1

Theorem 715 VNP1 = VNP when char(F) 6= 2

120 Chapter 7 Algebraic branching programs

The idea of the proof is ldquoto simulate in VNP1rdquo the primitives that we used inthe proof of VPe sube VP3 (Theorem 73)

Proof of Theorem 715 Clearly VNP1 sube VNP by Lemma 72 and takingthe nondeterminism closure N We will prove that VNP sube VNP1 Recall thatin the proof of VPe sube VP3 (Theorem 73) we defined for any polynomial h thematrix

M(h) =

1 0 0h 1 00 0 1

and we called the following matrices primitives

bull M(h) with h any variable or any constant in F

bull the 3times 3 permutation matrices denoted by Mπ for π isin S3

bull the diagonal matrices Mabc = diag(a b c) with a b c isin F

In the proof of VPe sube VP3 we constructed for any family (fn) isin VPe a sequenceof primitive matrices An1 Ant(n) with t(n) isin poly(n) such that

fn(x) = ( 1 1 1 )Mminus110A1 middot middot middotAm(

111

) (77)

We will show VPe sube VNP1 by constructing a hypercube sum over a width-1abp that evaluates the right-hand side of (77) This implies VNPe sube VNP1 bytaking the N-closure Then by Valiantrsquos Theorem 714 VNP sube VNP1

Let f(x) be a polynomial and let A1 Ak be primitive matrices suchthat f(x) is computed as

f(x) = ( 1 1 1 )Ak middot middot middotA1

(111

)

View this expression as a width-3 abp G with vertex layers labeled as shown inthe left-hand diagram in Fig 71 Assume for simplicity that all edges betweenlayers are present possibly with label 0 The sum of the values of every sndasht pathin G equals f(x)

f(x) =sumjisin[3]k

Ak[jk jkminus1] middot middot middotA1[j2 j1] (78)

We introduce some hypercube variables To every vertex of G except s and twe associate a bit the bits in the ith layer we call b1[i] b2[i] b3[i] To an sndashtpath in G we associate an assignment of the bj[i] by setting the bits of verticesvisited by the path to 1 and the others to 0 For example in the right-hand

74 Nondeterminism closure of VP1 121

s

0

1

2

kminus1

k

t

A1

A2

Ak

s

1 0 0

0 1 0

0 1 0

0 0 1

0 1 0

t

Figure 71 Illustration of the layer labelling and the path labelling used in theproof of Theorem 715

diagram in Fig 71 we show an sndasht path with the corresponding assignment of thebits bj[i] The assignments of the bj[i] corresponding to sndasht paths are preciselythe assignments such that for every i isin [k] exactly one of b1[i] b2[i] b3[i] equals 1Let

V (b1 b2 b3) =prodiisin[k]

(b1[i] + b2[i] + b3[i]

)prodstisin[3]s 6=t

(1minus bs[i]bt[i]

) (79)

Then the assignments of the bj[i] corresponding to sndasht paths are precisely theassignments such that V (b1 b2 b3) = 1 Otherwise V (b1 b2 b3) = 0

We will write f(x) as a hypercube sum by replacing each Ai[ji jiminus1] in (78)by a product of affine linear forms Si(Ai) with variables b and xsum

b

V (b1 b2 b3)Sk(Ak) middot middot middotS1(A1)

Define the expression Eq(α β) = (1minus α minus β)(1minus α minus β) for α β isin 0 1 Theexpression Eq(α β) evaluates to 1 if α equals β and evaluates to 0 otherwise

bull For any variable or constant x define

Si(M(x)) =(1 + (xminus 1)(b1[i]minus b1[iminus1])

)middot(1minus (1minus b2[i])b2[iminus1]

)middot Eq

(b3[iminus1] b3[i]

)

122 Chapter 7 Algebraic branching programs

bull For any permutation π isin S3 define

Si(Mπ) = Eq(b1[iminus1] bπ(1)[i]

)middot Eq

(b2[iminus1] bπ(2)[i]

)middot Eq

(b3[iminus1] bπ(3)[i]

)

bull For any constants a b c isin F define

Si(Mabc) =(a middot b1[iminus1] + b middot b2[iminus1] + c middot b3[iminus 1]

)middot Eq

(b1[iminus1] b1[i]

)middot Eq

(b2[iminus1] b2[i]

)middot Eq

(b3[iminus1] b3[i]

)

One verifies that

f(x) =sumb

V (b1 b2 b3)Sk(Ak) middot middot middotS1(A1)

Some of the factors in the expressions for the Si(Ai) are not affine linear As afinal step we apply the equality 1 + xy = 1

2

sumcisin01(x + 1 minus 2c)(y + 1 minus 2c) to

write these factors as products of affine linear forms introducing new hypercubevariables

75 Conclusion

We finish with an overview of inclusions equalities and separations among theclasses VPk VPe VP and their approximation and nondeterminism closures(when char(F) 6= 2) see Fig 72 The figure relies on the following two simplelemmas of which proofs can be found in our paper [BIZ17]

Lemma 716 ([BIZ17 Prop 510]) VP1 = VP1

Lemma 717 ([BIZ17 Prop 511]) VP1 ( VNP1 when char(F) 6= 2

75 Conclusion 123

VP1 VP2 VPe VP

VP1 VP2 VPe VP

VNP1 VNP2 VNPe VNP=

(

(

= =

(

= sube

( sube[AW16]

717

=716 sube sube(

subesube(710

715 [Val80]

[Val79]

Figure 72 Overview of relations among the algebraic complexity classes VPkVPe VP and their approximation and nondeterminism closures (when char(F) isnot 2) The relations without reference are either by definition or follow logicallyfrom the other relations

Bibliography

[AJRS13] Elizabeth S Allman Peter D Jarvis John A Rhodes andJeremy G Sumner Tensor rank invariants inequalities andapplications SIAM J Matrix Anal Appl 34(3)1014ndash1045 2013doi101137120899066 p 14

[Alo98] Noga Alon The Shannon capacity of a union Combinatorica18(3)301ndash310 1998 doi101007PL00009824 p 37

[ASU13] Noga Alon Amir Shpilka and Christopher Umans On sunflowersand matrix multiplication Comput Complexity 22(2)219ndash243Jun 2013 doi101007s00037-013-0060-1 p 48

[AW16] Eric Allender and Fengming Wang On the power of algebraicbranching programs of width two Comput Complexity25(1)217ndash253 2016 doi101007s00037-015-0114-7 p 17109 114 123

[AZ14] Martin Aigner and Gunter M Ziegler Proofs from The BookSpringer-Verlag Berlin fifth edition 2014doi101007978-3-662-44205-0 p 71

[BC18] Boris Bukh and Christopher Cox On a fractional version ofHaemersrsquo bound arXiv 2018 arXiv180200476 p 41 42

[BCC+17] Jonah Blasiak Thomas Church Henry Cohn Joshua A GrochowEric Naslund William F Sawin and Chris Umans On cap setsand the group-theoretic approach to matrix multiplication DiscreteAnal 2017 arXiv160506702 doi1019086da1245 p 4883 84 104

125

126 Bibliography

[BCPZ16] Harry Buhrman Matthias Christandl Christopher Perry andJeroen Zuiddam Clean quantum and classical communicationprotocols Phys Rev Lett 117230503 Dec 2016doi101103PhysRevLett117230503 p 1

[BCRL79] Dario Bini Milvio Capovani Francesco Romani and Grazia LottiO(n27799) complexity for ntimes n approximate matrix multiplicationInf Process Lett 8(5)234ndash235 1979doi1010160020-0190(79)90113-3 p 3 110

[BCS97] Peter Burgisser Michael Clausen and M Amin ShokrollahiAlgebraic complexity theory volume 315 of Grundlehren MathWiss Springer-Verlag Berlin 1997doi101007978-3-662-03338-8 p 4 6 48 50 66 79 119

[BCSX10] Arnab Bhattacharyya Victor Chen Madhu Sudan and Ning XieTesting Linear-Invariant Non-linear Properties A Short Reportpages 260ndash268 Springer Berlin Heidelberg Berlin Heidelberg2010 doi101007978-3-642-16367-8_18 p 48

[BCZ17a] Markus Blaser Matthias Christandl and Jeroen Zuiddam Theborder support rank of two-by-two matrix multiplication is sevenarXiv 2017 arXiv170509652 p 1 15

[BCZ17b] Harry Buhrman Matthias Christandl and Jeroen ZuiddamNondeterministic Quantum Communication Complexity the CyclicEquality Game and Iterated Matrix Multiplication In Christos HPapadimitriou editor 8th Innovations in Theoretical ComputerScience Conference (ITCS 2017) pages 241ndash2418 2017arXiv160303757 doi104230LIPIcsITCS201724 p 115

[Ber84] Stuart J Berkowitz On computing the determinant in smallparallel time using a small number of processors Inform ProcessLett 18(3)147ndash150 1984 doi1010160020-0190(84)90018-8p 108

[BI13] Peter Burgisser and Christian Ikenmeyer Explicit lower bounds viageometric complexity theory Proceedings 45th Annual ACMSymposium on Theory of Computing 2013 pages 141ndash150 2013doi10114524886082488627 p 108

[Bin80] Dario Bini Relations between exact and approximate bilinearalgorithms Applications Calcolo 17(1)87ndash97 1980doi101007BF02575865 p 3

Bibliography 127

[BIZ17] Karl Bringmann Christian Ikenmeyer and Jeroen Zuiddam OnAlgebraic Branching Programs of Small Width In Ryan OrsquoDonnelleditor 32nd Computational Complexity Conference (CCC 2017)pages 201ndash2031 2017 doi104230LIPIcsCCC201720 p 1107 111 112 118 122

[Bla13] Anna Blasiak A graph-theoretic approach to network coding PhDthesis Cornell University 2013 URL httpsecommonscornelledubitstreamhandle181334147ab675pdf p 42

[BLMW11] Peter Burgisser Joseph M Landsberg Laurent Manivel and JerzyWeyman An overview of mathematical issues arising in thegeometric complexity theory approach to VP 6= VNP SIAM JComput 40(4)1179ndash1209 2011 doi101137090765328 p 108

[BOC92] Michael Ben-Or and Richard Cleve Computing algebraic formulasusing a constant number of registers SIAM J Comput21(1)54ndash58 1992 doi1011370221006 p 17 109 112

[BPR+00] Charles H Bennett Sandu Popescu Daniel Rohrlich John ASmolin and Ashish V Thapliyal Exact and asymptotic measuresof multipartite pure-state entanglement Phys Rev A63(1)012307 2000 doi101103PhysRevA63012307 p 48

[Bre74] Richard P Brent The parallel evaluation of general arithmeticexpressions J ACM 21(2)201ndash206 April 1974doi101145321812321815 p 112

[Bri87] Michel Brion Sur lrsquoimage de lrsquoapplication moment In Seminairedrsquoalgebre Paul Dubreil et Marie-Paule Malliavin (Paris 1986)volume 1296 of Lecture Notes in Math pages 177ndash192 SpringerBerlin 1987 doi101007BFb0078526 p 9 93 94

[BS83] Eberhard Becker and Niels and Schwartz Zum Darstellungssatzvon Kadison-Dubois Arch Math (Basel) 40(5)421ndash428 1983doi101007BF01192806 p 7 12 33

[Bur90] Peter Burgisser Degenerationsordnung und Tragerfunktionalbilinearer Abbildungen PhD thesis Universitat Konstanz 1990httpnbn-resolvingdeurnnbndebsz352-opus-20311p 57 101

[Bur00] Peter Burgisser Completeness and reduction in algebraiccomplexity theory volume 7 of Algorithms and Computation inMathematics Springer-Verlag Berlin 2000doi101007978-3-662-04179-6 p 119

128 Bibliography

[Bur04] Peter Burgisser The complexity of factors of multivariatepolynomials Found Comput Math 4(4)369ndash396 2004doi101007s10208-002-0059-5 p 110 115

[BX15] Arnab Bhattacharyya and Ning Xie Lower bounds for testingtriangle-freeness in boolean functions Comput Complexity24(1)65ndash101 2015 doi101007s00037-014-0092-1 p 48

[BZ17] Jop Briet and Jeroen Zuiddam On the orthogonal rank of Cayleygraphs and impossibility of quantum round elimination QuantumInf Comput 17(1amp2) 2017 URL httpwwwrintonpresscomxxqic17qic-17-120106-0116pdfarXiv160806113 p 2

[CHM07] Matthias Christandl Aram W Harrow and Graeme MitchisonNonzero Kronecker coefficients and what they tell us about spectraComm Math Phys 270(3)575ndash585 2007doi101007s00220-006-0157-3 p 90

[CJZ18] Matthias Christandl Asger Kjaeligrulff Jensen and Jeroen ZuiddamTensor rank is not multiplicative under the tensor product LinearAlgebra Appl 543125ndash139 2018doi101016jlaa201712020 p 2 15

[CKSV16] Suryajith Chillara Mrinal Kumar Ramprasad Saptharishi andV Vinay The chasm at depth four and tensor rank Old resultsnew insights arXiv 2016 arXiv160604200 p 15

[CLP17] Ernie Croot Vsevolod F Lev and Peter Pal Pach Progression-freesets in Zn

4 are exponentially small Ann of Math (2)185(1)331ndash337 2017 doi104007annals201718517 p 4881

[CM06] Matthias Christandl and Graeme Mitchison The spectra ofquantum states and the Kronecker coefficients of the symmetricgroup Comm Math Phys 261(3)789ndash797 2006doi101007s00220-005-1435-1 p 91

[CMR+14] Toby Cubitt Laura Mancinska David E Roberson SimoneSeverini Dan Stahlke and Andreas Winter Bounds onentanglement-assisted source-channel coding via the Lovasz thetanumber and its variants IEEE Trans Inform Theory60(11)7330ndash7344 2014 arXiv13107120doi101109TIT20142349502 p 42

Bibliography 129

[CT12] Thomas M Cover and Joy A Thomas Elements of informationtheory John Wiley amp Sons 2012 p 60

[CU13] Henry Cohn and Christopher Umans Fast matrix multiplicationusing coherent configurations In Proceedings of the Twenty-FourthAnnual ACM-SIAM Symposium on Discrete Algorithms pages1074ndash1086 SIAM 2013 p 15

[CVZ16] Matthias Christandl Peter Vrana and Jeroen ZuiddamAsymptotic tensor rank of graph tensors beyond matrixmultiplication arXiv 2016 arXiv160907476 p 2 65 67 7985

[CVZ18] Matthias Christandl Peter Vrana and Jeroen Zuiddam Universalpoints in the asymptotic spectrum of tensors In Proceedings of 50thAnnual ACM SIGACT Symposium on the Theory of Computing(STOCrsquo18) ACM New York 2018 arXiv170907851doi10114531887453188766 p 2 47 65 87 88 96 103 105

[CW82] Don Coppersmith and Shmuel Winograd On the asymptoticcomplexity of matrix multiplication SIAM J Comput11(3)472ndash492 1982 doi1011370211038 p 3

[CW87] Don Coppersmith and Shmuel Winograd Matrix multiplication viaarithmetic progressions In Proceedings of the nineteenth annualACM symposium on Theory of computing pages 1ndash6 ACM 1987p 3

[CW90] Don Coppersmith and Shmuel Winograd Matrix multiplication viaarithmetic progressions J Symbolic Comput 9(3)251ndash280 1990doi101016S0747-7171(08)80013-2 p 4 6 8 10 48 67

[CZ18] Matthias Christandl and Jeroen Zuiddam Tensor surgery andtensor rank Comput Complexity Mar 2018doi101007s00037-018-0164-8 p 2 86

[Dra15] Jan Draisma Multilinear Algebra and Applications (lecture notes)2015 URL httpsmathsitesunibechjdraismapublicationsmlapplpdfp 15

[DVC00] Wolfgang Dur Guivre Vidal and Juan Ignacio Cirac Three qubitscan be entangled in two inequivalent ways Phys Rev A (3)62(6)062314 12 2000 doi101103PhysRevA62062314 p 48

130 Bibliography

[Ede04] Yves Edel Extensions of generalized product caps Des CodesCryptogr 31(1)5ndash14 2004 doi101023A1027365901231p 48 83

[EG17] Jordan S Ellenberg and Dion Gijswijt On large subsets of Fnq with

no three-term arithmetic progression Ann of Math (2)185(1)339ndash343 2017 doi104007annals201718518 p 1048 81 83 84

[FK14] Hu Fu and Robert Kleinberg Improved lower bounds for testingtriangle-freeness in boolean functions via fast matrix multiplicationIn Approximation Randomization and CombinatorialOptimization Algorithms and Techniques (APPROXRANDOM2014) pages 669ndash676 2014doi104230LIPIcsAPPROX-RANDOM2014669 p 48

[For16] Michael Forbes Some concrete questions on the border complexityof polynomials Presentation given at the Workshop on AlgebraicComplexity Theory WACT 2016 in Tel Avivhttpswwwyoutubecomwatchv=1HMogQIHT6Q 2016 p 110

[Fra02] Matthias Franz Moment polytopes of projective G-varieties andtensor products of symmetric group representations J Lie Theory12(2)539ndash549 2002 URLhttpemisamsorgjournalsJLTvol12_no216htmlp 93 94

[Fri17] Tobias Fritz Resource convertibility and ordered commutativemonoids Math Structures Comput Sci 27(6)850ndash938 2017doi101017S0960129515000444 p 37

[Ful97] William Fulton Young tableaux volume 35 of LondonMathematical Society Student Texts Cambridge University PressCambridge 1997 With applications to representation theory andgeometry p 88

[GKKS13] Ankit Gupta Pritish Kamath Neeraj Kayal and RamprasadSaptharishi Approaching the chasm at depth four In 2013 IEEEConference on Computational ComplexitymdashCCC 2013 pages 65ndash73IEEE Computer Soc Los Alamitos CA 2013doi101109CCC201316 p 108

[GMQ16] Joshua A Grochow Ketan D Mulmuley and Youming QiaoBoundaries of VP and VNP In Ioannis Chatzigiannakis MichaelMitzenmacher Yuval Rabani and Davide Sangiorgi editors 43rd

Bibliography 131

International Colloquium on Automata Languages andProgramming (ICALP 2016) volume 55 pages 341ndash3414 2016arXiv160502815 doi104230LIPIcsICALP201634 p 110

[Gro13] Joshua A Grochow Unifying and generalizing known lower boundsvia geometric complexity theory arXiv 2013 arXiv13046333p 108

[GW09] Roe Goodman and Nolan R Wallach Symmetry representationsand invariants volume 255 of Graduate Texts in MathematicsSpringer Dordrecht 2009 doi101007978-0-387-79852-3p 88

[Hae79] Willem Haemers On some problems of Lovasz concerning theShannon capacity of a graph IEEE Trans Inform Theory25(2)231ndash232 1979 doi101109TIT19791056027 p 37 4042

[Has90] Johan Hastad Tensor rank is NP-complete J Algorithms11(4)644ndash654 1990 doi1010160196-6774(90)90014-6 p 47

[HHHH09] Ryszard Horodecki Pawe l Horodecki Micha l Horodecki and KarolHorodecki Quantum entanglement Rev Modern Phys81(2)865ndash942 2009 doi101103RevModPhys81865 p 48

[HIL13] Jonathan D Hauenstein Christian Ikenmeyer and Joseph MLandsberg Equations for lower bounds on border rank ExpMath 22(4)372ndash383 2013 doi101080105864582013825892p 15 110

[Hum75] James E Humphreys Linear algebraic groups Springer-VerlagNew York-Heidelberg 1975 Graduate Texts in Mathematics No21 p 93

[HX17] Ishay Haviv and Ning Xie Sunflowers and testing triangle-freenessof functions Comput Complexity 26(2)497ndash530 Jun 2017doi101007s00037-016-0138-7 p 48

[Ike13] Christian Ikenmeyer Geometric complexity theory tensor rankand LittlewoodndashRichardson coefficients PhD thesis UniversitatPaderborn 2013 p 14

[Kar72] Richard M Karp Reducibility among combinatorial problems InComplexity of computer computations (Proc Sympos IBM ThomasJ Watson Res Center Yorktown Heights NY 1972) pages85ndash103 Plenum New York 1972 p 36

132 Bibliography

[Knu94] Donald E Knuth The sandwich theorem Electron J Combin 11994 URL httpwwwcombinatoricsorgVolume_1Abstractsv1i1a1htmlp 41

[Kra84] Hanspeter Kraft Geometrische Methoden in der InvariantentheorieSpringer 1984 doi101007978-3-663-10143-7 p 50 88 93

[KS08] Tali Kaufman and Madhu Sudan Algebraic property testing Therole of invariance In Proceedings of the Fortieth Annual ACMSymposium on Theory of Computing STOC rsquo08 pages 403ndash412New York NY USA 2008 ACMdoi10114513743761374434 p 48

[KSS16] Robert Kleinberg William F Sawin and David E Speyer Thegrowth rate of tri-colored sum-free sets arXiv 2016arXiv160700047 p 48 79 83

[Lan06] Joseph M Landsberg The border rank of the multiplication of2times 2 matrices is seven J Amer Math Soc 19(2)447ndash459 2006doi101090S0894-0347-05-00506-0 p 110

[LG14] Francois Le Gall Powers of tensors and fast matrix multiplicationIn ISSAC 2014mdashProceedings of the 39th International Symposiumon Symbolic and Algebraic Computation pages 296ndash303 ACM NewYork 2014 doi10114526086282608664 p 4 6 8 48 85

[Lic84] Thomas Lickteig A note on border rank Inf Process Lett18(3)173ndash178 1984 doi1010160020-0190(84)90023-1p 110

[LM16a] Joseph M Landsberg and Mateusz Micha lek A 2n2 minus log(n)minus 1lower bound for the border rank of matrix multiplication arXiv2016 arXiv160807486 p 110

[LM16b] Joseph M Landsberg and Mateusz Micha lek Abelian tensorsJ Math Pures Appl 2016 doi101016jmatpur201611004p 14

[LMR13] Joseph M Landsberg Laurent Manivel and Nicolas RessayreHypersurfaces with degenerate duals and the geometric complexitytheory program Comment Math Helv 88(2)469ndash484 2013doi104171CMH292 p 108

[LO15] Joseph M Landsberg and Giorgio Ottaviani New lower bounds forthe border rank of matrix multiplication Theory Comput

Bibliography 133

11285ndash298 2015 arXiv11126007doi104086toc2015v011a011 p 108 110

[Lov79] Laszlo Lovasz On the Shannon capacity of a graph IEEE TransInform Theory 25(1)1ndash7 1979 doi101109TIT19791055985p 13 35 41

[Mar08] Murray Marshall Positive polynomials and sums of squaresvolume 146 of Mathematical Surveys and Monographs AmericanMathematical Society Providence RI 2008doi101090surv146 p 34

[MP71] Robert J McEliece and Edward C Posner Hide and seek datastorage and entropy The Annals of Mathematical Statistics42(5)1706ndash1716 1971 doi101214aoms1177693169 p 41

[MP08] Guillaume Malod and Natacha Portier Characterizing Valiantrsquosalgebraic complexity classes J Complexity 24(1)16ndash38 2008doi101016jjco200609006 p 119

[MS01] Ketan D Mulmuley and Milind Sohoni Geometric complexitytheory I An approach to the P vs NP and related problemsSIAM J Comput 31(2)496ndash526 2001doi101137S009753970038715X p 14 108

[MS08] Ketan D Mulmuley and Milind Sohoni Geometric complexitytheory II Towards explicit obstructions for embeddings amongclass varieties SIAM J Comput 38(3)1175ndash1206 2008doi101137080718115 p 108

[Nes84] Linda Ness A stratification of the null cone via the moment mapAmer J Math 106(6)1281ndash1329 1984 With an appendix byDavid Mumford doi1023072374395 p 9 93 94

[Nis91] Noam Nisan Lower bounds for non-commutative computation InProceedings of the twenty-third annual ACM symposium on Theoryof computing pages 410ndash418 ACM 1991doi101145103418103462 p 110

[Nor16] Sergey Norin A distribution on triples with maximum entropymarginal arXiv 2016 arXiv160800243 p 83

[NW97] Noam Nisan and Avi Wigderson Lower bounds on arithmeticcircuits via partial derivatives Comput Complexity 6(3)217ndash234199697 doi101007BF01294256 p 108

134 Bibliography

[Pan78] Victor Ya Pan Strassenrsquos algorithm is not optimal Trilineartechnique of aggregating uniting and canceling for constructingfast algorithms for matrix operations In 19th Annual Symposiumon Foundations of Computer Science (Ann Arbor Mich 1978)pages 166ndash176 IEEE Long Beach Calif 1978 p 3

[Pan80] Victor Ya Pan New fast algorithms for matrix operations SIAMJ Comput 9(2)321ndash342 1980 doi1011370209027 p 3

[Pan81] Victor Ya Pan New combinations of methods for the accelerationof matrix multiplication Comput Math Appl 7(1)73ndash125 1981doi1010160898-1221(81)90009-2 p 3

[Pan84] Victor Ya Pan How to multiply matrices faster volume 179 ofLecture Notes in Computer Science Springer-Verlag Berlin 1984doi1010073-540-13866-8 p 3

[Pan18] Victor Ya Pan Fast feasible and unfeasible matrix multiplicationarXiv 2018 arXiv180404102 p 6

[PD01] Alexander Prestel and Charles N Delzell Positive polynomialsSpringer Monographs in Mathematics Springer-Verlag Berlin2001 From Hilbertrsquos 17th problem to real algebradoi101007978-3-662-04648-7 p 34

[Peb16] Luke Pebody Proof of a conjecture of Kleinberg-Sawin-SpeyerarXiv 2016 arXiv160805740 p 83

[PS98] George Polya and Gabor Szego Problems and theorems inanalysis I Classics in Mathematics Springer-Verlag Berlin 1998Series integral calculus theory of functions Translated from theGerman by Dorothee Aeppli Reprint of the 1978 Englishtranslation doi101007978-3-642-61905-2 p 21

[Raz09] Ran Raz Multi-linear formulas for permanent and determinant areof super-polynomial size J ACM 56(2)Art 8 17 2009doi10114515027931502797 p 108

[Raz13] Ran Raz Tensor-rank and lower bounds for arithmetic formulasJ ACM 60(6)Art 40 15 2013 doi1011452535928 p 14

[Rom82] Francesco Romani Some properties of disjoint sums of tensorsrelated to matrix multiplication SIAM J Comput 11(2)263ndash2671982 doi1011370211020 p 3

Bibliography 135

[Sap16] Ramprasad Saptharishi A survey of lower bounds in arithmeticcircuit complexity 302 2016 Online survey URLhttpsgithubcomdasarpmarlowerbounds-survey p 6 17109 112

[Sch81] Arnold Schonhage Partial and total matrix multiplication SIAMJ Comput 10(3)434ndash455 1981 p 3

[Sch03] Alexander Schrijver Combinatorial optimization polyhedra andefficiency volume 24 Springer Science amp Business Media 2003p 37 41

[Sha56] Claude E Shannon The zero error capacity of a noisy channelInstitute of Radio Engineers Transactions on Information TheoryIT-2(September)8ndash19 1956 doi101109TIT19561056798p 13 35

[Sha09] Asaf Shapira Greenrsquos conjecture and testing linear-invariantproperties In Proceedings of the Forty-first Annual ACMSymposium on Theory of Computing STOC rsquo09 pages 159ndash166New York NY USA 2009 ACMdoi10114515364141536438 p 48

[Shi16] Yaroslav Shitov How hard is the tensor rank arXiv 2016arXiv161101559 p 47

[Sin64] Richard C Singleton Maximum distance q-nary codes IEEETrans Information Theory IT-10116ndash118 1964doi101109TIT19641053661 p 101

[SOK14] Adam Sawicki Micha l Oszmaniec and Marek Kus Convexity ofmomentum map Morse index and quantum entanglement RevMath Phys 26(3)1450004 39 2014doi101142S0129055X14500044 p 9

[SSS09] Chandan Saha Ramprasad Saptharishi and Nitin Saxena Thepower of depth 2 circuits over algebras In IARCS AnnualConference on Foundations of Software Technology and TheoreticalComputer Science volume 4 pages 371ndash382 2009arXiv09042058 doi104230LIPIcsFSTTCS20092333p 109

[Sto10] Andrew James Stothers On the complexity of matrix multiplicationPhD thesis University of Edinburgh 2010httphdlhandlenet18424734 p 4 6 8 48

136 Bibliography

[Str69] Volker Strassen Gaussian elimination is not optimal NumerMath 13(4)354ndash356 1969 doi101007BF02165411 p 3 5

[Str83] Volker Strassen Rank and optimal computation of generic tensorsLinear Algebra Appl 5253645ndash685 1983doi1010160024-3795(83)80041-X p 110

[Str86] Volker Strassen The asymptotic spectrum of tensors and theexponent of matrix multiplication In Proceedings of the 27thAnnual Symposium on Foundations of Computer Science SFCS rsquo86pages 49ndash54 Washington DC USA 1986 IEEE Computer Societydoi101109SFCS198652 p 4 7

[Str87] Volker Strassen Relative bilinear complexity and matrixmultiplication J Reine Angew Math 375376406ndash443 1987doi101515crll1987375-376406 p 3 4 49 67

[Str88] Volker Strassen The asymptotic spectrum of tensors J ReineAngew Math 384102ndash152 1988doi101515crll1988384102 p 4 7 12 19 26 27 28 2930 32 33 49 50 51

[Str91] Volker Strassen Degeneration and complexity of bilinear mapssome asymptotic spectra J Reine Angew Math 413127ndash1801991 doi101515crll1991413127 p 3 4 10 48 49 5255 56 57 66 67 81 82

[Str94] Volker Strassen Algebra and complexity In First EuropeanCongress of Mathematics Vol II (Paris 1992) volume 120 ofProgr Math pages 429ndash446 Birkhauser Basel 1994doi101007s10107-008-0221-1 p 67

[Str05] Volker Strassen Komplexitat und Geometrie bilinearerAbbildungen Jahresber Deutsch Math-Verein 107(1)3ndash31 2005p 4 88 94 95 100 101

[Tao08] Terence Tao Structure and randomness pages from year one of amathematical blog American Mathematical Soc 2008 p 48

[Tao16] Terence Tao A symmetric formulation of theCrootndashLevndashPachndashEllenbergndashGijswijt capset boundhttpsterrytaowordpresscom 2016 p 48 58 81 84

[Tob91] Verena Tobler Spezialisierung und Degeneration von TensorenPhD thesis Universitat Konstanz 1991httpnbn-resolvingdeurnnbndebsz352-opus-20324p 57

Bibliography 137

[TS16] Terence Tao and Will Sawin Notes on the ldquoslice rankrdquo of tensorshttpsterrytaowordpresscom 2016 p 48 58

[Val79] Leslie G Valiant Completeness classes in algebra In ConferenceRecord of the Eleventh Annual ACM Symposium on Theory ofComputing (Atlanta Ga 1979) pages 249ndash261 ACM New York1979 doi101145800135804419 p 107 108 123

[Val80] Leslie G Valiant Reducibility by algebraic projections Universityof Edinburgh Department of Computer Science 1980 InternalReport p 109 119 123

[VC15] Peter Vrana and Matthias Christandl Asymptotic entanglementtransformation between W and GHZ states J Math Phys56(2)022204 12 2015 arXiv13103244doi10106314908106 p 69

[VDDMV02] F Verstraete J Dehaene B De Moor and H Verschelde Fourqubits can be entangled in nine different ways Phys Rev A (3)65(5 part A)052112 5 2002 doi101103PhysRevA65052112p 48

[Wal14] Michael Walter Multipartite quantum states and their marginalsPhD thesis ETH Zurich 2014 arXiv14106820 p 93

[WDGC13] Michael Walter Brent Doran David Gross and MatthiasChristandl Entanglement polytopes multiparticle entanglementfrom single-particle information Science 340(6137)1205ndash12082013 arXiv12080365 doi101126science1232957 p 8 995

[Wil12] Virginia Vassilevska Williams Multiplying matrices faster thanCoppersmith-Winograd Extended abstract InSTOCrsquo12mdashProceedings of the 2012 ACM Symposium on Theory ofComputing pages 887ndash898 ACM New York 2012doi10114522139772214056 p 4 6 8 48

[Zui17] Jeroen Zuiddam A note on the gap between rank and border rankLinear Algebra Appl 52533ndash44 2017doi101016jlaa201703015 p 2 14 110

[Zui18] Jeroen Zuiddam The asymptotic spectrum of graphs and theShannon capacity arXiv 2018 arXiv180700169 p 35

Glossary

〈n〉 ntimes middot middot middot times n diagonal tensor 47

〈a b c〉 matrix multiplication tensor 48

G lowastH or-product 42

GH strong graph product and-product 35

α(G) stability number 35

χ(G) clique cover number 40

Kk complete graph on k vertices 36

F θ(t) quantum functional 96

G(t) GLn1 times middot middot middot timesGLnk for t isin Fn1 otimes middot middot middot otimes Fnk 52

H(P ) Shannon entropy of probability distribution P 52

h(p) binary entropy of probability p isin [0 1] 53

τ(Φ) hitting set number 59

˜τ(Φ) asymptotic hitting set number 60

ω matrix multiplication exponent 47

P moment polytope 94

139

140 Glossary

P(X) the set of probability distributions on X 52

R rank 27

˜R asymptotic rank 27

R(t) border rank 50

R(G) rank of a graph clique cover number 40

R(t) tensor rank 47

SR(t) slice rank 58

Q subrank 27

˜Q asymptotic subrank 27

Q(t) border subrank 50

Q(Φ) combinatorial subrank 10

Q(G) subrank of a graph stability number 40

supp(t) support 52

Θ(G) Shannon capacity 35

ϑ(G) Lovasz theta number 41

G tH disjoint union 36

W (t) Sn1 times middot middot middot times Snk for t isin Fn1 otimes middot middot middot otimes Fnk 53

X(S6) asymptotic spectrum of semiring S with Strassen preorder 6 25

ζ(S)(t) gauge point 51

ζθ(t) support functional 52

Samenvatting

Algebraısche complexiteit asymptotische spectra enverstrengelingspolytopen

Het is welbekend dat de rang van een matrix multiplicatief is onder het Krone-ckerproduct additief onder de directe som genormaliseerd op identiteitsmatricesen niet-stijgend onder vermenigvuldiging van links en van rechts met matricesMatrixrang is zelfs de enige reele parameter met deze vier eigenschappen In 1986initieerde Strassen de studie van de uitbreiding naar tensoren vind alle afbeel-dingen van k-tensoren naar de reele getallen die multiplicatief zijn onder hettensor Kroneckerproduct additief onder de directe som genormaliseerd op ldquoiden-titeitstensorenrdquo en niet-stijgend onder het toepassen van lineaire afbeeldingen opde k tensorfactoren Strassen noemde de verzameling van deze afbeeldingen hetldquoasymptotische spectrum van k-tensorenrdquo Hij bewees als we het asymptotischespectrum begrijpen dan begrijpen we de asymptotische relaties tussen tensorswaaronder de asymptotische subrang en de asymptotische rang In het bijzonderals we het asymptotische spectrum kennen dan kennen we de aritmetische com-plexiteit van matrixvermenigvuldiging een centraal probleem in de algebraıschecomplexiteitstheorie

Een van de hoofdresultaten in dit proefschrift is de eerste expliciete construc-tie van een oneindige familie van elementen in het asymptotische spectrum vancomplexe k-tensoren genaamd de quantumfunctionalen Onze constructie is geba-seerd op informatietheorie en momentpolytopen ook wel verstrengelingspolytopengenoemd Daarnaast bestuderen we onder andere de relatie tussen de recentgeıntroduceerde slice rang en de quantumfunctionalen en we bewijzen dat deldquoasymptotischerdquo slice rang gelijk is aan het minimum over de quantumfunctionalenNaast het bestuderen van de bovengenoemde tensorparameters geven we eenuitbreiding van de CoppersmithndashWinograd-methode (voor het verkrijgen vanondergrenzen op de asymptotische combinatorische subrang) naar hogere-orde

141

142 Samenvatting

tensoren dwz tensoren van orde minstens 4 We passen deze uitbreiding toeom nieuwe bovengrenzen te krijgen op de asymptotische tensorrang van complete-graaftensoren via de lasermethode (Gezamenlijk werk met Christandl en VranaQIP 2018 STOC 2018)

Als een nieuwe toepassing van de abstracte theorie van asymptotische spectraintroduceren we het asymptotische spectrum van grafen in de grafentheorie Ana-loog aan de situatie voor tensoren geldt als we het asymptotisch spectrum vangrafen begrijpen dan begrijpen we de Shannoncapaciteit een graafparameter diede zero-error-communicatiecomplexiteit van communicatiekanalen karakteriseertMet andere woorden we bewijzen een nieuwe dualiteitsstelling voor de Shannon-capaciteit Voorbeelden van elementen in het asymptotische spectrum van grafenzijn het thetagetal van Lovasz en de fractionele Haemersgrenzen

Tot slot bestuderen we een algebraısch model van berekening genaamd algebraicbranching programs Een algebraic branching program (abp) is het spoor vaneen product van matrices met polynomen van graad hoogstens 1 als elementenDe maximale grootte van de matrices heet de breedte van de abp In 1992bewezen Ben-Or en Cleve dat elk polynoom berekend kan worden door eenbreedte-3 abp met een aantal matrices dat polynomiaal is in de formula size vanhet polynoom Daarentegen bewezen Allender en Wang in 2011 dat sommigepolynomen niet berekend kunnen worden door breedte-2 abps Wij bewijzen dat elkpolynoom benaderd kan worden door een breedte-2 abp met een aantal matricesdat polynomiaal is in de formula size van het polynoom waarbij benaderingwordt bedoeld in de zin van degeneration (Gezamenlijk werk met Ikenmeyer enBringmann CCC 2017 JACM 2018)

Summary

Algebraic complexity asymptotic spectra andentanglement polytopes

Matrix rank is well-known to be multiplicative under the Kronecker productadditive under the direct sum normalised on identity matrices and non-increasingunder multiplying from the left and from the right by any matrices In fact matrixrank is the only real matrix parameter with these four properties In 1986 Strassenproposed to study the extension to tensors find all maps from k-tensors to thereals that are multiplicative under the tensor Kronecker product additive underthe direct sum normalised on ldquoidentity tensorsrdquo and non-increasing under actingwith linear maps on the k tensor factors Strassen called the collection of thesemaps the ldquoasymptotic spectrum of k-tensorsrdquo He proved that understandingthe asymptotic spectrum implies understanding the asymptotic relations amongtensors including the asymptotic subrank and the asymptotic rank In particularknowing the asymptotic spectrum means knowing the arithmetic complexity ofmatrix multiplication a central problem in algebraic complexity theory

One of the main results in this dissertation is the first explicit construction ofan infinite family of elements in the asymptotic spectrum of complex k-tensorscalled the quantum functionals Our construction is based on information theoryand moment polytopes ie entanglement polytopes Moreover among otherthings we study the relation of the recently introduced slice rank to the quantumfunctionals and find that ldquoasymptoticrdquo slice rank equals the minimum over thequantum functionals Besides studying the above tensor parameters we extendthe CoppersmithndashWinograd method (for obtaining asymptotic combinatorialsubrank lower bounds) to higher-order tensors ie order at least 4 We applythis generalisation to obtain new upper bounds on the asymptotic tensor rankof complete graph tensors via the laser method (Joint work with Christandland Vrana QIP 2018 STOC 2018)

143

144 Summary

In graph theory as a new instantiation of the abstract theory of asymptoticspectra we introduce the asymptotic spectrum of graphs Analogous to thesituation for tensors understanding the asymptotic spectrum of graphs meansunderstanding the Shannon capacity a graph parameter capturing the zero-errorcommunication complexity of communication channels In different words weprove a new duality theorem for Shannon capacity Some known elements in theasymptotic spectrum of graphs are the Lovasz theta number and the fractionalHaemers bounds

Finally we study an algebraic model of computation called algebraic branchingprograms An algebraic branching program (abp) is the trace of a product ofmatrices with affine linear forms as matrix entries The maximum size of thematrices is called the width of the abp In 1992 Ben-Or and Cleve provedthat width-3 abps can compute any polynomial efficiently in the formula sizeOn the other hand in 2011 Allender and Wang proved that some polynomialscannot be computed by any width-2 abp We prove that any polynomial can beefficiently approximated by a width-2 abp where approximation is defined in thesense of degeneration (Joint work with Ikenmeyer and Bringmann CCC 2017JACM 2018)

Titles in the ILLC Dissertation Series

ILLC DS-2009-01 Jakub SzymanikQuantifiers in TIME and SPACE Computational Complexity of GeneralizedQuantifiers in Natural Language

ILLC DS-2009-02 Hartmut FitzNeural Syntax

ILLC DS-2009-03 Brian Thomas SemmesA Game for the Borel Functions

ILLC DS-2009-04 Sara L UckelmanModalities in Medieval Logic

ILLC DS-2009-05 Andreas WitzelKnowledge and Games Theory and Implementation

ILLC DS-2009-06 Chantal BaxSubjectivity after Wittgenstein Wittgensteinrsquos embodied and embedded subjectand the debate about the death of man

ILLC DS-2009-07 Kata BaloghTheme with Variations A Context-based Analysis of Focus

ILLC DS-2009-08 Tomohiro HoshiEpistemic Dynamics and Protocol Information

ILLC DS-2009-09 Olivia LadinigTemporal expectations and their violations

ILLC DS-2009-10 Tikitu de JagerrdquoNow that you mention it I wonderrdquo Awareness Attention Assumption

ILLC DS-2009-11 Michael FrankeSignal to Act Game Theory in Pragmatics

ILLC DS-2009-12 Joel UckelmanMore Than the Sum of Its Parts Compact Preference Representation OverCombinatorial Domains

ILLC DS-2009-13 Stefan BoldCardinals as Ultrapowers A Canonical Measure Analysis under the Axiom ofDeterminacy

ILLC DS-2010-01 Reut TsarfatyRelational-Realizational Parsing

ILLC DS-2010-02 Jonathan ZvesperPlaying with Information

ILLC DS-2010-03 Cedric DegremontThe Temporal Mind Observations on the logic of belief change in interactivesystems

ILLC DS-2010-04 Daisuke IkegamiGames in Set Theory and Logic

ILLC DS-2010-05 Jarmo KontinenCoherence and Complexity in Fragments of Dependence Logic

ILLC DS-2010-06 Yanjing WangEpistemic Modelling and Protocol Dynamics

ILLC DS-2010-07 Marc StaudacherUse theories of meaning between conventions and social norms

ILLC DS-2010-08 Amelie GheerbrantFixed-Point Logics on Trees

ILLC DS-2010-09 Gaelle FontaineModal Fixpoint Logic Some Model Theoretic Questions

ILLC DS-2010-10 Jacob VosmaerLogic Algebra and Topology Investigations into canonical extensions dualitytheory and point-free topology

ILLC DS-2010-11 Nina GierasimczukKnowing Onersquos Limits Logical Analysis of Inductive Inference

ILLC DS-2010-12 Martin Mose BentzenStit Iit and Deontic Logic for Action Types

ILLC DS-2011-01 Wouter M KoolenCombining Strategies Efficiently High-Quality Decisions from ConflictingAdvice

ILLC DS-2011-02 Fernando Raymundo Velazquez-QuesadaSmall steps in dynamics of information

ILLC DS-2011-03 Marijn KoolenThe Meaning of Structure the Value of Link Evidence for Information Retrieval

ILLC DS-2011-04 Junte ZhangSystem Evaluation of Archival Description and Access

ILLC DS-2011-05 Lauri KeskinenCharacterizing All Models in Infinite Cardinalities

ILLC DS-2011-06 Rianne KapteinEffective Focused Retrieval by Exploiting Query Context and Document Struc-ture

ILLC DS-2011-07 Jop BrietGrothendieck Inequalities Nonlocal Games and Optimization

ILLC DS-2011-08 Stefan MinicaDynamic Logic of Questions

ILLC DS-2011-09 Raul Andres LealModalities Through the Looking Glass A study on coalgebraic modal logic andtheir applications

ILLC DS-2011-10 Lena KurzenComplexity in Interaction

ILLC DS-2011-11 Gideon BorensztajnThe neural basis of structure in language

ILLC DS-2012-01 Federico SangatiDecomposing and Regenerating Syntactic Trees

ILLC DS-2012-02 Markos MylonakisLearning the Latent Structure of Translation

ILLC DS-2012-03 Edgar Jose Andrade LoteroModels of Language Towards a practice-based account of information innatural language

ILLC DS-2012-04 Yurii KhomskiiRegularity Properties and Definability in the Real Number Continuum idealizedforcing polarized partitions Hausdorff gaps and mad families in the projectivehierarchy

ILLC DS-2012-05 David Garcıa SorianoQuery-Efficient Computation in Property Testing and Learning Theory

ILLC DS-2012-06 Dimitris GakisContextual Metaphilosophy - The Case of Wittgenstein

ILLC DS-2012-07 Pietro GallianiThe Dynamics of Imperfect Information

ILLC DS-2012-08 Umberto GrandiBinary Aggregation with Integrity Constraints

ILLC DS-2012-09 Wesley Halcrow HollidayKnowing What Follows Epistemic Closure and Epistemic Logic

ILLC DS-2012-10 Jeremy MeyersLocations Bodies and Sets A model theoretic investigation into nominalisticmereologies

ILLC DS-2012-11 Floor SietsmaLogics of Communication and Knowledge

ILLC DS-2012-12 Joris DormansEngineering emergence applied theory for game design

ILLC DS-2013-01 Simon PauwSize Matters Grounding Quantifiers in Spatial Perception

ILLC DS-2013-02 Virginie FiutekPlaying with Knowledge and Belief

ILLC DS-2013-03 Giannicola ScarpaQuantum entanglement in non-local games graph parameters and zero-errorinformation theory

ILLC DS-2014-01 Machiel KeestraSculpting the Space of Actions Explaining Human Action by IntegratingIntentions and Mechanisms

ILLC DS-2014-02 Thomas IcardThe Algorithmic Mind A Study of Inference in Action

ILLC DS-2014-03 Harald A BastiaanseVery Many Small Penguins

ILLC DS-2014-04 Ben RodenhauserA Matter of Trust Dynamic Attitudes in Epistemic Logic

ILLC DS-2015-01 Marıa Ines CrespoAffecting Meaning Subjectivity and evaluativity in gradable adjectives

ILLC DS-2015-02 Mathias Winther MadsenThe Kid the Clerk and the Gambler - Critical Studies in Statistics andCognitive Science

ILLC DS-2015-03 Shengyang ZhongOrthogonality and Quantum Geometry Towards a Relational Reconstructionof Quantum Theory

ILLC DS-2015-04 Sumit SourabhCorrespondence and Canonicity in Non-Classical Logic

ILLC DS-2015-05 Facundo CarreiroFragments of Fixpoint Logics Automata and Expressiveness

ILLC DS-2016-01 Ivano A CiardelliQuestions in Logic

ILLC DS-2016-02 Zoe ChristoffDynamic Logics of Networks Information Flow and the Spread of Opinion

ILLC DS-2016-03 Fleur Leonie BouwerWhat do we need to hear a beat The influence of attention musical abilitiesand accents on the perception of metrical rhythm

ILLC DS-2016-04 Johannes MartiInterpreting Linguistic Behavior with Possible World Models

ILLC DS-2016-05 Phong LeLearning Vector Representations for Sentences - The Recursive Deep LearningApproach

ILLC DS-2016-06 Gideon Maillette de Buy WennigerAligning the Foundations of Hierarchical Statistical Machine Translation

ILLC DS-2016-07 Andreas van CranenburghRich Statistical Parsing and Literary Language

ILLC DS-2016-08 Florian SpeelmanPosition-based Quantum Cryptography and Catalytic Computation

ILLC DS-2016-09 Teresa PiovesanQuantum entanglement insights via graph parameters and conic optimization

ILLC DS-2016-10 Paula HenkNonstandard Provability for Peano Arithmetic A Modal Perspective

ILLC DS-2017-01 Paolo GaleazziPlay Without Regret

ILLC DS-2017-02 Riccardo PinosioThe Logic of Kantrsquos Temporal Continuum

ILLC DS-2017-03 Matthijs WesteraExhaustivity and intonation a unified theory

ILLC DS-2017-04 Giovanni CinaCategories for the working modal logician

ILLC DS-2017-05 Shane Noah Steinert-ThrelkeldCommunication and Computation New Questions About Compositionality

ILLC DS-2017-06 Peter HawkeThe Problem of Epistemic Relevance

ILLC DS-2017-07 Aybuke OzgunEvidence in Epistemic Logic A Topological Perspective

ILLC DS-2017-08 Raquel Garrido AlhamaComputational Modelling of Artificial Language Learning Retention Recogni-tion amp Recurrence

ILLC DS-2017-09 Milos StanojevicPermutation Forests for Modeling Word Order in Machine Translation

ILLC DS-2018-01 Berit JanssenRetained or Lost in Transmission Analyzing and Predicting Stability in DutchFolk Songs

ILLC DS-2018-02 Hugo HuurdemanSupporting the Complex Dynamics of the Information Seeking Process

ILLC DS-2018-03 Corina KoolenReading beyond the female The relationship between perception of authorgender and literary quality

ILLC DS-2018-04 Jelle BruinebergAnticipating Affordances Intentionality in self-organizing brain-body-environmentsystems

ILLC DS-2018-05 Joachim DaiberTypologically Robust Statistical Machine Translation Understanding and Ex-ploiting Differences and Similarities Between Languages in Machine Transla-tion

ILLC DS-2018-06 Thomas BrochhagenSignaling under Uncertainty

ILLC DS-2018-07 Julian SchloderAssertion and Rejection

ILLC DS-2018-08 Srinivasan ArunachalamQuantum Algorithms and Learning Theory

ILLC DS-2018-09 Hugo de Holanda Cunha NobregaGames for functions Baire classes Weihrauch degrees transfinite computa-tions and ranks

ILLC DS-2018-10 Chenwei ShiReason to Believe

ILLC DS-2018-11 Malvin GattingerNew Directions in Model Checking Dynamic Epistemic Logic

ILLC DS-2018-12 Julia IlinFiltration Revisited Lattices of Stable Non-Classical Logics

  • Acknowledgements
  • Introduction
    • Matrix multiplication
    • The asymptotic spectrum of tensors
    • Higher-order CW method
    • Abstract asymptotic spectra
    • The asymptotic spectrum of graphs
    • Tensor degeneration
    • Combinatorial degeneration
    • Algebraic branching program degeneration
    • Organisation
      • The theory of asymptotic spectra
        • Introduction
        • Semirings and preorders
        • Strassen preorders
        • Asymptotic preorders
        • Maximal Strassen preorders
        • The asymptotic spectrum
        • The representation theorem
        • Abstract rank and subrank
        • Topological aspects
        • Uniqueness
        • Subsemirings
        • Subsemirings generated by one element
        • Universal spectral points
        • Conclusion
          • The asymptotic spectrum of graphs Shannon capacity
            • Introduction
            • The asymptotic spectrum of graphs
              • The semiring of graph isomorphism classes
              • Strassen preorder via graph homomorphisms
              • The asymptotic spectrum of graphs
              • Shannon capacity
                • Universal spectral points
                  • Lovaacutesz theta number
                  • Fractional graph parameters
                    • Conclusion
                      • The asymptotic spectrum of tensors matrix multiplication
                        • Introduction
                        • The asymptotic spectrum of tensors
                          • The semiring of tensor equivalence classes
                          • Strassen preorder via restriction
                          • The asymptotic spectrum of tensors
                          • Asymptotic rank and asymptotic subrank
                            • Gauge points
                            • Support functionals
                            • Upper and lower support functionals
                            • Asymptotic slice rank
                            • Conclusion
                              • Tight tensors and combinatorial subrank cap sets
                                • Introduction
                                • Higher-order CoppersmithndashWinograd method
                                  • Construction
                                  • Computational remarks
                                  • Examples type sets
                                    • Combinatorial degeneration method
                                    • Cap sets
                                      • Reduced polynomial multiplication
                                      • Cap sets
                                        • Graph tensors
                                        • Conclusion
                                          • Universal points in the asymptotic spectrum of tensors entanglement polytopes moment polytopes
                                            • Introduction
                                            • SchurndashWeyl duality
                                            • Kronecker and LittlewoodndashRichardson coefficients
                                            • Entropy inequalities
                                            • Hilbert spaces and density operators
                                            • Moment polytopes
                                              • General setting
                                              • Tensor spaces
                                                • Quantum functionals
                                                • Outer approximation
                                                • Inner approximation for free tensors
                                                • Quantum functionals versus support functionals
                                                • Asymptotic slice rank
                                                • Conclusion
                                                  • Algebraic branching programs approximation and nondeterminism
                                                    • Introduction
                                                    • Definitions and basic results
                                                      • Computational models
                                                      • Complexity classes
                                                      • The theorem of Ben-Or and Cleve
                                                      • Approximation closure
                                                      • Nondeterminism closure
                                                        • Approximation closure of VP2
                                                        • Nondeterminism closure of VP1
                                                        • Conclusion
                                                          • Bibliography
                                                          • Glossary
                                                          • Samenvatting
                                                          • Summary
Page 7: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch

Contents

Acknowledgements ix

1 Introduction 311 Matrix multiplication 512 The asymptotic spectrum of tensors 613 Higher-order CW method 1014 Abstract asymptotic spectra 1115 The asymptotic spectrum of graphs 1216 Tensor degeneration 1417 Combinatorial degeneration 1518 Algebraic branching program degeneration 1519 Organisation 17

2 The theory of asymptotic spectra 1921 Introduction 1922 Semirings and preorders 1923 Strassen preorders 2024 Asymptotic preorders 4sim 2125 Maximal Strassen preorders 2326 The asymptotic spectrum X(S6) 2527 The representation theorem 2628 Abstract rank and subrank RQ 2729 Topological aspects 29210 Uniqueness 30211 Subsemirings 31212 Subsemirings generated by one element 32213 Universal spectral points 33

v

214 Conclusion 33

3 The asymptotic spectrum of graphs Shannon capacity 3531 Introduction 3532 The asymptotic spectrum of graphs 37

321 The semiring of graph isomorphism classes G 37322 Strassen preorder via graph homomorphisms 38323 The asymptotic spectrum of graphs X(G) 39324 Shannon capacity Θ 39

33 Universal spectral points 41331 Lovasz theta number ϑ 41332 Fractional graph parameters 41

34 Conclusion 46

4 The asymptotic spectrum of tensors matrix multiplication 4741 Introduction 4742 The asymptotic spectrum of tensors 49

421 The semiring of tensor equivalence classes T 49422 Strassen preorder via restriction 49423 The asymptotic spectrum of tensors X(T ) 49424 Asymptotic rank and asymptotic subrank 50

43 Gauge points ζ(i) 5144 Support functionals ζθ 5245 Upper and lower support functionals ζθ ζθ 5646 Asymptotic slice rank 5847 Conclusion 63

5 Tight tensors and combinatorial subrank cap sets 6551 Introduction 6552 Higher-order CoppersmithndashWinograd method 68

521 Construction 69522 Computational remarks 77523 Examples type sets 78

53 Combinatorial degeneration method 7954 Cap sets 81

541 Reduced polynomial multiplication 81542 Cap sets 82

55 Graph tensors 8556 Conclusion 86

6 Universal points in the asymptotic spectrum of tensors entan-glement polytopes moment polytopes 8761 Introduction 87

vi

62 SchurndashWeyl duality 8863 Kronecker and LittlewoodndashRichardson coefficients gλmicroν c

λmicroν 90

64 Entropy inequalities 9165 Hilbert spaces and density operators 9266 Moment polytopes P(t) 93

661 General setting 93662 Tensor spaces 94

67 Quantum functionals F θ(t) 9568 Outer approximation 10069 Inner approximation for free tensors 101610 Quantum functionals versus support functionals 102611 Asymptotic slice rank 103612 Conclusion 105

7 Algebraic branching programs approximation and nondetermi-nism 10771 Introduction 10772 Definitions and basic results 110

721 Computational models 110722 Complexity classes VP VPe VPk 111723 The theorem of Ben-Or and Cleve 112724 Approximation closure C 115725 Nondeterminism closure N(C) 115

73 Approximation closure of VP2 11674 Nondeterminism closure of VP1 11975 Conclusion 122

Bibliography 125

Glossary 139

Samenvatting 141

Summary 143

vii

Acknowledgements

First of all I thank all my coauthors for very fruitful collaboration Harry BuhrmanMatthias Christandl Peter Vrana Jop Briet Chris Perry Asger Jensen MarkusBlaser Christian Ikenmeyer and Karl Bringmann

Chris Zaal Leen Torenvliet and Robert Belleman I thank for all their effortsto set up for me the ldquodouble bachelor programmerdquo in Mathematics and Computerscience at the University of Amsterdam (UvA) in 2009 This programme as wellas the ldquowebklasrdquo on the Riemann hypothesis organised by Jan van de Craats andRoland van der Veen and the close vicinity of the UvA to the Dutch nationalresearch institute for mathematics and computer science (CWI) made me decideto come to Amsterdam My enjoyable master thesis project in mathematics withEric Opdam made me follow the academic path for which I thank Eric

Of course most importantly I thank my PhD supervisor Harry Buhrmanfor introducing me to research as a bachelor student for absorbing me into theAlgorithms and Complexity group at CWI for having enough faith in me to hireme as his PhD student in 2014 and for his general guidance throughout I feelvery lucky for the opportunities and scientific freedom that this has brought me

Matthias Christandl has been my closest collaborator and mentor since wemet in Berkeley in 2014 In practice this meant countless nights of fun Skypesessions between Amsterdam and Copenhagen countless enjoyable visits to theUniversity of Copenhagen and countless kitchen table sessions at the HallinsgadeThanks Matthias for the energy inspiration and optimism And thanks Matthiasand Henriette for the hospitality

Jop Briet I thank for his general guidance and for lots of inspiration Thepolynomial method reading group which he mainly organised inspired partof my paper with Matthias Christandl and Peter Vrana on universal points inthe asymptotic spectrum of tensors (This reading group also resulted in DionGijswijtrsquos paper on cap sets) My paper with Jop on round elimination laterinspired me to write the paper on the asymptotic spectrum of graphs

ix

Christian Ikenmeyer I thank for numerous inspiring discussions on algebraiccomplexity theory and tensors which greatly influenced my papers on tensor rankand our joint paper with Karl Bringmann on algebraic branching programs

Peter Vrana I thank for our many enjoyable research collaborations the resultsof which form a central part of this dissertation for his clever insights and forfinding several mathematical mistakes while reading the draft of this dissertation

Ronald de Wolf I thank for his general advice throughout my PhD and formany suggestions regarding the current version of this dissertation which will beincorporated in the next version (but not in the printed version because of theregulations of the University of Amsterdam)

Jop Briet Monique Laurent Lex Schrijver Peter Vrana Matthias ChristandlMaris Ozols Michael Walter and Bart Sevenster I thank for helpful discussionsregarding the results in Chapter 2 and Chapter 3 of this dissertation

Srinivasan Arunachalam I thank for sharing the ups and downs during ourfour years as PhD students at CWI Florian Speelman Farrokh Labib SvenPolak Bart Litjens and Bart Sevenster I thank for numerous valuable researchdiscussions

Bikkie Aldeias and Rob van Rooijen I thank for their excellent library servicesMartijn Zuiddam and Maris Ozols I thank for proofreading the draft of this

dissertationFinally I thank my parents and my brothers and my friends for their support

Amsterdam Jeroen ZuiddamAugust 31 2018

x

Publications

[BCPZ16] Harry Buhrman Matthias Christandl Christopher Perry and JeroenZuiddam Clean quantum and classical communication protocolsPhysical Review Letters 117230503 2016httpslinkapsorgdoi101103PhysRevLett117230503

httparxivorgabs160507948

[BCZ17a] Markus Blaser Matthias Christandl and Jeroen Zuiddam The bordersupport rank of two-by-two matrix multiplication is sevenManuscript 2017httpsarxivorgabs170509652

[BCZ17b] Harry Buhrman Matthias Christandl and Jeroen ZuiddamNondeterministic Quantum Communication Complexity the CyclicEquality Game and Iterated Matrix MultiplicationIn Christos H Papadimitriou editor 8th Innovations in TheoreticalComputer Science Conference (ITCS) 2017httpdropsdagstuhldeopusvolltexte20178181

[BIZ17] Karl Bringmann Christian Ikenmeyer and Jeroen Zuiddam On algebraicbranching programs of small widthIn Ryan OrsquoDonnell editor 32nd Computational ComplexityConference (CCC) 2017httpsdoiorg104230LIPIcsCCC201720

httpsarxivorgabs170205328

Journal of the ACM Vol 65 No 5 Article 32 2018httpsdoiorg1011453209663

1

2 Acknowledgements

[BZ17] Jop Briet and Jeroen Zuiddam On the orthogonal rank of Cayley graphsand impossibility of quantum round eliminationQuantum Information and Computation 2017httpwwwrintonpresscomxxqic17qic-17-120106-0116pdf

httpsarxivorgabs160806113

[CJZ18] Matthias Christandl Asger Kjaeligrulff Jensen and Jeroen ZuiddamTensor rank is not multiplicative under the tensor productLinear Algebra and its Applications 543125ndash139 2018httpsdoiorg101016jlaa201712020

httpsarxivorgabs170509379

[CVZ16] Matthias Christandl Peter Vrana and Jeroen Zuiddam Asymptotic tensorrank of graph tensors beyond matrix multiplicationManuscript 2016httpsarxivorgabs160907476

[CVZ18] Matthias Christandl Peter Vrana and Jeroen Zuiddam Universal Pointsin the Asymptotic Spectrum of Tensors Extended AbstractIn Proceedings of 50th Annual ACM SIGACT Symposium on the Theory ofComputing (STOC) 2018httpsdoiorg10114531887453188766

httpsarxivorgabs170907851

[CZ18] Matthias Christandl and Jeroen Zuiddam Tensor surgery and tensor rank(Journal of) computational complexity 2018httpsdoiorg101007s00037-018-0164-8

httpsarxivorgabs160604085

[Zui17] Jeroen Zuiddam A note on the gap between rank and border rankLinear Algebra and its Applications 52533ndash44 2017httpsdoiorg101016jlaa201703015

httparxivorgabs150405597

[Zui18] Jeroen Zuiddam The asymptotic spectrum of graphs and the ShannoncapacityManuscript 2018httparxivorgabs180700169

This dissertation is based on the above papers with primary focus on the fourhighlighted papers

Toelichting op het relatieve belang van de co-auteurs voor elk artikel is hetbelang van de co-auteurs ongeveer gelijk verdeeld

Chapter 1

Introduction

Volker Strassen published in 1969 his famous algorithm for multiplying anytwo ntimes n matrices using only O(n281) rather than O(n3) arithmetical opera-tions [Str69] His discovery marked the beginning of a still ongoing line of researchin the field of algebraic complexity theory a line of research that by now touchesseveral fields of mathematics including algebraic geometry representation theory(quantum) information theory and combinatorics This dissertation is inspired byand contributes to this line of research

No further progress followed for almost 10 years after Strassenrsquos discoverydespite the fact that ldquomany scientists understood that discovery as a signal toattack the problem and to push the exponent further downrdquo [Pan84] Then in 1978Pan improved the exponent from 281 to 279 [Pan78 Pan80] One year later BiniCapovani Lotti and Romani improved the exponent to 278 by constructing fastldquoapproximativerdquo algorithms for matrix multiplication and making these algorithmsexact via the method of interpolation [BCRL79 Bin80] Cast in the languageof tensors the result of Bini et al corresponds to what we now call a ldquoborderrankrdquo upper bound The idea of studying approximative complexity or bordercomplexity for algebraic problems has nowadays become an important theme inalgebraic complexity theory

Schonhage then obtained the exponent 255 by constructing a fast algorithm forcomputing many ldquodisjointrdquo small matrix multiplications and transforming this intoan algorithm for one large matrix multiplication [Sch81] The upper bound was im-proved shortly after by works of Pan [Pan81] Romani [Rom82] and Coppersmithand Winograd [CW82] resulting in the exponent 250 Then in 1987 Strassenpublished the laser method with which he obtained the exponent 248 [Str87] Thelaser method was used in the same year by Coppersmith and Winograd to obtainthe exponent 238 [CW87] To do this they invented a method for constructingcertain large combinatorial structures This method or actually the extendedversion that Strassen published in [Str91] we now call the CoppersmithndashWinogradmethod All further improvements on upper bounding the exponent essentially

3

4 Chapter 1 Introduction

follow the framework of Coppersmith and Winograd and the improvements donot affect the first two digits after the comma [CW90 Sto10 Wil12 LG14]

Define ω to be the optimal exponent in the complexity of matrix multiplicationWe call ω the exponent of matrix multiplication To summarise the above historicalaccount on upper bounds ω lt 238 On the other hand the only lower bound wecurrently have is the trivial lower bound 2 le ω

The history of upper bounds on the matrix multiplication exponent ω whichbegan with Strassenrsquos algorithm and ended with the Strassen laser methodand CoppersmithndashWinograd method is well-known and well-documented seeeg [BCS97 Section 1513] However there is remarkable work of Strassen ona theory of lower bounds for ω and similar types of exponents and this workhas received almost no attention This theory of lower bounds is the theory ofasymptotic spectra of tensors and is the topic of a series of papers by Strassen[Str86 Str87 Str88 Str91 Str05]

In the foregoing the word tensor has popped up twicemdashnamely when wementioned border rank and just now when we mentioned asymptotic spectraof tensorsmdashbut we have not discussed at all why tensors should be relevant forunderstanding the complexity of matrix multiplication First we give a mini courseon tensors A k-tensor t = (ti1ik)i1ik is a k-dimensional array of numbers fromsome field say the complex numbers C Thus a 2-tensor is simply a matrix Ak-tensor is called simple if there exist k vectors v1 vk such that the entries of tare given by the products ti1ik = (v1)i1 middot middot middot (vk)ik for all indices ij The tensorrank of t is the smallest number n such that t can be written as a sum of n simpletensors Thus the tensor rank of a 2-tensor is simply its matrix rank Returning tothe problem of finding the complexity of matrix multiplication there is a special3-tensor called the matrix multiplication tensor that encodes the computationalproblem of multiplying two 2times 2 matrices This 3-tensor is commonly denotedby 〈2 2 2〉 It turns out that the matrix multiplication exponent ω is exactly theasymptotic rate of growth of the tensor rank of the ldquoKronecker powersrdquo of thetensor 〈2 2 2〉 This important observation follows from the fundamental fact thatthe computational problem of multiplying matrices is ldquoself-reduciblerdquo Namely wecan multiply two matrices by viewing them as block matrices and then performmatrix multiplication at the level of the blocks

We wrap up this introductory story To understand the computational com-plexity of matrix multiplication one should understand the asymptotic rate ofgrowth of the tensor rank of a certain family of tensors a family that is obtainedby taking powers of a fixed tensor The theory of asymptotic spectra is the theoryof bounds on such asymptotic parameters of tensors

The main story line of this dissertation concerns the theory of asymptoticspectra In Section 11 of this introduction we discuss in more detail the computa-tional problem of multiplying matrices In Section 12 we discuss the asymptoticspectrum of tensors and discuss a new result an explicit description of infinitely

11 Matrix multiplication 5

many elements in the asymptotic spectrum of tensors In Section 13 we considera new higher-order CoppersmithndashWinograd method

The theory of asymptotic spectra of tensors is a special case of an abstracttheory of asymptotic spectra of preordered semirings which we discuss in Sec-tion 14 In Section 15 we apply this abstract theory to a new setting namelyto graphs By doing this we obtain a new dual characterisation of the Shannoncapacity of graphs

The second story line of this dissertation is about degeneration an algebraickind of approximation related to the concept of border rank of Bini et al We discussdegeneration in the context of tensors in Section 16 There is a combinatorialversion of tensor degeneration which we call combinatorial degeneration Wediscuss a new result regarding combinatorial degeneration in Section 17 FinallySection 18 is about a new result concerning degeneration for algebraic branchingprograms an algebraic model of computation

We finish in Section 19 with a discussion of the organisation of this dissertationinto chapters

11 Matrix multiplication

In this section we discuss in more detail the computational problem of multiplyingtwo matrices

Algebraic complexity theory studies algebraic algorithms for algebraic problemsRoughly speaking algebraic algorithms are algorithms that use only the basicarithmetical operations + and times over some field say R or C A fundamentalexample of an algebraic problem is matrix multiplication

If we multiply two ntimesn matrices by computing the inner products between anyrow of the first matrix and any column of the second matrix one by one we needroughly 2 middot n3 arithmetical operations (+ and times) For example we can multiplytwo 2times2 matrices with 12 arithmetical operations namely 8 multiplications and 4additions(

a11 a12

a21 a22

)(b11 b12

b21 b22

)=

(a11b11 + a12b21 a11b12 + a12b22

a21b11 + a22b21 a21b12 + a22b22

)

Since matrix multiplication is a basic operation in linear algebra it is worthwhileto see if we can do better than 2 middot n3 In 1969 Strassen [Str69] published a betteralgorithm The base routine of Strassenrsquos algorithm is an algorithm for multiplyingtwo 2times 2 matrices with 7 multiplications 18 additions and certain sign changes(

a11 a12

a21 a22

)(b11 b12

b21 b22

)=

(x1 + x4 minus x5 + x7 x3 + x5

x2 + x4 x1 + x3 minus x2 + x6

)with

x1 = (a11 + a22)(b11 + b22)

6 Chapter 1 Introduction

x2 = (a21 + a22)b11

x3 = a11(b12 minus b22)

x4 = a22(minusb11 + b21)

x5 = (a11 + a12)b22

x6 = (minusa11 + a21)(b11 + b12)

x7 = (a12 minus a22)(b21 + b22)

The general routine of Strassenrsquos algorithm multiplies two n times n matrices byrecursively dividing the matrices into four blocks and applying the base routineto multiply the blocks (this is the self-reducibility of matrix multiplication thatwe mentioned earlier) The base routine does not assume commutativity of thevariables for correctness so indeed we can take the variables to be matrices Afterexpanding the recurrence we see that Strassenrsquos algorithm uses 47middotnlog2 7 asymp 47middotn281

arithmetical operations Over the years Strassenrsquos algorithm was improved bymany researchers The best algorithm known today uses C middot n238 arithmeticaloperations where C is some constant [CW90 Sto10 Wil12 LG14] The exponentof matrix multiplication ω is the infimum over all real numbers β such that forsome constant Cβ we can multiply for any n isin N any two ntimes n matrices with atmost Cβ middot nβ arithmetical operations From the above it follows that ω le 238From a simple flattening argument it follows that 2 le ω We are left with thefollowing well-known open problem what is the value of the matrix multiplicationexponent ω

The constant C for the currently best algorithm is impractically large (fora discussion of this issue see eg [Pan18]) For a practical fast algorithm oneshould either improve C or find a balance between C and the exponent of nWe will ignore the size of C in this dissertation and focus on the exponent ωFor an overview of the field of algebraic complexity theory the reader shouldconsult [BCS97] and [Sap16]

12 The asymptotic spectrum of tensors

We now discuss the theory of asymptotic spectra for tensors

Let s and t be k-tensors over a field F s isin Fn1otimesmiddot middot middototimesFnk t isin Fm1otimesmiddot middot middototimesFmk We say s restricts to t and write s gt t if there are linear maps Ai Fni rarr Fmisuch that (A1 otimes middot middot middot otimes Ak)(s) = t Let [n] = 1 n for n isin N We definethe product s otimes t isin Fn1m1 otimes middot middot middot otimes Fnkmk by (s otimes t)(i1j1)(ikjk) = si1iktj1jkfor i isin [n1] times middot middot middot [nk] and j isin [m1] times middot middot middot times [mk] This product generalizes thewell-known Kronecker product of matrices We refer to this product as the tensor(Kronecker) product We define the direct sum s oplus t isin Fn1+m1 otimes middot middot middot otimes Fnk+mk

by (s oplus t)`1`k = s`1`k if ` isin [n1] times middot middot middot times [nk] (s oplus t)n1+`1nk+`k = t`1`k if` isin [m1]times middot middot middot times [mk] and (soplus t)`1`k = 0 for the remaining indices

12 Asymptotic spectra of tensors 7

The asymptotic restriction problem asks to compute the infimum of all realnumbers β ge 0 such that for all n isin N

sotimesβn+o(n) gt totimesn

We may think of the asymptotic restriction problem as having two directionsnamely to find

1 obstructions ldquocertificatesrdquo that prohibit sotimesβn+o(n) gt totimesn or

2 constructions linear maps that carry out sotimesβn+o(n) gt totimesn

Ideally we would like to find matching obstructions and constructions so that weindeed learn the value of β

What do obstructions look like We set β equal to one it turns out that itis sufficient to understand this case We say s restricts asymptotically to t andwrite s gtsim t if

sotimesn+o(n) gt totimesn

What do obstructions look like for asymptotic restriction gtsim More precisely whatdo obstructions look like for gtsim restricted to a subset S sube k-tensors over FLet us assume S is closed under direct sum and tensor product and containsthe diagonal tensors 〈n〉 =

sumni=1 ei otimes middot middot middot otimes ei for n isin N where e1 en is the

standard basis of Fn Let X(S) be the set of all maps φ S rarr Rge0 that are

(a) monotone under restriction gt

(b) multiplicative under the tensor Kronecker product otimes

(c) additive under the direct sum oplus

(d) normalised to φ(〈n〉) = n at the diagonal tensor 〈n〉

The elements φ isin X(S) are called spectral points of S The set X(S) is called theasymptotic spectrum of S

Spectral points φ isin X(S) are obstructions Let s t isin S If s gtsim t thenby definition we have a restriction sotimesn+o(n) gt totimesn Then (a) and (b) imply theinequality φ(s)n+o(n) = φ(sotimesn+o(n)) gt φ(totimesn) = φ(t)n This implies φ(s) gt φ(t)We negate that statement if φ(s) lt φ(t) then not s gtsim t In that case φ is anobstruction to s gtsim t

The remarkable fact is that X(S) is a complete set of obstructions for gtsimNamely for s t isin S the asymptotic restriction s gtsim t holds if and only if we haveφ(s) gt φ(t) for all spectral points φ isin X(S) This was proven by Volker Strassenin [Str86 Str88] His proof uses a theorem of Becker and Schwarz [BS83] which iscommonly referred to as the KadisonndashDubois theorem (for historical reasons) or

8 Chapter 1 Introduction

the real representation theorem (We will say more about this completeness resultin Section 14)

Let us introduce tensor rank and subrank and their asymptotic versionsThe tensor rank of t is the size of the smallest diagonal tensor that restrictsto t R(t) = minr isin N t 6 〈r〉 and the subrank of t is the size of thelargest diagonal tensor to which t restricts Q(t) = maxr isin N 〈r〉 6 tAsymptotic rank is defined as ˜R(t) = limnrarrinfinR(totimesn)1n and asymptotic subrankis defined as ˜Q(t) = limnrarrinfinQ(totimesn)1n From Feketersquos lemma it follows that

˜Q(t) = supn Q(totimesn)1n and ˜R(t) = infn R(totimesn)1n One easily verifies that everyspectral point φ isin X(S) is an upper bound on asymptotic subrank and a lowerbound on asymptotic rank for any tensor t isin S

˜Q(t) le φ(t) le ˜R(t)

Strassen used the completeness of X(S) for 6sim to prove ˜Q(t) = minφisinX(S) φ(t) and

˜R(t) = maxφisinX(S) φ(t) One should think of these expressions as being dual to thedefining expressions for ˜Q and ˜R

We mentioned that Strassen was motivated to study the asymptotic spectrumof tensors by the study of the complexity of matrix multiplication The preciseconnection with matrix multiplication is as follows The matrix multiplicationexponent ω is characterised by the asymptotic rank ˜R(〈2 2 2〉) of the matrixmultiplication tensor

〈2 2 2〉 =sum

ijkisin[2]

eij otimes ejk otimes eki isin F4 otimes F4 otimes F4

via ˜R(〈2 2 2〉) = 2ω We know the trivial lower bound 2 le ω see Section 43 Weknow the (nontrivial) upper bound ω le 23728639 which is by Coppersmith andWinograd [CW90] and improvements by Stothers [Sto10] Williams [Wil12] andLe Gall [LG14] It may seem that for the study of matrix multiplication only theasymptotic rank ˜R is of interest and that the asymptotic subrank ˜Q is just a toyparameter Asymptotic subrank however plays an important role in the currentlybest matrix multiplication algorithms We will discuss this idea in the context ofthe asymptotic subrank of so-called complete graph tensors in Section 55

The important message is understanding the asymptotic spectrum of ten-sors X(S) means understanding asymptotic restriction 6sim the asymptotic sub-rank ˜Q and the asymptotic rank ˜R of tensors Of course we should now find anexplicit description of X(S)

Our main result regarding the asymptotic spectrum of tensors is the explicitdescription of an infinite family of elements in the asymptotic spectrum of allcomplex tensors X(complex k-tensors) which we call the quantum function-als (Chapter 6) Finding such an infinite family has been an open problemsince the work of Strassen Moment polytopes (studied under the name en-tanglement polytopes in quantum information theory [WDGC13]) play a key

12 Asymptotic spectra of tensors 9

role here To each tensor t is associated a convex polytope P(t) collectingrepresentation-theoretic information about t called the moment polytope of t(See eg [Nes84 Bri87 WDGC13 SOK14]) The moment polytope has twoimportant equivalent descriptions

Quantum marginal spectra description We begin with the descriptionof P(t) in terms of quantum marginal spectra

Let V be a (finite-dimensional) Hilbert space In quantum information theorya positive semidefinite hermitian operator ρ V rarr V with trace one is calleda density operator The sequence of eigenvalues of a density operator ρ is aprobability vector We let spec(ρ) = (p1 pn) be the sequence of eigenvalues of ρordered non-increasingly p1 ge middot middot middot ge pn Let V1 and V2 be Hilbert spaces Given adensity operator ρ on V1 otimes V2 the reduced density operator ρ1 = tr2 ρ is uniquelydefined by the property that tr(ρ1X1) = tr(ρ(X1otimesIdV2)) for all operators X1 on V1The operator ρ1 is again a density operator The operation tr2 is called the partialtrace over V2 In an explicit form ρ1 is given by 〈ei ρ1(ej)〉 =

sum`〈eiotimesf` ρ(ejotimesf`)〉

where the ei form a basis of V1 and the fi form an orthonormal basis of V2 (thestatement is independent of basis choice)

Let Vi be a Hilbert space and consider the tensor product V1 otimes V2 otimes V3Associate with t isin V1 otimes V2 otimes V3 the dual element tlowast = 〈t middot〉 isin (V1 otimes V2 otimes V3)

lowastThen ρt = ttlowast〈t t〉 = t〈t middot〉〈t t〉 is a density operator on V1otimesV2otimesV3 Viewing ρt

as a density operator on the regrouped space V1 otimes (V2 otimes V3) we may take thepartial trace of ρt over V2otimesV3 as described above We denote the resulting densityoperator by ρt1 = tr23 ρ

t We similarly define ρt2 and ρt3

Let V = V1otimesV2otimesV3 Let G = GL(V1)timesGL(V2)timesGL(V3) act naturally on V Let t isin V 0 The moment polytope of t is

P(t) = P(G middot t) = (spec(ρu1) spec(ρu2) spec(ρu3)) u isin G middot t 0

Here G middot t denotes the Zariski closure or equivalently the Euclidean closure in Vof the orbit G middot t = g middot t g isin G

Representation-theoretic description On the other hand there is a de-scription of P(t) in terms of non-vanishing of representation-theoretic multiplicitiesWe do not state this description here but stress that it is crucial for our proofs

Quantum functionals For any probability vector θ isin Rk (iesumk

i=1 θ(i) = 1and θ(i) ge 0 for all i isin [k]) we define the quantum functional F θ as an optimisationover the moment polytope

F θ(t) = max

2sumki=1 θ(i)H(x(i)) (x(1) x(k)) isin P(t)

Here H(y) denotes Shannon entropy of the probability vector y We prove that F θ

satisfies properties (a) (b) (c) and (d) for all complex k-tensors

Theorem (Theorem 611) F θ isin X(complex k-tensors)

10 Chapter 1 Introduction

To put our result into context Strassen in [Str91] constructed elements in theasymptotic spectrum of S = oblique k-tensors over F with the preorder 6|SThe set S is a strict and non-generic subset of all k-tensors over F These elementswe call the (Strassen) support functionals On oblique tensors over C the quantumfunctionals and the support functionals coincide An advantage of the supportfunctionals over the quantum functionals is that they are defined over any fieldIn fact the support functionals are ldquopowerful enoughrdquo to reprove the result ofEllenbergndashGijswijt on cap sets [EG17] We discuss the support functionals inSection 44

13 Higher-order CW method

Recall that in the asymptotic restriction problem we have an obstruction directionand a construction direction The quantum functionals and the support functionalsprovide obstructions Now we look at the construction direction Constructionsare asymptotic transformations sotimesβn+o(n) gt totimesn We restrict attention to the casethat t is a diagonal tensor 〈r〉 Constructions in this case essentially correspondto lower bounds on the asymptotic subrank ˜Q(s) The goal is now to constructgood lower bounds on ˜Q(s)

Strassen solved the problem of computing the asymptotic subrank for so-calledtight 3-tensors with the CoppersmithndashWinograd (CW) method and the supportfunctionals [CW90 Str91] The CW method is combinatorial Let us introduce thecombinatorial viewpoint Let I1 Ik be finite sets We call a set D sube I1timesmiddot middot middottimesIka diagonal if any two distinct elements a b isin D differ in all k coordinates LetΦ sube I1timesmiddot middot middottimes Ik We call a diagonal D sube Φ free if D = Φcap (D1timesmiddot middot middottimesDk) HereDi = ai a isin D is the projection ofD onto the ith coordinate The subrank Q(Φ)of Φ is the size of the largest free diagonal D sube Φ For two sets Φ sube I1 times middot middot middot times Ikand Ψ sube J1 times middot middot middot times Jk we define the product ΦtimesΨ sube (I1 times J1)times middot middot middot times (Ik times Jk)by Φ times Ψ = ((a1 b1) (ak bk)) a isin Φ b isin Ψ The asymptotic subrankis defined as ˜Q(Φ) = limnrarrinfinQ(Φtimesn)1n One may think of Φ as a k-partitehypergraph and of a free diagonal in Φ as an induced k-partite matching

How does this combinatorial version of subrank relate to the tensor version ofsubrank that we defined earlier Let t isin Fn1otimesmiddot middot middototimesFnk Expand t in the standardbasis t =

sumiisin[n1]timesmiddotmiddotmiddottimes[nk] ti ei1 otimes middot middot middot otimes eik Let supp(t) be the support of t in the

standard basis supp(t) = i isin [n1]timesmiddot middot middottimes [nk] ti 6= 0 Then Q(supp(t)) le Q(t)We want to construct large free diagonals Let Φ sube I1timesmiddot middot middottimesIk We call Φ tight

if there are injective maps αi Ii rarr Z such that if a isin Φ thensumk

i=1 αi(ai) = 0For a set X let P(X) be the set of probability distributions on X For θ isin P([k]) letHθ(Φ) = maxPisinP(Φ)

sumki=1 θ(i)H(Pi) where H(Pi) denotes the Shannon entropy

of the ith marginal distribution of P In [Str91] Strassen used the CW methodand the support functionals to characterise the asymptotic subrank ˜Q(Φ) fortight Φ sube I1 times I2 times I3 He proved the following Let Φ sube I1 times I2 times I3 be tight

14 Abstract asymptotic spectra 11

Then

˜Q(Φ) = minθisinP([3])

2Hθ(Φ) = maxPisinP(Φ)

miniisin[3]

2H(Pi) (11)

We study the higher-order regime Φ sube I1 times middot middot middot times Ik k ge 4

Theorem (Theorem 57) Let Φ sube I1 times middot middot middot times Ik be tight Then ˜Q(Φ) is lowerbounded by an expression that generalizes the right-hand side of (11)

Stating the lower bound requires a few definitions so we do not state it hereIt is not known whether our new lower bound matches the upper bound given byquantum or support functionals

Using Theorem 57 we managed to exactly determine the asymptotic subranksof several new examples These results in turn we used to obtain upper boundson the asymptotic rank of so-called complete graph tensors via a higher-orderStrassen laser method

14 Abstract asymptotic spectra

Strassen mainly studied tensors but he developed an abstract theory of asymptoticspectra in a general setting In the next section we apply this abstract theory tographs We now introduce the abstract theory One has a semiring S (think of asemiring as a ring without additive inverses) that contains N and a preorder 6on S that (1) behaves well with respect to the semiring operations (2) inducesthe natural order on N and (3) for any a b isin S b 6= 0 there is an r isin N sube Swith a 6 r middot b We call such a preorder a Strassen preorder The main theoremis that the asymptotic version 6sim of the Strassen preorder is characterised by themonotone semiring homomorphisms S rarr Rge0 For a b isin S let a 6sim b if there is a

sequence xn isin NN with x1nn rarr 1 when nrarrinfin and an 6 bnxn for all n isin N Let

X = X(S6) = φ isin Hom(SRge0) foralla b isin S a 6 brArr φ(a) le φ(b)

The set X is called the asymptotic spectrum of (S6)

Theorem (Strassen) a 6sim b iff forallφ isin X φ(a) le φ(b)

Strassen applies this theorem to study rank and subrank of tensors Wedefine an abstract notion of rank R(a) = minn isin N a 6 n and an abstractnotion of subrank Q(a) = maxm isin N m 6 a We then naturally have anasymptotic rank ˜R(a) = limnrarrinfinR(an)1n and (under certain mild conditions) anasymptotic subrank ˜Q(a) = limnrarrinfinQ(an)1n In fact ˜R(a) = infn R(an)1n and

˜Q(a) = supn Q(an)1n by Feketersquos lemma The theorem implies the following dualcharacterisations

12 Chapter 1 Introduction

Corollary (Section 28) If a isin S with ak gt 2 for some k isin N then

˜Q(a) = minφisinX

φ(a)

If a isin S with φ(a) ge 1 for some φ isin X then

˜R(a) = maxφisinX

φ(a)

In Chapter 2 we will discuss the abstract theory of asymptotic spectra Wewill discuss a proof of the above theorem that is obtained by integrating the proofsof Strassen in [Str88] and the proof of the KadisonndashDubois theorem of Beckerand Schwarz in [BS83] We will also discuss some basic properties of generalasymptotic spectra

15 The asymptotic spectrum of graphs

In the previous section we have seen the abstract theory of asymptotic spectraWe now discuss a problem in graph theory where we can apply this abstracttheory Consider a communication channel with input alphabet a b c d e andoutput alphabet 1 2 3 4 5 When the sender gives an input to the channel thereceiver gets an output according to the following diagram where an outgoingarrow is picked randomly (say uniformly randomly)

a 1

b 2

c 3

d 4

e 5

Output 2 has an incoming arrow from a and an incoming arrow from b Wesay a and b are confusable because the receiver cannot know whether a or bwas given as an input to the channel In this channel the pairs of inputsa b b c c d d e e a are confusable If we restrict the input set toa subset of pairwise non-confusable letters say a c then we can use the channelto communicate two messages with zero error It is clear that for this channel anynon-confusable set of inputs has size at most two Can we make better use of thechannel if we use the channel twice Yes now the input set is the set of two letterwords aa ab ac ad ae ba bb and we have a set of pairwise non-confusablewords aa bc ce db ed which has size 5 Thus ldquoper channel userdquo we can send atleast

radic5 letters What happens if we use the channel n times

15 The asymptotic spectrum of graphs 13

The situation is concisely described by drawing the confusability graph of thechannel which has the input letters as vertices and the confusable pairs of inputletters as edges For the above channel the confusability graph is the 5-cycle C5

a

b

cd

e

A subset of inputs that are pairwise non-confusable corresponds to a subset ofthe vertices in the confusability graph that contains no edges an independent setThe independence number of any graph G is the size of the largest independentset in G and is denoted by α(G) If G is the confusability graph of some channelthen the confusability graph for using the channel n times is denoted by Gn (thegraph product is called the strong graph product) The question of how manyletters we can send asymptotically translates to computing the limit

Θ(G) = limnrarrinfin

α(Gn)1n

which exists because α is supermultiplicative under The parameter Θ(G) wasintroduced by Shannon [Sha56] and is called the Shannon capacity of the graph GComputing the Shannon capacity is a nontrivial problem already for small graphsLovasz in 1979 [Lov79] computed the value Θ(C5) =

radic5 by introducing and

evaluating a new graph parameter ϑ which is now known as the Lovasz thetanumber Already for the 7-cycle C7 the Shannon capacity is not known

Duality theorem We propose a new application of the abstract theory ofasymptotic spectra to graph theory The main theorem that results from this is adual characterisation of the Shannon capacity of graphs For graphs G and H wesay G 6 H if there is a graph homomorphism Grarr H ie from the complementof G to the complement of H We show graphs are a semiring under the stronggraph product and the disjoint union t and 6 is a Strassen preorder onthis semiring The rank in this setting is the clique cover number χ(middot) = χ( middot )ie the chromatic number of the complement The subrank in this setting is theindependence number α(middot) Let X(G) be the set of semiring homomorphismsfrom graphs to Rge0 that are monotone under 6 From the abstract theory ofasymptotic spectra we derive the following duality theorem

Theorem (Theorem 31) Θ(G) = minφisinX(G) φ(G)

In Chapter 3 we will prove Theorem 31 and we will discuss the known elementsin X(G) which are the Lovasz theta number and a family of parameters obtainedby ldquofractionalisingrdquo

14 Chapter 1 Introduction

16 Tensor degeneration

We move to the second story line that we mentioned earlier degeneration Degen-eration is a prominent theme in algebraic complexity theory Roughly speakingdegeneration is an algebraic notion of approximation defined via orbit closures

For tensors for example degeneration is defined as follows Let V1 V2 V3

be finite-dimensional complex vector spaces and let V = V1 otimes V2 otimes V3 be thetensor product space Let G = GL(V1) times GL(V2) times GL(V3) act naturally on V Let s t isin V Let G middot t = g middot t g isin G be the orbit of t under G We say tdegenerates to s and write t s if s is an element in the orbit closure G middot t Herethe closure is taken with respect to the Zariski topology or equivalently withrespect to the Euclidean topology One should think of this degeneration asa topologically closed version of the restriction preorder le for tensors that wedefined earlier Degeneration is a ldquolargerrdquo preorder than restriction in the sensethat s t implies s le t

In several algebraic models of computation approximative computations cor-respond to certain degenerations In some models such an approximative com-putation can be turned into an exact computation at a small cost for exampleusing the method of interpolation The currently fastest matrix multiplicationalgorithms are constructed in this way for example

On the other hand it turns out that if a lower bound technique for analgebraic measure of complexity is ldquocontinuousrdquo then the lower bounds obtainedwith this technique are already lower bounds on the approximative version ofthe complexity measure This observation turns approximative complexity anddegeneration into an interesting topic itself A research program in this directionis the geometric complexity theory program of Mulmuley and Sohoni towardsseparating the algebraic complexity class VP (and related classes) from VNP[MS01] (see also [Ike13])

In this section we briefly discuss three results related to degeneration of tensorsthat are not discussed further in this dissertation Then we will discuss resultson combinatorial degeneration in Section 17 and algebraic branching programdegeneration in Section 18

Ratio of tensor rank and border rank The approximative or degenera-tion version of tensor rank is called border rank and is denoted by R It has beenknown since the work of Bini and Strassen that tensor rank R and border rank Rare different How much can they be different In [Zui17] we showed the followinglower bound Let k ge 3 There is a sequence of k-tensors tn in (C2n)otimesk such thatR(tn)R(tn) ge k minus o(1) when n rarr infin This answers a question of Landsbergand Micha lek [LM16b] and disproves a conjecture of Rhodes [AJRS13] Furtherprogress will most likely require the construction of explicit tensors with hightensor rank which has implications in formula complexity [Raz13]

Border support rank Support rank is a variation on tensor rank whichhas its own approximative version called border support rank A border support

17 Combinatorial degeneration 15

rank upper bound for the matrix multiplication tensor yields an upper bound onthe asymptotic complexity This was shown by Cohn and Umans in the contextof the group theoretic approach towards fast matrix multiplication [CU13] Theyasked what is the border support rank of the smallest matrix multiplicationtensor 〈2 2 2〉 In [BCZ17a] we showed that it equals seven Our proof usesthe highest-weight vector technique (see also [HIL13]) Our original motivationto study support rank is a connection that we found between support rank andnondeterministic multiparty quantum communication complexity [BCZ17b]

Tensor rank under outer tensor product We applied degeneration asa tool to study an outer tensor product otimes on tensors For s isin Cn1 otimes middot middot middot otimes Cnk

and t isin Cm1 otimes middot middot middot otimes Cm` let s otimes t be the natural (k + `)-tensor in Cn1 otimes middot middot middot otimesCnk otimes Cm1 otimes middot middot middot otimes Cm` The products otimes and otimes differ by a regrouping of thetensor indices It is well known that tensor rank is not multiplicative under otimesIn [CJZ18] we showed that tensor rank is already not multiplicative under otimes astronger result Nonmultiplicativity occurs when taking a power of a tensor whoseborder rank is strictly smaller than its tensor rank This answers a question ofDraisma [Dra15] and Saptharishi et al [CKSV16]

17 Combinatorial degeneration

In the previous section we introduced the general idea of degeneration and discusseddegeneration of tensors Combinatorial degeneration is the combinatorial analogueof tensor degeneration Consider sets Φ sube Ψ sube I1 times middot middot middot times Ik of k-tuples Wesay Φ is a combinatorial degeneration of Ψ and write Ψ Φ if there are mapsui Ii rarr Z such that for all α isin I1 times middot middot middot times Ik if α isin Ψ Φ then

sumki=1 ui(αi) gt 0

and if α isin Φ thensumk

i=1 ui(αi) = 0 We prove that combinatorial asymptoticsubrank is nonincreasing under combinatorial degeneration

Theorem (Theorem 521) If Ψ Φ then ˜Q(Ψ) ge ˜Q(Φ)

The analogous statement for subrank of tensors is trivially true The crucialpoint is that Theorem 521 is about combinatorial subrank As an example Theo-rem 521 combined with the CW method yields an elegant optimal constructionof tri-colored sum-free sets which are combinatorial objects related to cap sets

18 Algebraic branching program degeneration

We now consider degeneration in the context of algebraic branching programs Acentral theme in algebraic complexity theory is the study of the power of differentalgebraic models of computation and the study of the corresponding complexityclasses We have already (implicitly) used an algebraic model of computationwhen we discussed matrix multiplication circuits

16 Chapter 1 Introduction

bull A circuit is a directed acyclic graph G with one or more source verticesand one sink vertex Each source vertex is labelled by a variable xi ora constant α isin F The other vertices are labelled by either + or times andhave in-degree 2 (that is fan-in 2) Each vertex of G naturally computesa polynomial The value of G is the element computed at the sink vertexThe size of G is the number of vertices (One may also allow multiple sinkvertices in order to compute multiple polynomials eg to compute matrixmultiplication) Here is an example of a circuit computing xy + 2x+ y minus 1

minus1 2 x y source vertices

times times

+ +

+ sink vertex

Consider the following two models

bull A formula is a circuit whose graph is a tree

bull An algebraic branching program (abp) is a directed acyclic graph G withone source vertex s one sink vertex t and affine linear forms over the basefield F as edge labels Moreover each vertex is labeled with an integer (itslayer) and the arrows in the abp point from vertices in layer i to vertices inlayer i+ 1 The cardinality of the largest layer we call the width of the abpThe number of vertices we call the size of the abp The value of an abp isthe sum of the values of all sndasht-paths where the value of an sndasht-path is theproduct of its edge labels We say that an abp computes its value Here isan example of a width-3 abp computing xy + 2x+ y minus 1

s

t

x2

xyminus1

19 Organisation 17

The above models of computation give rise to complexity classes A complexityclass consists of families of multivariate polynomials (fn)n = (f(x1 xqn)n)nisinNover some fixed field F We say a family of polynomials (fn)n is a p-family if thedegree of fn and the number of variables of fn grow polynomially in n Let VPbe the class of p-families with polynomially bounded circuit size Let VPe be theclass of p-families with polynomially bounded formula size For k isin N let VPk bethe class of families of polynomials computable by width-k abps of polynomiallybounded size Let VPs be the class of p-families computable by skew circuitsof polynomial size Skew circuits are a type of circuits between formulas andgeneral circuits The class VPs coincides with the class of families of polynomialscomputable by abps of polynomially bounded size (see eg [Sap16]) Ben-Orand Cleve proved that VP3 = VP4 = middot middot middot = VPe [BOC92] Allender and Wangproved VP2 ( VP3 [AW16] Thus VP2 ( VP3 = VP4 = middot middot middot = VPe sube VPsThe following separation problem is one of the many open problems regardingalgebraic complexity classes Is the inclusion VPe sube VPs strict Motivated by thisseparation problem we study the approximation closure of VPe We mentionedthat Ben-Or and Cleve proved that formula size is polynomially equivalent towidth-3 abp size [BOC92] Regarding width-2 there are explicit polynomials thatcannot be computed by any width-2 abp of any size [AW16] The abp model hasa natural notion of approximation When we allow approximation in our abpsthe situation changes completely

Theorem (Theorem 78) Any polynomial can be approximated by a width-2 abpof size polynomial in the formula size

In terms of complexity classes this means VP2 = VPe where middot denotes theldquoapproximation closurerdquo of the complexity class The theorem suggests an ap-proach regarding the separation of VPe and VPs Namely superpolynomial lowerbounds on formula size may be obtained from superpolynomial lower bounds onapproximate width-2 abp size We moreover study the nondeterminism closure ofcomplexity classes and prove a new characterisation of the complexity class VNP

19 Organisation

This dissertation is divided into chapters as follows We will begin with the abstracttheory of asymptotic spectra in Chapter 2 Then we introduce the asymptoticspectra of graphs and a new characterisation of the Shannon capacity in Chapter 3In Chapter 4 we introduce the asymptotic spectrum of tensors discuss the supportfunctionals of Strassen for oblique tensors and a characterisation of asymptoticslice rank of oblique tensors as the minimum over the support functionals InChapter 5 we discuss tight tensors the higher-order CoppersmithndashWinogradmethod the combinatorial degeneration method and applications to the cap setproblem type sets and graph tensors In Chapter 6 we introduce an infinite family

18 Chapter 1 Introduction

of elements in the asymptotic spectrum of complex k-tensors and characterise theasymptotic slice rank as the minimum over the quantum functionals Finally inChapter 7 we study algebraic branching programs and approximation closure andnondeterminism closure of algebraic complexity classes

Chapter 2

The theory of asymptotic spectra

21 Introduction

This is an expository chapter about the abstract theory of asymptotic spectra ofVolker Strassen [Str88] The theory studies semirings S that are endowed with apreorder 6 The main result Theorem 212 is that under certain conditions theasymptotic version 6sim of this preorder is characterised by the semiring homomor-phisms S rarr Rge0 that are monotone under 6 These monotone homomorphismsmake up the ldquoasymptotic spectrumrdquo of (S6) For the elements of S we havenatural notions of rank and subrank generalising rank and subrank of tensorsThe asymptotic spectrum gives a dual characterisation of the asymptotic versionsof rank and subrank This dual description may be thought of as a ldquolower boundrdquomethod in the sense of computational complexity theory In Chapter 3 andChapter 4 we will study two specific pairs (S6)

22 Semirings and preorders

A (commutative) semiring is a set S with a binary addition operation + a binarymultiplication operation middot and elements 0 1 isin S such that for all a b c isin S

(1) + is associative (a+ b) + c = a+ (b+ c)

(2) + is commutative a+ b = b+ a

(3) 0 + a = a

(4) middot is associative (a middot b) middot c = a middot (b middot c)

(5) middot is commutative a middot b = b middot a

(6) 1 middot a = a

19

20 Chapter 2 The theory of asymptotic spectra

(7) middot distributes over + a middot (b+ c) = (a middot b) + (a middot c)

(8) 0 middot a = 0

As usual we abbreviate a middot b as ab A preorder is a relation 4 on a set X such thatfor all a b c isin X

(1) 4 is reflexive a 4 a

(2) 4 is transitive a 4 b and b 4 c implies a 4 c

As usual a 4 b is the same as b lt a Let N = 0 1 2 be the set of naturalnumbers and let Ngt0 = 1 2 be the set of strictly-positive natural numbersWe write le for the natural order 0 le 1 le 2 le 3 le middot middot middot on N

23 Strassen preorders

Let S be a semiring with N sube S A preorder 4 on S is a Strassen preorder if

(1) forallnm isin N n le m iff n 4 m

(2) foralla b c d isin S if a 4 b and c 4 d then a+ c 4 b+ d and ac 4 bd

(3) foralla b isin S b 6= 0 existr isin N a 4 rb

Note that condition (2) is equivalent to the condition foralla b s isin S if a 4 b thena+ s 4 b+ s and as 4 bs

Let 4 be a Strassen preorder on S Then 0 4 1 by condition (1) For a isin Swe have a 4 a by reflexivity and thus 0 4 a by condition (2)

Examples

We give two examples of a semiring with a Strassen preorder Proofs and formaldefinitions are given later

Graphs Let S be the set of all (isomorphism classes of) finite simple graphsLet GH isin S Let G t H be the disjoint union of G and H Let G H bethe strong graph product of G and H (see Chapter 3) With addition t andmultiplication the set S becomes a semiring The 0 in S is the graph with novertices and the 1 in S is the graph with a single vertex Let G be the complementof G Define a preorder 6 on S by G 6 H if there is a graph homomorphismGrarr H Then 6 is a Strassen preorder We will investigate this semiring furtherin Chapter 3

24 Asymptotic preorders 4sim 21

Tensors Let F be a field Let k isin N Let S be the set of all k-tensors over Fwith arbitrary format that is S = cupFn1 otimes middot middot middot otimes Fnk n1 nk isin N Fors isin Fn1 otimes middot middot middot otimes Fnk and t isin Fm1 otimes middot middot middot otimes Fmk let s 6 t if there are linear mapsAi Fmi rarr Fni with (A1otimesmiddot middot middototimesAk)t = s We identify any s t isin S for which s 6 tand t 6 s Let oplus be the direct sum of k-tensors and let otimes be the tensor productof k-tensors (see Chapter 4) With addition oplus and multiplication otimes the set Sbecomes a semiring The 0 in S is the zero tensor and the 1 in S is the standardbasis element e1otimes middot middot middot otimes e1 isin F1otimes middot middot middot otimes F1 The preorder 6 is a Strassen preorderWe will investigate this semiring further in Chapter 4 Chapter 5 and Chapter 6

24 Asymptotic preorders 4simDefinition 21 Let 4 be a relation on S Define the relation 4sim on S by

a2 4sim a1 if exist(xN) isin NN infNx

1NN = 1 forallN isin N aN2 4 aN1 xN (21)

If 4 is a Strassen preorder then we may in (21) replace the infimum infN x1NN

by the limit limNrarrinfin x1NN since we may assume xN+M le xNxM (if aN2 4 aN1 xN

and aM2 4 aM1 xM then aN+M2 4 aN+M

1 xNxM) and then apply Feketersquos lemma(Lemma 22)

Lemma 22 (Feketersquos lemma see [PS98 No 98]) Let x1 x2 x3 isin Rge0 satisfyxn+m le xn + xm Then limnrarrinfin xnn = infn xnn

Proof Let y = infn xnn Let ε gt 0 Let m isin Ngt0 with xmm lt y + ε Anyn isin N can be written in the form n = qm+ r where r is an integer 0 le r le mminus 1Set x0 = 0 Then xn = xqm+r le xm + xm + middot middot middot+ xm + xr = qxm + xr Therefore

xnn

=xqm+r

qm+ rle qxm + xr

qm+ r=xmm

qm

qm+ r+xrn

Thus

y le xnnlt (y + ε)

qm

n+xrn

The claim follows because xrnrarr 0 and qmnrarr 1 when nrarrinfin

For a1 a2 isin S if a1 4 a2 then clearly a1 4sim a2

Lemma 23 Let 4 be a Strassen preorder on S Then 4sim is a Strassen preorderon S the ldquoasymptotic preorderrdquo corresponding to 4

Proof Let a b c d isin S We verify that 4sim is a preorderFirst reflexivity We have a 4 a so aN 4 aN middot 1 so a 4sim a

22 Chapter 2 The theory of asymptotic spectra

Second transitivity Let a 4sim b and b 4sim c This means aN 4 bNxN andbN 4 cNyN with x

1NN rarr 1 and y

1NN rarr 1 Then aN 4 bNxN 4 cNxNyN Since

(xNyN)1N rarr 1 we conclude a 4sim cWe verify condition (1) Let nm isin N If n le m then n 4 m so n 4sim m If

n 4sim m then nN 4 nMxN so nN le mNxN which implies n le mWe verify condition (2) Let a 4sim b and c 4sim d This means aN 4 bNxN and

cN 4 dNyN Thus aNcN 4 bNdNxNyN and so ac 4sim bd Assume xN and yN arenondecreasing (otherwise set xN = maxnleN xn) Then

(a+ c)N =Nsumm=0

(N

m

)amcNminusm 4

Nsumm=0

(N

m

)bmdNminusmxmyNminusm

4Nsumm=0

(N

m

)bmdNminusmxNyN = (b+ d)NxNyN

Thus a+ c 4sim b+ dWe verify (3) Let a b isin S b 6= 0 Then there is an r isin N with a 4 rb and

thus a 4sim rb

Lemma 24 Let 4 be a Strassen preorder on S Let a1 a2 b isin S

(i) If a2 + b 4 a1 + b then a2 4sim a1

(ii) If a2b 4 a1b with b 6= 0 then a2 4sim a1

(iii) If a24simsim a1 then a2 4sim a1

(iv) If exists isin S foralln isin N na2 4 na1 + s then a2 4sim a1

Proof (ii) Let a2b 4 a1b By an inductive argument similar to the argument weused to prove (24)

forallN isin N aN2 b 4 aN1 b (22)

Let m r isin N with 1 4 mb 4 r (We use b 6= 0) From (22) follows

forallN isin N aN2 4 aN2 mb 4 aN1 mb 4 aN1 r

Thus we conclude a2 4sim a1(iii) Let a2

4simsim a1 This means aN2 4sim aN1 xN with x1NN rarr 1 This in turn means

that (aN2 )M 4 (aN1 xN)MyNM with forallN isin N y1MNM rarr 1 that is

aNM2 4 aNM1 xMN yNM

Choose a sequence N 7rarrMN such that (yNMN)1MN le 2 eg given N let MN be

the smallest M for which (yNM)1M le 2 Then aNMN2 4 aNMN

1 xMNN yNMN

and

(xMNN yNMN

)1(NMN ) = x1NN (yNMN

)1(NMN ) le x1NN 21N rarr 1

25 Maximal Strassen preorders 23

We conclude a2 4sim a1(iv) Let s isin S with foralln isin N na2 4 na1 + s We may assume a1 6= 0 Let k isin N

with s 4 ka1 Then

foralln isin N kna2 4 kna1 + ka1 = ka1(n+ 1) (23)

Apply (ii) to (23) to get

foralln isin N a2n 4sim a1(n+ 1)

By an inductive argument

forallN isin N aN2 4sim aNminus12 a12 4sim aNminus2

2 a213 4sim middot middot middot 4sim aN1 (N + 1)

Since (N + 1)1N rarr 1 a24simsim a1 From (iii) follows a2 4sim a1

(i) Let a2 + b 4 a1 + b We first prove

forallq isin N qa2 + b 4 qa1 + b (24)

By assumption the statement is true for q = 1 suppose the statement is truefor q minus 1 then

qa2 + b = (q minus 1)a2 + (a2 + b) 4 (q minus 1)a2 + (a1 + b)

= ((q minus 1)a2 + b) + a1 4 ((q minus 1)a1 + b) + a1 = qa1 + b

which proves the statement by induction Then foralln isin N na2 4 na1 + b From (iv)follows a2 4sim a1

25 Maximal Strassen preorders

Let P be the set of Strassen preorders on S For 4142 isin P we write 42 sube 41

if for all a b isin S a 42 b implies a 41 b (The notation 42 sube 41 is natural if wethink of the relations 4i as sets of pairs (a b) with a 4i b)

Lemma 25 Let 4 isin P with 4 = 4sim and a2 64 a1 Then there is an element4a1a2 isin P with 4 sube 4a1a2 and a1 4a1a2 a2

Proof For x1 x2 isin S let

x1 4a1a2 x2 if exists isin S x1 + sa2 4 x2 + sa1

The relation 4a1a2 is reflexive since x + 0 middot a2 4 x + 0 middot a1 The relation 4a1a2

is transitive if x1 4a1a2 x2 and x2 4a1a2 x3 then x1 + sa2 4 x2 + sa1 andx2 + ta2 4 x3 + ta1 for some s t isin S and so x1 + (t + s)a2 4 x2 + ta2 + sa1 4x3 + ta1 + sa1 = x3 + (t + s)a1 Thus x1 4a1a2 x3 We conclude that 4a1a2 is apreorder on S

24 Chapter 2 The theory of asymptotic spectra

We prove that 4a1a2 is a Strassen preorder If x1 4a1a2 x2 and y1 4a1a2 y2then clearly x1 + y1 4a1a2 x2 + y2 If x1 4a1a2 x2 and y isin S then x1y 4a1a2 x2yFrom this follows if x1 4a1a2 x2 and y1 4a1a2 y2 then x1y2 4a1a2 x2y2

Let nm isin N If n le m then n 4 m so n 4a1a2 m If n 6le m then n ge m+ 1Suppose n 4a1a2 m Let s isin S with n+ sa2 4 m+ sa1 Adding m+ 1 4 n gives

m+ 1 + n+ sa2 4 n+m+ sa1

Since 4 = 4sim we may apply Lemma 24 (i) to obtain

1 + sa2 4 sa1 (25)

From (25) follows s 6= 0 From (25) also follows

sa2 4 sa1 (26)

Since 4 = 4sim we may apply Lemma 24 (ii) to (26) to obtain the contradiction

a2 4 a1

Therefore n 64a1a2 m We conclude that 4a1a2 is a Strassen preorder thatis 4 isin P

Finally we have a1 4a1a2 a2 since a1 + 1 middot a2 4 a2 + 1 middot a1 Also if x1 4 x2then x1 + 0 middot a2 4 x2 + 0 middot a1 that is 4 sube 4a1a2

Let 4 be a Strassen preorder Let P4 be the set of Strassen preorderscontaining 4 ordered by inclusion sube Let C sube P4 be any chain Then theunion of all preorders in C is an element of P4 and contains all elements of CTherefore by Zornrsquos lemma P4 contains a maximal element (maximal withrespect to inclusion sube)

Lemma 26 Let 4 be maximal in P Then 4 = 4sim

Proof Trivially 4 sube 4sim From Lemma 23 we know 4sim isin P From maximalityof 4 follows 4 = 4sim

A relation 4 on S is total if for all a b isin S a 4 b or b 4 a

Lemma 27 Let 4 be maximal in P Then 4 is total

Proof Suppose 4 is not total say a1 64 a2 and a2 64 a1 By Lemma 25 there is anelement 4a1a2isin P with 4 sube 4a1a2 and a1 4a1a2 a2 Then 4 is strictly containedin 4a1a2 which contradicts the maximality of 4 We conclude 4 is total

26 The asymptotic spectrum X(S6) 25

26 The asymptotic spectrum X(S6)

Definition 28 Let S be a semiring with N sube S and let 6 be a Strassen preorderon S Let

X(S6) = φ isin Hom(SRge0) a 6 brArr φ(a) le φ(b)

We call X(S6) the asymptotic spectrum of (S6) We call the elements ofX(S6) spectral points

Lemma 29 Let 4 isin P be total There is exactly one semiring homomorphismφ S rarr Rge0 with

a 4 brArr φ(a) le φ(b)

Moreover if 4 is maximal in P then

a 4 bhArr φ(a) le φ(b)

Proof Let 4 isin P be total For a isin S define

φ(a) = inf rs

r s isin N sa 4 rψ(a) = supu

v u v isin N u 4 va

We prove ψ(a) le φ(a) Let r s u v isin N Suppose u 4 va and sa 4 r Thenfollows su 4 vsa 4 vr Thus uv le rs We prove ψ(a) ge φ(a) Supposeψ(a) lt φ(a) Let r s isin N with ψ(a) lt rs lt φ(a) Then sa 64 r Fromtotality follows sa lt r Thus ψ(a) ge rs which is a contradiction We concludeψ(a) = φ(a)

Let a b isin S We prove φ(a+ b) le φ(a) + φ(b) Let sa sb ra rb isin N Supposesaa 4 ra and sbb 4 rb Then sasba 4 sbra and sasbb 4 sarb By additionsasb(a+b) 4 sbra+sarb Thus φ(a+b) le ra

sa+ rb

sb We prove ψ(a+b) ge ψ(a)+ψ(b)

Suppose ua 4 vaa and ub 4 vbb Then vbua 4 vavba and vaub 4 vavbb By additionvbua + vaub 4 vavb(a+ b) Thus ψ(a+ b) ge ua

va+ ub

vb We thus have additivity

We prove φ(ab) le φ(a)φ(b) Suppose saa 4 ra and sbb 4 rb Then sasbab 4rarb Thus φ(ab) le ra

sa

rbsb

We prove ψ(ab) ge ψ(a)ψ(b) Suppose ua 4 vaa and

ub 4 vbb Then uaub 4 vavbab Thus uava

ubvble ψ(ab) We thus have multiplicativity

We prove monotonicity a 4 brArr φ(a) le φ(b) Suppose sbb 4 rb From a 4 bfollows sba 4 sbb 4 rb Thus φ(a) le rb

sb

We prove φ(1) = 1 Trivially 1 4 1 Therefore φ(1) le 11

= 1 and ψ(1) ge 11

= 1We prove φ(0) = 0 Trivially sa0 4 0 so φ(0) le 0

sa= 0 Trivially 0 4 va0 so

φ(0) ge 0va

= 0We prove the uniqueness of φ Let φ1 φ2 be semiring homomorphisms S rarr Rge0

with a 4 b rArr φi(a) le φi(b) Suppose φ1(a) lt φ2(a) Let u v isin N with

26 Chapter 2 The theory of asymptotic spectra

φ1(a) lt uvlt φ2(a) Then va 64 u so by totality va lt u Thus φ1(a) ge u

v which

is a contradiction This proves uniquenessFinally suppose 4 is maximal in P Lemma 26 gives 4 = 4sim Let a 64 b

From Lemma 24 (iv) follows existn na 64 nb+ 1 By totality na lt nb+ 1 Apply φto get φ(a) ge φ(b) + 1

n In particular φ(a) gt φ(b)

Lemma 210 The map

X(S6)rarr maximal elements in P6 φ 7rarr 4φwith a 4φ b iff φ(a) le φ(b) is a bijection

Proof Let φ isin X(S6) One verifies that 4φ is a Strassen preorder and 6 sube 6sim sube4φ Let 4 be maximal in P4φ Lemma 27 says that 4 is total By Lemma 29there is a ψ isin X(S6) with 4 sube 4ψ Clearly 4φ sube 4ψ The uniqueness statementof Lemma 29 implies φ = ψ This means 4φ = 4 that is 4φ is maximal Weconclude that the map is well defined

Let 4 maximal in P6 Then 4 is total By Lemma 29 there is a φ isin X(S6)with 4 sube 4φ We conclude the map is surjective

Let φ ψ isin X(S6) with 4φ = 4ψ From Lemma 29 follows φ = ψ Weconclude the map is injective

Lemma 211 Let a b isin S Then a 6sim b iff a 4 b for all maximal 4 isin P6

Proof Let 4 isin P6 be maximal Then 6sim sube 4sim = 4 by Lemma 26 so a 6sim bimplies a 4 b

Suppose a 66sim b Let n isin Nge1 with na 66sim nb+1 (Lemma 24 (iv)) By Lemma 25there is an element 4nb+1na isin P with 6sim sube 4nb+1na and we may assume 4nb+1na

is maximal Then nb+ 1 4nb+1na na and so a 64nb+1na b

27 The representation theorem

The following theorem is the main theorem

Theorem 212 ([Str88 Th 24]) Let S be a commutative semiring with N sube Sand let 6 be a Strassen preorder on S Let X = X(S6) be the set of 6-monotonesemiring homomorphisms from S to Rge0

X = X(S6) = φ isin Hom(SRge0) foralla b isin S a 6 brArr φ(a) le φ(b)

For a b isin S let a 6sim b if there is a sequence (xN) isin NN with x1NN rarr 1 when

N rarrinfin such that forallN isin N aN 6 bNxN Then

foralla b isin S a 6sim b iff forallφ isin X φ(a) le φ(b)

Proof Let a b isin S Suppose a 6sim b Then clearly for all φ isin X we haveφ(a) le φ(b) Suppose a 66sim b By Lemma 211 there is a maximal 4isin P6 witha 64 b By Lemma 210 there is a φ isin X with φ(a) gt φ(b)

28 Abstract rank and subrank RQ 27

28 Abstract rank and subrank RQ

We generalise the notions of rank and subrank for tensors to arbitrary semiringswith a Strassen preorder Let a isin S Define the rank

R(a) = minr isin N a 6 r

and the subrank

Q(a) = maxr isin N r 6 a

Then Q(a) le R(a) Define the asymptotic rank

˜R(a) = limNrarrinfin

R(aN)1N

Define the asymptotic subrank

˜Q(a) = limNrarrinfin

Q(aN)1N

By Feketersquos lemma (Lemma 22) asymptotic rank is an infimum and asymptoticsubrank is a supremum as follows

˜R(a) = infN

R(aN)1N

˜Q(a) = supN

Q(aN)1N when a = 0 or a ge 1

Theorem 212 implies that the asymptotic rank and asymptotic subrank have thefollowing dual characterisation in terms of the asymptotic spectrum (This is astraightforward generalisation of [Str88 Th 38])

Corollary 213 (cf [Str88 Th 38]) For a isin S with existφ isin Xφ(a) ge 1

˜R(a) = maxφisinX

φ(a)

Proof Let φ isin X For N isin N R(aN) ge φ(a)N Therefore ˜R(a) ge φ(a)and so ˜R(a) ge maxφisinX φ(a) It remains to prove ˜R(a) le maxφisinX φ(a) Welet x = maxφisinX φ(a) By assumption x ge 1 By definition of x we have

forallφ isin X φ(a) le x

Take the mth power on both sides

forallφ isin Xm isin N φ(am) le xm

Take the ceiling on the right-hand side

forallφ isin Xm isin N φ(am) le dxme

28 Chapter 2 The theory of asymptotic spectra

Apply Theorem 212 to get asymptotic preorders

forallm isin N am 6sim dxme

Then by definition of asymptotic preorder

forallmN isin N amN 6 dxmeN2εmN for some εmN isin o(N)

Then

forallmN isin N R(amN)1mN le dxme1m2εmNmN

From x ge 1 follows dxme1m rarr x when m rarr infin Choose m = m(N) withm(N)rarrinfin as N rarrinfin and εm(N)N isin o(N) to get ˜R(a) = infN R(aN )1N le x

Corollary 214 (cf [Str88 Th 38]) For a isin S with existk isin N ak gt 2

˜Q(a) = minφisinX

φ(a)

Proof Let φ isin X For N isin N Q(aN ) le φ(a)N Therefore ˜Q(a) le φ(a) so ˜Q(a) leminφisinX φ(a) It remains to prove ˜Q(a) ge minφisinX φ(a) Let y = minφisinX φ(a)

From the assumption ak gt 2 follows y gt 1 By definition of y we have

forallφ isin X φ(a) ge y

Take the mth power on both sides

forallφ isin Xm isin N φ(am) ge ym

Take the floor on the right-hand side

forallφ isin Xm isin N φ(am) ge bymc

Apply Theorem 212 to get asymptotic preorders

forallm isin N am gtsim bymc

Then by definition of asymptotic preorder

forallmN isin N amN2εmN gt bymcN for some εmN isin o(N)

Now we use ak gt 2 to get

forallmN isin N amN+kεmN gt bymcN

Then

forallmN isin N Q(amN+kεmN )1

mN+kεmN ge bymcN

mN+kεmN

Choose m = m(N) with m(N) rarr infin as N rarr infin and εm(N)N isin o(N) to obtain

˜Q(a) = supN Q(aN)1N ge y

29 Topological aspects 29

29 Topological aspects

Theorem 212 does not tell the full story Namely there is also a topologicalcomponent which we will now discuss Let S be a semiring with N sube S Let 6 bea Strassen preorder on S Let X = X(S6) be the asymptotic spectrum of (S6)For a isin S let

a Xrarr Rge0 φ 7rarr φ(a) (27)

The map a simply evaluates a given homomorphism φ at a One may think of aas the collection (φ(a))φisinX of all evaluations of the elements of X at a Let Rge0

have the Euclidean topology Endow X with the weak topology with respect tothe family of functions a a isin S That is endow X with the coarsest topologysuch that each a becomes continuous

Let C(XRge0) be the semiring of continuous functions Xrarr Rge0 with additionand multiplication defined pointwise on X that is (f + g)(x) = f(x) + g(x)and (f middot g)(x) = f(x)g(x) for f g isin C(XRge0) and x isin X Define the semiringhomomorphism

Φ S rarr C(XRge0) a 7rarr a

which maps a to the evaluator a defined in (27)

Theorem 215 ([Str88 Th 24])

(i) X is a nonempty compact Hausdorff space

(ii) foralla b isin S a 6sim b iff Φ(a) le Φ(b) pointwise on X

(iii) Φ(S) separates the points of X

Proof Statement (ii) follows from Theorem 212Statement (iii) is clearWe prove statement (i) We have 2 66sim 1 so from Theorem 212 follows that X

cannot be emptyFor a isin S let na isin N with a le na Then for φ isin X φ(a) le na and so

φ(a) isin [0 na] Embed X subeprod

aisinS[0 na] as a set via φ 7rarr (φ(a))aisinS The setprodaisinS[0 na] with the product topology is compact by the theorem of TychonoffTo see that X is closed in

prodaisinS[0 na] we write X as an intersection of sets

X =φ isin

prodaisinS

[0 na] φ(0) = 0capφ isin

prodaisinS

[0 na] φ(1) = 1

cap⋂bcisinS

φ isin

prodaisinS

[0 na] φ(b+ c)minus φ(b)minus φ(c) = 0

cap⋂bcisinS

φ isin

prodaisinS

[0 na] φ(bc)minus φ(b)φ(c) = 0

30 Chapter 2 The theory of asymptotic spectra

cap⋂bcisinSblec

φ isin

prodaisinS

[0 na] φ(b) le φ(c)

and we observe that the intersected sets are closed

X = 0minus1(0) cap 1minus1(1)

cap⋂bcisinS

((b+ c)minus bminus c

)minus1(0)

cap⋂bcisinS

((bc)minus bc

)minus1(0)

cap⋂bcisinSblec

(cminus b

)minus1([0infin))

This implies X is also compactLet φ ψ isin X be distinct Let a isin S with φ(a) 6= ψ(a) Then a(φ) 6= a(ψ)

Let U 3 a(φ) V 3 a(ψ) be open and disjoint subsets of Rge0 Then aminus1(U) andaminus1(V ) are open and disjoint subsets of X We conclude that X is Hausdorff

210 Uniqueness

Let S be a semiring with N sube S Let 6 be a Strassen preorder on S LetX = X(S6) be the asymptotic spectrum of (S6) The object X is unique inthe following sense

Theorem 216 ([Str88 Cor 27]) Let Y be a compact Hausdorff space LetΨ S rarr C(YRge0) be a homomorphism of semirings such that

Ψ(S) separates the points of Y (28)

and

foralla b isin S a 6sim bhArr Ψ(a) le Ψ(b) pointwise on Y (29)

Then there is a unique homeomorphism (continuous bijection with continuousinverse) h Y rarr X such that the diagram

S

C(XRge0) C(YRge0)

ΨΦ

hlowast

(210)

commutes where hlowast φ 7rarr φ h Namely let h y 7rarr(a 7rarr Ψ(a)(y)

)

211 Subsemirings 31

Proof We prove uniqueness Suppose there are two such homeomorphisms

h1 h2 Y rarr X

Suppose x 6= h2(hminus11 (x)) for some x isin X Since Φ(S) separates the points of X

there is an a isin S with Φ(a)(x) 6= Φ(a)(h2(hminus11 (x))) Let y = hminus1

1 (x) isin Y ThenΦ(a)(h1(y)) 6= Φ(a)(h2(y)) Since (210) commutes Φ(a)(h1(y)) = Ψ(a)(y) andΦ(a)(h2(y)) = Ψ(a)(y) a contradiction

We prove existence Let h Y rarr X y 7rarr (a 7rarr Ψ(a)(y)) One verifies that his well-defined continuous injective and that the diagram in (210) commutes Itremains to show that h is surjective We know that Q middot Φ(S) is a Q-subalgebraof C(XR) which separates points and which contains the nonzero constantfunction Φ(1) so by the StonendashWeierstrass theorem Q middot Φ(S) is dense in C(XR)under the sup-norm Suppose h is not surjective Then h(Y) ( X is a properclosed subset Let x0 isin X h(Y) be in the complement Since X is a compactHausdorff space there is a continuous function f Xrarr [minus1 1] with

f(h(Y)) = 1

f(x0) = minus1

We know that f can be approximated by elements from Q middot Φ(S) ie let ε gt 0then there are a1 a2 isin S N isin N such that

1N

(Φ(a1)(x)minus Φ(a2)(x)

)gt 1minus ε for all x isin h(Y)

1N

(Φ(a1)(x0)minus Φ(a2)(x0)

)lt minus1 + ε

This means Ψ(a1) ge Ψ(a2) pointwise on Y so a1 gtsim a2 but also Φ(a1) 6ge Φ(a2)pointwise on X so a1 6gtsim a2 This is a contradiction

211 Subsemirings

Let S be a subsemiring of a semiring T and let 6 be a Strassen preorder on T Then the restriction 6|S is a Strassen preorder on S How are the asymptoticspectra X(S6|S) and X(T6) related Obviously for φ isin X(T6) we haveφ|S isin X(S6|S) In fact the uniqueness theorem of Section 210 implies that allelements of X(S6|S) are restrictions of elements of X(T6)

Corollary 217 Let S be a subsemiring of a semiring T Let 6 be a Strassenpreorder on T Then

X(S6|S) = X(T6)|S

Proof Let

X = X(S6|S)

32 Chapter 2 The theory of asymptotic spectra

Φ S rarr C(XRge0) a 7rarr a

and let

Y = X(T6)|S = φ|S φ isin X(T6)Ψ S rarr C(YRge0) a 7rarr

(φ|S 7rarr φ|S(a)

)

Then Y is a compact Hausdorff space Let φ|S ψ|S isin Y be distinct Then there isan a isin S with φ|S(a) 6= ψ|S(a) so (28) holds For a b isin S a 6sim b iff Φ(a) le Φ(b)iff Ψ(a) le Ψ(b) so (29) holds Therefore

h X(T6)|S rarr X(S6|S) φ|S 7rarr(a 7rarr Ψ(a)(φ|S)

)= φ|S

is a homeomorphism

212 Subsemirings generated by one element

Let S be a semiring and let 6 be a Strassen preorder on S We specialise to thesimplest type of subsemiring of S Namely let a isin S and let

N[a] = ksumi=0

ni ai k isin N ni isin N

sube S

be the subsemiring of S generated by a We call X(N[a]) = X(N[a]6|N[a]) theasymptotic spectrum of a

Corollary 218 (cf [Str88]) If ak gt 2 for some k isin N then

˜Q isin X(N[a])

If φ(a) ge 1 for some φ isin X then

˜R isin X(N[a])

Proof Let X = X(N[a]) Let n1 nq By Corollary 214

˜Q(an1 + middot middot middot+ anq) = minφisinX

φ(an1 + middot middot middot+ anq)

Since φ is a homomorphism φ(an1 + middot middot middot+ anq) = φ(a)n1 + middot middot middot+ φ(a)nq Now weobserve that xn1 + middot middot middot+ xnq is minimised by taking x minimal in the domain Weconclude

˜Q(an1 + middot middot middot+ anq) =

qsumi=1

(minφisinX

φ(a))ni = ˜Q(a)n1 + middot middot middot+ ˜Q(a)nq

The claim for asymptotic rank ˜R similarly follows from Corollary 213

213 Universal spectral points 33

Remark 219 In general asymptotic subrank ˜Q and asymptotic rank ˜R are notelements of the asymptotic spectrum We will see an example in Chapter 4 relatedto the matrix multiplication tensor

Remark 220 Corollary 218 is closely related to Schonhagersquos τ -theorem fortensors also called Schonhagersquos asymptotic sum inequality The τ -theorem featuresin every recent fast matrix multiplication algorithm (ie every algorithm based onthe laser method)

Remark 221 An element φ isin X(N[a]) is uniquely determined by the valueof φ(a) isin Rge0 We may thus identify the asymptotic spectrum X(N[a]) with acompact (ie closed and bounded) subset of the positive reals Rge0 via φ 7rarr φ(a)

213 Universal spectral points

Having discussed the simplest type of subsemiring in the previous section letus discuss the most difficult type of supersemiring When applying the theoryof asymptotic spectra to some setting there is a natural largest semiring S inwhich the objects of study live For example we may study the semiring S of all(equivalence classes of) 3-tensors of arbitrary format over F Or we may studythe semiring S of all (isomorphism classes of) finite simple graphs We refer tothe elements of the asymptotic spectrum X(S) of the ldquoambientrdquo semiring S bythe term universal spectral points (cf [Str88 page 119]) The universal spectralpoints are the most useful monotone homomorphisms

214 Conclusion

To a semiring S with a Strassen preorder 6 we associated an asymptotic pre-order 6sim We proved that this asymptotic preorder is characterised by the6-monotone semiring homomorphisms S rarr Rge0 which make up the asymp-totic spectrum X(S6) of (S6) For (S6) we naturally have a rank functionR S rarr N and a subrank function Q S rarr N Their asymptotic versions

˜R(a) = infn R(an)1n and ˜Q(a) = supn Q(an)1n coincide with maxφisinX(S6) φ(a)

and minφisinX(S6) φ(a) respectively assuming existφ isin Xφ(a) ge 1 and existk isin N ak gt 2respectively Unfortunately we have proved the existence of the asymptotic spec-trum by nonconstructive means Explicitly constructing spectral points for a givenpair (S6) will be a challenging task

Some remarks about our proof in this chapter The proof in [Str88] uses theKadisonndashDubois theorem from the paper of Becker and Schwartz [BS83] as ablack-box Our presentation basically integrates the proof of Strassen with theproof of Becker and Schwartz The notions of rank and subrank were in [Str88] onlydiscussed for tensors We considered the straightforward generalisation to arbitrary

34 Chapter 2 The theory of asymptotic spectra

semirings with a Strassen preorder An evident feature of our presentation is thatwe do not pass from the semiring to its Grothendieck ring but instead stay inthe semiring In this way we stay close to the ldquoreal worldrdquo objects I thank JopBriet and Lex Schrijver for this idea There is a large body of literature on theKadisonndashDubois theorem for which we refer to the modern books by Prestel andDelzell [PD01 Theorem 526] and Marshall [Mar08 Theorem 544]

Chapter 3

The asymptotic spectrum of graphsShannon capacity

This chapter is based on the manuscript [Zui18]

31 Introduction

This chapter is about the Shannon capacity of graphs which was introduced byClaude Shannon in the context of coding theory [Sha56] More precisely we willapply the theory of asymptotic spectra of Chapter 2 to gain a better understandingof Shannon capacity (and other asymptotic properties of graphs)

We first recall the definition of the Shannon capacity of a graph Let G be a(finite simple) graph with vertex set V (G) and edge set E(G) An independent setor stable set in G is a subset of V (G) that contains no edges The independencenumber or stability number α(G) is the cardinality of the largest independentset in G For graphs G and H the and-product GH also called strong graphproduct is defined by

V (GH) = V (G)times V (H)

E(GH) =(g h) (gprime hprime)

(g gprime isin E(G) or g = gprime

)and

(h hprime isin E(H) or h = hprime

)and (g h) 6= (gprime hprime)

The Shannon capacity Θ(G) is defined as the limit

Θ(G) = limNrarrinfin

α(GN)1N (31)

This limit exists and equals the supremum supN α(GN)1N by Feketersquos lemma(Lemma 22)

Computing the Shannon capacity is nontrivial already for small graphs Lovaszin [Lov79] computed the value Θ(C5) =

radic5 where Ck denotes the k-cycle graph

by introducing and evaluating a new graph parameter ϑ which is now known as

35

36 Chapter 3 The asymptotic spectrum of graphs Shannon capacity

the Lovasz theta number For example the value of Θ(C7) is currently not knownThe Shannon capacity Θ is not known to be hard to compute in the sense ofcomputational complexity On the other hand deciding whether α(G) le k givena graph G and k isin N is NP-complete [Kar72]

New result dual description of Shannon capacity

The new result of this chapter is a dual characterisation of the Shannon capacityof graphs This characterisation is obtained by applying Strassenrsquos theory ofasymptotic spectra of Chapter 2 Thus this chapter also serves as an illustrationof the theory of asymptotic spectra

To state the theorem we need the standard notions graph homomorphismgraph complement and graph disjoint union Let G and H be graphs A graphhomomorphism f G rarr H is a map f V (G) rarr V (H) such that for allu v isin V (G) if u v isin E(G) then f(u) f(v) isin E(H) In other words a graphhomomorphism maps edges to edges The complement G of G is defined by

V (G) = V (G)

E(G) =u v u v 6isin E(G) u 6= v

We define a relation 6 on graphs let G 6 H if there is a graph homomor-phism Grarr H from the complement of G to the complement of H The disjointunion G tH is defined by

V (G tH) = V (G) t V (H)

E(G tH) = E(G) t E(H)

For n isin N the complete graph Kn is the graph with V (Kn) = [n] = 1 2 nand E(Kn) = i j i j isin [n] i 6= j Thus K0 = K0 is the empty graphand K1 = K1 is the graph consisting of a single vertex and no edges

Theorem 31 Let S sube graphs be a collection of graphs which is closed underthe disjoint union t and the strong graph product and which contains the graphwith a single vertex K1 Define the asymptotic spectrum X(S) as the set of allmaps φ S rarr Rge0 such that for all GH isin S

(1) if G 6 H then φ(G) le φ(H)

(2) φ(G tH) = φ(G) + φ(H)

(3) φ(GH) = φ(G)φ(H)

(4) φ(K1) = 1

32 The asymptotic spectrum of graphs 37

Let G 6sim H if there is a sequence (xN) isin NN with x1NN rarr 1 when N rarr infin such

that for every N isin N

GN 6 (HN)txN = HN t middot middot middot tHN︸ ︷︷ ︸xN

Then

(i) G 6sim H iff forallφ isin X(S) φ(G) le φ(H)

(ii) Θ(G) = minφisinX(S) φ(G)

Statement (ii) of Theorem 31 is nontrivial in the sense that Θ is not anelement of X(graphs) Namely Θ is not additive under t by a result ofAlon [Alo98] and Θ is not multiplicative under by a result of Haemers [Hae79]It turns out that the graph parameter G 7rarr maxφisinX(graphs) φ(G) is itself anelement of X(graphs) and is equal to the fractional clique cover number χf (seeSection 332 and eg [Sch03 Eq (67112)]) Fritz in [Fri17] proves (independentlyof Strassenrsquos line of work) a statement that is weaker than Theorem 31 Namelyhe proves the statement of Theorem 31 without the additivity condition (2)

In Section 32 we will prove Theorem 31 by applying the theory of asymptoticspectra of Chapter 2 to the appropriate semiring and preorder In Section 33 wewill discuss the elements in the asymptotic spectrum of graphs X(graphs) thatare currently known to me the Lovasz theta number the fractional clique covernumber the fractional orthogonal rank of the complement and the fractionalHaemers bounds We moreover prove a sufficient condition for the ldquofractionalisa-tionrdquo of a graph parameter to be in the asymptotic spectrum of graphs

32 The asymptotic spectrum of graphs

In this section we prove Theorem 31 by applying the theory of asymptotic spectrato the appropriate semiring

321 The semiring of graph isomorphism classes GA graph homomorphism f Grarr H is a graph isomorphism if f is bijective asa map V (G)rarr V (H) and bijective as a map E(G)rarr E(H) We write G sim= Hif there is a graph isomorphism f G rarr H The relation sim= is an equivalencerelation on graphs which we call isomorphism For example the graphs Gand H given by

V (G) = a b c d E(G) = a b b c c d a dV (H) = 1 2 3 4 E(H) = 1 3 2 3 2 4 1 4

38 Chapter 3 The asymptotic spectrum of graphs Shannon capacity

are isomorphic Let G = graphssim= be the set of equivalence classes in graphsunder sim= ie the isomorphism classes The relation 6 is a preorder on G Recallthat Kn is the complete graph on n vertices and thus Kn is the graph with nvertices and no edges

Lemma 32 Let ABC isin graphs

(i) t and are commutative and associative operations on G

(ii) distributes over t on G ie A (B t C) = (AB) t (A C)

(iii) K1 A = A

(iv) K0 A = K0

(v) K0 t A = A

(vi) Kn tKm = Kn+m

Proof We leave the proof to the reader

In other words Lemma 32 says that (Gt K0 K1) is a (commutative)semiring in which the elements K0 K1 K2 behave like the natural numbers NWe will denote this semiring simply by G

322 Strassen preorder via graph homomorphisms

Let G be the semiring of graphs Recall that G 6 H if there is a graph homomor-phism f Grarr H

Lemma 33 The preorder 6 is a Strassen preorder on G That is for graphsABCD isin G we have the following

(i) For nm isin N Kn 6 Km iff n le m

(ii) If A 6 B and C 6 D then A t C 6 B tD and A C 6 B D

(iii) For AB isin G if B 6= K0 then there is an r isin N with A 6 Kr B

Proof Statement (i) is easy to verify We prove (ii) Let f Ararr B and g C rarr Dbe graph homomorphisms Let the map f t g V (A) t V (C)rarr V (B) t V (D) bedefined by

(f + g)(a) = f(a) for a isin V (A)

(f + g)(c) = g(c) for c isin V (C)

32 The asymptotic spectrum of graphs 39

One verifies directly that f t g is a graph homomorphism A t C rarr B tD Letthe map f g V (A)times V (C)rarr V (B)times V (D) be defined by

(f g)(a c) = (f(a) g(c))

One verifies directly that f g is a graph homomorphism A C rarr B D Thisproves (ii) We prove (iii) Let r = |V (A)| Then A 6 Kr By assumptionB 6= K0 so K1 6 B Therefore A 6 Kr

sim= Kr1 6 KrB This proves (iii)

323 The asymptotic spectrum of graphs X(G)

We thus have a semiring G with a Strassen preorder 6 We are therefore in theposition to apply the theory of asymptotic spectra (Chapter 2) Let us translatethe abstract terminology to this setting

Let G 6sim H if there is a sequence (xN) isin NN with (xN)1N rarr 1 such that forevery N isin N we have GN 6 HN KxN ie GN 6 (HN)txN

Let S sube G be a subsemiring For example one may take S = G or one maychoose any set X sube G and let S = N[X] be the subsemiring of G generated by Xunder t and

The asymptotic spectrum of S is the set X(S) of 6-monotone semiring homo-morphisms S rarr Rge0 ie all maps φ S rarr Rge0 such that for all GH isin S

(1) if G 6 H then φ(G) le φ(H)

(2) φ(G tH) = φ(G) + φ(H)

(3) φ(GH) = φ(G)φ(H)

(4) φ(K1) = 1

We call X(G) the asymptotic spectrum of graphs

Theorem 34 Let GH isin S Then G 6sim H iff forallφ isin X(S) φ(G) le φ(H)

Proof By Lemma 32 we have a semigroup S and by Lemma 33 we have aStrassen preorder 6 so we may apply Theorem 212

We refer to Chapter 2 for a discussion of the topological properties of X(S)

324 Shannon capacity Θ

Let us discuss the (asymptotic) rank and (asymptotic) subrank for (G6) Recallthat an independent set in G is a subset of V (G) that contains no edges andthe independence number α(G) is the cardinality of the largest independent setin G A colouring of G is an assignment of colours to the elements of V (G) suchthat connected vertices get distinct colours The chromatic number χ(G) is the

40 Chapter 3 The asymptotic spectrum of graphs Shannon capacity

smallest number of colours in any colouring of G The clique cover number χ(G)is defined as the chromatic number of the complement χ(G) = χ(G)

For the semiring G with preorder 6 the abstract definition of subrank ofSection 28 becomes Q(G) = maxm isin N Km 6 G and the abstract definitionof rank becomes R(G) = minn isin N G 6 Kn

Lemma 35

(i) α(G) = Q(G)

(ii) χ(G) = R(G)

Proof We leave the proof to the reader

We see directly that the asymptotic rank is the Shannon capacity

˜R(G) = limNrarrinfin

R(GN)1N = limNrarrinfin

α(GN)1N = Θ(G)

and that the asymptotic subrank is the asymptotic clique cover number

˜Q(G) = limNrarrinfin

Q(GN)1N = limNrarrinfin

χ(GN)1N = ˜χ(G)

Let S sube G be a subsemiring Let G isin S

Corollary 36 Θ(G) = minφisinX(S) φ(G)

Proof Let G be a graph Either G = K0 or K1 6 G 6 K1 or G contains at leastone edge In the first two cases the claim is clearly true In the third case G gt K2

and we may thus apply Corollary 213

Corollary 37 ˜χ(G) = maxφisinX(S) φ(G)

Proof This is Corollary 214

Remark 38 As mentioned earlier it turns out that ˜χ is in fact itself an elementof X(G) See Section 332 (This is a striking difference with the situation fortensors which we will discuss in Chapter 4 there both asymptotic rank andasymptotic subrank are not in the asymptotic spectrum see Remark 44)

Shannon capacity is not in the asymptotic spectrum

Lemma 39 GG ge K|V (G)|

Proof Let D = (u u) u isin V (G) Let (u u) (v v) isin D Then eitheru v isin E(G) or u v isin E(G) (exclusive or) and so (u u) (v v) 6isin E(GG)Therefore the subgraph in GG induced by D is isomorphic to K|V (G)|

Example 310 Let G be the Schlafli graph This is a graph with 27 verticesThus Θ(GG) ge |V (G)| = 27 On the other hand Haemers in [Hae79] showedthat Θ(G)Θ(G) le 21 This implies the map Θ is not in X(G) since it is notmultiplicative under

33 Universal spectral points 41

33 Universal spectral points

The abstract theory of asymptotic spectra of Chapter 2 does not explicitly describethe elements of X(G) ie the universal spectral points (cf Section 213) Howeverseveral graph parameters from the literature can be shown to be universal spectralpoints In fact recently in [BC18] the first infinite family of universal spectralpoints was found the fractional Haemers bounds We give a brief (and probablyincomplete) overview of currently known elements in X(G)

331 Lovasz theta number ϑ

For any real symmetric matrix A let Λ(A) be the largest eigenvalue The Lovasztheta number ϑ(G) is defined as

ϑ(G) = minΛ(A) A isin RV (G)timesV (G) symmetric u v 6isin E(G)rArr Auv = 1

The parameter ϑ(G) was introduced by Lovasz in [Lov79] We refer to [Knu94]and [Sch03] for a survey It follows from well-known properties that ϑ isin X(G)

332 Fractional graph parameters

Besides the Lovasz theta number there are several elements in X(G) that arenaturally obtained as fractional versions of -submultiplicative t-subadditive6-monotone maps G rarr Rge0 For any map φ G rarr Rge0 we define a fractionalversion φf by

φf (G) = infd

φ(GKd

)d

We will discuss several fractional parameters from the literature and prove ageneral theorem about fractional parameters

Fractional clique cover number

We consider the fractional version of the clique cover number χ(G) = χ(G) It iswell-known that χf isin X(G) see eg [Sch03] The fractional clique cover number χfin fact equals the asymptotic clique cover number ˜χ(G) = limNrarrinfin χ(GN)1N

which we introduced in the previous section see [MP71] and also [Sch03 Th 6717]

Fractional Haemers bound

Let rank(A) denote the matrix rank of any matrix A For any set C of matricesdefine rank(C) = minrank(A) A isin C For a field F and a graph G define theset of matrices

MF(G) = A isin FV (G)timesV (G) foralluv Avv 6= 0 u v 6isin E(G)rArr Auv = 0

42 Chapter 3 The asymptotic spectrum of graphs Shannon capacity

Let RF(G) = rank(MF(G)) The parameter RF(G) was introduced by Haemersin [Hae79] and is known as the Haemers bound The fractional Haemers bound RF

f

was studied by Anna Blasiak in [Bla13] and was recently shown to be -multiplica-tive by Bukh and Cox in [BC18] From this it is not hard to prove that RF

f isin X(G)Bukh and Cox in [BC18] furthermore prove a separation result for any field F ofnonzero characteristic and any ε gt 0 there is a graph G such that for any field Fprimewith char(F) 6= char(Fprime) the inequality RF

f(G) lt εRFprimef (G) holds This separation

result implies that there are infinitely many elements in X(G)

Fractional orthogonal rank

In [CMR+14] the orthogonal rank ξ(G) and its fractional version the projectiverank ξf (G) are studied It easily follows from results in [CMR+14] that G 7rarr ξf (G)is in X(G)

General fractional parameters

We will prove something general about fractional parameters Define the lexico-graphic product GnH by

V (GnH) = V (G)times V (H)

E(GnH) =(g h) (gprime hprime) g gprime isin E(G)

or (g = gprime and h hprime isin E(H))

The lexicographic product satisfies GnH = GnH Also define the or-productG lowastH by

V (G lowastH) = V (G)times V (H)

E(G lowastH) =(g h) (gprime hprime) g gprime isin E(G) or h hprime isin E(H)

The or-product and the strong graph product are related by G lowastH = GH Thestrong graph product gives a subgraph of the lexicographic product which gives asubgraph of the or-product

GH sube GnH sube G lowastH

Therefore G lowastH 6 GnH 6 GH Finally GnKd = G lowastKd and of courseGKd = Gtd

We will prove if φ G rarr Rge0 is -submultiplicative t-subadditive and 6-monotone then φf is again -submultiplicative t-subadditive and 6-monotoneMoreover if φ G rarr N is 6-monotone and satisfies

forallGH isin G φ(GnH) ge φ(GnKφ(H))

then φf is n-supermultiplicative and more importantly φf is -supermultiplica-tive

33 Universal spectral points 43

Lemma 311

(i) If φ is t-superadditive then φf is t-superadditive

(ii) If φ is 6-monotone then φf is 6-monotone

(iii) If φ is t-subadditive and 6-monotone then φf is t-subadditive

(iv) If foralln isin N φ(Kn) = n then foralln isin N φf (Kn) = n

(v) If φ is -submultiplicative and 6-monotone then φf is -submultiplicative

Proof Let GH isin G Let d isin N(i) The lexicographic product distributes over the disjoint union

(G tH) nKd = (GnKd) t (H nKd)

By superadditivity

φ((GnKd) t (H nKd)) ge φ(GnKd) + φ(H nKd)

Therefore

φf (G tH) = infd

φ((G tH) nKd)

d

= infd

φ((GnKd) t (H nKd))

d

ge infd

φ(GnKd)

d+φ(H nKd)

d

ge infd1

φ(GnKd1)

d1

+ infd2

φ(H nKd2)

d2

= φf (G) + φf (H)

(ii) Let G 6 H Then G n Kd 6 H n Kd Thus φ(G n Kd) le φ(H n Kd)Therefore φf (G) le φf (H)

(iii) We have GnKd 6 GKd = Gtd Thus by monotonicity and subadditivity

φ(GnKd) le dφ(G)

and for d e isin N

φ(GnKde) = φ((GnKd) nKe) le eφ(GnKd)

We use this inequality to get for d1 d2 isin N

φ(GnKd1)

d1

+φ(H nKd2)

d2

ge φ(GnKd1d2) + φ(H nKd1d2)

d1d2

44 Chapter 3 The asymptotic spectrum of graphs Shannon capacity

From subadditivity follows

φ(GnKd1d2) + φ(H nKd1d2)

d1d2

ge φ((GnKd1d2) t (H nKd1d2))

d1d2

=φ((G tH) nKd1d2)

d1d2

ge φf (G tH)

We conclude φf (G) + φf (H) ge φf (G tH)(iv) Let n isin N Then φf (Kn) = infd φ(Kn nKd)d = infd φ(Knd)d = n(v) Let d1 d2 isin N We claim

(GH) nKd1d2 le (GnKd1) (H nKd2)

This is the same as saying there is a graph homomorphism

(GH) nKd1d2 rarr (GnKd1) (H nKd2)

which is the same as saying there is a graph homomorphism

(G lowastH) nKd1d2 rarr (GnKd1) lowast (H nKd2)

where lowast denotes the or-product of graphs One verifies that (g h (i j)) 7rarr((g i) (h j)) is such a graph homomorphism proving the claim The claimtogether with monotonicity and submultiplicativity gives

φ((GH)nKd1d2) le φ((GnKd1) (H nKd2)) le φ(GnKd1)φ(H nKd2)

Therefore

φf (GH) = infd

φ((GH) nKd)

d

= infd1d2

φ((GH) nKd1d2)

d1d2

le infd1d2

φ(GnKd1)

d1

φ(H nKd2)

d2

= φf (G)φf (H)

This concludes the proof of the lemma

Lemma 312 Let φ G rarr N satisfy

forallGH isin G φ(GnH) ge φ(GnKφ(H)) (32)

Then

infH

φ(GnH)

φ(H)= inf

d

φ(GnKd)

d

33 Universal spectral points 45

Proof From (32) follows

φ(GnH)

φ(H)geφ(GnKφ(H))

φ(H)

and so

φ(GnH)

φ(H)ge inf

d

φ(GnKd)

d

We take the infimum over H to get

infH

φ(GnH)

φ(H)ge inf

d

φ(GnKd)

d

The inequality in the other direction

infH

φ(GnH)

φ(H)le inf

d

φ(GnKd)

d

is trivially true

Lemma 313 Let φ G rarr N be 6-monotone and satisfy

forallGH isin G φ(GnH) ge φ(GnKφ(H))

Then φf is n- and -supermultiplicative

Proof Let AB isin G We have AB gt AnB so

φf (AB) ge φf (AnB)

It remains to show φf (AnB) ge φf (A)φf (B) We have

φ(AnB nH)

φ(H)=φ(An (B nH))

φ(B nH)

φ(B nH)

φ(H)

which implies

φ(AnB nH)

φ(H)ge inf

Hprime

φ(AnH prime)

φ(H prime)infHprimeprime

φ(B nH primeprime)

φ(H primeprime)= φf (A)φf (B)

Take the infimum over H to obtain φf (AnB) ge φf (A)φf (B)

Theorem 314 Let φ G rarr N be t-additive -submultiplicative 6-monotoneand Kn-normalised and satisfy

forallGH isin G φ(GnH) ge φ(GnKφ(H))

Then φf is in X(G)

Proof This follows from Lemma 311 Lemma 312 and Lemma 313

46 Chapter 3 The asymptotic spectrum of graphs Shannon capacity

34 Conclusion

In this chapter we introduced a new connection between Strassenrsquos theory ofasymptotic spectra and the Shannon capacity of graphs In particular we charac-terised the Shannon capacity (which is defined as a supremum) as a minimisationover elements in the asymptotic spectrum of graphs Known elements in theasymptotic spectrum of graphs include the fractional clique cover number theLovasz theta number the projective rank and the fractional Haemers bound Weare left with a clear goal for future work find all elements in the asymptoticspectrum of graphs

Chapter 4

The asymptotic spectrum of tensorsexponent of matrix multiplication

This chapter is based on joint work with Matthias Christandl and Peter Vrana [CVZ18]

41 Introduction

This chapter is about tensors t isin Fn1 otimes middot middot middot otimes Fnk and their asymptotic propertiesThe theory of asymptotic spectra of Chapter 2 was developed by Strassen exactlyfor the purpose of understanding the asymptotic properties of tensors Thischapter is expository and provides the necessary background for understandingChapter 5 and Chapter 6

Let us first define the asymptotic properties of interest and discuss some oftheir applications We need the concepts restriction tensor product and diagonaltensor Let s isin Fn1 otimes middot middot middot otimes Fnk and t isin Fm1 otimes middot middot middot otimes Fmk be tensors We say srestricts to t and write s gt t if there are linear maps Ai Fni rarr Fmi suchthat t = (A1 otimes middot middot middot otimes Ak) middot s The tensor product of s and t is the elements otimes t isin Fn1m1 otimes middot middot middot otimes Fnkmk with coordinates (s otimes t)ij = sitj We naturallydefine the direct sum s oplus t isin Fn1+m1 otimes middot middot middot otimes Fnk+mk We define the diagonaltensors 〈n〉 =

sumni=1 ei otimes middot middot middot otimes ei for n isin N where e1 en is the standard basis

of Fn The tensor rank R(t) is the smallest number n isin N such that t can bewritten as a sum of simple tensors a simple tensor being a tensor of the formv1 otimes middot middot middot otimes vk Equivalently R(t) = minn isin N t 6 〈n〉 The asymptotic rankis the regularisation ˜R(t) = limnrarrinfinR(totimesn)1n While tensor rank is known to behard to compute [Has90 Shi16] we do not know whether asymptotic rank is hardto compute

The exponent of matrix multiplication

The motivating example for studying asymptotic rank is the problem of findingthe exponent of matrix multiplication ω Recall from the introduction that ω

47

48 Chapter 4 The asymptotic spectrum of tensors matrix multiplication

is the infimum over a isin R such that two n times n matrices can be multipliedusing O(na) arithmetic operations (in the algebraic circuit model) It turns out(see [BCS97]) that ω is characterised by the asymptotic rank ˜R(〈2 2 2〉) of thematrix multiplication tensor

〈2 2 2〉 =sum

ijkisin[2]

eij otimes ejk otimes eki isin F4 otimes F4 otimes F4

Namely ˜R(〈2 2 2〉) = 2ω We know the trivial lower bound 2 le ω see Section 43We know the (non-trivial) upper bound ω le 23728639 which is by Coppersmithand Winograd [CW90] and improvements by Stothers Williams and Le Gall[Sto10 Wil12 LG14]

Asymptotic subrank and asymptotic restriction

Besides (asymptotic) rank we naturally define subrank Q(t) = maxm isin N 〈m〉 6 t and the asymptotic subrank ˜Q(t) = limnrarrinfinQ(totimesn)1n Moreover wesay s restricts asymptotically to t written s gtsim t if there is a sequence of naturalnumbers a(n) isin o(n) such that for all n isin N

sotimesn otimes 〈2〉otimesa(n) gt totimesn

One can prove (see [Str91]) that

sotimesn otimes 〈2〉otimeso(n) gt totimesn iff sotimesn+o(n) gt totimesn

Our goal is to understand asymptotic restriction asymptotic rank and asymptoticsubrank

More connections quantum information combinatorics algebraic prop-erty testing

Besides matrix multiplication other applications of asymptotic restriction oftensors asymptotic rank of tensors and asymptotic subrank of tensors includedeciding the feasibility of an asymptotic transformation between pure quantumstates via stochastic local operations and classical communication (slocc) inquantum information theory [BPR+00 DVC00 VDDMV02 HHHH09] boundingthe size of combinatorial structures like cap sets and tri-colored sum-free sets inadditive combinatorics [Ede04 Tao08 ASU13 CLP17 EG17 Tao16 BCC+17KSS16 TS16] see Chapter 5 and bounding the query complexity of certainproperties in algebraic property testing [KS08 BCSX10 Sha09 BX15 HX17FK14]

This chapter is organised as follows In Section 42 we briefly discuss thesemiring of tensors the asymptotic spectrum of tensors and asymptotic rank and

42 The asymptotic spectrum of tensors 49

subrank In Section 43 we discuss the gauge points a simple construction of finitelymany elements in the asymptotic spectrum of tensors In Section 44 we discussthe Strassen support functionals a family of elements in the asymptotic spectrumof ldquoobliquerdquo tensors This family is parametrised by probability distributionson [k] In Section 45 we discuss an extension of the support functionals calledthe Strassen upper support functionals which have the potential to be universalFinally in Section 46 we prove a new result we show how asymptotic slice rankis related to the support functionals

42 The asymptotic spectrum of tensors

Let us properly set up the semiring of tensors and the asymptotic spectrum Forthe proofs we refer to [Str87 Str88 Str91]

421 The semiring of tensor equivalence classes TWe begin by putting an equivalence relation on tensors For example we want toidentify isomorphic tensors and also for any tensor t isin Fn1 otimes middot middot middot otimes Fnk we wantto identify t with toplus 0 where 0 isin Fm1 otimes middot middot middot otimes Fmk is a zero tensor of any format

We say s is isomorphic to t and write s sim= t if there are bijective linear mapsAi Fmi rarr Fni such that t = (A1 Ak) middot s

We say s and t are equivalent and write s sim t if there are zero tensorss0 = 0 isin Fa1 times middot middot middot times Fak and t0 = 0 isin Fb1 times middot middot middot times Fbk such that s oplus s0

sim= t oplus t0The equivalence relation sim is in fact the equivalence relation generated by therestriction preorder 6

Let T be the set of sim-equivalence classes of k-tensors over F for some fixed kand field F The direct sum and the tensor product naturally carry over to T and T becomes a semiring with additive unit 〈0〉 and multiplicative unit 〈1〉(more precisely the equivalence classes of those tensors but we will not make thisdistinction)

422 Strassen preorder via restriction

Restriction 6 induces a partial order on T which behaves well with respect tothe semiring operations and naturally n le m if and only if 〈n〉 6 〈m〉 Thereforerestriction 6 is a Strassen preorder on T

423 The asymptotic spectrum of tensors X(T )

Let S sube T be a subsemiring Let

X(S) = X(S6) = φ isin Hom(SRge0) foralla b isin S a 6 brArr φ(a) le φ(b)

50 Chapter 4 The asymptotic spectrum of tensors matrix multiplication

We call X(S) the asymptotic spectrum of S and we call X(T ) the asymptoticspectrum of k-tensors over F

Theorem 41 ([Str88]) Let s t isin S Then s t iff forallφ isin X(S) φ(s) le φ(t)

Proof This follows from Theorem 212

We refer to Chapter 2 for a discussion of the topological properties of X(S)

Remark 42 We mention that X(S) may equivalently be defined with degenera-tion instead of restriction ge Over C we say f degenerates to g written f gif f sim= f prime and g sim= gprime and gprime is in the Euclidean closure (or equivalently Zariskiclosure) of the orbit GLn1 times middot middot middot timesGLnk middot f prime It is a nontrivial fact from algebraicgeometry (see [Kra84 Lemma III231] or [BCS97]) that there is a degenerationf g if and only if there are matrices Ai with entries polynomial in ε such that(A1 Ak) middot f = εdg + εd+1g1 + middot middot middot + εd+ege for some elements g1 ge Thelatter definition of degeneration is valid when C is replaced by an arbitrary fieldF and that is how degeneration is defined for an arbitrary field Degenerationis weaker than restriction f ge g implies f g Asymptotically however thenotions coincide f amp g if and only if fotimesn otimes 〈2〉otimeso(n) gotimesn We mention thatanalogous to restriction degeneration gives rise to border rank and border subrankR(f) = minr isin N f 〈r〉 and Q(f) = maxs isin N 〈s〉 f respectively

424 Asymptotic rank and asymptotic subrank

The abstract theory of asymptotic spectra characterises asymptotic subrank andasymptotic rank as follows

Corollary 43 Let S sube T be a subsemiring Let a isin S Then

˜Q(a) = minφisinX(S)

φ(a) (41)

˜R(a) = maxφisinX(S)

φ(a) (42)

Proof Statement (42) follows from Corollary 213 since either a = 0 or a gt 1For statement (41) if totimesk gt 2 for some k isin N then we apply Corollary 214Otherwise one can show that ˜Q(t) equals 0 or 1 using the gauge points of thenext section (see [Str88 Lemma 37])

Remark 44 One verifies that ˜R and ˜Q are 6-monotones and have value non 〈n〉 They are not universal spectral points however Namely the asymptoticrank of each of the three tensors

〈2 1 1〉 = e1 otimes e1 otimes 1 + e2 otimes e2 otimes 1 isin F2 otimes F2 otimes F1

〈1 1 2〉 = e1 otimes 1otimes e1 + e2 otimes 1otimes e2 isin F2 otimes F1 otimes F2

43 Gauge points ζ(i) 51

〈1 2 2〉 = 1otimes e1 otimes e1 + 1otimes e2 otimes e2 isin F1 otimes F2 otimes F2

equals 2 whereas their tensor product equals the matrix multiplication ten-sor 〈2 2 2〉 whose tensor rank equals 7 and whose asymptotic rank is thus atmost 7 ie strictly smaller than 23 Therefore asymptotic rank is not multiplica-tive On the other hand the asymptotic subrank of each of the above three tensorsequals 1 whereas the asymptotic subrank of 〈2 2 2〉 equals 4 see Chapter 5Therefore asymptotic subrank is not multiplicative

Goal 45 Our goal is now to explicitly describe elements in X(T ) universalspectral points or more modestly to describe elements in X(S) for interestingsubsemirings S sube T

Strassen constructed a finite family of elements in X(T ) the gauge points andan infinite family of elements in X(oblique tensors) the support functionalsThe support functionals are powerful enough to determine the asymptotic subrankof any ldquotight tensorrdquo Tight tensors are discussed in Chapter 5 In Chapter 6 weconstruct an infinite family in X(k-tensors over C) the quantum functionalsIn the rest of this chapter we discuss the gauge points and the support functionalsWe will focus on the case k = 3 for clarity of exposition

43 Gauge points ζ(i)

Strassen in [Str88] introduced a finite family of elements in X(T ) called the gaugepoints We focus on 3-tensors but the construction generalises immediately tok-tensors Let Vi = Fni Let t isin V1 otimes V2 otimes V3 Let i isin [3] Let flatteni(t) bethe image of t under the grouping V1 otimes V2 otimes V3 rarr Vi otimes (

otimesj 6=i Vj) We think

of flatteni(t) as a matrix Let ζ(i) T rarr N t 7rarr rank(flatteni(t)) with rankdenoting matrix rank We call ζ(1) ζ(2) ζ(3) the gauge points From the propertiesof matrix rank follows directly that ζ(i) is multiplicative under otimes additive under oplusmonotone under restriction 6 (and under degeneration ) and normalised to 1on 〈1〉 = e1 otimes e1 otimes e1

Theorem 46 ζ(1) ζ(2) ζ(3) isin X(T )

Recall ˜Q(t) le φ(t) le ˜R(t) for φ isin X(T ) In particular maxi ζ(i)(t) le ˜R(t)

We do not know whether maxiisin[3] ζ(i) equals ˜R To be precise we do not know any t

for which maxi ζ(i)(t) lt ˜R(t) and we do not know a proof that maxi ζ

(i)(t) = ˜R(t)for all t There are various families of tensors t for which maxi ζ

(i)(t) = ˜R(t) isproven We will see such a family in Section 542 For the matrix multiplicationtensor 〈2 2 2〉 we have 4 = maxi ζ

(i)(〈2 2 2〉) le 2ω so maxi ζ(i)(t) = ˜R(t) would

imply that the matrix multiplication exponent ω equals 2On the other hand ˜Q(t) le mini ζ

(i)(t) There exist t for which ˜Q(t) is

strictly smaller than miniisin[3] ζ(i)(t) To show this strict inequality we need another

52 Chapter 4 The asymptotic spectrum of tensors matrix multiplication

technique of Strassen the support functionals The support functionals are thetopic of the next section

44 Support functionals ζθ

Strassen in [Str91] constructed an infinite family of elements in the asymptoticspectrum of oblique k-tensors called the support functionals In this section we ex-plain the construction of the support functionals The support functionals providethe benchmark for our new quantum functionals (Chapter 6) and are relevant inthe context of combinatorial problems like the cap set problem (Section 542) Forclarity of exposition we focus on 3-tensors The ideas extend directly to k-tensors

Oblique tensors are tensors for which in some basis the support has thefollowing special structure Let t isin Fn1 otimes Fn2 otimes Fn3 Let e1 eni be thestandard basis of Fni Write t =

sumijk tijk ei otimes ej otimes ek Let [ni] = 1 2 ni

Let supp(t) = (i j k) tijk 6= 0 sube [n1] times [n2] times [n3] be the support of t withrespect to the standard basis Let [ni] have the natural ordering 1 lt 2 lt middot middot middot lt [ni]and let [n1]times [n2]times [n3] have the product order denoted by le That is x le yif for all i isin [3] holds xi le yi We call supp(t) oblique if supp(t) is an antichainwith respect to le ie if any two elements in supp(t) are incomparable withrespect to le We call a tensor t oblique if supp(g middot t) is oblique for some groupelement g isin G(t) = GLn1 times GLn2 times GLn3 The family of oblique tensors is asemiring under oplus and otimes

Not all tensors are oblique Obliqueness is not a generic property (see Propo-sition 621) However many tensors that are of interest in algebraic complexitytheory are oblique notably the matrix multiplication tensors

〈a b c〉 =sumiisin[a]

sumjisin[b]

sumkisin[c]

eij otimes ejk otimes eki isin Fab otimes Fbc otimes Fca

For any finite set X let P(X) be the set of all probability distributions on XFor any probability distribution P isin P(X) the Shannon entropy of P is definedas H(P ) = minus

sumxisinX P (x) log2 P (x) with 0 log2 0 understood as 0 Given finite

sets X1 Xk and a probability distribution P isin P(X1 times middot middot middot times Xk) on theproduct set X1 times middot middot middot timesXk we denote the marginal distribution of P on Xi by Pithat is Pi(a) =

sumxxi=a

P (x) for any a isin Xi

Definition 47 Let θ isin Θ = P([3]) For t isin Fn1 otimes Fn2 otimes Fn3 0 with supp(t)oblique define

ζθ(t) = max2sum3i=1 θ(i)H(Pi) P isin P(supp(t))

We call the ζθ for θ isin Θ the support functionals

Theorem 48 ζθ isin X(oblique) for θ isin Θ

44 Support functionals ζθ 53

We work towards the proof of Theorem 48 For p isin [0 1] let h(p) be thebinary entropy function h(p) = minusp log2 p minus (1 minus p) log2(1 minus p) ie h(p) is theShannon entropy of the probability vector (p 1minus p) The following properties ofthe Shannon entropy are well-known

Lemma 49

(i) H(P otimesQ) = H(P ) +H(Q) for P isin P(X1) Q isin P(X2)

(ii) H(P ) le H(P1) +H(P2) for P isin P(X1 timesX2)

(iii) H(pPoplus(1minusp)Q) = pH(P )+(1minusp)H(Q)+h(p) for PQ isin P(X) p isin [0 1]

(iv) 2a + 2b = max0leple1 2pa+(1minusp)b+h(p) for a b isin R

For X sube [n1]times [n2]times [n3] let Xle = y isin [n1]times [n2]times [n3] existx isin X y le x bethe downward closure of X Let max(X) = y isin X forallx isin X y le x rArr y = xbe the maximal points of X with respect to le Let Sn be the symmetric groupof permutations of [n] Then the product group Sn1 times Sn2 times Sn3 acts naturallyon [n1]times [n2]times [n3]

Lemma 410 Let t isin Fn1 otimes Fn2 otimes Fn3 For every g isin G(t) there is a triple ofpermutations w isin W (t) = Sn1 times Sn2 times Sn3 with w middotmax(supp(g middot t)) sube supp(t)le

Proof We prepare for the construction of w Let n isin N Let e1 en bethe standard basis of Fn Let g isin GLn Let f1 fn with fj = g middot ej be thetransformed basis of Fn Let (Ei)iisin[n] and (Fj)jisin[n] be the complete flags of Fnwith

Ei = Spanei ei+1 enFj = Spanfj fj+1 fn

Define the map

π [n]rarr [n] j 7rarr maxi isin [n] Ei cap (fj + Fj+1) 6= empty

(43)

We prove π is injective Let j k isin [n] with j le k and suppose i = π(j) = π(k)Let Ftimes = F 0 From (43) follows

(Ftimesei + Ei+1) cap (fj + Fj+1) 6= empty (44)

Ei+1 cap (fj + Fj+1) = empty (45)

(Ftimesei + Ei+1) cap (fk + Fk+1) 6= empty (46)

Suppose j lt k Then from (44) and (46) we obtain a contradiction to (45) Weconclude that j = k Thus π is injective

54 Chapter 4 The asymptotic spectrum of tensors matrix multiplication

For each Fni define as above the standard complete flag (Eij)jisin[ni] of Fni the

complete flag (F ij )jisin[ni] corresponding to the basis given by gi and the permuta-

tion πi [ni]rarr [ni] Let w = (π1 π2 π3) isin W (t)We will prove w middot max(supp(g middot t)) sube supp(t)le Let y isin max(supp(g middot t))

Let x = w middot y By construction of πi the intersection Eixicap (f iyi + F i

yi+1) is notempty Choose

f iyi isin Eixicap (f iyi + F i

yi+1)

Let tlowast be the multilinear map Fn1 times Fn2 times Fn3 rarr F with tlowast(ei ej ek) = tijk for alli isin [n1] j isin [n2] k isin [n3] Then

tlowast(f 1y1 f 2y2 f 3y3

) = tlowast(f 1y1 f 2y2 f 3y3

) +sum

zisin[n1]times[n2]times[n3]zgty

cz tlowast(f 1

z1 f 2z2 f 3z3

) (47)

for some cz isin F Since y is maximal in supp(gmiddott) the sum over z gt y in (47) equalszero We conclude tlowast(f 1

y1 f 2y2 f 3y3

) = tlowast(f 1y1 f 2y2 f 3y3

) 6= 0 Thus tlowast(E1x1timesE2

x2timesE3

x3)

is not zero and thus x isin supp(t)le

Proof of Theorem 48 We prove ζθ on oblique tensors is otimes-multiplicative oplus-additive 6-monotone and normalised to 1 on 〈1〉 = e1otimese1otimese1 The normalisationζθ(〈1〉) = 1 is clear

We prove ζθ is otimes-supermultiplicative Let s isin Fn1 otimes Fn2 otimes Fn3 and lett isin Fm1otimesFm2otimesFm3 Let P isin P(supp(t)) and Q isin P(supp(s)) Then the productP otimesQ isin P(supp(sotimes t)) has marginals PiotimesQi Since H(PiotimesQi) = H(Pi)+H(Qi)(Lemma 49(i)) we conclude ζθ(s)ζθ(t) le ζθ(sotimes t)

We prove ζθ is otimes-submultiplicative For P isin P(supp(t)) and θ isin Θ we use thenotation Hθ(P ) =

sum3i=1 θ(i)H(Pi) We naturally identify supp(t) with a subset

of [n1] times [n2] times [n3] times [m1] times [m2] times [m3] Let P isin P(supp(t)) Let P[3] be themarginal distribution of P on [n1] times [n2] times [n3] and let P3+[3] be the marginaldistribution of P on [m1]times [m2]times [m3] Then Hθ(P ) le Hθ(P[3]) +Hθ(P3+[3]) byLemma 49(ii) We conclude ζθ(sotimes t) le ζθ(s)ζθ(t)

We prove ζθ is oplus-additive By definition

ζθ(soplus t) = max2Hθ(P ) P isin P(supp(soplus t))= max

max0leple1

2Hθ(pPoplus(1minusp)Q) P isin P(supp(s)) Q isin P(supp(t))

From Lemma 49(iii) and (iv) follows

max

max0leple1

2Hθ(pPoplus(1minusp)Q) P isin P(supp(s)) Q isin P(supp(t))

= max

max0leple1

2pHθ(P )+(1minusp)Hθ(Q)+h(p) P isin P(supp(s)) Q isin P(supp(t))

= max

2Hθ(P ) + 2Hθ(Q) P isin P(supp(s)) Q isin P(supp(t))

44 Support functionals ζθ 55

= ζθ(s) + ζθ(t)

We conclude ζθ(soplus t) = ζθ(s) + ζθ(t)We prove ζθ is 6-monotone Let s 6 t with supp(s) and supp(t) oblique Then

there are linear maps Ai with s = (A1 otimes A2 otimes A3) middot t If A1 A2 A3 are of theform diag(1 1 0 0) then ζθ(s) le ζθ(t) Suppose g = (A1 A2 A3) isin G(t)Let P isin P(supp(t)) maximise Hθ on P(supp(t)) Let σ isin W such that σ middot Phas non-increasing marginals Then Hθ(σ middot P ) = Hθ(P ) and σ middot P maximises Hθ

on P(supp(σ middot t)) Then σ middot P maximises Hθ on P(supp(σ middot t)le) by Lemma 412below Let Q isin P(supp(g middot t)) maximise Hθ on P(supp(g middot t)) By Lemma 410there is a w isin W with w middot supp(g middot t) sube supp(σ middot t)le Then Hθ(w middotQ) = Hθ(Q) leHθ(σ middot P ) = Hθ(P ) Thus maxPisinP(supp(gmiddott)) Hθ(P ) le maxPisinsupp(t) Hθ(P ) Weconclude ζθ(g middot t) le ζθ(t)

The following two lemmas finish the above proof of Theorem 48 Recall thatin the proof we defined Hθ(P ) =

sum3i=1 θ(i)H(Pi) for θ isin Θ

Lemma 411 ([Str91 Prop 21]) Let Φ sube [n1] times [n2] times [n3] Let P isin P(Φ)Let supp(P ) be the support x isin Φ P (x) 6= 0 For x isin Φ define hP (x) =minussum3

i=1 θ(i) log2 Pi(xi) Then P maximises Hθ on P(Φ) if and only if

forallx isin supp(P ) hP (x) = maxyisinΦ

hP (y) (48)

Proof We write Hθ(P ) in terms of hP

Hθ(P ) =3sumi=1

θ(i)H(Pi) =sum

xisinsupp(P )

P (x)hP (x) (49)

For Q isin P(Φ)

limεrarr0+

d

dεHθ

((1minus ε)P + εQ

)= lim

εrarr0+

d

sumx

((1minus ε)P (x) + εQ(x)

)h(1minusε)P+εQ(x)

=sumx

P (x)

( 3sumi=1

θ(i)Pi(xi)minusQi(xi)

Pi(xi) ln(2)

)+sumx

(minusP (x) +Q(x)

)hP (x)

=sumx

Q(x)hP (x)minussumx

P (x)hP (x)

Therefore since Hθ is continuous and concave P maximises Hθ if and only if

forallQ isin P(Φ)sumx

Q(x)hP (x)minussumx

P (x)hP (x) le 0 (410)

56 Chapter 4 The asymptotic spectrum of tensors matrix multiplication

We will prove (410) is equivalent to (48) Supposesum

xQ(x)hP (x) lesum

x P (x)hP (x)for every Q isin P(Φ) In particular hP (y) le

sumx P (x)hP (x) for every y isin Φ so

maxyisinΦ hP (y) lesum

x P (x)hP (x) Then maxyisinΦ hP (y) =sum

x P (x)hP (x) We con-clude maxyisinΦ hP (y) = hP (x) for every x isin supp(P )

Suppose maxyisinΦ hP (y) = hP (x) for every x isin supp(P ) Then hP (y) le hP (x)for every Q isin P(Φ) y isin supp(Q) x isin supp(P ) We conclude

sumxQ(x)hP (x) lesum

x P (x)hP (x)

Lemma 412 ([Str91 Cor 22]) Let Φ sube [n1]times [n2]times [n3] Let P maximise Hθ

on P(Φ) Suppose Pi is nonincreasing on [ni] for each i isin [3] Then P max-imises Hθ on P(Φle) where Φle is the downward closure of Φ with respect to le

Proof We know P satisfies (48) We will prove P satisfies (48) with Φ replacedby Φle Then we are done by Lemma 411 Let x isin Φle Then x le y forsome y isin Φ Then (P1(x1) P2(x2) P3(x3)) ge (P1(y1) P2(y2) P3(y3)) since each Piis nonincreasing Then hP (x) le hP (y) We conclude maxΦle hP le maxΦ hP Onthe other hand Φ sube Φle Therefore maxΦ hP le maxΦle hP

Using the support functionals Strassen managed to fully compute the asymp-totic spectrum of several semirings generated by oblique tensors We will see anexample in Section 542

45 Upper and lower support functionals ζθ ζθ

In Section 44 we defined the support functionals ζθ oblique rarr Rge0 andproved that ζθ isin X(oblique) From the general theory of asymptotic spectra(Chapter 2) we know ζθ is the restriction of some map φ tensors rarr Rge0

in X(T ) However the proof of that fact was non-constructive In other wordswe know that ζθ can be extended to an element of X(T ) In this short sectionwe discuss a candidate extension proposed by Strassen called the upper supportfunctional We also discuss a companion called the lower support functional

For arbitrary t isin Fn1 otimes Fn2 otimes Fn3 the upper support functional and the lowersupport functional are defined as

ζθ(t) = mingisinG(t)

max2Hθ(P ) P isin P(supp(g middot t))

ζθ(t) = maxgisinG(t)

max2Hθ(P ) P isin P(max(supp(g middot t)))

with G(t) = GLn1 timesGLn2 timesGLn3 and Hθ(P ) =sum3

i=1 θ(i)H(Pi) We summarisethe known properties of the upper and lower support functional

Theorem 413 ([Str91]) Let s isin Fn1 otimes Fn2 otimes Fn3 and t isin Fm1 otimes Fm2 otimes Fm3Let θ isin Θ = P([3])

45 Upper and lower support functionals ζθ ζθ 57

(i) ζθ(〈n〉) = n for n isin N

(ii) ζθ(soplus t) = ζθ(s) + ζθ(t)

(iii) ζθ(sotimes t) le ζθ(s)ζθ(t)

(iv) If s gt t then ζθ(s) ge ζθ(t)

Theorem 414 ([Str91]) Let s isin Fn1 otimes Fn2 otimes Fn3 and t isin Fm1 otimes Fm2 otimes Fm3Let θ isin Θ

(i) ζθ(〈n〉) = n for n isin N

(ii) ζθ(soplus t) ge ζθ(s) + ζθ(t)

(iii) ζθ(sotimes t) ge ζθ(s)ζθ(t)

(iv) If s gt t then ζθ(s) ge ζθ(t)

Theorem 415 ([Str91]) ζθ(sotimes t) ge ζθ(s)ζθ(t) and ζθ(t) ge ζθ(t) for θ isin Θ

Regarding statement (ii) in Theorem 414 Burgisser [Bur90] shows that thelower support functional ζθ is not in general additive under the direct sumwhen θi gt 0 for all i See also [Str91 Comment (iii)] In particular this impliesthat the upper support functional ζθ(t) and the lower support functional ζθ(t)are not equal in general the upper support functional being additive In factto show that the lower support functional is not additive Burgisser first showsthat when F is algebraically closed the generic value of ζθ on Fn otimes Fn otimes Fnequals (1minusmini θi) log2 n+ o(n) On the other hand Tobler [Tob91] shows thatthe generic value of ζθ on FnotimesFnotimesFn equals log2 n So even generically ζθ and ζθare different on Fn otimes Fn otimes Fn

For θ isin Θ we say f is θ-robust if ζθ(t) = ζθ(t) We say t is robust if t is θ-robustfor all θ isin Θ Let us try to understand what robust tensors look like A tensor tis θ-robust if and only if

ζθ(t) le ζθ(t) (411)

The set of θ-robust tensors is closed under oplus and otimes since

ζθ(soplus t) = ζθ(s) + ζθ(t) = ζθ(s) + ζθ(t) le ζθ(soplus t)

and

ζθ(sotimes t) le ζθ(s)ζθ(t) = ζθ(s)ζθ(t) le ζθ(sotimes t)

For X sube [n1] times [n2] times [n3] we use the notation Hθ(X) = maxPisinP(X) Hθ(P )Let t isin Fn1 otimes Fn2 otimes Fn3 0 Equation (411) means that there are g h isin G(t)and P isin P(max supp(h middot t)) such that Hθ(supp(g middot t)) le Hθ(P ) In this case we

58 Chapter 4 The asymptotic spectrum of tensors matrix multiplication

have ζθ(t) = ζθ(t) = 2Hθ(P ) In particular t is θ-robust if there is a g isin G(t) suchthat the maximisation Hθ(supp(g middot t) is attained by a P isin P(max(supp(g middot t)))This criterion is automatically satisfied for all θ when supp(g middot t) = max(supp(g middot t))for some g isin G(t) Suppose t is oblique Then supp(g middot t) is an antichainfor some g isin G(t) and thus supp(g middot t) = max supp(g middot t) Then t is robustand ζθ(t) = ζθ(t) = 2Hθ(supp(gmiddott))

46 Asymptotic slice rank

Slice rank is a variation on tensor rank that was introduced by Terence Taoin [Tao16] to study cap sets We will look at cap sets in Section 54 Here westudy the relationship between asymptotic slice rank and the support functionals

Consider the following characterisation of tensor rank Let a simple tensor beany tensor of the form v1 otimes v2 otimes v3 isin V1 otimes V2 otimes V3 with vi isin Vi for i isin [k] Thenthe rank R(t) of t isin V1 otimes V2 otimes V3 is the smallest number r such that t can bewritten as a sum of r simple tensors

Slice rank is defined similarly but with simple tensors replaced by slicesFor S sube [k] let VS =

otimesiisinS Vi For j isin [k] let j = j A tensor in V1 otimes V2 otimes V3

is called a slice if it is of the form v otimes w with v isin Vj and w isin Vj for some j isin [k](under the natural reordering of the tensor legs) Let t isin V1 otimes V2 otimes V3 The slicerank of t denoted by SR(t) is the smallest number r such that t can be writtenas a sum of r slices For example the tensor

W = e1 otimes e1 otimes e2 + e1 otimes e2 otimes e1 + e2 otimes e1 otimes e1 isin F2 otimes F2 otimes F2 (412)

has slice rank 2 since we can write W = e1 otimes (e1 otimes e2 + e2 otimes e1) + e2 otimes e1 otimes e1In fact the slice rank of any element in V1 otimes V2 otimes V3 is at most mini dimVi Thetensor rank of W on the other hand is known to be 3

Slice rank is clearly monotone under restriction The slice rank of the diagonaltensor 〈r〉 equals r [Tao16] It follows that subrank is at most slice rank

Q(t) le SR(t)

The motivation for the introduction of slice rank in [Tao16] was finding upperbounds on subrank Q(t) and asymptotic subrank ˜Q(t)

The main result of this section is the following theorem Recall that a tensor tis oblique if the support supp(g middot t) is an antichain for some g isin G(t)

Theorem 416 Let t be oblique Then

limnrarrinfin

SR(totimesn)1n = minθisinP([3])

ζθ(t)

Our proof of Theorem 416 is based on a proof of Tao and Sawin in [TS16]and discussions of the author with Dion Gijswijt The explicit connection betweenasymptotic slice rank and the support functionals is new

46 Asymptotic slice rank 59

We use Theorem 416 before giving its proof to see that SR is not submulti-plicative and not supermultiplicative under the tensor product otimes In particular wecannot use Feketersquos lemma Lemma 22 to prove that the limit limnrarrinfin SR(totimesn)1n

exists Thus the existence of the limit is a non-trivial consequence of Theorem 416Let W as in (412) Then SR(W ) = 2 We have ζ(131313)(W ) = 2h(13) lt 2

From Theorem 416 follows SR(Wotimesn) le 2nh(13)+o(1) We conclude SR(Wotimesn) lt 2n

for n large enough We conclude SR is not supermultiplicative Now it is alsoclear that slice rank is not the same as (border) subrank since (border) subrankis supermultiplicative

Next the tensorssumn

i=1 eiotimeseiotimes1sumn

i=1 eiotimes1otimeseisumn

i=1 1otimeseiotimesei have slice rankone while their tensor product equals the matrix multiplication tensor 〈n n n〉which has slice rank n2 by Theorem 416 and Theorem 53 in the next chapterapplied to the tight tensor 〈n n n〉 We conclude SR is not submultiplicative

Slice rank and hitting set number

We study the hitting set number of the support of a tensor Let Φ sube [n1]times[n2]times[n3]A hitting set for Φ is a 3-tuple of sets A1 sube [n1] A3 sube [n2] A3 sube [n3] such that forevery a isin Φ there is an i isin [3] with ai isin Ai We may think of Φ as a 3-partite3-uniform hypergraph Then the definition of hitting set says every edge a isin Φ ishit by an element of some Ai A hitting set is also called a vertex cover everyedge being covered by some vertex or a transversal The size of the hittingset (A1 A2 A3) is |A1|+ |A2|+ |A3| The hitting set number τ(Φ) is the size ofthe smallest hitting set for Φ Let t isin Fn1 otimes Fn2 otimes Fn3

Lemma 417 Let g isin G(t) = GLn1timesGLn2timesGLn3 Then SR(t) le τ(supp(g middot t))

Proof This is clear

Lemma 418 Let g isin G(t) Then SR(t) ge τ(max(supp(g middot t)))

Proof It is sufficient to consider g = e Let

t =

r1sumi=1

v1i otimes u1

i +

r2sumi=1

v2i otimes u2

i +

r3sumi=1

v3i otimes u3

i

be a slice decomposition We may assume vj1 vjrj

are linearly independent

Let Vj = Spanvj1 vjrj sube Fnj Let Wj sube (Fnj)lowast be the elements in thedual space that vanish on Vj Let Bj sube Wj be a basis with the followingproperty with respect to the standard basis the matrix with the elementsof Bj as columns is in reduced row echelon form ie each column is of theform (lowast middot middot middot lowast 1 0 middot middot middot 0)T and the pivot elements (the 1rsquos) are all in different rowsLet Sj sube [nj] be the indices of the pivot element Let Sj = [nj] Sj be thecomplement Then |Sj| = rj We claim (S1 S2 S3) is a hitting set for max(supp(t))

60 Chapter 4 The asymptotic spectrum of tensors matrix multiplication

Then r1 + r2 + r3 = |S1|+ |S2|+ |S3| ge τ(max(supp(t))) Let x isin max(supp(t))Suppose x isin S1 times S2 times S3 For every j isin [3] let φj isin Bj have its pivot element atindex xj Let φ = φ1 otimes φ2 otimes φ3 Then φ isin W1 otimesW2 otimesW3 so φ(t) = 0 Since x ismaximal and each Bj is in reduced row echelon form

φ(t) =sumylex

ty φ(ey1 otimes ey2 otimes ey3)

=sumyltx

ty φ(ey1 otimes ey2 otimes ey3) + tx ex1 otimes ex2 otimes ex3

=sumyltx

sy ey1 otimes ey2 otimes ey3 + tx ex1 otimes ex2 otimes ex3

for some sy isin F From φ(t) = 0 follows tx = 0 This contradicts x isin supp(t) sox 6isin S1 times S2 times S3 ie there is a j isin [3] with xj isin Sj

Asymptotic hitting set number

We now study the asymptotic hitting set number ˜τ(Φ) = limnrarrinfin τ(Φtimesn)1nWe will use some basic facts of types and type classes Let X be a finite

set Let N isin N An N-type on X is a probability distribution P on X withN middot P (x) isin N for all x isin X Let P be an N -type on X The type class TNP sube XN

is the set of sequences s = (s1 sN) with x occuring N middot P (x) times in s forevery x isin X ie |i isin [N ] si = x| = N middot P (x)

Lemma 419 The number of N-types on X equals(N+|X|minus1|X|minus1

) Let P be an

N-type The size of the type class TNP equals the multinomial coefficient(NNP

)

Proof We leave the proof to the reader

Lemma 420 Let P be an N-type on X Then

1

(N + 1)|X|2NH(P ) le

(N

NP

)le 2NH(P )

Proof See eg [CT12 Theorem 1113]

Lemma 421 log2˜τ(Φ) le maxPisinP(Φ) miniisin[3]H(Pi)

Proof Let P maximise maxPisinP(Φ) miniH(Pi) Let n isin N We construct a hittingset (A1 A2 A3) for Φn as follows Let x isin Φn Viewing x as an n-tuple of elementsin Φ let Q isin Pn(Φ) be the type of x (ie the empirical distribution) Let j isin [3]with H(Qj) = miniisin[3]H(Qi) By our choice of P we have

H(Qj) = miniisin[3]

H(Qi) le miniisin[3]

H(Pi)

46 Asymptotic slice rank 61

Viewing x as a 3-tuple (x1 x2 x3) add xj to Aj We repeat this for all x isin ΦnThe final (A1 A2 A3) is a hitting set for Φn by construction For each j isin [3]

|Aj| lesumQj

|T nQj | lesumQj

2nH(Qj)

where the sum is over Qj isin Pn(Φj) with H(Qj) le miniisin[3]H(Pi) Then

|Aj| le |Pn(Φj)| 2nminiH(Pi) = poly(n)2nminiH(Pi)

We conclude |A1|+ |A2|+ |A3| le poly(n)2nminiH(Pi)

Lemma 422 log2˜τ(Φ) ge maxPisinP(Φ) miniisin[3]H(Pi)

Proof Let P maximise maxPisinP(Φ) miniH(Pi) Let n isin N Let (A1 A2 A3) be ahitting set for Φn Let Q isin Pn(Φ) be an n-type with miniH(Qi) = miniH(Pi)minuso(n) Let Ψ = T nQ sube Φn be the set of strings with type Q Then (A1 A2 A3) is ahitting set for Ψ Let πi Ψrarr Φn

i (x1 x2 x3) 7rarr xi Then

Ψ = πminus11 (A1) cup πminus1

2 (A2) cup πminus13 (A3)

Let j isin [3] with |πminus1j (Aj)| ge 1

3|Ψ| The fiber πminus1

j (a) has constant size over a isin Ψj

Let cj = |πminus1j (a)| be this size Then

|Ψ| =sumaisinΨj

|πminus1j (a)| =

sumaisinΨj

cj = |Ψj| cj

And

|πminus1j (Aj)| =

sumaisinAjcapΨj

|πminus1j (a)| = |Aj capΨj| cj le |Aj| cj

Therefore

|Aj| ge|πminus1j (Aj)|cj

ge13|Ψ|cj

= 13|Ψj|

We have |Ψj| ge 2nH(Qj)minuso(n) ge 2nminiH(Qi)minuso(n) ge 2nminiH(Pi)minuso(n) We conclude|A1|+ |A2|+ |A3| ge |Aj| ge 1

3|Ψj| ge 1

32nminiH(Pi)minuso(n)

Lemma 423 log2˜τ(Φ) = maxPisinP(Φ) miniisin[3] H(Pi)

Proof This follows directly from the above lemmas

62 Chapter 4 The asymptotic spectrum of tensors matrix multiplication

Asymptotic slice rank

We now combine the above lemmas about slice rank and the asymptotic hittingset number to prove Theorem 416 First we have the following basic lemma

Lemma 424 minθisinΘ maxPisinP(Φ)Hθ(P ) = maxPisinP(Φ) miniisin[3]H(Pi)

Proof Since Hθ(P ) is convex in θ and concave in P von Neumannrsquos minimaxtheorem gives minθ maxPisinP(Φ)Hθ(P ) = maxPisinP(Φ) minθHθ(P ) Finally we usethat minθHθ(P ) = miniH(Pi)

Define fsim(t) = lim supnrarrinfin f(totimesn)1n and fsim(t) = lim infnrarrinfin f(totimesn)1n

Lemma 425 Let t isin Fn1 otimes Fn2 otimes Fn3 Then

maxgisinG(t)

maxPisinP(max supp(gmiddott))

miniH(Pi) le SRsim(t) le SRsim(t) le min

θζθ(t)

Proof By definition SRsim(t) le SRsim(t) From Lemma 417 follows

SRsim(t) le˜τ(supp(g middot t))

for any g isin G(t) Lemma 423 gives ˜τ(supp(g middot t)) = maxPisinP(supp(gmiddott)) mini 2H(Pi)

Thus with the help of Lemma 424

SRsim(t) le mingisinG(t)

maxPisinP(supp(gmiddott))

mini

2H(Pi) = minθζθ(t)

From Lemma 418 follows

˜τ(max(supp(g middot t))) le SRsim(t)

for any g isin G(t) Lemma 423 gives

maxgisinG(t)

maxPisinP(max(supp(gmiddott)))

mini

2H(Pi) le SRsim(t)

This proves the lemma

Proof of Theorem 416 We may assume Φ = supp(t) is oblique Then withthe help of Lemma 424 and Lemma 425

minθisinΘ

ζθ(t) = minθisinΘ

ζθ(t)

= minθisinΘ

maxPisinmax(Φ)

2Hθ(P )

= maxPisinmax(Φ)

miniisin[3]

2H(Pi)

le maxgisinG(t)

maxPisinP(max(supp(gmiddott)))

miniisin[3]

2H(Pi)

le SRsim(t)

le SRsim(t)

le minθisinΘ

ζθ(t)

This proves the claim

47 Conclusion 63

47 Conclusion

The study of asymptotic rank of tensors is motivated by the open problem of findingthe exponent of matrix multiplication Asymptotic subrank has applicationsin for example combinatorics and algebraic property testing Via the theoryof asymptotic spectra Strassen characterised asymptotic rank and asymptoticsubrank in terms of the asymptotic spectrum of tensors Strassen introduced thegauge points in X(T ) and the support functionals in X(oblique) More preciselythere are the lower support functionals and the upper support functionals Thelower support functionals are not additive and can thus not be universal spectralpoints The upper support functionals may be universal spectral points but thiscan however not be shown with the help of the lower support functionals Finallywe showed that for oblique tensors the asymptotic slice rank exists and equals theminimum value over the support functionals In the next chapter we will see asubfamily of the oblique 3-tensors for which the support functionals are powerfulenough to compute the asymptotic subrank

Chapter 5

Tight tensors and combinatorialsubrank cap sets

This chapter is based on joint work with Matthias Christandl and Peter Vrana [CVZ16

CVZ18]

51 Introduction

In the previous chapter we discussed the gauge points and the support function-als ζθ The gauge points are in the asymptotic spectrum of all tensors while thesupport functionals are in the asymptotic spectrum of oblique tensors

How ldquopowerfulrdquo are the support functionals We know ˜Q(t) le ζθ(t) le ˜R(t) for

oblique t Thus maxθ ζθ(t) le ˜R(t) In fact maxθ ζ

θ(t) is at most the maximumover the gauge points maxS ζ(S) and in turn maxS ζ(S) is at most ˜R(t) Asremarked earlier it is not known whether maxS ζ(S) equals ˜R(t) in general

On the other hand we have ˜Q(t) le minθ ζθ(t) Do we attain equality here

in general ˜Q(t) = minθ ζθ(t) The answer is ldquoyesrdquo for the subsemiring of tight

3-tensors In this chapter we study tight k-tensors

Tight tensors

Let I1 Ik be finite sets Let Φ sube I1 times middot middot middot times Ik We say Φ is tight if there areinjective maps ui Ii rarr Z for i isin [k] such that

forallα isin Φ u1(α1) + middot middot middot+ uk(αk) = 0

We say t isin Fn1 otimes middot middot middot otimes Fnk is tight if there is a g isin G(t) = GLn1 times middot middot middot times GLnksuch that the support supp(g middot t) is tight

Recall that a tensor is oblique if the support is an antichain in some basisClearly tight tensors are oblique To summarise the families of tensors that we

65

66 Chapter 5 Tight tensors and combinatorial subrank cap sets

have defined up to now we have

tight sube oblique sube robust sube θ-robust

Recall that the families of oblique robust and θ-robust tensors each form asemiring under otimes and oplus Tight tensors have the same property [Str91 Section 5]Another property is that any subset of a tight set is tight

Example 51 Let k ge 3 be fixed For any integer n ge 1 and c isin [n] the set

Φn(c) = α isin 0 nminus 1k α1 + middot middot middot+ αk = c

is tight For any integer n ge 2 and any c isin [n] the set

Ψn(c) = α isin 0 nminus 1k α1 + middot middot middot+ αk = c mod n

is not tight (cf Exercise 1520 in [BCS97])

Example 52 When F contains a primitive nth root of unity ζ the tensor

tn =sum

αisinΨn(nminus1)

eα1 otimes middot middot middot otimes eαk isin (Fn)otimesk

which has support Ψn(n minus 1) is tight Namely the elements vj =sumn

i=1 ζijei

for j isin [n] form a basis of Fn Let g isin G(tn) be the corresponding basistransformation Then we have tn =

sumnj=1 vj otimes middot middot middot otimes vj and we see that the

support supp(g middot tn) = α isin [n]k α1 = middot middot middot = αk is tight (See also [BCS97Exercise 1525]) When the characteristic of F equals n the tensor tn is also tightas we will see in Section 542

Combinatorial subrank and the CoppersmithndashWinograd method

We care about tight tensors because of a remarkable theorem for tight 3-tensors ofStrassen (Theorem 53 below) To understand the theorem we need the concept ofcombinatorial asymptotic subrank (cf [Str91 Section 5]) We say D sube I1timesmiddot middot middottimesIkis a diagonal when any two distinct α β isin D are distinct in all k coordinates Inother words for elements in D the value at one coordinate uniquely determinesthe value at the other k minus 1 coordinates Let Φ sube I1 times middot middot middot times Ik We say adiagonal D sube I1 times middot middot middot times Ik is free for Φ or simply D sube Φ is a free diagonalif D = Φ cap (D1 times middot middot middot times Dk) where Di = xi (x1 xk) isin D Define the(combinatorial) subrank Q(Φ) as the size of the largest free diagonal D sube ΦFor Φ sube I1 times middot middot middot times Ik and Ψ sube J1 times middot middot middot times Jk we naturally define the productΦtimesΨ sube (I1 times J1)times middot middot middot times (Ik times Jk) by

ΦtimesΨ = ((α1 β1) (αk βk)) α isin Φ β isin Ψ

51 Introduction 67

Define the (combinatorial) asymptotic subrank ˜Q(Φ) = limnrarrinfinQ(Φtimesn)1n Lett isin Fn1 otimes middot middot middot otimes Fnk and let Φ be the support of t in the standard basis ThenQ(Φ) le Q(t) and ˜Q(Φ) le ˜Q(t) The number Q(Φ) may be interpreted as thelargest number n such that 〈n〉 can be obtained from t using a restriction thatconsists of matrices that have at most one nonzero entry in each row and ineach column (This is called M-restriction in [Str87 Section 6] which standsfor monomial restriction) We may also interpret Φ as a k-partite hypergraphThen Q(Φ) is the size of the largest induced k-partite matching in Φ

Let Φ sube [n1]timesmiddot middot middottimes [nk] and let t isin Fn1 otimesmiddot middot middototimesFnk be any tensor with supportequal to Φ Then the (asymptotic) subranks of Φ and t are related as follows

Q(Φ) le Q(t) and ˜Q(Φ) le ˜Q(t)

Strassen proved the following theorem using the method of Coppersmith andWinograd [CW90] Recall that for Φ sube I1 times I2 times I3 we let P(Φ) be the set ofprobability distributions on Φ For P isin P(Φ) let P1 P2 P3 be the marginaldistributions of P on the 3 components of I1 times I2 times I3

Theorem 53 ([Str91 Lemma 51]) Let Φ sube I1 times I2 times I3 be tight Then

˜Q(Φ) = maxPisinP(Φ)

miniisin[3]

2H(Pi) (51)

The consequence of Theorem 53 is that the support functionals are sufficientlypowerful to compute the asymptotic subrank of tight 3-tensors

Corollary 54 ([Str91 Proposition 54]) Let t isin Fn1 otimes Fn2 otimes Fn3 be tight Then

˜Q(t) = minθisinP([3])

ζθ(t)

Moreover if Φ = supp(g middot t) is tight for some g isin G(t) then ˜Q(t) = ˜Q(Φ)

Remark 55 Strassen conjectured in [Str94 Conjecture 53] that for the familyof tight 3-tensors the support functionals give all spectral points in the asymp-totic spectrum X(tight 3-tensors) In [Str91] numerous examples are given ofsubfamilies of tight 3-tensors for which this is the case

Remark 56 Equation (51) becomes false when we let Φ sube I1 times middot middot middot times Ikwith k ge 4 and we let the right-hand side of the equation be maxPisinP(Φ) mini 2H(Pi)see [CVZ16 Example 1138]

New results in this chapter

This chapter is an investigation of tight tensors combinatorial asymptotic subrankand applications More precisely this chapter contains the following new results

68 Chapter 5 Tight tensors and combinatorial subrank cap sets

Higher-order CoppersmithndashWinograd method In Section 52 we extendTheorem 53 to obtain a lower bound for ˜Q(Φ) for tight sets Φ sube I1 times middot middot middot times Ikwith k ge 4 Our lower bound is not known to be optimal in general We computeexamples for which the lower bound is optimal

Combinatorial degeneration method In Section 53 we further extend therange of application of the CoppersmithndashWinograd method via a partial order

on supports of tensors called combinatorial degeneration We prove that if Φ Ψthen ˜Q(Φ) le ˜Q(Ψ) Suppose Ψ is not tight but Φ is tight then we may apply the(higher-order) CoppersmithndashWinograd method to obtain a lower bound on ˜Q(Φ)and thus on ˜Q(Ψ)

Cap sets In Section 54 we relate the theory of asymptotic spectra theCoppersmithndashWinograd method and the combinatorial degeneration methodto the problem of upper bounding the maximum size of cap sets in Fnp

Graph tensors Graph tensors are generalisations of the matrix multiplicationtensor 〈2 2 2〉 parametrised by graphs In Section 55 we discuss how one canapply the higher-order CoppersmithndashWinograd method to obtain upper boundson the asymptotic rank of complete graph tensors We also briefly discuss thesurgery method which gives good upper bounds on the asymptotic rank of graphtensors for sparse graphs like cycle graphs

52 Higher-order CW method

In this section we extend Theorem 53 to tight Φ sube I1 times middot middot middot times Ik with k ge 4We introduce some notation Let P(Φ) be the set of probability distributionson Φ For P isin P(Φ) let P1 Pk be the marginal distributions of P on the kcomponents of I1 times middot middot middot times Ik Let R(Φ) be the set of all subsets R sube Φ2 suchthat R 6sube (x x) x isin Φ and R sube (x y) isin Φ2 xi = yi for some i isin [k]For P isin P(Φ) and R isin R(Φ) let Q(R (P1 Pk)) be the set of probabilitydistributions Q on R whose marginal distributions on the 2k components of Rsatisfy Qi = Qk+i = Pi for i isin [k]

Let I1 Ik be finite subsets of Z The result of this section is a lower bound onthe asymptotic subrank of any Φ sube I1timesmiddot middot middottimesIk satisfying foralla isin Φ

sumki=1 ai = 0 For

R sube R(Φ) let r(R) be the rank over Q of the matrix with rows xminusy (x y) isin R

Theorem 57 Let Φ sube Zk be a finite set with foralla isin Φsumk

i=1 ai = 0 Then

log2 ˜Q(Φ) ge maxP

minRQ

H(P )minus (k minus 2)H(Q)minusH(P )

r(R)

with P isin P(Φ) R isin R(Φ) and Q isin Q(R (P1 Pk))

52 Higher-order CoppersmithndashWinograd method 69

521 Construction

We prepare for the proof of Theorem 57 by discussing some basic facts

Average-free sets

Lemma 58 Let k isin N Let M isin N We say a subset B sube ZMZ is (k minus 1)-average-free if

forallx1 xk isin B x1 + middot middot middot+ xkminus1 = (k minus 1)xk rArr x1 = middot middot middot = xk

There is a (k minus 1)-average-free set B sube ZMZ of size |B| = M1minuso(1)

Proof There is a set A sube 1 bMminus1kminus1c of size |A| = M1minuso(1) with

forallx1 xk isin A x1 + middot middot middot+ xkminus1 = (k minus 1)xk rArr x1 = middot middot middot = xk (52)

see [VC15 Lemma 10] Let B = a mod M a isin A sube ZMZ Then |B| = |A|Let x1 xk isin B with x1 + middot middot middot+ xkminus1 = (k minus 1)xk View x1 xk as elementsin 1 bMminus1

kminus1c Then x1 + middot middot middot+xkminus1 = (kminus1)xk still holds From (52) follows

x1 = middot middot middot = xk in Z and hence also in ZMZ

Linear combinations of uniform variables

Lemma 59 Let M be a prime Let u1 un be independently uniformly dis-tributed over ZMZ Let v1 vm be (ZMZ)-linear combinations of u1 unThen the vector v = (v1 vm) is uniformly distributed over the range of v in(ZMZ)m

Proof Let vi =sum

j cijuj with cij isin ZMZ Then v = Cu with u = (u1 un)and C the matrix with entries Cij = cij Let y in the image of C Then thecardinality of the preimage Cminus1(y) equals the cardinality of the kernel of CIndeed if Cx = y then Cminus1(y) = x + ker(C) Since u is uniform we concludethat v is uniform on the image of C

Free diagonals

Lemma 510 Let G be a graph with n vertices and m edges Then G has at leastnminusm connected components

Proof A graph without edges has n connected components For every edge thatwe add to the graph we lose at most one connected component

Lemma 511 Let I1 Ik be finite sets Let Ψ sube I1 times middot middot middot times Ik Let

C = a b sube Ψ a 6= bexisti isin [k] ai = bi

Then Q(Ψ) ge |Ψ| minus |C| Obviously the statement remains true if we replace C bythe larger set (a b) isin Ψ2 a 6= bexisti isin [k] ai = bi

70 Chapter 5 Tight tensors and combinatorial subrank cap sets

Proof Let G = (Ψ C) be the graph with vertex set Ψ and edge set C Let Γ sube Ψcontain exactly one vertex per connected component of G The vertices in Γ arepairwise not adjacent So Γ is a diagonal Of course Γ sube Ψcap (Γ1times middot middot middot times Γk) Leta isin Ψ cap (Γ1 times middot middot middot times Γk) Let x1 xk isin Γ with

(x1)1 = a1 (x2)2 = a2 (xk)k = ak

Then x1 xk are all adjacent to a in G ie they are all in the same connectedcomponent Then x1 = middot middot middot = xk since Γ contains precisely one vertex perconnected component So a = x1 = middot middot middot = xk So a isin Γ We conclude thatΓ supe Ψ cap (Γ1 times middot middot middot times Γk) Finally |Γ| ge |Ψ| minus |C| by Lemma 510

We now give the proof of Theorem 57 We repeat some notation from aboveLet k ge 3 Let Φ sube Zk be a finite set Let P(Φ) be the set of probabilitydistributions on Φ For P isin P(Φ) let P1 Pk be the marginal distributionsof P on the k components of Zk Let R(Φ) be the set of all subsets R sube Φ2 suchthat R 6sube (x x) x isin Φ and R sube (x y) isin Φ2 xi = yi for some i isin [k]For P isin P(Φ) and R isin R(Φ) let Q(R (P1 Pk)) be the set of probabilitydistributions Q on R whose marginal distributions on the 2k components of Rsatisfy Qi = Qk+i = Pi for i isin [k] For R sube R(Φ) let r(R) be the rank over Q ofthe matrix with rows

xminus y (x y) isin R

For any prime M let rM(R) be the rank over ZMZ of the same matrix

Theorem (Theorem 57) Let Φ sube Zk be a finite set with foralla isin Φsumk

i=1 ai = 0Then

log2 ˜Q(Φ) ge maxP

minRQ

H(P )minus (k minus 2)H(Q)minusH(P )

r(R)

with P isin P(Φ) R isin R(Φ) and Q isin Q(R (P1 Pk))

Proof Let P be a rational probability distribution on Φ ie foralla isin Φ P (a) isin Q

Choice of parameters

This proof involves a variable N that we will let go to infinity and a primenumber M that depends on N For the sake of rigor we first set the dependenceof M on N and make sure that N is large enough for M to have good properties

Let n isin N such that P is an n-type ie foralla isin Φ nP (a) isin N Let N = tn be amultiple of n Let

f(N) = log2

(2|Φ|

2

maxRisinR(Φ)

(N + |R| minus 1

|R| minus 1

))isin o(N) (53)

52 Higher-order CoppersmithndashWinograd method 71

Let

g(N) = |Φ| log2(N + 1) isin o(N)

By Lemma 420

2NH(P )minusg(N) le(N

NP

) (54)

Let

micro(N) = maxRQ

H(Q)minusH(P ) + (1 + g(N) + f(N)) 1N

r(R)(55)

with R isin R(Φ) and Q isin Q(R (P1 Pk)) Let M be a prime with

d2micro(N)Ne leM le 2d2micro(N)Ne (56)

Such a prime exists by Bertrandrsquos postulate see eg [AZ14] We can make Marbitrarily large by choosing N large enough Choose N = tn large enough suchthat

M gt k minus 1 (57)

forallR isin R(Φ) rM(R) = r(R) (58)

We will later let t and thus N go to infinity

Restrict to marginal type classes

The set ΦotimesN is a finite subset of (ZN)k Let a isin ΦotimesN Then we have thatai = ((ai)1 (ai)N) isin ZN for i isin [k] We restrict to those a for which ai is inthe type class TNPi for all i isin [k] Thus let

Ψ = ΦotimesN cap (TNP1times middot middot middot times TNPk)

We prove a lower bound on the size of Ψ Let (s1 sN ) isin TNP Then sj isin Φ forj isin [N ] and ((s1)i (sN)i) isin TNPi for i isin [k] So(

((s1)1 (sN)1) ((s1)k (sN)k))isin ΦotimesN cap (TNP1

times middot middot middot times TNPk) = Ψ

Thus |Ψ| ge |TNP | By Lemma 419 |TNP | =(NNP

) By Lemma 420

(NNP

)ge

2NH(P )minusg(N) Therefore

|Ψ| ge 2NH(P )minusg(N) (59)

72 Chapter 5 Tight tensors and combinatorial subrank cap sets

Hashing

Let u1 ukminus1 v1 vN isin ZMZ For i isin [k] let

hi ZN rarr ZMZ

x 7rarr

ui +

sumNj=1 xjvj for 1 le i le k minus 1

1kminus1

(u1 + middot middot middot+ ukminus1 minus

sumNj=1 xjvj

)for i = k

Note that kminus1 is invertible in ZMZ by (57) Let a isin Ψ Then ((a1)j (ak)j) isinΦ for j isin [N ] So

sumki=1(ai)j = 0 for every j isin [N ] Thus

ksumi=1

Nsumj=1

(ai)jvj =Nsumj=1

vj

ksumi=1

(ai)j = 0

Therefore

h1(a1) + middot middot middot+ hkminus1(akminus1) = (k minus 1)hk(ak)

Restrict to average-free set

Let B sube ZMZ be a (k minus 1)-average-free set of size

|B| geM1minusκ(M) with κ(M) isin o(1) (510)

meaning

forallx1 xk isin B x1 + middot middot middot+ xkminus1 = (k minus 1)xk rArr x1 = middot middot middot = xk (511)

(Lemma 58) Let Ψprime sube Ψ be the subset

Ψprime = a isin Ψ foralli isin [k] hi(ai) isin B

Let a isin Ψprime Then a isin Ψ so

h1(a1) + middot middot middot+ hkminus1(akminus1) = (k minus 1)hk(ak)

Since hi(ai) isin B for every i isin [k] (511) implies

h1(a1) = middot middot middot = hk(ak)

Probabilistic method

Clearly Q(ΦotimesN) ge Q(Ψ) ge Q(Ψprime) Let

C prime = (a b) isin Ψprime2 a 6= bexisti isin [k] ai = bi

52 Higher-order CoppersmithndashWinograd method 73

Let X = |Ψprime| and Y = |C prime| By Lemma 511

Q(Ψprime) ge X minus Y

Let u1 ukminus1 v1 vN be independent uniformly random variables over thefield ZMZ Then X and Y are random variables Then

Q(Ψprime) ge E[X minus Y ] = E[X]minus E[Y ]

where the expectation is over u1 ukminus1 v1 vN We will prove

E[X] = |B| |Ψ|Mminus(kminus1) (512)

E[Y ] le |B|maxRQ

2NH(Q)+f(N)Mminus(kminus1)minusr(R) (513)

with f(N) as defined in (53) and R isin R(Φ) Q isin Q(R (P1 Pk)) Beforeproving (512) and (513) we derive the final bound

Derivation of final bound

From (512) and (513) follows

E[X]minus E[Y ] ge |B| |Ψ|Mminus(kminus1) minus |B|maxRQ

2NH(Q)+f(N)Mminus(kminus1)minusr(R)

We factor out |B| |Ψ| and Mminus(kminus1)

E[X]minus E[Y ] ge |B| |Ψ|Mminus(kminus1)(

1minus 1

|Ψ|maxRQ

2NH(Q)+f(N)Mminusr(R))

From our choice of micro(N) from (55)

micro(N) = maxRQ

H(Q)minusH(P ) + (1 + g(N) + f(N)) 1N

r(R)

follows

maxRQ

2N(H(Q)minusH(P )minusr(R)micro(N))+g(N)+f(N) le 1

2 (514)

Apply |B| geM1minusκ(M) from (510) and |Ψ| ge 2NH(P )minusg(N) from (59) to get

E[X]minus E[Y ] geM1minusκ(M)2NH(P )minusg(N)Mminus(kminus1)

middot(

1minus 2minusNH(P )+g(N) maxRQ

2NH(Q)+f(N)Mminusr(R))

geMminus(kminus2+κ(M))2NH(P )minusg(N)

74 Chapter 5 Tight tensors and combinatorial subrank cap sets

middot(

1minusmaxRQ

2NH(Q)minusNH(P )+g(N)+f(N)Mminusr(R))

(Here we used (514) to see that the second factor is nonnegative) Apply theupper bound 2micro(N)N leM le 2micro(N)N+2 from (56) to get

E[X]minus E[Y ] ge (2micro(N)N+2)minus(kminus2+κ(M))2NH(P )minusg(N)

middot(

1minusmaxRQ

2NH(Q)minusNH(P )+g(N)+f(N)(2micro(N)N)minusr(R))

= 2N(H(P )minus(kminus2+κ(M))micro(N))minus2(kminus2+κ(M))minusg(N)

middot(

1minusmaxRQ

2N(H(Q)minusH(P )minusr(R)micro(N))+g(N)+f(N))

Using (514) we get

E[X]minus E[Y ] ge 2N(H(P )minus(kminus2+κ(M))micro(N))minus2(kminus2+κ(M))minusg(N)(1minus 1

2)

= 2N(H(P )minus(kminus2+κ(M))micro(N))minus2(kminus2+κ(M))minusg(N)minus1

Then

1

Nlog2 Q(ΦotimesN)

ge 1

Nlog2(E[X]minus E[Y ])

ge H(P )minus (k minus 2 + κ(M)) maxRQ

H(Q)minusH(P ) + (1 + g(N) + f(N)) 1N

r(R)

minus 2(k minus 2 + κ(M)) + g(N) + 1

N

We let t and thus N go to infinity and obtain

log2 ˜Q(Φ) ge H(P )minus (k minus 2) maxRQ

H(Q)minusH(P )

r(R)

This lower bound holds for any rational probability distribution P on Φ and bycontinuity for any real probability distribution P on Φ

It remains to prove (512) and (513) We do this in the lemmas below

Lemma 512 E[X] = |B| |Ψ|Mminus(kminus1)

Proof Let a isin Ψ Then h1(a1) + middot middot middot+ hkminus1(akminus1) = (k minus 1)hk(ak) The followingfour statements are equivalent

a isin Ψprime

foralli isin [k] hi(ai) isin B

52 Higher-order CoppersmithndashWinograd method 75

existb isin B h1(a1) = middot middot middot = hk(ak) = b

existb isin B h1(a1) = middot middot middot = hkminus1(akminus1) = b

Therefore

P[a isin Ψprime] =sumbisinB

P[h1(a1) = middot middot middot = hkminus1(akminus1) = b]

For b isin B

P[h1(a1) = middot middot middot = hkminus1(akminus1) = b] = (Mminus1)kminus1

We conclude

E[X] =sumaisinΨ

P[a isin Ψprime]

=sumaisinΨ

sumbisinB

P[h1(a1) = middot middot middot = hkminus1(akminus1) = b]

=sumaisinΨ

sumbisinB

(Mminus1)kminus1

= |Ψ| |B|Mminus(kminus1)

This proves the lemma

Lemma 513 E[Y ] le |B|maxRQ 2NH(Q)+f(N)Mminus(kminus1)minusr(R)

Proof Let

C = (a aprime) isin Ψ2 a 6= aprimeexisti isin [k] ai = aprimei

Let (a aprime) isin C The following statements are equivalent

(a aprime) isin C prime (515)

a aprime isin Ψprime (516)

foralli isin [k] hi(ai) hi(aprimei) isin B (517)

existb isin B h1(a1) = middot middot middot = hk(ak) = h1(aprime1) = middot middot middot = hk(aprimek) = b (518)

Therefore

E[Y ] =sum

(aaprime)isinC

P[(a aprime) isin C prime]

=sum

(aaprime)isinC

sumbisinB

P[h1(a1) = middot middot middot = hk(ak) = h1(aprime1) = middot middot middot = hk(aprimek) = b]

76 Chapter 5 Tight tensors and combinatorial subrank cap sets

Let (a aprime) isin C Then hi(ai) and hi(aprimei) are ZMZ-linear combinations of

u1 ukminus1 v1 vN The random variable(h1(a1) hk(ak) h1(aprime1) hk(a

primek))

is uniformly distributed over the image subspace V sube (ZMZ)2k Let b isin BThen (b b) isin V since u1 = middot middot middot = uk = b v1 vN = 0 is a valid assignmentTherefore

P[h1(a1) = middot middot middot = hk(ak) = h1(aprime1) = middot middot middot = hk(aprimek) = b] = |V |minus1

And |V | equals M to the power the rank of the matrix1 0 middot middot middot 0 1

kminus11 0 middot middot middot 0 1

kminus1

0 1 0 1kminus1

0 1 0 1kminus1

0 0 middot middot middot 1 1

kminus10 0 middot middot middot 1 1

kminus1

a1 a2 middot middot middot akminus1 minus akkminus1

aprime1 aprime2 middot middot middot aprimekminus1 minusaprimekkminus1

(519)

over ZMZ with a1 ak aprime1 a

primek thought of as column vectors in (ZMZ)N

With column operations we transform (519) into0 0 middot middot middot 0 0 1 0 middot middot middot 0 00 0 middot middot middot 0 0 0 1 0 0

0 0 middot middot middot 0 0 0 0 1 0

a1 minus aprime1 a2 minus aprime2 middot middot middot akminus1 minus aprimekminus1 ak minus aprimek aprime1 aprime2 middot middot middot aprimekminus1 0

(520)

Matrix (520) has rank equal to k minus 1 plus rM(a aprime) = rk(A(a aprime)) where

A(a aprime) =(a1 minus aprime1 a2 minus aprime2 middot middot middot ak minus aprimek

)

We obtain

E[Y ] lesum

(aaprime)isinC

sumbisinB

Mminus(kminus1+rM (aaprime))

Since the summands are independent of b we get

E[Y ] le |B|sum

(aaprime)isinC

Mminus(kminus1+rM (aaprime))

Let (a aprime) isin C Consider the rows of A(a aprime) The N rows are of theform xi minus yi with (xi yi) isin Φ2 Let s = ((x1 y1) (xN yN)) Let R =

52 Higher-order CoppersmithndashWinograd method 77

(x1 y1) (xN yN) We have rM(a aprime) = rM(R) and rM(R) = r(R) by (58)Let Q be the N -type with supp(Q) = R and s isin TNQ From a 6= aprime followsR 6sube (x x) x isin Φ From existi isin [k] ai = aprimei follows existi isin [k] R sube (x y) xi = yiFrom a aprime isin TNP1

times middot middot middot times TNPk follows Qi = Qk+i = Pi for all i isin [k] We thus have

E[Y ] le |B|sum

RisinR(Φ)

sumQisinQ(R(P1Pk))

supp(Q)=RQ is N -type

sumsisinTNQ

Mminus(kminus1+r(R))

The number of N -types Q with supp(Q) = R is at most the number of N -typeson R which is at most

(N+|R|minus1|R|minus1

)(Lemma 419) For any Q isin Q(R (P1 Pk))

|TNQ | le 2NH(Q) (Lemma 419) Therefore

E[Y ] le |B|sum

RisinR(Φ)

(N + |R| minus 1

|R| minus 1

)max

QisinQ(R(P1Pk))2NH(Q) Mminus(kminus1+r(R))

Also |R(Φ)| le 2|Φ|2

Therefore

E[Y ] le |B| 2|Φ|2 maxRisinR(Φ)

(N + |R| minus 1

|R| minus 1

)max

QisinQ(R(P1Pk))2NH(Q) Mminus(kminus1+r(R))

We conclude that

E[Y ] le |B|maxRQ

2NH(Q)+f(N)Mminus(kminus1)minusr(R)

This proves the lemma

522 Computational remarks

The following two lemmas are helpful when applying Theorem 57 We leave theproof to the reader

Lemma 514 Let P isin P(Φ) Let RRprime isin R(Φ) with R sube Rprime and r(R) = r(Rprime)Then

maxQisinQ(R(P1Pk))

H(Q)minusH(P )

r(R)le max

QisinQ(Rprime(P1Pk))

H(Q)minusH(P )

r(Rprime)

Lemma 515 Let R isin R(Φ) There is an equivalence relation Rprime isin R(Φ) withR sube Rprime and r(R) = r(Rprime)

78 Chapter 5 Tight tensors and combinatorial subrank cap sets

523 Examples type sets

We discuss some examples The first example we will use to get good upperbounds on the asymptotic rank of complete graph tensors in Section 55 We focuson one family of examples that is parametrised by partitions Let λ ` k be aninteger partition of k with d parts Let

Φλ = a isin 0 1 dminus 1 type(a) = λ

The set Φλ is tight

Theorem 516 log2 ˜Q(Φ(22)) = 1

Proof Let Φ = Φ(22) Clearly ˜Q(Φ) le 2 After relabelling foralla isin Φsumk

i=1 ai = 0We may thus apply Theorem 57 Let P be the uniform probability distributionon Φ Then H(P ) = log2 6

Let R isin R(Φ) We may assume that

R sube (1 1 0 0) (1 0 1 0) (1 0 0 1)2

cup(0 0 1 1) (0 1 0 1) (0 1 1 0)2

We may assume R is an equivalence relation (Lemma 515) Let (x y) isin RLet Rprime = R cup ((1 1 1 1) minus x (1 1 1 1) minus y) isin R(Φ) Then R sube Rprime andRprime isin R(Φ) and r(R) = r(Rprime) We may thus assume that if (x y) isin R then also((1 1 1 1)minus x (1 1 1 1)minus y) isin R (Lemma 514)

Let S = (1 1 0 0) (1 0 1 0) (1 0 0 1) By the above observation it sufficesto consider equivalence relations on S There are three types of such equivalencerelations

Type (3) all three elements of S are equivalent Then |R| = 18 and r(R) = 2Type (2 1) two elements of S are equivalent and inequivalent to the third

element (which is equivalent to itself) Then |R| = 10 and r(R) = 1Type (1 1 1)) all elements of S are inequivalent Then R sube (x x) x isin Φ

which is a contradictionFor type (3) and (2 1) the uniform probability distribution Q on R has

marginals Qi = Q4+i = Pi for i isin [4] The uniform Q is optimal Then H(Q) =log2 |R| Let R(3) and R(21) be equivalence relations of type (3) and (2 1) Then

log2 ˜Q(Φ) ge minH(P )minus 2

r(R(3))

(log2 |R(3)| minusH(P )

)

H(P )minus 2

r(R(21))

(log2 |R(21)| minusH(P )

)= min log2 6minus 2

2(log2 18minus log2 6)

log2 6minus 21(log2 10minus log2 6)

= min1 log25425 = 1

This proves the theorem

53 Combinatorial degeneration method 79

Theorem 517 log2 ˜Q(Φ(0kminus11)) = h(1k)

Proof We refer to [CVZ16]

With Srinivasan Arunachalam and Peter Vrana we have the following unpub-lished result

Theorem 518 log2 ˜Q(Φ(0k21k2)) = 1

53 Combinatorial degeneration method

In this section we extend the (higher-order) CoppersmithndashWinograd method via apreorder called combinatorial degeneration Suppose Ψ sube I1timesmiddot middot middottimes Ik is not tightbut has a tight subset Φ sube Ψ In the rest of this section we focus on obtaining alower bound on ˜Q(Ψ) via Φ This has an application in the context of tri-coloredsum-free sets (Section 542) for example

Definition 519 ([BCS97]) Let Φ sube Ψ sube I1 times middot middot middot times Ik We say that Φ is acombinatorial degeneration of Ψ and write Ψ Φ if there are maps ui Ii rarr Z(i isin [k]) such that for all α isin I1 times middot middot middot times Ik if α isin Ψ Φ then

sumki=1 ui(αi) gt 0

and if α isin Φ thensumk

i=1 ui(αi) = 0 Note that the maps ui need not be injective

Combinatorial degeneration gets its name from the following standard proposi-tion see eg [BCS97 Proposition 1530]

Proposition 520 Let t isin Fn1 otimes middot middot middot otimes Fnk Let Ψ = supp(t) Let Φ sube Ψ suchthat Ψ Φ Then t t|Φ

Proposition 520 brings us only slightly closer to our goal Namely givent isin Fn1 otimesmiddot middot middototimesFnk with Ψ = supp(t) and given Φ sube Ψ such that ΨΦ it followsdirectly from Proposition 520 that t t|Φ and thus ˜Q(t) ge ˜Q(t|Φ) This howeverdoes not give us a lower bound on the combinatorial asymptotic subrank ˜Q(Ψ)The following theorem does Our theorem extends a result in [KSS16]

Theorem 521 Let Φ sube Ψ sube I1 times middot middot middot times Ik If Ψ Φ then

˜Q(Ψ) ge ˜Q(Φ)

Lemma 522 Let Φ sube Ψ sube I1 times middot middot middot times Ik If Ψ Φ then ˜Q(Ψ) ge Q(Φ)

Proof Pick maps ui Ii rarr Z such that

ksumi=1

ui(αi) = 0 for α isin Φ

ksumi=1

ui(αi) gt 0 for α isin Ψ Φ

80 Chapter 5 Tight tensors and combinatorial subrank cap sets

Let D be a free diagonal in Φ with |D| = Q(Φ) and let

wi =sumxisinDi

ui(x)

Let n isin N and define

Wi =

(x1 xn|D|) isin Itimesn|D|i

n|D|sumj=1

ui(xj) = nwi

Then

Ψtimesn|D| cap (W1 times middot middot middot timesWk) = Φtimesn|D| cap (W1 times middot middot middot timesWk)

The inclusion supe is clear To show sube let (x1 xk) isin Ψtimesn|D| cap (W1 times middot middot middot timesWk)Write xi = (xi1 xi2 xin|D|) and consider the n|D| times k matrix of evaluations

u1(x11) u2(x21) middot middot middot uk(xk1)u1(x12) u2(x22) middot middot middot uk(xk2)

u1(x1n|D|) u2(x2n|D|) middot middot middot uk(xkn|D|)

The sum of the ith column is nwi by definition of Wi andsumk

i=1 nwi = 0 Therow sums are nonnegative by definition of the maps u1 uk We conclude thatthe row sums are zero Therefore (x1 xk) is an element of Φtimesn|D|

Since D is a free diagonal in Φ Dtimesn|D| is a free diagonal in Φtimesn|D| and alsoDtimesn|D| cap (W1times middot middot middot timesWk) is a free diagonal in Φtimesn|D| cap (W1times middot middot middot timesWk) which inturn is equal to Ψtimesn|D| cap (W1 times middot middot middot timesWk) Therefore Dtimesn|D| cap (W1 times middot middot middot timesWk) isalso a free diagonal in Ψtimesn|D| ie

Q(Ψtimesn|D|) ge |Dtimesn|D| cap (W1 times middot middot middot timesWk)|

In the set Dtimesn|D| consider the strings with uniform type ie where all |D|elements of D occur exactly n times These are clearly in W1 times middot middot middot timesWk andtheir number is

(n|D|nn

) Therefore

Q(Ψtimesn|D|) ge(

n|D|n n

)= |D|n|D|minuso(n)

which implies ˜Q(Ψ) = limnrarrinfinQ(Ψtimesn|D|)1

n|D| ge |D|

Proof of Theorem 521 We have ˜Q(Ψ) = limnrarrinfin ˜Q(Ψtimesn)1n It follows fromLemma 522 that

limnrarrinfin ˜Q(Ψtimesn)1n ge lim

nrarrinfinQ(Φtimesn)1n

The right-hand side is ˜Q(Φ)

54 Cap sets 81

54 Cap sets

A subset A sube (Z3Z)n is called a cap set if any line in A is a point a linebeing a triple of points of the form (u u + v u + 2v) Until recently it wasnot known whether the maximal size of a cap set in (Z3Z)n grows like 3nminuso(n)

or like cnminuso(n) for some c lt 3 Gijswijt and Ellenberg in [EG17] inspired bythe work of Croot Lev and Pach in [CLP17] settled this question showing thatc le 3(207+33

radic33)138 asymp 2755 Tao realised in [Tao16] that the cap set question

may naturally be phrased as the problem of computing the size of the largestmain diagonal in powers of the ldquocap set tensorrdquo

sumα eα1 otimes eα2 otimes eα3 where the

sum is over α1 α2 α3 isin F3 with α1 + α2 + α3 = 0 Here main diagonal refersto a subset A of the basis elements such that restricting the cap set tensor toAtimesAtimesA gives the tensor

sumvisinA votimes votimes v We show that the cap set tensor is in

the GL3(F3)times3 orbit of the ldquoreduced polynomial multiplication tensorrdquo which wasstudied in [Str91] and we show how recent results follow from this connectionusing Theorem 521

541 Reduced polynomial multiplication

Let tn be the tensorsum

α eα1 otimes eα2 otimes eα3 where the sum is over (α1 α2 α3) in0 1 nminus13 such that α1 +α2 = α3 We call tn the reduced polynomial multi-plication tensor since tn is essentially the structure tensor of the algebra F[x](xn)of univariate polynomials modulo the ideal generated by xn The support of tnequals

(α1 α2 α3) isin 0 nminus 13

∣∣α1 + α2 = α3

which via α3 7rarr nminus 1minus α3 we may identify with the set

Φn =

(α1 α2 α3) isin 0 nminus 13∣∣α1 + α2 + α3 = nminus 1

(521)

The support Φn is tight (cf Example 51) Strassen proves in [Str91 Theorem 67]using Corollary 54 that ˜Q(tn) = ˜Q(Φn) = z(n) where z(n) is defined as

z(n) =γn minus 1

γ minus 1γminus2(nminus1)3 (522)

with γ equal to the unique positive real solution of the equation 1γminus1minus n

γnminus1= nminus1

3

The following table contains values of z(n) for small n See also [Str91 Table 1]

82 Chapter 5 Tight tensors and combinatorial subrank cap sets

n z(n)

rounded exact

2 188988 3223 = 2h(13)

3 275510 3(207 + 33radic

33)1384 3610725 4461586 5309737 6156208 7001559 78461210 869012

In fact [Str91 Theorem 67] says that the asymptotic spectrum of tn is completelydetermined by the support functionals and that the possible values that thespectral points can take on tn form the closed interval [z(n) n] (cf Remark 221)

X(N[tn]) = ζθ|N[tn] θ isin P([3]) φ(tn) φ isin X(N[tn]) = [z(n) n]

542 Cap sets

We turn to cap sets

Definition 523 A three-term progression-free set is a set A sube (ZmZ)n satisfy-ing the following For all (x1 x2 x3) isin Atimes3 there are u v isin (ZmZ)n such that(x1 x2 x3) = (u u + v u + 2v) if and only if x1 = x2 = x3 Let r3((ZmZ)n) bethe size of the largest three-term progression-free set in (ZmZ)n and define theregularisation ˜r3(ZmZ) = limnrarrinfin r3((ZmZ)n)1n

A three-term progression-free set in (Z3Z)n is called a cap or cap set Wenext discuss an asymmetric variation on three-term progression free sets calledtri-colored sum-free sets which are potentially larger They are interesting sinceall known upper bound techniques for the size of three-term progression-free setsturn out to be upper bounds on the size of tri-colored sum-free sets

Definition 524 Let G be an abelian group Let Γ sube GtimesGtimesG For i isin [3] wedefine the marginal sets Γi = x isin G existα isin Γ αi = x We say Γ is tricoloredsum-free if the following holds The set Γ is a diagonal and for any α isin Γ1timesΓ2timesΓ3α1 + α2 + α3 = 0 if and only if α isin Γ (Recall that Γ sube I1 times I2 times I3 is a diagonalwhen any two distinct α β isin Γ are distinct in all coordinates) Let s3(G) be thesize of the largest tricolored sum-free set in GtimesGtimesG and define the regularisation

˜s3(G) = limnrarrinfin s3(Gtimesn)1n

Equivalently Γ sube GtimesGtimesG is a tricolored sum-free set if and only if Γ is afree diagonal in α isin GtimesGtimesG α1 + α2 + α3 = 0

54 Cap sets 83

If the set A sube G = (ZmZ)n is three-term progression-free then the setΓ = (a aminus2a) a isin A sube G times G times G is tri-colored sum-free Therefore wehave ˜r3(ZmZ) le ˜s3(ZmZ)

We summarise the recent history of results on cap sets For clarity we focuson m = 3 we refer the reader to the references for the general results Edel in[Ede04] proved the lower bound 221739 le ˜r3(Z3Z) In [EG17] Ellenberg andGijswijt proved the upper bound

˜r3(Z3Z) le 3(207 + 33radic

33)138 asymp 2755

Blasiak et al [BCC+17] proved that in fact

˜s3(Z3Z) le 3(207 + 33radic

33)138

This upper bound was shown to be an equality in [KSS16 Nor16 Peb16]

Theorem 525 ˜s3(Z3Z) = 3(207 + 33radic

33)138

We reprove Theorem 525 by proving that ˜s3(ZmZ) equals the asymptoticsubrank z(m) of tm discussed in Section 541 when m is a prime power Thesignificance of our proof lies in the explicit connection to the framework ofasymptotic spectra and not in the obtained value which also for prime powers mwas already computed in [BCC+17 KSS16 Nor16 Peb16]

Proof We will prove ˜s3(ZmZ) = z(m) when m is a prime power By defini-tion ˜s3(ZmZ) equals the asymptotic subrank of the set

α isin 0 mminus 13 α1 + α2 + α3 = 0 mod m

which via α3 7rarr α3 minus (mminus 1) we may identify with the set

Ψm = α isin 0 mminus 13 α1 + α2 + α3 = mminus 1 mod m

and so ˜s3(ZmZ) = ˜Q(Ψm) Let

Φm = α isin 0 mminus 13 α1 + α2 + α3 = mminus 1

We know ˜Q(Φm) = z(m) (Section 541) We will show that ˜Q(Φm) = ˜Q(Ψm)when m is a prime power This proves the theorem

We prove ˜Q(Φm) le ˜Q(Ψm) There is a combinatorial degeneration Φm ΨmIndeed let ui 0 mminus 1 rarr 0 mminus 1 be the identity map If α isin Φmthen

sum3i=1 ui(αi) = m minus 1 and if α isin Ψm Φm then

sum3i=1 ui(αi) equals m minus 1

plus a positive multiple of m This means Theorem 521 applies and we thusobtain ˜Q(Φm) le ˜Q(Ψm) This proves the claim

We show ˜Q(Ψm) le ˜Q(Φm) when m is a power of the prime p Let F = FpLet fm isin Fm otimes Fm otimes Fm have support Ψm with all nonzero coefficients equal

84 Chapter 5 Tight tensors and combinatorial subrank cap sets

to 1 Obviously ˜Q(Ψm) le ˜Q(fm) To compute ˜Q(fm) we show that there is abasis in which the support of fm equals the tight set Φm Then ˜Q(fm) = ˜Q(Φm)(Corollary 54) This implies the claim We prepare to give the basis (which isthe same basis as used in [BCC+17]) First observe that the rule x 7rarr

(xa

)gives a

well-defined map ZmZrarr ZpZ since for a isin 0 1 mminus 1 if x = y mod mthen

(xa

)=(ya

)mod p by Lucasrsquo theorem Let (ex)x be the standard basis of Fm

The elements (sum

xisinZmZ(xa

)ex)aisinZmZ form a basis of Fm since the matrix (

(xa

))ax

is upper triangular with ones on the diagonal We will now rewrite fm in the basis((sum

x

(xa

)ex)a (

sumy

(yb

)ey)b (

sumz

(zc

)ez)c) Observe that

(x

mminus1

)equals 1 if and only

if x equals mminus 1 and hence

fm =sum

xyzisinZmZx+y+z=mminus1

ex otimes ey otimes ez =sum

xyzisinZmZ

(x+ y + z

mminus 1

)ex otimes ey otimes ez

The identity(x+y+zw

)=sum(

xa

)(yb

)(zc

)with sum over a b c isin 0 1 mminus 1 such

that a+ b+ c = w is true and thussumxyzisinZmZ

(x+ y + z

mminus 1

)ex otimes ey otimes ez

=sum

xyzisinZmZ

sumabcisin01mminus1

a+b+c=mminus1

(x

a

)(y

b

)(z

c

)ex otimes ey otimes ez (523)

We may simply rewrite (523) as

sumabcisin01mminus1

a+b+c=mminus1

sumxisinZmZ

(x

a

)ex otimes

sumyisinZmZ

(y

b

)eb otimes

sumzisinZmZ

(z

c

)ez

Therefore with respect to the basis ((sum

x

(xa

)ex)a (

sumy

(yb

)ey)b (

sumz

(zc

)ez)c) the

support of fm equals the tight set Φm (And even stronger fm is isomorphic tothe tensor F[x](xm) of Section 541)

Remark 526 Why did we reprove the cap set result Theorem 525 Ourmotivation being interested in the asymptotic spectrum of tensors was to seeif the techniques in the cap set papers are stronger than the Strassen supportfunctionals ie whether they give any new spectral points Above we have seenthat the cap set result itself can be proven with the support functionals In fact weshow in Section 46 that for oblique tensors the asymptotic slice-rank which wasintroduced in [Tao16] to give a concise proof of [EG17] equals the minimum valueover the support functionals In Section 611 we show that for all complex tensorsasymptotic slice-rank equals the minimum value of the quantum functionals

55 Graph tensors 85

55 Graph tensors

In this section we briefly discuss the application that motivated us to proveTheorem 57 in [CVZ16] namely upper bounding the asymptotic rank of so-calledgraph tensors Graph tensors are defined as follows

Let G = (VE) be a graph (or hypergraph) with vertex set V and edgeset E Let n isin N Let (bi)iisin[n] be the standard basis of Fn We define the graphtensor Tn(G) as

Tn(G) =sumiisin[n]E

otimesvisinV

(otimeseisinEvisine

bie

)

seen as a |V |-tensor Given a vertex v isin V let d(v) denote the degree of v thatis d(v) equals the number of edges e isin E that contain v Then Tn(G) is naturallyinotimes

visinV Fd(v) We write T(G) for T2(G) For example for the complete graphon four vertices K4 the graph tensor is

T(K4) = T( )

= T( )

otimes T( )

otimes T( )

otimes T( )

otimes T( )

otimes T( )

=sum

iisin016(bi1otimesbi2otimesbi5)otimes (bi2otimesbi3otimesbi6)otimes (bi3otimesbi4otimesbi5)otimes (bi1otimesbi4otimesbi6)

living in (C8)otimes4 Let Kk be the complete graph on k vertices The 2times2 matrix mul-tiplication tensor 〈2 2 2〉 equals the tensor T(K3) Define the exponent ω(T(G)) =log2 ˜R(T(G)) We study the exponent per edge τ(T(G)) = ω(T(G)) |E(G)|

Our result is an upper bound on τ(T(K4)) in terms of the combinatorialasymptotic subrank ˜Q(Φ(22)) which we studied in Theorem 516

Theorem 527 For any q ge 1 τ(T(K4)) le logq

( q + 2

˜Q(Φ(22))

)

Proof We apply a generalisation of the laser method See [CVZ16]

Corollary 528 Let k ge 4 Then τ(T(Kk)) le 0772943

Proof In the bound of Theorem 527 we plug in the value ˜Q(Φ(22)) = 2 fromTheorem 516 Then we optimise over q to obtain the value 0772943 Bya ldquocovering argumentrdquo we can show that τ(T(Kk)) is non-increasing when kincreases

For k ge 4 Corollary 528 improves the upper bound τ(T(Kk)) le 0790955that can be derived from the well-known upper bound of Le Gall [LG14] on theexponent of matrix multiplication ω = ω(T(K3))

86 Chapter 5 Tight tensors and combinatorial subrank cap sets

A standard ldquoflattening argumentrdquo (ie using the gauge points from the asymp-totic spectrum) yields the lower bound τ(T(Kk)) ge 1

2k(k minus 1) if k is even and

τ(T(Kk)) ge 12(k + 1)k if k is odd As a consequence if the exponent of matrix

multiplication ω equals 2 then τ(T(K4)) = τ(T(K3)) = 23 We raise the following

question is there a k ge 5 such that τ(T(Kk)) lt23

Tensor surgery cycle graphs

For graph tensors given by sparse graphs good upper bounds on the asymptoticrank can be obtained with an entirely different method called tensor surgery whichwe introduced in [CZ18] As an illustration let me mention the results we obtainedfor cycle graphs with tensor surgery Recall ω = log2 ˜R(〈2 2 2〉) = log2 ˜R(T(Ck))Let ωk = log2 ˜R(T(Ck)) First observe that ωk = k for even k For odd k triviallyk minus 1 le ωk le k We prove the following

Theorem 529 For k ` odd ωk+`minus1 le ωk + ω`

Corollary 530 Let k ge 5 odd Then ωk le ωkminus2 + ω3 and thus ωk le kminus12ω

Corollary 531 If ω = 2 then ωk = k minus 1 for all odd k

See [CZ18] for the proofs

56 Conclusion

Tight tensors are a subfamily of the oblique tensors For tight 3-tensors theminimum over the support functionals equals the asymptotic subrank This isproven via the CoppersmithndashWinograd method The construction is in fact of avery combinatorial nature In this chapter we studied the combinatorial notion ofsubrank We proved that combinatorial subrank is monotone under combinatorialdegeneration We studied the cap set problem via the support functionals Weextended the CoppersmithndashWinograd method to higher-order tensors and appliedthis method to study graph tensors

Chapter 6

Universal points in the asymp-totic spectrum of tensors entanglementpolytopes moment polytopes

This chapter is based on joint work with Matthias Christandl and Peter Vrana [CVZ18]

61 Introduction

In Chapter 4 following Strassen we introduced the asymptotic spectrum oftensors X(T ) = X(T 6) for T the semiring of k-tensors over F for some fixedinteger k and field F with addition given by direct sum oplus multiplication givenby tensor product otimes and preorder 6 given by restriction (or degeneration) Theasymptotic spectrum characterises the asymptotic rank ˜R and the asymptoticsubrank ˜Q We have seen that the asymptotic rank plays an important role inalgebraic complexity theory the asymptotic rank of the matrix multiplicationtensor 〈2 2 2〉 =

sumijkisin[2] eij otimes ejkotimes eki isin F4otimesF4otimesF4 characterises the exponent

of the arithmetic complexity of multiplying two n times n matrices over F thatis ˜R(〈2 2 2〉) = 2ω We have also seen in Chapter 5 how one may use theasymptotic subrank to upper bound the size of combinatorial objects like forexample cap sets in Fn3

New results in this chapter

So far the only elements we have seen in X(T ) (ie universal spectral pointscf Section 213) are the gauge points (Section 43) Besides that we have seenin Section 44 that the Strassen support functionals ζθ are in X(oblique) Inthis chapter we introduce for the first time an explicit infinite family of universalspectral points (over the complex numbers) the quantum functionals Our newinsight is to use the moment polytope Given a tensor t isin Cn1 otimes Cn2 otimes Cn3 themoment polytope P(t) is a convex polytope that carries representation-theoretic

87

88 Chapter 6 Universal points in the asymptotic spectrum of tensors

information about t The quantum functionals are defined as maximisations overmoment polytopes

Let me immediately put a disclaimer The quantum functionals do not give anew lower bound on the asymptotic rank of matrix multiplication 〈2 2 2〉 namelythe quantum functionals give the same lower bound as the gauge points Alsothe quantum functionals being defined for tensors over complex numbers only wedo not expect to get new upper bounds on the size of combinatorial objects thatare ldquolike cap setsrdquo

So what have we gained Arguably we have found the ldquorightrdquo viewpoint onhow to construct universal spectral points for tensors (In fact after writing ourpaper [CVZ18] we realised that Strassen had begun a study of moment polytopesin the appendix of the German survey [Str05] Strassen did not construct newuniversal spectral points however not in that publication at least) If there aremore universal spectral points then our viewpoint may lead the way to findingthem Moreover whereas no efficient algorithm is known for evaluating the supportfunctionals the moment polytope viewpoint may open the way to having efficientalgorithms for evaluating the quantum functionals

In Sections 62ndash67 we work towards the construction of the quantum functionalsand we give a proof that they are universal spectral points In Sections 68ndash610 wecompare the quantum functionals and the support functionals and in Section 611we relate asymptotic slice rank to the quantum functionals

In this chapter we will focus on 3-tensors but the theory naturally generalisesto k-tensors

62 SchurndashWeyl duality

For background on representation theory we refer to [Kra84] [Ful97] and [GW09]Let Sn be the symmetric group on n symbols Let Sn act on the tensor

space (Cd)otimesn by permuting the tensor legs

π middot v1 otimes middot middot middot otimes vn = vπminus1(1) otimes middot middot middot otimes vπminus1(n) π isin Sn

Let GLd be the general linear group of Cd Let GLd act on (Cd)otimesn via the diagonalembedding GLd rarr GLtimesnd g 7rarr (g g)

g middot v1 otimes middot middot middot otimes vn = (gv1)otimes middot middot middot otimes (gvn) g isin GLd

The actions of Sn and GLd commute so we have a well-defined action of the productgroup Sn timesGLd on (Cd)otimesn SchurndashWeyl duality describes the decomposition ofthe space (Cd)otimesn into a direct sum of irreducible Sn timesGLd representations Thisdecomposition is

(Cd)otimesn sim=oplusλ`dn

[λ]otimes Sλ(Cd) (61)

62 SchurndashWeyl duality 89

with [λ] an irreducible Sn representation of type λ and Sλ(Cd) an irreducibleGLd-representation of type λ when `(λ) le d and 0 when `(λ) gt d We use thenotation λ `d n for the partitions of n with at most d parts Let

Pλ (Cd)otimesn rarr (Cd)otimesn

be the equivariant projector onto the isotypical component of type λ ie onto thesubspace of (Cd)otimesn isomorphic to [λ]otimes Sλ(Cd) The projector Pλ is given by theaction of the group algebra element

Pλ =(dim[λ]

n

)2 sumTisinTab(λ)

cT isin C[Sn]

where Tab(λ) is the set of Young tableaux of shape λ filled with [n] and with cTthe Young symmetrizer

cT =sum

σisinC(T )

sgn(σ)σsum

πisinR(T )

π

where C(T ) R(T ) sube Sn are the subgroups of permutations inside columns andpermutations inside rows respectively The element Pλ is a minimal centralidempotent in C[Sn] and

sumλ`n Pλ = e

Back to the decomposition of (Cd)otimesn We need a handle on the size of thecomponents in the direct sum decomposition (61) For our application it is goodto think of d as a constant and n as a large number The number of summands inthe direct sum decomposition (61) is upper bounded by a polynomial in n

|λ `d n| le (n+ 1)d

ie there are only few summands compared to the total dimension dn There arethe following well-known bounds on the dimensions of the irreducible representa-tions [λ] and Sλ(Cd) that make up the summands

nprodd`=1(λ` + dminus `)

le dim[λ] le nprodd`=1 λ`

(62)

dimSλ(Cd) le (n+ 1)d(dminus1)2 (63)

Let p isin Rn be a probability vector iesumn

i=1 pi = 1 and pi ge 0 for i isin [n]Let H(p) be the Shannon entropy of the probability vector p

H(p) =nsumi=1

pi log2

1

pi

For α isin [0 1] let h(α) = H((α 1 minus α)) be the binary entropy For a partitionλ = (λ1 λ`) ` n let λ = λn = (λ1n λ`n) be the probability vectorobtained by normalising λ

90 Chapter 6 Universal points in the asymptotic spectrum of tensors

Let λ ` n For N isin N let Nλ = (Nλ1 Nλ2 Nλ`) be the stretchedpartition We see that asymptotically in the stretching factor N the dimensionof [Nλ] behaves like a multinomial coefficient and

2NnH(λ)minuso(N) le dim[Nλ] le 2NnH(λ) (64)

63 Kronecker and LittlewoodndashRichardson coef-

ficients gλmicroν cλmicroν

Let micro ν ` n Let Sn rarr SntimesSn π 7rarr (π π) be the diagonal embedding Considerthe decomposition of the tensor product [micro] otimes [ν] restricted along the diagonalembedding

[micro]otimes [ν] darrSntimesSnSnsim=otimesλ`n

HomSn([λ] [micro]otimes [ν])otimes [λ]

Define the Kronecker coefficient

gλmicroν = dim HomSn([λ] [micro]otimes [ν])

ie gλmicroν is the multiplicity of [λ] in [micro]otimes [ν]

Let λ `a+b Let GLa timesGLb rarr GLa+b (AB) 7rarr AoplusB be the block-diagonalembedding Consider the decomposition of the representation Sλ(Ca+b) restrictedalong the block-diagonal embedding

Sλ(Ca+b) darrGLa+b

GLatimesGLbsim=oplusmicro`aν`b

Hλmicroν otimes Smicro(Ca)otimes Sν(Cb)

with

Hλmicroν = HomGLatimesGLb(Smicro(Ca)otimes Sν(Cb)Sλ(Ca+b))

Define the LittlewoodndashRichardson coefficient cλmicroν = dimHλmicroν

For partitions λ λprime ` define λ + λprime elementwise The Kronecker and theLittlewoodndashRichardson coefficients have the following semigroup property (seeeg [CHM07])

Lemma 61 Let λ micro ν α β γ ` be partitions

(i) If gλmicroν gt 0 and gαβγ gt 0 then gλ+α micro+β ν+γ gt 0

(ii) If cλmicroν gt 0 and cαβγ gt 0 then cλ+αmicro+β ν+γ gt 0

64 Entropy inequalities 91

64 Entropy inequalities

The semigroup properties imply the following lemma Of this lemma the firststatement can be found in a paper by Christandl and Mitchison [CM06] while wedo not know of any source that explicitly states the second statement For theconvenience of the reader we give the proofs of both statements

Lemma 62 Let λ micro ν ` be partitions

(i) If gλmicroν gt 0 then H(λ) le H(micro) +H(ν)

(ii) If cλmicroν gt 0 then H(λ) le |micro||micro|+|ν|H(micro) + |ν|

|micro|+|ν|H(ν) + h( |micro||micro|+|ν|

)

Proof (i) Let gλmicroν gt 0 Suppose λ micro ν ` n Let N isin N Then Lemma 61implies gNλNmicroNν gt 0 This means HomSnN ([Nλ] [Nmicro]otimes [Nν]) 6= 0 which impliesdim[Nλ] le dim[Nmicro] dim[Nν] From (64) we have the dimension bounds

2NnH(λ)minuso(N) le dim[Nλ]

dim[Nmicro] le 2NnH(micro)

dim[Nν] le 2NnH(ν)

Thus NnH(λ) minus o(N) le NnH(micro) + NnH(ν) Divide by Nn and let N go toinfinity to get H(λ) le H(micro) +H(ν)

(ii) We restrict the decomposition

(Ca+b)otimesn sim=oplusλ`a+bn

[λ]otimes Sλ(Ca+b)

along the block-diagonal embedding to get

(Ca+b)otimesn darrGLa+b

GLatimesGLbsim=otimesλ`a+bn

[λ]otimes Sλ(Ca+b) darrGLa+b

GLatimesGLb

sim=oplusλ`a+bn

[λ]otimesoplusmicro`aν`b

Ccλmicroν otimes Smicro(Ca)otimes Sν(Cb)

sim=oplusmicro`aν`b

(oplusλ`a+bn

[λ]otimes Ccλmicroν)otimes Smicro(Ca)otimes Sν(Cb)

On the other hand

(Ca+b)otimesn darr sim= (Ca oplus Cb)otimesn darrsim= (Ca)otimesn oplus ((Ca)otimesnminus1 otimes Cb)oplus middot middot middot oplus (Cb)otimesn darr

sim=noplusk=0

C(nk) otimesoplusmicro`ak

([micro]otimes Smicro(Ca))otimesoplus

ν`bnminusk

([ν]otimes Sν(Cb))

92 Chapter 6 Universal points in the asymptotic spectrum of tensors

sim=noplusk=0

oplusmicro`akν`bnminusk

(C(nk) otimes [micro]otimes [ν]

)otimes Smicro(Ca)otimes Sν(Cb)

Suppose cλmicroν gt 0 Comparing the above expressions gives the inequality dim[λ] le(n|micro|

)dim[micro] dim[ν] By the semigroup property Lemma 61 we have cNλNmicroNν gt 0

for all N isin N Thus dim[Nλ] le(NnN |micro|

)dim[Nmicro] dim[Nν] for all N isin N Then

from (64) follows

2NnH(λ)minuso(N) le 2Nnh(|micro|n

)2N |micro|H(micro)2N |ν|H(ν)

We conclude H(λ) le h( |micro|n

) + |micro|nH(micro) + |ν|

nH(ν)

Let x = (x(1) x(2) x(3)) be a triple of probability vectors x(i) isin Rni Let θ isin Θbe a weighting Let Hθ(x) be the θ-weighted average of the Shannon entropies ofthe probability vectors x(1) x(2) and x(3)

Hθ(x) = θ(1)H(x(1)) + θ(2)H(x(2)) + θ(3)H(x(3))

(Note that this notation is slightly different from the notation used in Chapter 4)We will use the notation λ `3 n to say that λ is a triple of partitions of n ie λequals (λ(1) λ(2) λ(3)) where each λ(i) is a partition of n We write λ for the

normalised triple (λ(1) λ(2) λ(3))

Lemma 63 Let λ micro ν `3 be three triples of partitions

(i) If gλ(i)micro(i)ν(i) gt 0 for all i then 2Hθ(λ) le 2Hθ(micro)2Hθ(ν)

(ii) If micro `3 m ν `3 nminusm and cλ(i)

micro(i)ν(i) gt 0 for all i then 2Hθ(λ) le 2Hθ(micro) + 2Hθ(ν)

Proof (i) Suppose gλ(i)micro(i)ν(i) gt 0 for all i Then H(λ(i)) le H(micro(i)) +H(ν(i)) for

all i by Lemma 62 Thussum

i θ(i)H(λ(i)) lesum

i θ(i)H(micro(i))+sum

i θ(i)H(ν(i)) Then

Hθ(λ) le Hθ(micro) +Hθ(ν) We conclude 2Hθ(λ) le 2Hθ(micro)2Hθ(ν)

(ii) Suppose cλ(i)

micro(i)ν(i) gt 0 for all i Then H(λ(i)) le mnH(micro(i))+ nminusm

nH(ν(i))+h(m

n)

by Lemma 62 We take the θ-weighted average to get Hθ(λ) le mnHθ(micro) +

nminusmmHθ(ν) + h(m

n) We conclude 2Hθ(λ) le 2Hθ(micro) + 2Hθ(ν) by Lemma 49(iv)

65 Hilbert spaces and density operators

Endow the vector space Cn with a hermitian inner product (one may take thestandard hermitian inner product 〈u v〉 =

sumni=1 uivi for u v isin Cn where middot denotes

taking the complex conjugate) so that it is a Hilbert space

66 Moment polytopes P(t) 93

Let (V1 〈middot middot〉) and (V2 〈middot middot〉) be Hilbert spaces On V1 oplus V2 we define the innerproduct by 〈u1 oplus u2 v1 oplus v2〉 = 〈u1 v1〉+ 〈u2 v3〉 On V1 otimes V2 we define the innerproduct by 〈u1 otimes u2 v1 otimes v2〉 = 〈u1 v1〉〈u2 v2〉 and extending linearly

Let V be a Hilbert space A positive semidefinite hermitian operator ρ V rarr Vwith trace one is called a density operator The sequence of eigenvalues of a densityoperator ρ is a probability vector Let spec(ρ) = (p1 pn) be the sequence ofeigenvalues of ρ ordered non-increasingly p1 ge middot middot middot ge pn

Let V1 and V2 be Hilbert spaces Given a density operator ρ on V1 otimes V2the reduced density operator ρ1 = tr2 ρ is uniquely defined by the property thattr(ρ1X1) = tr(ρ(X1otimes IdV2)) for all operators X1 on V1 The operator ρ1 is again adensity operator The operation tr2 is called the partial trace over V2 Explicitly ρ1

is given by 〈ei ρ1(ej)〉 =sum

`〈ei otimes f` ρ(ej otimes f`)〉 where the ei are some basis of V1

and the fi are some basis of V2 (the statement is independent of basis choice)Let Vi be a Hilbert space and consider the tensor product V1otimesV2otimesV3 Associate

with t isin V1 otimes V2 otimes V3 the dual element tlowast = 〈t middot〉 isin (V1 otimes V2 otimes V3)lowast Then

ρt = ttlowast〈t t〉 = t〈t middot〉〈t t〉

is a density operator on V1 otimes V2 otimes V3 Viewing ρt as a density operator on theregrouped space V1otimes (V2otimes V3) we may take the partial trace of ρt over V2otimes V3 asdescribed above We denote the resulting density operator by ρt1 = tr23 ρ

t Wesimilarly define ρt2 = tr13 ρ

t and ρt3 = tr12 ρt

66 Moment polytopes P(t)

We give a brief introduction to moment polytopes We refer to [Nes84 Bri87Fra02 Wal14] for more information We begin with the general setting and thenspecialise to orbit closures in tensor spaces

661 General setting

Let G be a connected reductive algebraic group (We refer to Kraft [Kra84] andHumphreys [Hum75] for an introduction to algebraic groups) Fix a maximal torusT sube G and a Borel subgroup T sube B sube G We have the character group X(T ) theWeyl group W the root system Φ sube X(T ) and the system of positive roots Φ+ sube ΦFor λ micro isin X(T ) we set λ 4 micro if micro minus λ is a sum of positive roots Let V bea rational G-representation The restriction of the action of G to T gives adecomposition

V =oplus

λisinX(T )

Vλ Vλ = v isin V forallt isin T t middot v = λ(t)v

This decomposition is called the weight decomposition of V The λ isin X(T )with Vλ 6= 0 are called the weights of V with respect to T The Vλ are the

94 Chapter 6 Universal points in the asymptotic spectrum of tensors

weight spaces of V For v isin V let vλ be the component of v in Vλ Letsupp(v) = λ vλ 6= 0

Let E be the real vector space E = X(T ) otimes R The Weyl group W actson X(T ) and thus on E We enlarge 4 to a partial order on E as follows Forx y isin E let x 4 y if y minus x is a nonnegative linear combination of positive rootsLet D sube E be the positive Weyl chamber For every x isin E the orbit W middot xintersects the positive Weyl chamber D in exactly one point which we denote bydom(x)

Let V be a finite-dimensional rational G-module Let χ isin X(T ) cap D bea dominant character We denote the χ-isotypical component of V with V(χ)Let Z sube V be a Zariski closed set We denote the coordinate ring of Z with C[Z]We denote the degree d part of C[Z] with C[Z]d If Z is G-stable then C[Z]d is aG-module

Definition 64 Let V be a rational G-module and Z sube V a nontrivial irreducibleclosed G-stable cone The moment polytope of Z denoted by

P(Z)

is defined as the Euclidean closure in E of the set

R(Z) = χd (C[Z]d)(χlowast) 6= 0

of normalised characters χd for which the χlowast-isotypical component (C[Z]d)(χlowast) isnot zero

Theorem 65 (MumfordndashNess [Nes84] Brion [Bri87] Franz [Fra02]) The momentpolytope is indeed a convex polytope and it is equal to the image of the so-calledmoment map intersected with the positive Weyl chamber

P(Z) = micro(Z 0) capD

Let Z = G middot v be the orbit closure (in the Zariski topology) of a vector v isin V 0and suppose G middot v is a cone

Lemma 66 (See eg [Str05]) Suppose G middot v is a cone Then

R(G middot v) = χd (C[G middot v]d)(χlowast) 6= 0= χd (lin(G middot votimesd))(χ) 6= 0

662 Tensor spaces

We specialise to 3-tensors Let V = V1 otimes V2 otimes V3 with Vi = Cni Let

G = GLn1 timesGLn2 timesGLn3

67 Quantum functionals F θ(t) 95

T = T1 times T2 times T3

with Ti the diagonal matrices in GLni The weight decomposition of V is thedecomposition with respect to the standard basis elements ex1 otimes ex2 otimes ex3 wherex isin [n1]times [n2]times [n3] The support supp(v) is the support of v with respect to thestandard basis

In the current setting there is a beautiful rephrasing of Theorem 65 in termsof ordered spectra of reduced density matrices Recall from Section 65 that forv isin V 0 we have a density matrix ρv and reduced density matrices ρvi of whichwe may take the non-increasingly ordered spectra spec(ρvi )

Theorem 67 (WalterndashDoranndashGrossndashChristandl [WDGC13]) Let Z sube V be anontrivial irreducible closed G-stable cone Then

P(Z) = (spec ρz1 spec ρz2 spec ρz3) z isin Z 0

Let v isin V 0 We consider the moment polytope of the orbit closure Z = G middot vIn this setting Lemma 66 specialises to the following

Lemma 68 (See eg [Str05])

R(G middot v) = χd (C[G middot v]d)(χlowast) 6= 0= χd (lin(G middot votimesd))(χ) 6= 0= χd Pχv

otimesd 6= 0

where Pχ = Pχ(1) otimes Pχ(2) otimes Pχ(3) with Pχ(i) V otimesdi rarr V otimesdi the projector onto the

isotypical component of type χ(i) discussed in Section 62

On the other hand Theorem 67 immediately gives a description of the momentpolytope P(G middot v) in terms of ordered spectra of reduced density matrices

Theorem 69 Let v isin V 0 Then

P(G middot v) = (spec ρu1 spec ρu2 spec ρu3) u isin G middot v 0

Summarising we have two descriptions of the moment polytope a represen-tation-theoretic or invariant-theoretic description (Lemma 68) and a quantummarginal spectra description (Theorem 69) These two descriptions are the keyto proving the properties of the quantum functionals that we need

67 Quantum functionals F θ(t)

We will now define the quantum functionals and prove that they are universalspectral points

96 Chapter 6 Universal points in the asymptotic spectrum of tensors

Let p = (p1 pn) isin Rn be a probability vector iesumn

i=1 pi = 1 andpi ge 0 for all i isin [n] Recall that H(p) denotes the Shannon entropy of theprobability vector p H(p) =

sumni=1 pi log2 1pi Let x = (x(1) x(2) x(3)) be a

triple of probability vectors x(i) isin Rni Let θ isin Θ be a weighting Recallthat Hθ(x) denotes the θ-weighted average of the Shannon entropies of the threeprobability vectors x(1) x(2) x(3)

Hθ(x) = θ(1)H(x(1)) + θ(2)H(x(2)) + θ(3)H(x(3))

Let V = Cn1 otimes Cn2 otimes Cn3 Let G = GLn1 timesGLn2 timesGLn3 Let v isin V 0 Weuse the notation P(v) = P(G middot v) for the moment polytope of the orbit closureof v

Definition 610 For θ isin Θ and v isin V 0 let

F θ(v) = max2Hθ(x) x isin P(v)

Let F θ(0) = 0 We call the functions F θ the quantum functionals The namequantum functional comes from the fact that the moment polytope P(t) consistsof triples of quantum marginal entropies

Theorem 611 Let T be the semiring of 3-tensors over C Let 6 be the restrictionpreorder For θ isin Θ

F θ isin X(T 6)

In other words F θ is a semiring homomorphism T rarr Rge0 which is monotoneunder degeneration 6 In fact F θ is monotone under degeneration

Remark 612 The results in this chapter generalise to k-tensors over C In ourpaper [CVZ18] we discuss this general situation in detail and make a distinctionbetween upper quantum functionals and lower quantum functionals

Let p isin Rn and q isin Rm be probability vectors The tensor product potimesq isin Rnm

defined by

potimes q = (piqj i isin [n] j isin [m])

is a probability vector The direct sum poplus q isin Rn+m defined by

poplus q = (p1 pn q1 qm)

is a probability vectorLet x = (x(1) x(2) x(3)) and y = (y(1) y(2) y(3)) be triples of probability vectors

We define the tensor product xotimes y elementwise

xotimes y = (x(1) otimes y(1) x(2) otimes y(2) x(3) otimes y(3))

67 Quantum functionals F θ(t) 97

We define the direct sum xoplus y elementwise

xoplus y = (x(1) oplus y(1) x(2) oplus y(2) x(3) oplus y(3))

For x otimes y and x oplus y to be in the moment polytope we will need to reorder thecomponents non-increasingly For a triple of probability vectors x = (x(1) x(2) x(3))let

dom(x)

be the triple of probability vectors obtained from x be reordering the compo-nents x(i) such that they become non-increasing Let dom(S) = dom(x) x isin S

For v isin Cn1otimesCn2otimesCn3 we will use the notation G(v) = GLn1timesGLn2timesGLn3

to denote the group that naturally corresponds to the space that v lives in Wewill use the notation P(v) = P(G(v) middot v) for the moment polytope of the orbitclosure of v

Theorem 613 Let s isin Cn1 otimes Cn2 otimes Cn3 and t isin Cm1 otimes Cm2 otimes Cm3

(i) dom(P(s)otimesP(t)

)sube P(sotimes t)

(ii) forallα isin [0 1] dom(αP(s)oplus (1minus α) P(t)

)sube P(soplus t)

(iii) If s t isin Cn1 otimes Cn2 otimes Cn3 0 and s isin G(t) middot t then P(s) sube P(t)

(iv) P(soplus 0) = P(s)oplus 0

(v) P(〈1〉) = ((1) (1) (1)) with 〈1〉 = e1 otimes e1 otimes e1 isin C1 otimes C1 otimes C1

Proof To prove statements (i) and (ii) let x isin P(s) and y isin P(t) Then thereare elements a isin G(s) middot s and b isin G(t) middot t with ordered marginal spectra x and y

x = (spec ρa1 spec ρa2 spec ρa3)

y = (spec ρb1 spec ρb2 spec ρb3)

We prove statement (i) We have aotimes b isin G(sotimes t) middot sotimes t Thus

dom(xotimes y) = (spec ρaotimesb1 spec ρaotimesb2 spec ρaotimesb3 ) isin P(sotimes t)

We conclude dom(P(s)otimesP(t)) sube P(sotimes t) We prove statement (ii) Let α isin [0 1]Define the tensor u(α) isin Cn1+m1 otimes Cn2+m2 otimes Cn3+m3 by

u(α) =

radicαradic〈s s〉

aoplusradic

1minus αradic〈t t〉

b

Then u(α) isin G(soplus t) middot soplus t We have ρu(α)i = αρai oplus (1 minus α)ρbi From the

observation

spec(αρai oplus (1minus α)ρbi) = dom(αxoplus (1minus α)y)

98 Chapter 6 Universal points in the asymptotic spectrum of tensors

follows dom(αxoplus (1minus α)y) isin P(G(soplus t) middot soplus t) We conclude

dom(αP(s)oplus (1minus α)P(t)) sube P(soplus t)

We have thus proven statement (i) and (ii)We prove statement (iii) Let G = G(t) = G(s) Let s isin G middot t Then

G middot s sube G middot t so we have a G-equivariant restriction map C[G middot s] C[G middot t] onthe coordinate rings Let χd isin R(G middot s) with (C[G middot s]d)(χlowast) 6= 0 Then also(C[G middot t]d)(χlowast) 6= 0 by Schurrsquos lemma Thus χd isin R(G middot t) sube P(G middot t) Weconclude P(s) sube P(t)

We prove statement (iv) Let χd isin R(G(soplus 0) middot (soplus 0)) with Pχ(soplus0)otimesd 6= 0Recall from Section 62 that Pχ is given by the action of an element in the groupalgebra C[Sd] which we also denoted by Pχ From this viewpoint we see that also

Pχsotimesd 6= 0 So χd isin R(G(s) middot s)Statement (v) is a direct observation

Corollary 614

(i) F θ(s)F θ(t) le F θ(sotimes t)

(ii) F θ(s) + F θ(t) le F θ(soplus t)

(iii) If s t then F θ(s) le F θ(t)

(iv) F θ(〈1〉) = 1

Proof (i) Let x isin P(s) and y isin P(t) Then xotimesy isin P(sotimest) by Theorem 613 It isa basic fact that Hθ(x)+Hθ(y) = Hθ(xotimesy) (Lemma 49) so 2Hθ(x)2Hθ(y) = 2Hθ(xotimesy)We conclude F θ(s)F θ(t) le F θ(sotimes t)

(ii) Let x isin P(s) and y isin P(t) Then by Theorem 613 for all α isin [0 1]

dom(αxoplus (1minus α)y) isin P(soplus t)

It is a basic fact that αHθ(x) + (1 minus α)Hθ(y) + h(α) = Hθ(αx oplus (1 minus α)y)(Lemma 49) Thus for any α isin [0 1] we have 2αHθ(x)+(1minusα)Hθ(y)+h(α) le F θ(soplus t)Using Lemma 49(iv) we conclude F θ(s) + F θ(t) le F θ(soplus t)

(iii) This follows from statement (iii) and (iv) of Theorem 613 since bydefinition degeneration s t means soplus 0 isin G(toplus 0) middot (toplus 0)

(iv) This follows from statement (v) of Theorem 613

67 Quantum functionals F θ(t) 99

Theorem 615

(i) R(sotimes t) sube λN existmicroN isin R(s) νN isin R(t) gλ(i)micro(i)ν(i) gt 0 for all i

(ii) R(soplus t) sube λN existmicrom isin R(s) ν(N minusm) isin R(t) cλ(i)

micro(i)ν(i) gt 0 for all i

Proof (i) Let s isin V1 otimes V2 otimes V3 and let t isin W1 otimesW2 otimesW3 Let λN isin R(sotimes t)with Pλ(sotimes t)otimesN 6= 0 Let π be the natural reordering map

π ((V1 otimesW1)otimes (V2 otimesW2)otimes (V3 otimesW3))otimesN

rarr (V1 otimes V2 otimes V3)otimesN otimes (W1 otimesW2 otimesW3)otimesN

Then

(sotimes t)otimesN =summicroν

πminus1(Pmicro otimes Pν)π(sotimes t)otimesN

Let micro ν `3 N with Pλπminus1(Pmicro otimes Pν)π(s otimes t)otimesN 6= 0 Then Pmicros

otimesN 6= 0 andPνt

otimesN 6= 0 ie microN isin R(s) and νN isin R(t) Moreover Pλπminus1(Pmicro otimes Pν)π 6= 0

which means the Kronecker coefficients gλ(i)micro(i)ν(i) are nonzero(ii) Let λN isin R(soplus t) with Pλ(soplus t)otimesN 6= 0 Let us expand (soplus t)otimesN as

(soplus t)otimesN = sotimesN oplus (sotimesNminus1 otimes t)oplus middot middot middot oplus totimesN

Then Pλ does not vanish on some summand which we may assume to be of theform sotimesm otimes totimesNminusm Let π be the natural projection

π ((V1 oplusW1)otimes (V2 oplusW2)otimes (V3 oplusW3))otimesN

rarr (V1 otimes V2 otimes V3)otimesm otimes (W1 otimesW2 otimesW3)otimesNminusm

Let micro ν with Pλπminus1(Pmicro otimes Pν)π(soplus t)otimesN 6= 0 Then Pmicros

otimesm 6= 0 and PνtotimesNminusm 6= 0

Moreover Pλπminus1(Pmicro otimes Pν)π 6= 0 Therefore the LittlewoodndashRichardson coeffi-

cients cλ(i)

micro(i)ν(i) are nonzero

Corollary 616

(i) F θ(sotimes t) le F θ(s)F θ(t)

(ii) F θ(soplus t) le F θ(s) + F θ(t)

Proof (i) Let λN isin R(s otimes t) By Theorem 615 there is a microN isin R(s) and aνN isin R(t) such that the Kronecker coefficient gλ(i)micro(i)ν(i) is nonzero for every i

Then 2Hθ(micro) le F θ(s) and 2Hθ(ν) le F θ(t) by definition of F θ The Kroneckercoefficients being nonzero implies

2Hθ(λ) le 2Hθ(micro)2Hθ(ν)

100 Chapter 6 Universal points in the asymptotic spectrum of tensors

by Lemma 63 We conclude F θ(sotimes t) le F θ(s)F θ(t)

(ii) Let λN isin R(s oplus t) Then by Theorem 615 there are microm isin R(s) and

ν(N minus m) isin R(t) such that the LittlewoodndashRichardson coefficient cλ(i)

micro(i)ν(i) isnonzero for every i This means

2Hθ(λ) le 2Hθ(micro) + 2Hθ(ν)

by Lemma 63 We conclude F θ(soplus t) le F θ(s) + F θ(t)

Proof of Theorem 611 Corollary 614 and Corollary 616 together prove The-orem 611

68 Outer approximation

In this section we discuss an outer approximation of P(t) We will use this outerapproximation to show that the quantum functionals are at most the supportfunctionals

Let 4 be the dominance order ie majorization order on triples of probabilityvectors For any set S sube Rn1 times Rn2 times Rn3 of triples of probability vectors let S4

denote the upward closure with respect to 4

S4 = y isin Rn1 times Rn2 times Rn3 existx isin S x 4 y

Let conv(S) denote the convex hull of S in Rn1 times Rn2 times Rn3 Recall that for x isinS we defined dom(x) as the triple of probability vectors obtained from x =(x(1) x(2) x(3)) by reordering the components x(i) such that they become non-increasing and dom(S) = dom(x) x isin S

Theorem 617 (Strassen [Str05]) Let v isin V 0 Then

P(v) sube (dom conv supp v)4 (65)

Proof We give the proof for the convenience of the reader Let χd isin R(G middot v)Then (lin(G middot votimesd))(χ) 6= 0 Let Mχ sube lin(G middot votimesd) be a simple G-submodule withhighest weight χ Let N sube V otimesd be the G-module complement N oplusMχ = V otimesdThen votimesd is not in N Let v =

oplusγisinsupp v vγ be the weight decomposition Then votimesd

is a sum of tensor products of the vγ At least one summand is not in N say ofweight η =

sumγ dγγ with

sumγ dγ = d The projection V otimesd rarrMχ along N maps this

summand onto a nonzero weight vector of weight η So η is a weight of Mχ Thenalso dom(η) is a weight of Mχ Since χ is the highest weight of Mχ dom(η) 4 χ

Then dom(ηd) 4 χd We have ηd =sum

γdγdγ isin conv supp v We conclude

R(G middot v) sube (dom conv supp v)4 and thus P(G middot v) sube (dom conv supp v)4

69 Inner approximation for free tensors 101

69 Inner approximation for free tensors

In this section we discuss an inner approximation for the moment polytope of afree tensor We will use this inner approximation in the next section to prove thatthe quantum functionals coincide with the support functionals when restricted tofree tensors We will prove that not all tensors are free

We say a set Φ sube [n1] times [n2] times [n3] is free if every two different elementsof Φ differ in at least two coordinates in other words if the elements of Φ haveHamming distance at least two We say v isin V = Cn1 otimes Cn2 otimes Cn3 is free if forsome g isin G(v) = GLn1 timesGLn2 timesGLn3 the support supp(g middot v) sube [n1]times [n2]times [n3]is free (Free is called schlicht in [Str05])

Theorem 618 (Strassen [Str05]) Let v isin V 0 with supp(v) free Then

dom conv supp v sube P(v)

Proof We refer to [Str05]

Corollary 619 Let v isin V 0 with supp(v) free Then

P(v)4 =(dom conv supp v

)4

Proof By Theorem 618 dom conv supp v sube P(v) We take the upward closureon both sides to get (dom conv supp v)4 sube P(v)4 On the other hand fromTheorem 617 follows P(v)4 sube (dom conv supp v)4

Remark 620 Recall that v isin V is oblique if the support supp(g middot v) is anantichain for some g isin G(v) (Section 44) Such antichains are free so obliquetensors are free Thus tight sube oblique sube free Like the tight tensors andoblique tensors free tensors from a semigroup under otimes and oplus

Proposition 621 For n ge 5 there exists a tensor that is not free in CnotimesCnotimesCn

Proof We upper bound the maximal size of a free support Let Φ sube [n]times [n]times [n]be free Any two distinct elements in Φ are still distinct if we forget the thirdcoefficient of each Therefore |Φ| = |(α1 α2) α isin Φ| le n2 (This is a specialcase of the Singleton bound [Sin64] from coding theory This upper bound is tightsince Φ = (a b c) a b c isin [n] c = a+ b mod n is free and has size n2) Secondwe apply the following observation of Burgisser [Bur90 page 3] Let

Zn = t isin Cn otimes Cn otimes Cn existg isin G(t) |supp(g middot t)| lt n3 minus 3n2

Let Yn = Cn otimes Cn otimes Cn Zn Then the set Yn is Zariski open and nonemptyNow let n ge 5 and let t isin Yn Then forallg isin G(t) |supp(g middot t)| ge n3 minus 3n2 gt n2 Weconclude t is not free

102 Chapter 6 Universal points in the asymptotic spectrum of tensors

610 Quantum functionals versus support func-

tionals

We discussed the support functionals ζθ isin X(oblique 3-tensors over F) in Chap-ter 4 We recall its definition over C Let V = Cn1otimesCn2otimesCn3 For θ isin Θ = P([3])and t isin V 0 with supp(t) oblique

ζθ(t) = max2Hθ(P ) P isin P(supp(t))

We also discussed an extension of ζθ to all 3-tensors over C the upper supportfunctional

ζθ(t) = mingisinG(t)

max2Hθ(P ) P isin P(supp(g middot t))

We know ζθ(s otimes t) le ζθ(s)ζθ(t) ζθ(s oplus t) = ζθ(s) + ζθ(t) ζθ(〈1〉) = 1 ands 6 trArr ζθ(s) le ζθ(t) for any s t isin V

The set conv supp(g middot t) is the set of marginals of probability distributions onsupp(g middot t) Thus dom conv supp(g middot t) is the set of ordered marginals of probabilitydistributions on supp(g middot t) Therefore

ζθ(t) = mingisinG(t)

maxxisinS(gmiddott)

2Hθ(x)

with S(w) = dom conv suppw Let X sube Rn1 times Rn2 times Rn3 be a set of triples ofprobability vectors From Schur-convexity of the Shannon entropy function followsmaxxisinX 2Hθ(x) = maxxisinX4 2Hθ(x) Also Hθ(x) = Hθ(domx)

Theorem 622 ζθ(t) ge F θ(t)

Proof Let g isin G(t) such that

maxxisinS

2Hθ(x) = ζθ(t)

with S = dom conv supp(g middot t) We have

maxxisinS

2Hθ(x) = maxxisinS4

2Hθ(x)

By Theorem 617 P(t) sube S4 We conclude F θ(t) le ζθ(t)

Theorem 623 Let t isin V be free Then ζθ(t) = F θ(t)

Proof We know from Theorem 622 that ζθ(t) ge F θ(t) We prove ζθ(t) le F θ(t)Let g isin G(t) such that supp(g middot t) is free Let S = dom conv supp(g middot t) Thenζθ(t) le maxxisinS 2Hθ(x) = maxxisinS4 2Hθ(x) By Theorem 618 we have S4 = P(t)4We conclude ζθ(t) le F θ(t)

611 Asymptotic slice rank 103

We can show that the regularised upper support functional equals the quantumsupport functional As a consequence the quantum functional is at least the lowersupport functional which was discussed in Chapter 4

Theorem 624 limnrarrinfin1nζθ(totimesn)1n = F θ(t)

Proof We refer the reader to [CVZ18]

Corollary 625 F θ(v) ge ζθ(v)

Proof By Theorem 624 F θ(v) = limnrarrinfin ζθ(votimesn)1n We know ζθ(v) ge ζθ(v)

by Theorem 415 and thus limnrarrinfin ζθ(votimesn)1n ge limnrarrinfin ζθ(v

otimesn)1n The lowersupport functional ζθ is supermultiplicative under otimes (Theorem 414) so

limnrarrinfin

ζθ(votimesn)1n ge ζθ(v)

Combining these three inequalities proves the theorem

611 Asymptotic slice rank

We proved in Section 46 that for oblique t isin Fn1 otimes Fn2 otimes Fn3 the asymptotic slicerank limnrarrinfin SR(totimesn)1n exists and equals minθisinΘ ζ

θ(t) with Θ = P([3]) In thissection we prove the analogous statement for the quantum functionals

Theorem 626 Let t isin Cn1 otimes Cn2 otimes Cn3 Then

limnrarrinfin

SR(totimesn)1n = minθisinΘ

F θ(t)

We work towards the proof of Theorem 626 Let t isin Cn1 otimesCn2 otimesCn3 0 LetEθ(t) = log2 F

θ(t)

Lemma 627 For any ε gt 0 there is an n0 isin N such that for all n ge n0 there isa λn isin R(t) with miniisin[3]H(λ(i)) ge minθisinΘE

θ(t)minus ε

Proof By definition

minθisinΘ

Eθ(t) = minθisinΘ

maxxisinP(t)

sumjisin[3]

θ(j)H(x(j))

By Von Neumannrsquos minimax theorem the right-hand side equals

maxxisinP(t)

minθisinΘ

sumjisin[3]

θ(j)H(x(j))

which equals

maxxisinP(t)

minjisin[3]

H(x(j))

104 Chapter 6 Universal points in the asymptotic spectrum of tensors

Let ε gt 0 Let microm isin R(t) with minjisin[3] H(micro(j)) ge minθisinΘ Eθ(t)minus ε2 We will

use two facts We have (P(1) otimes P(1) otimes P(1))t = t 6= 0 The triples of partitions λwith Pλt

otimesn 6= 0 for some n form a semigroup Let n isin N We can write n = qm+rwith q r isin N 0 le r lt m Let λ(j) = qmicro(j) + (r) Then by the semigroup property

Pλtotimesn 6= 0 ie λn isin R(t) We have 1

n(qmicro(j) + (r)) = qm

nmicro(j) + r

n(r) By concavity

of Shannon entropy

H( 1n(qmicro(j) + (r))) = H( qm

nmicro(j) + r

n(r))

ge qmnH(micro(j))

ge (1minus mn

)H(micro(j))

When n is large enough (1minus mn

)H(micro(j)) is at least H(micro(j))minus ε2 Let n0 isin N suchthat this is the case for all j isin [3]

Lemma 628 Let λn isin R(t) Then SR(totimesn) ge miniisin[3] dim[λ(i)]

Proof We have the restriction totimesn ge Pλtotimesn 6= 0 Choose rank-one projections Aj

in the vector spaces Sλ(j)(Cnj) with

s = (id[λ(1)]otimesA1)otimes (id[λ(2)]otimesA2)otimes (id[λ(3)]otimesA3)Pλtotimesn 6= 0

The tensor s is invariant under Sn acting diagonally on (Cn1)otimesnotimes(Cn2)otimesnotimes(Cn3)otimesnThus the marginal spectra spec ρsi are uniform This implies s is semistableFrom [BCC+17 Theorem 46] follows that SR(s) equals miniisin[3] dim[λ(i)]

Lemma 629 lim infnrarrinfin SR(totimesn)1n ge minθisinΘ Fθ(t)

Proof Let ε gt 0 For n large enough choose λn isin R(t) as in Lemma 627 ByLemma 628 SR(totimesn) ge miniisin[3] dim[λ(i)] The right-hand side we lower bound by

miniisin[3]

dim[λ(i)] ge miniisin[3]

2nH(λ(i))2minuso(n) ge 2n(minθisinΘ Eθ(t)minusε)2minuso(n)

Then lim infnrarrinfin SR(totimesn)1n ge 2minθisinΘ Eθ(t)minusε

Lemma 630 lim supnrarrinfin SR(totimesn)1n le F θ(t)

Proof Let n isin N Define s1 s2 s3 isin Cn1 otimes Cn2 otimes Cn3 by

s1 =(sumλ(1)`n

H(λ(1))leEθ(t)

Pλ(1) otimes Idotimes Id)totimesn

s2 =(sumλ(2)`n

H(λ(2))leEθ(t)

Idotimes Pλ(2) otimes Id)

(totimesn minus s1)

612 Conclusion 105

s3 =(sumλ(3)`n

H(λ(3))leEθ(t)

Idotimes Idotimes Pλ(3)

)(totimesn minus s1 minus s2)

Then totimesn = s1 +s2 +s3 The slice rank of an element in the image of Pλ(1)otimes Idotimes Id

is at most dim[λ(1)] otimes Sλ(1)(Cn1) which is at most 2nH(λ(1))+o(n) (Section 62)Similarly for Id otimes Pλ(2) otimes Id and Id otimes Id otimes Pλ(3) The tensor s1 is in the imageof the sum

sumλ(1) Pλ(1) otimes Id otimes Id over λ(1) ` n with at most n1 parts There are

at most (n+ 1)n1 such partitions Thus SR(s1) le (n+ 1)n12nEθ(t)+o(n) Similarly

for s2 and s3 Therefore

lim supnrarrinfin

SR(totimesn)1n le lim supnrarrinfin

(3(n+ 1)maxiisin[3] ni 2nE

θ(t)+o(n))1n

(66)

The right-hand side of (66) equals F θ(t)

Proof of Theorem 626 Lemma 629 and Lemma 630 together prove Theo-rem 626

612 Conclusion

In this chapter we constructed the first infinite family of spectral points for 3-tensors over C the quantum functionals For 30 years the only explicit spectralpoints known were the gauge points The constructions in this chapter naturallygeneralise to higher-order tensors for which we refer to our paper [CVZ18] Wedo not know whether the quantum functionals are all spectral points for 3-tensorsover C Finally we showed that for complex tensors the asymptotic slice rankexists and equals the minimum value over the quantum functionals

Chapter 7

Algebraic branching programsapproximation and nondeterminism

This chapter is based on joint work with Karl Bringmann and Christian

Ikenmeyer [BIZ17]

71 Introduction

The study of asymptotic tensor rank in previous chapters was originally motivatedby the study of the complexity of matrix multiplication in the algebraic circuitmodel an algebraic model of computation In this chapter we will study severalother algebraic models of computation and algebraic complexity classes

Formulas the class VPe and the determinant

An (arithmetic) formula is a rooted binary tree whose leaves are each labeledwith a variable or a field constant and whose root and intermediate vertices arelabeled with either + (addition) or times (multiplication) In the natural way viarecursion over the tree structure a formula computes a multivariate polynomial f The formula size of a multivariate polynomial f is the smallest number of verticesrequired for any formula to compute f Here is an example of a formula of size 7computing the polynomial (3 + x)(3 + y)

3 x 3 y

+ +

times

A sequence of multivariate polynomials (fn)nisinN is called a family Valiant inhis seminal paper [Val79] introduced the complexity class VPe that is defined as

107

108 Chapter 7 Algebraic branching programs

the set of all families whose formula size is polynomially bounded (We say asequence (an)n isin NN of natural numbers is polynomially bounded if there exists aunivariate polynomial q such that an le q(n) for all n) For example the family((x1)

n + (x2)n + middot middot middot+ (xn)n)n is in VPe because the formula size of this family

grows quadratically

The smallest known formulas for the determinant family detn have size nO(logn)This follows from Berkowitzrsquo algorithm [Ber84] which gives an algebraic cir-cuit of depth O(log2 n) and thus by expanding we get an algebraic formula ofdepth O(log2 n) whose size is then trivially bounded by 2O(log2 n) = nO(logn) Itis a major open question in algebraic complexity theory whether formulas ofpolynomially bounded size exist for detn This question can be phrased in termsof complexity classes as asking whether or not the inclusion VPe sube VPs is strict(We will define VPs shortly)

Motivated by this question we study the closure class VPe of families ofpolynomials that can be approximated arbitrarily closely by families in VPe

(see Section 724 for the formal definition) Over the field R or C one can thinkof VPe as the set of families whose border formula size is polynomially boundedThe border formula size of a polynomial f is the smallest number c such that thereexists a sequence gi of polynomials with formula size at most c and limirarrinfin gi = f

Continuous lower bounds

In algebraic complexity theory problem instances correspond to vectors v isin FnA complexity lower bound often takes the form of a function f Fn rarr F that is zeroon the vectors of ldquolow complexityrdquo and nonzero on v We refer to Grochow [Gro13]for a discussion of settings where complexity lower bounds are obtained in thisway (eg [NW97 Raz09 LO15 GKKS13 LMR13 BI13]) Over the complexnumbers we can in fact assume that these functions f are continuous [Gro13](and even so-called highest-weight vector polynomials) If C and D are algebraiccomplexity classes with C sube D (for example C = VPe and D = VPs) thena proof of separation D 6sube C in this continuous manner implies the strongerseparation D 6sube C In our case it is thus natural to aim for the separation VPs 6subeVPe instead of the slightly weaker VPs 6sube VPe which provides further motivationfor studying VPe This is exactly analogous to the geometric complexity theoryapproach of Mulmuley and Sohoni (see eg [MS01 MS08] and the exposition[BLMW11 Sec 9]) which aims to prove the separation VNP 6sube VPs to attackValiantrsquos famous conjecture VPs 6= VNP [Val79] (Here VNP is the class ofp-definable families see Section 724)

New results in this chapter

We prove two new results in this chapter

71 Introduction 109

Algebraic branching programs of width 2 An algebraic branching pro-gram (abp) is a directed acyclic graph with a source vertex s and a sink vertex tthat has affine linear forms over the base field F as edge labels Moreover werequire that each vertex is labeled with an integer (its layer) and that edges in theabp only point from vertices in layer i to vertices in layer i+ 1 The width of anabp is the cardinality of its largest layer The size of an abp is the number of itsvertices The value of an abp is the sum of the values of all sndasht-paths where thevalue of an sndasht-path is the product of its edge labels We say that an abp computesits value The class VPs coincides with the class of families of polynomials thatcan be computed by abps of polynomially bounded size see eg [Sap16]

For k isin N we introduce the class VPk as the class of families of polyno-mials computable by width-k abps of polynomially bounded size It is well-known (see Lemma 72) that VPk sube VPe for all k ge 1 In 1992 Ben-Or andCleve [BOC92] showed that VPk = VPe for all k ge 3 In 2011 Allender andWang [AW16] showed that width-2 abps cannot compute every polynomial so inparticular we have a strict inclusion VP2 ( VP3

We prove that the closure of VP2 and the closure of VPe are equal

VP2 = VPe (71)

when char(F) 6= 2 From (71) and the result of Allender and Wang follows directlythat the inclusion VP2 ( VP2 is strict We have thus separated a complexityclass from its approximation closure

VNP via affine linear forms Every algebraic complexity class has a nondeter-ministic closure (see Section 725 for the definition) The nondeterministic closureof VP is called VNP and the nondeterministic closure of VPe is called VNPeIn 1980 Valiant [Val80] proved VNPe = VNP The nondeterministic closureof VP1 and VP2 we call VNP1 and VNP2 Using interpolation techniques wecan deduce VNP2 = VNP from (71) provided the field is infinite Using moresophisticated techniques we prove

VNP1 = VNP (72)

From (72) easily follows VP1 ( VNP1 Also from [AW16] we get VP2 ( VNP2We have thus separated complexity classes from their nondeterministic closures

Further related work

An excellent exposition on the history of small-width computation can be foundin [AW16] along with an explicit polynomial that cannot be computed by width-2abps namely x1x2 + x3x4 + middot middot middot+ x15x16 Saha Saptharishi and Saxena in [SSS09Cor 14] showed that x1x2 + x3x4 + x5x6 cannot be computed by width-2 abpsthat correspond to the iterated matrix multiplication of upper triangular matrices

110 Chapter 7 Algebraic branching programs

Burgisser in [Bur04] studied approximations in the model of general algebraiccircuits finding general upper bounds on the error degree For most algebraiccomplexity classes C the relation between C and C has not been an activeobject of study As pointed out recently by Forbes [For16] Nisanrsquos result [Nis91]implies that C = C for C being the class of size-k algebraic branching programson noncommuting variables A structured study of VP and VPs was startedin [GMQ16] Much work in lower bounds for algebraic approximation algorithmshas been done in the area of bilinear complexity dating back to [BCRL79 Str83Lic84] and more recently eg [Lan06 LO15 HIL13 Zui17 LM16a]

This chapter is organised as follows In Section 72 we discuss definitions andbasic results In Section 73 we prove that the approximation closure of VP2

equals the approximation closure of VPe ie VP2 = VPe In Section 74 we provethat the nondeterminism closure of VP1 equals VNP

72 Definitions and basic results

We briefly recall the definition of circuits formulas and branching programs andwe recall the definition of the corresponding complexity classes Then we discusssome straightforward relationships among these classes and review the proof of atheorem by Ben-Or and Cleve which inspired our work Finally we discuss theapproximation closure and the nondeterminism closure for algebraic complexityclasses

721 Computational models

Let x1 x2 be formal variables By F[x] we mean the ring of polynomials over Fwith variables x1 x2 xk with k large enough

A circuit is a directed acyclic graph G with one or more source vertices andone sink vertex Each source vertex is labelled by a variable xi or a constant c isin FThe other vertices are labelled by either + or times and have in-degree 2 (that isfan-in 2) Each vertex computes an element in F[x] by recursion over the graphThe element computed by the sink is the element computed by the circuit Thesize of a circuit is the number of vertices

A formula is a circuit whose graph is a treeAn algebraic branching program (abp) is a directed acyclic graph with a source

vertex s and a sink vertex t that has affine linear forms αxi + β α β isin F asedge labels Moreover we require that each vertex is labeled with an integer (itslayer) and that edges in the abp only point from vertices in layer i to vertices inlayer i+ 1 The width of an abp is the cardinality of its largest layer The size ofan abp is the number of its vertices The value of an abp is the sum of the valuesof all sndasht-paths where the value of an sndasht-path is the product of its edge labels

72 Definitions and basic results 111

We say that an abp computes its value

For example the following abp has depth 5 width 3 and computes thepolynomial x1x2 + x2 + 2x1 minus 1

x1 2

x1x2minus1

An abp G corresponds naturally to an iterated product of matrices for any twoconsecutive layers Li Li+1 in G let Mi be the matrix (evw)visinLiwisinLi+1

with evwthe label of the edge from v to w (or 0 if there is no edge from v to w) Then thevalue of G equals the product Mk middot middot middotM2M1

For example the above abp corresponds to the following iterated matrixproduct

(1 1 1

)minus1 0 00 x2 00 0 x1

1 0 0x1 1 00 0 2

1

11

722 Complexity classes VP VPe VPk

The circuit size of a polynomial f is the size of the smallest circuit computing f The formula size of a polynomial f is the size of the smallest formula computing f

A family is a sequence (fn)nisinN of multivariate polynomials over F A class is aset of families The class VP consists of all families (fn) with circuit size degreeand number of variables in poly(n) The class VPe consists of all families (fn)with formula size in poly(n) (The origin of the subscript e in VPe is the termldquoarithmetic expressionrdquo) Clearly VPe sube VP

We introduce classes defined by abps Let k ge 1 The class VPk consists of allfamilies computed by polynomial-size width-k abps with edges labelled by affinelinear forms

sumi αixi + β with coefficients αi β isin F

We note that the above classes depend on the choice of the ground field F

In our paper [BIZ17] we make a distinction between three different types ofedge labels for abps The class VPk in this chapter corresponds to the class VPg

k

in [BIZ17]

112 Chapter 7 Algebraic branching programs

723 The theorem of Ben-Or and Cleve

This subsection is about the relations among VPk and VPe

Lemma 71 VPk sube VP` when k le `

Proof This is clearly true

Lemma 72 VPk sube VPe for any k

Proof For the simple proof we refer to [BIZ17]

Ben-Or and Cleve [BOC92] showed that for k ge 3 the classes VPk and VPe

are in fact equal

Theorem 73 (Ben-Or and Cleve [BOC92]) For k ge 3 VPk = VPe

We will review the construction of Ben-Or and Cleve here because we will useit to prove Theorem 78 and Theorem 715 The following depth-reduction lemmafor formulas by Brent is a crucial ingredient

Lemma 74 (Brent [Bre74]) Let f be an n-variate degree-d polynomial computedby a formula of size s Then f can also be computed by a formula of size poly(s n d)and depth O(log s)

Proof See the survey of Saptharishi [Sap16 Lemma 55] for a modern proof

Proof of Theorem 73 Lemma 72 says VPk sube VPe We will prove theinlusion VPe sube VP3 from which follows VPe sube VPk by Lemma 71 andthus VPk = VPe For a polynomial h define the matrix

M(h) =

1 0 0h 1 00 0 1

which as part of an abp looks like

h

We call the following matrices primitive

bull M(h) with h any variable or any constant in F

bull the 3times 3 permutation matrices denoted by Mπ with π isin S3

bull the diagonal matrices Mabc = diag(a b c) with a b c isin F

72 Definitions and basic results 113

The entries of the primitives are variables or constants in F making them suitableto use in the construction of a width-3 abp

Let (fn) isin VPe Then fn can be computed by a formula of size s(n) isin poly(n)By Brentrsquos depth-reduction theorem for formulas (Lemma 74) fn can be computedby a formula of size poly(n) and depth d(n) isin O(log s(n))

We will construct a sequence of primitives A1 Am(n) such that

A1 middot middot middotAm(n) =

1 0 0fn 1 00 0 1

with m(n) isin O(4d(n)) = poly(n) Then

fn(x) = ( 1 1 1 )Mminus110A1 middot middot middotAm(

111

)

so fn(x) can be computed by a width-3 abp of length poly(n) proving the theoremTo explain the construction let h be a polynomial and consider a formula

computing h of depth d The goal is to construct (recursively on the formulastructure) primitives A1 Am such that

A1 middot middot middotAm =

1 0 0h 1 00 0 1

with m isin O(4d)

Suppose h is a variable or a constant Then M(h) is itself a primitive matrixSuppose h = f + g is a sum of two polynomials f g and suppose M(f) and

M(g) can be written as a product of primitives Then M(f + g) equals a productof primitives because M(f + g) = M(f)M(g) This can easily be verified directlyor by noting that in the corresponding partial abps the top-bottom paths (ui-vjpaths) have the same value

u1 u2 u3

v1 v2 v3

f

gsim

u1 u2 u3

v1 v2 v3

f+g

Suppose h = fg is a product of two polynomials f g and suppose M(f) andM(g) can be written as a product of primitives Then M(fg) equals a product ofprimitives because

M(f middot g) = M(23)

(M1minus11M(123)M(g)M(132)M(f)

)2M(23)

114 Chapter 7 Algebraic branching programs

(here (23) isin S3 denotes the transposition 1 7rarr 1 2 7rarr 3 3 7rarr 2 and (123) isin S3

denotes the cyclic shift 1 7rarr 2 2 7rarr 3 3 7rarr 1) as can be verified either directly orby checking that in the corresponding partial abps the top-bottom paths (ui-vjpaths) have the same value

u1 u2 u3

v1 v2 v3

f

minus1

g

f

g

minus1

sim

u1 u2 u3

v1 v2 v3

f middotg

This completes the construction

The length m of the construction is m(h) = 1 for h a variable or constant andrecursively m(f + g) = m(f) +m(g) m(f middot g) = 2(m(f) +m(g)) so m isin O(4d)where d is the formula size of h

The above result of Ben-Or and Cleve (Theorem 73) raises the intriguingquestion whether the inclusion VP2 sube VPe is strict Allender and Wang [AW16]show that the inclusion is indeed strict in fact they show that some polynomialscannot be computed by any width-2 abp

Theorem 75 (Allender and Wang [AW16]) The polynomial

x1x2 + x3x4 + middot middot middot+ x15x16

cannot be computed by any width-2 abp Therefore we have the separation ofclasses VP2 ( VP3 = VPe

72 Definitions and basic results 115

724 Approximation closure C

We define the norm of a complex multivariate polynomial as the sum of theabsolute values of its coefficients This defines a topology on the polynomial ringC[x1 xm] Given a complexity measure L say abp size or formula size thereis a natural notion of approximate complexity that is called border complexityNamely a polynomial f isin C[x] has border complexity Ltop at most c if there isa sequence of polynomials g1 g2 in C[x] converging to f such that each gisatisfies L(gi) le c It turns out that for reasonable classes over the field of complexnumbers C this topological notion of approximation is equivalent to what we callalgebraic approximation (see eg [Bur04]) Namely a polynomial f isin C[x] satisfiesL(f)alg le c iff there are polynomials f1 fe isin C[x] such that the polynomial

h = f + εf1 + ε2f2 + middot middot middot+ εefe isin C[εx]

has complexity LC(ε)(h) le c where ε is a formal variable and LC(ε)(h) denotesthe complexity of h over the field extension C(ε) This algebraic notion ofapproximation makes sense over any base field and we will use it in the statementsand proofs of this chapter

Definition 76 Let C(F) be a class over the field F We define the approximationclosure C(F) as follows a family (fn) over F is in C(F) if there are polynomialsfni(x) isin F[x] and a function e Nrarr N such that the family (gn) defined by

gn(x) = fn(x) + εfn1(x) + ε2fn2(x) + middot middot middot+ εe(n)fne(n)(x)

is in C(F(ε)) We define the poly-approximation closure Cpoly

(F) similarly butwith the additional requirement that e(n) isin poly(n) We call e(n) the errordegree

725 Nondeterminism closure N(C)

We introduce the nondeterminism closure for algebraic complexity classes

Definition 77 Let C be a class The class N(C) consists of families (fn) withthe following property there is a family (gn) isin C and p(n) q(n) isin poly(n) suchthat

fn(x) =sum

bisin01p(n)

gq(n)(bx)

where x and b denote sequences of variables x1 x2 and b1 b2 bp(n) We saythat f(x) is a hypercube sum over g and that b1 b2 bp(n) are the hypercubevariables For any subscript x we will use the notation VNPx to denote N(VPx)We remark that the map C 7rarr N(C) trivially satisfies all properties of being aKuratowski closure operator ie N(empty) = empty C sube N(C) N(CcupD) = N(C)cupN(D)and N(N(C)) = N(C)

116 Chapter 7 Algebraic branching programs

73 Approximation closure of VP2

We show that every polynomial can be approximated by a width-2 abp Evenbetter we show that every polynomial can be approximated by a width-2 abpof size polynomial in the formula size and with error degree polynomial in theformula size This is the main result of the current chapter

Theorem 78 VPe sube VP2poly

when char(F) 6= 2

Proof For a polynomial h define the matrix M(h) =(

1 0h 1

) We call the following

matrices primitives

bull M(h) with h any variable or constant in F

bull(

12ε

00 1

)

(minus2ε 0

0 1

)

(minus1 ε0 1

)

(minus1 minusε0 1

)

(minus1 00 1

)The entries of the primitives are variables or constants in the base field F(ε)making them suitable to use in a width-2 abp over the base field F(ε)

Let (fn) isin VPe so fn(x) can be computed by a formula of size s(n) isin poly(n)By Brentrsquos depth reduction theorem for formulas (Lemma 74) fn can be computedby a formula of size poly(n) and depth d(n) isin O(log s(n))

We will construct a sequence of primitives A1 Am(n) such that

A1 middot middot middotAm(n) =

(1 0fn 1

)+ ε

(fn111 fn112

fn121 fn122

)+ ε2

(fn211 fn212

fn221 fn222

)+ middot middot middot + εe

(fne11 fne12

fne21 fne22

)for some fnijk isin F[x] with m(n) e(n) isin O(8d(n)) = poly(n) Then

( 1 1 )(minus1 00 1

)A1 middot middot middotAm(n)( 1

1 ) = fn(x) +O(ε)

so fn(x) can be approximated by a width-2 abp of length poly(n) and with errordegree poly(n) proving the theorem

We begin with the construction Let h be a polynomial and consider a formulacomputing h of depth d The goal is to construct recursively on the tree structureof the formula a sequence of primitives A1 Am such that for some hijk isin F[x]

A1 middot middot middotAm =

(1 0h 1

)+ ε

(0 0h121 0

)+ ε2

(h211 h212

h221 h222

)+

middot middot middot + εe(he11 he12

he21 he22

)(73)

with m e isin O(8d) Notice the particular first-degree error pattern in (73) whichour recursion will rely on

73 Approximation closure of VP2 117

Suppose h is a variable or a constant Then M(h) is itself a primitive satisfy-ing (73)

Suppose h = f + g is a sum of two polynomials f g and suppose that

F =

(1 0f 1

)+ ε

(0 0f prime 0

)+O(ε2) (74)

G =

(1 0g 1

)+ ε

(0 0gprime 0

)+O(ε2) (75)

are products of primitives for some f prime gprime isin F[x] Then

G middot F =

(1 0

f + g 1

)+ ε

(0 0

f prime + gprime 0

)+O(ε2)

is a product of primitives satisfying (73)Suppose h = fg is a product of two polynomials and suppose that F and G

are of the form (74) and (75) and are products of primitives We will constructM((f + g)2) M(minusf 2) M(minusg2) approximately in such a way that when we usethe identity (f + g)2 minus f 2 minus g2 = 2fg the error terms cancel properly Define theexpressions sq+(A) and sqminus(A) by

sqplusmn(A) =

(minusε 00 1

)middot A middot

(minus1 plusmnε0 1

)middot A middot

(1ε

00 1

)

Then

sqplusmn(F ) =

(1∓ εf 0

plusmnf 2 +O(ε) 1plusmn εf

)+O(ε2)

We have

sqminus(F ) middot sqminus(G) middot sq+(G middot F )

=

(1 + εg 0

minusg2 +O(ε) 1minus εg

)middot(

1 + εf 0minusf 2 +O(ε) 1minus εf

)middot(

1minus ε(f + g) 0(f + g)2 +O(ε) 1 + ε(f + g)

)+O(ε2)

which simplifies to

sqminus(F ) middot sqminus(G) middot sq+(G middot F ) =

(1 0

2fg +O(ε) 1

)+O(ε2)

118 Chapter 7 Algebraic branching programs

We conclude(2 00 1

)middot sqminus(G) middot sqminus(F ) middot sq+(G middot F ) middot

(12

00 1

)=

(minus2ε 0

0 1

)middotG middot

(minus1 minusε0 1

)middotG middot

(minus1 00 1

)middot F middot

(minus1 minusε0 1

)F

middot(minus1 00 1

)middotG middot F middot

(minus1 ε0 1

)middotG middot F middot

(12ε

00 1

)=

(1 0

fg +O(ε) 1

)+O(ε2)

This completes the constructionThe length m of the construction is m(h) = 1 for h a variable or constant

and recursively m(f + g) = m(f) + m(g) m(f middot g) = 4(m(f) + m(g)) + 7 Weconclude m isin O(8d) The error degree e of the construction satisfies the samerecursion so e isin O(8d)

Remark 79 The construction in the above proof of Theorem 78 is differentfrom the construction in our paper [BIZ17] The recursion in the above proof issimpler while the construction in [BIZ17] has a better error degree and has aspecial form which relates it to a family of polynomials called continuants

Corollary 710 VP2 = VPe and VP2poly

= VPepoly

when char(F) 6= 2

Proof We have VP2 sube VPe by Lemma 72 Taking closures on both sides weobtain VP2 sube VPe and VP2

polysube VPepoly

When char(F) 6= 2 VPe sube VP2

poly(Theorem 78) By taking closures follows

VPe sube VP2 and VPepolysube VP2

poly

Corollary 711 VP2poly

= VPe when char(F) 6= 2 and F is infinite

Proof By Corollary 710 VP2poly

= VPepoly

We prove VPepoly

= VPe inLemma 712 below

Lemma 712 VPepoly

= VPe when char(F) 6= 2 and F is infinite

Proof The inclusion VPe sube VPepoly

is trivially true We prove the other directionLet (fn) isin VPe

poly Then there are polynomials fni(x) isin F[x] and e(n) isin poly(n)

such that

fn(x) + εfn1(x) + ε2fn2(x) + middot middot middot+ εe(n)fne(n)(x)

is computed by a poly-size formula Γ over F(ε) Let α0 α1 αe(n) be distinctelements in F such that replacing ε by αj in Γ is a valid substitution ie not

74 Nondeterminism closure of VP1 119

causing division by zero These αj exist since our field is infinite by assumptionView

gn(ε) = fn(x) + εfn1(x) + ε2fn2(x) + middot middot middot+ εe(n)fne(n)(x)

as a polynomial in ε The polynomial gn(ε) has degree at most e(n) so we canwrite gn(ε) as follows (Lagrange interpolation on e(n) + 1 points)

gn(ε) =

e(n)sumj=0

gn(αj)prod

0lemlee(n)m6=j

εminus αmαj minus αm

(76)

Clearly fn(x) = gn(0) However replacing ε by 0 in Γ is not a valid substitutionin general From (76) we see directly how to write gn(0) as a linear combinationof the values gn(αj) namely

gn(0) =

e(n)sumj=0

gn(αj)prod

0lemlee(n)m6=j

minusαmαj minus αm

that is

gn(0) =e(n)sumj=0

βj gn(αj) with βj =prod

0lemlee(n)m6=j

αmαm minus αj

The value gn(αj) is computed by the formula Γ with ε replaced by αj which we

denote by Γ|ε=αj Thus fn(x) is computed by the poly-size formulasume(n)

j=0 βj Γ|ε=αj We conclude (fn) isin VPe

Remark 713 The statement of Lemma 712 also holds with VPe replaced withVPs or with VP by a similar proof

74 Nondeterminism closure of VP1

Recall the definition of VNPx = N(VPx) from Definition 77 Valiant proved thefollowing characterisation of VNP in his seminal work [Val80] See also [BCS97Thm 2126] [Bur00 Thm 213] and [MP08 Thm 2]

Theorem 714 (Valiant [Val80]) VNPe = VNP

We strengthen Valiantrsquos characterisation of VNP from VNPe to VNP1

Theorem 715 VNP1 = VNP when char(F) 6= 2

120 Chapter 7 Algebraic branching programs

The idea of the proof is ldquoto simulate in VNP1rdquo the primitives that we used inthe proof of VPe sube VP3 (Theorem 73)

Proof of Theorem 715 Clearly VNP1 sube VNP by Lemma 72 and takingthe nondeterminism closure N We will prove that VNP sube VNP1 Recall thatin the proof of VPe sube VP3 (Theorem 73) we defined for any polynomial h thematrix

M(h) =

1 0 0h 1 00 0 1

and we called the following matrices primitives

bull M(h) with h any variable or any constant in F

bull the 3times 3 permutation matrices denoted by Mπ for π isin S3

bull the diagonal matrices Mabc = diag(a b c) with a b c isin F

In the proof of VPe sube VP3 we constructed for any family (fn) isin VPe a sequenceof primitive matrices An1 Ant(n) with t(n) isin poly(n) such that

fn(x) = ( 1 1 1 )Mminus110A1 middot middot middotAm(

111

) (77)

We will show VPe sube VNP1 by constructing a hypercube sum over a width-1abp that evaluates the right-hand side of (77) This implies VNPe sube VNP1 bytaking the N-closure Then by Valiantrsquos Theorem 714 VNP sube VNP1

Let f(x) be a polynomial and let A1 Ak be primitive matrices suchthat f(x) is computed as

f(x) = ( 1 1 1 )Ak middot middot middotA1

(111

)

View this expression as a width-3 abp G with vertex layers labeled as shown inthe left-hand diagram in Fig 71 Assume for simplicity that all edges betweenlayers are present possibly with label 0 The sum of the values of every sndasht pathin G equals f(x)

f(x) =sumjisin[3]k

Ak[jk jkminus1] middot middot middotA1[j2 j1] (78)

We introduce some hypercube variables To every vertex of G except s and twe associate a bit the bits in the ith layer we call b1[i] b2[i] b3[i] To an sndashtpath in G we associate an assignment of the bj[i] by setting the bits of verticesvisited by the path to 1 and the others to 0 For example in the right-hand

74 Nondeterminism closure of VP1 121

s

0

1

2

kminus1

k

t

A1

A2

Ak

s

1 0 0

0 1 0

0 1 0

0 0 1

0 1 0

t

Figure 71 Illustration of the layer labelling and the path labelling used in theproof of Theorem 715

diagram in Fig 71 we show an sndasht path with the corresponding assignment of thebits bj[i] The assignments of the bj[i] corresponding to sndasht paths are preciselythe assignments such that for every i isin [k] exactly one of b1[i] b2[i] b3[i] equals 1Let

V (b1 b2 b3) =prodiisin[k]

(b1[i] + b2[i] + b3[i]

)prodstisin[3]s 6=t

(1minus bs[i]bt[i]

) (79)

Then the assignments of the bj[i] corresponding to sndasht paths are precisely theassignments such that V (b1 b2 b3) = 1 Otherwise V (b1 b2 b3) = 0

We will write f(x) as a hypercube sum by replacing each Ai[ji jiminus1] in (78)by a product of affine linear forms Si(Ai) with variables b and xsum

b

V (b1 b2 b3)Sk(Ak) middot middot middotS1(A1)

Define the expression Eq(α β) = (1minus α minus β)(1minus α minus β) for α β isin 0 1 Theexpression Eq(α β) evaluates to 1 if α equals β and evaluates to 0 otherwise

bull For any variable or constant x define

Si(M(x)) =(1 + (xminus 1)(b1[i]minus b1[iminus1])

)middot(1minus (1minus b2[i])b2[iminus1]

)middot Eq

(b3[iminus1] b3[i]

)

122 Chapter 7 Algebraic branching programs

bull For any permutation π isin S3 define

Si(Mπ) = Eq(b1[iminus1] bπ(1)[i]

)middot Eq

(b2[iminus1] bπ(2)[i]

)middot Eq

(b3[iminus1] bπ(3)[i]

)

bull For any constants a b c isin F define

Si(Mabc) =(a middot b1[iminus1] + b middot b2[iminus1] + c middot b3[iminus 1]

)middot Eq

(b1[iminus1] b1[i]

)middot Eq

(b2[iminus1] b2[i]

)middot Eq

(b3[iminus1] b3[i]

)

One verifies that

f(x) =sumb

V (b1 b2 b3)Sk(Ak) middot middot middotS1(A1)

Some of the factors in the expressions for the Si(Ai) are not affine linear As afinal step we apply the equality 1 + xy = 1

2

sumcisin01(x + 1 minus 2c)(y + 1 minus 2c) to

write these factors as products of affine linear forms introducing new hypercubevariables

75 Conclusion

We finish with an overview of inclusions equalities and separations among theclasses VPk VPe VP and their approximation and nondeterminism closures(when char(F) 6= 2) see Fig 72 The figure relies on the following two simplelemmas of which proofs can be found in our paper [BIZ17]

Lemma 716 ([BIZ17 Prop 510]) VP1 = VP1

Lemma 717 ([BIZ17 Prop 511]) VP1 ( VNP1 when char(F) 6= 2

75 Conclusion 123

VP1 VP2 VPe VP

VP1 VP2 VPe VP

VNP1 VNP2 VNPe VNP=

(

(

= =

(

= sube

( sube[AW16]

717

=716 sube sube(

subesube(710

715 [Val80]

[Val79]

Figure 72 Overview of relations among the algebraic complexity classes VPkVPe VP and their approximation and nondeterminism closures (when char(F) isnot 2) The relations without reference are either by definition or follow logicallyfrom the other relations

Bibliography

[AJRS13] Elizabeth S Allman Peter D Jarvis John A Rhodes andJeremy G Sumner Tensor rank invariants inequalities andapplications SIAM J Matrix Anal Appl 34(3)1014ndash1045 2013doi101137120899066 p 14

[Alo98] Noga Alon The Shannon capacity of a union Combinatorica18(3)301ndash310 1998 doi101007PL00009824 p 37

[ASU13] Noga Alon Amir Shpilka and Christopher Umans On sunflowersand matrix multiplication Comput Complexity 22(2)219ndash243Jun 2013 doi101007s00037-013-0060-1 p 48

[AW16] Eric Allender and Fengming Wang On the power of algebraicbranching programs of width two Comput Complexity25(1)217ndash253 2016 doi101007s00037-015-0114-7 p 17109 114 123

[AZ14] Martin Aigner and Gunter M Ziegler Proofs from The BookSpringer-Verlag Berlin fifth edition 2014doi101007978-3-662-44205-0 p 71

[BC18] Boris Bukh and Christopher Cox On a fractional version ofHaemersrsquo bound arXiv 2018 arXiv180200476 p 41 42

[BCC+17] Jonah Blasiak Thomas Church Henry Cohn Joshua A GrochowEric Naslund William F Sawin and Chris Umans On cap setsand the group-theoretic approach to matrix multiplication DiscreteAnal 2017 arXiv160506702 doi1019086da1245 p 4883 84 104

125

126 Bibliography

[BCPZ16] Harry Buhrman Matthias Christandl Christopher Perry andJeroen Zuiddam Clean quantum and classical communicationprotocols Phys Rev Lett 117230503 Dec 2016doi101103PhysRevLett117230503 p 1

[BCRL79] Dario Bini Milvio Capovani Francesco Romani and Grazia LottiO(n27799) complexity for ntimes n approximate matrix multiplicationInf Process Lett 8(5)234ndash235 1979doi1010160020-0190(79)90113-3 p 3 110

[BCS97] Peter Burgisser Michael Clausen and M Amin ShokrollahiAlgebraic complexity theory volume 315 of Grundlehren MathWiss Springer-Verlag Berlin 1997doi101007978-3-662-03338-8 p 4 6 48 50 66 79 119

[BCSX10] Arnab Bhattacharyya Victor Chen Madhu Sudan and Ning XieTesting Linear-Invariant Non-linear Properties A Short Reportpages 260ndash268 Springer Berlin Heidelberg Berlin Heidelberg2010 doi101007978-3-642-16367-8_18 p 48

[BCZ17a] Markus Blaser Matthias Christandl and Jeroen Zuiddam Theborder support rank of two-by-two matrix multiplication is sevenarXiv 2017 arXiv170509652 p 1 15

[BCZ17b] Harry Buhrman Matthias Christandl and Jeroen ZuiddamNondeterministic Quantum Communication Complexity the CyclicEquality Game and Iterated Matrix Multiplication In Christos HPapadimitriou editor 8th Innovations in Theoretical ComputerScience Conference (ITCS 2017) pages 241ndash2418 2017arXiv160303757 doi104230LIPIcsITCS201724 p 115

[Ber84] Stuart J Berkowitz On computing the determinant in smallparallel time using a small number of processors Inform ProcessLett 18(3)147ndash150 1984 doi1010160020-0190(84)90018-8p 108

[BI13] Peter Burgisser and Christian Ikenmeyer Explicit lower bounds viageometric complexity theory Proceedings 45th Annual ACMSymposium on Theory of Computing 2013 pages 141ndash150 2013doi10114524886082488627 p 108

[Bin80] Dario Bini Relations between exact and approximate bilinearalgorithms Applications Calcolo 17(1)87ndash97 1980doi101007BF02575865 p 3

Bibliography 127

[BIZ17] Karl Bringmann Christian Ikenmeyer and Jeroen Zuiddam OnAlgebraic Branching Programs of Small Width In Ryan OrsquoDonnelleditor 32nd Computational Complexity Conference (CCC 2017)pages 201ndash2031 2017 doi104230LIPIcsCCC201720 p 1107 111 112 118 122

[Bla13] Anna Blasiak A graph-theoretic approach to network coding PhDthesis Cornell University 2013 URL httpsecommonscornelledubitstreamhandle181334147ab675pdf p 42

[BLMW11] Peter Burgisser Joseph M Landsberg Laurent Manivel and JerzyWeyman An overview of mathematical issues arising in thegeometric complexity theory approach to VP 6= VNP SIAM JComput 40(4)1179ndash1209 2011 doi101137090765328 p 108

[BOC92] Michael Ben-Or and Richard Cleve Computing algebraic formulasusing a constant number of registers SIAM J Comput21(1)54ndash58 1992 doi1011370221006 p 17 109 112

[BPR+00] Charles H Bennett Sandu Popescu Daniel Rohrlich John ASmolin and Ashish V Thapliyal Exact and asymptotic measuresof multipartite pure-state entanglement Phys Rev A63(1)012307 2000 doi101103PhysRevA63012307 p 48

[Bre74] Richard P Brent The parallel evaluation of general arithmeticexpressions J ACM 21(2)201ndash206 April 1974doi101145321812321815 p 112

[Bri87] Michel Brion Sur lrsquoimage de lrsquoapplication moment In Seminairedrsquoalgebre Paul Dubreil et Marie-Paule Malliavin (Paris 1986)volume 1296 of Lecture Notes in Math pages 177ndash192 SpringerBerlin 1987 doi101007BFb0078526 p 9 93 94

[BS83] Eberhard Becker and Niels and Schwartz Zum Darstellungssatzvon Kadison-Dubois Arch Math (Basel) 40(5)421ndash428 1983doi101007BF01192806 p 7 12 33

[Bur90] Peter Burgisser Degenerationsordnung und Tragerfunktionalbilinearer Abbildungen PhD thesis Universitat Konstanz 1990httpnbn-resolvingdeurnnbndebsz352-opus-20311p 57 101

[Bur00] Peter Burgisser Completeness and reduction in algebraiccomplexity theory volume 7 of Algorithms and Computation inMathematics Springer-Verlag Berlin 2000doi101007978-3-662-04179-6 p 119

128 Bibliography

[Bur04] Peter Burgisser The complexity of factors of multivariatepolynomials Found Comput Math 4(4)369ndash396 2004doi101007s10208-002-0059-5 p 110 115

[BX15] Arnab Bhattacharyya and Ning Xie Lower bounds for testingtriangle-freeness in boolean functions Comput Complexity24(1)65ndash101 2015 doi101007s00037-014-0092-1 p 48

[BZ17] Jop Briet and Jeroen Zuiddam On the orthogonal rank of Cayleygraphs and impossibility of quantum round elimination QuantumInf Comput 17(1amp2) 2017 URL httpwwwrintonpresscomxxqic17qic-17-120106-0116pdfarXiv160806113 p 2

[CHM07] Matthias Christandl Aram W Harrow and Graeme MitchisonNonzero Kronecker coefficients and what they tell us about spectraComm Math Phys 270(3)575ndash585 2007doi101007s00220-006-0157-3 p 90

[CJZ18] Matthias Christandl Asger Kjaeligrulff Jensen and Jeroen ZuiddamTensor rank is not multiplicative under the tensor product LinearAlgebra Appl 543125ndash139 2018doi101016jlaa201712020 p 2 15

[CKSV16] Suryajith Chillara Mrinal Kumar Ramprasad Saptharishi andV Vinay The chasm at depth four and tensor rank Old resultsnew insights arXiv 2016 arXiv160604200 p 15

[CLP17] Ernie Croot Vsevolod F Lev and Peter Pal Pach Progression-freesets in Zn

4 are exponentially small Ann of Math (2)185(1)331ndash337 2017 doi104007annals201718517 p 4881

[CM06] Matthias Christandl and Graeme Mitchison The spectra ofquantum states and the Kronecker coefficients of the symmetricgroup Comm Math Phys 261(3)789ndash797 2006doi101007s00220-005-1435-1 p 91

[CMR+14] Toby Cubitt Laura Mancinska David E Roberson SimoneSeverini Dan Stahlke and Andreas Winter Bounds onentanglement-assisted source-channel coding via the Lovasz thetanumber and its variants IEEE Trans Inform Theory60(11)7330ndash7344 2014 arXiv13107120doi101109TIT20142349502 p 42

Bibliography 129

[CT12] Thomas M Cover and Joy A Thomas Elements of informationtheory John Wiley amp Sons 2012 p 60

[CU13] Henry Cohn and Christopher Umans Fast matrix multiplicationusing coherent configurations In Proceedings of the Twenty-FourthAnnual ACM-SIAM Symposium on Discrete Algorithms pages1074ndash1086 SIAM 2013 p 15

[CVZ16] Matthias Christandl Peter Vrana and Jeroen ZuiddamAsymptotic tensor rank of graph tensors beyond matrixmultiplication arXiv 2016 arXiv160907476 p 2 65 67 7985

[CVZ18] Matthias Christandl Peter Vrana and Jeroen Zuiddam Universalpoints in the asymptotic spectrum of tensors In Proceedings of 50thAnnual ACM SIGACT Symposium on the Theory of Computing(STOCrsquo18) ACM New York 2018 arXiv170907851doi10114531887453188766 p 2 47 65 87 88 96 103 105

[CW82] Don Coppersmith and Shmuel Winograd On the asymptoticcomplexity of matrix multiplication SIAM J Comput11(3)472ndash492 1982 doi1011370211038 p 3

[CW87] Don Coppersmith and Shmuel Winograd Matrix multiplication viaarithmetic progressions In Proceedings of the nineteenth annualACM symposium on Theory of computing pages 1ndash6 ACM 1987p 3

[CW90] Don Coppersmith and Shmuel Winograd Matrix multiplication viaarithmetic progressions J Symbolic Comput 9(3)251ndash280 1990doi101016S0747-7171(08)80013-2 p 4 6 8 10 48 67

[CZ18] Matthias Christandl and Jeroen Zuiddam Tensor surgery andtensor rank Comput Complexity Mar 2018doi101007s00037-018-0164-8 p 2 86

[Dra15] Jan Draisma Multilinear Algebra and Applications (lecture notes)2015 URL httpsmathsitesunibechjdraismapublicationsmlapplpdfp 15

[DVC00] Wolfgang Dur Guivre Vidal and Juan Ignacio Cirac Three qubitscan be entangled in two inequivalent ways Phys Rev A (3)62(6)062314 12 2000 doi101103PhysRevA62062314 p 48

130 Bibliography

[Ede04] Yves Edel Extensions of generalized product caps Des CodesCryptogr 31(1)5ndash14 2004 doi101023A1027365901231p 48 83

[EG17] Jordan S Ellenberg and Dion Gijswijt On large subsets of Fnq with

no three-term arithmetic progression Ann of Math (2)185(1)339ndash343 2017 doi104007annals201718518 p 1048 81 83 84

[FK14] Hu Fu and Robert Kleinberg Improved lower bounds for testingtriangle-freeness in boolean functions via fast matrix multiplicationIn Approximation Randomization and CombinatorialOptimization Algorithms and Techniques (APPROXRANDOM2014) pages 669ndash676 2014doi104230LIPIcsAPPROX-RANDOM2014669 p 48

[For16] Michael Forbes Some concrete questions on the border complexityof polynomials Presentation given at the Workshop on AlgebraicComplexity Theory WACT 2016 in Tel Avivhttpswwwyoutubecomwatchv=1HMogQIHT6Q 2016 p 110

[Fra02] Matthias Franz Moment polytopes of projective G-varieties andtensor products of symmetric group representations J Lie Theory12(2)539ndash549 2002 URLhttpemisamsorgjournalsJLTvol12_no216htmlp 93 94

[Fri17] Tobias Fritz Resource convertibility and ordered commutativemonoids Math Structures Comput Sci 27(6)850ndash938 2017doi101017S0960129515000444 p 37

[Ful97] William Fulton Young tableaux volume 35 of LondonMathematical Society Student Texts Cambridge University PressCambridge 1997 With applications to representation theory andgeometry p 88

[GKKS13] Ankit Gupta Pritish Kamath Neeraj Kayal and RamprasadSaptharishi Approaching the chasm at depth four In 2013 IEEEConference on Computational ComplexitymdashCCC 2013 pages 65ndash73IEEE Computer Soc Los Alamitos CA 2013doi101109CCC201316 p 108

[GMQ16] Joshua A Grochow Ketan D Mulmuley and Youming QiaoBoundaries of VP and VNP In Ioannis Chatzigiannakis MichaelMitzenmacher Yuval Rabani and Davide Sangiorgi editors 43rd

Bibliography 131

International Colloquium on Automata Languages andProgramming (ICALP 2016) volume 55 pages 341ndash3414 2016arXiv160502815 doi104230LIPIcsICALP201634 p 110

[Gro13] Joshua A Grochow Unifying and generalizing known lower boundsvia geometric complexity theory arXiv 2013 arXiv13046333p 108

[GW09] Roe Goodman and Nolan R Wallach Symmetry representationsand invariants volume 255 of Graduate Texts in MathematicsSpringer Dordrecht 2009 doi101007978-0-387-79852-3p 88

[Hae79] Willem Haemers On some problems of Lovasz concerning theShannon capacity of a graph IEEE Trans Inform Theory25(2)231ndash232 1979 doi101109TIT19791056027 p 37 4042

[Has90] Johan Hastad Tensor rank is NP-complete J Algorithms11(4)644ndash654 1990 doi1010160196-6774(90)90014-6 p 47

[HHHH09] Ryszard Horodecki Pawe l Horodecki Micha l Horodecki and KarolHorodecki Quantum entanglement Rev Modern Phys81(2)865ndash942 2009 doi101103RevModPhys81865 p 48

[HIL13] Jonathan D Hauenstein Christian Ikenmeyer and Joseph MLandsberg Equations for lower bounds on border rank ExpMath 22(4)372ndash383 2013 doi101080105864582013825892p 15 110

[Hum75] James E Humphreys Linear algebraic groups Springer-VerlagNew York-Heidelberg 1975 Graduate Texts in Mathematics No21 p 93

[HX17] Ishay Haviv and Ning Xie Sunflowers and testing triangle-freenessof functions Comput Complexity 26(2)497ndash530 Jun 2017doi101007s00037-016-0138-7 p 48

[Ike13] Christian Ikenmeyer Geometric complexity theory tensor rankand LittlewoodndashRichardson coefficients PhD thesis UniversitatPaderborn 2013 p 14

[Kar72] Richard M Karp Reducibility among combinatorial problems InComplexity of computer computations (Proc Sympos IBM ThomasJ Watson Res Center Yorktown Heights NY 1972) pages85ndash103 Plenum New York 1972 p 36

132 Bibliography

[Knu94] Donald E Knuth The sandwich theorem Electron J Combin 11994 URL httpwwwcombinatoricsorgVolume_1Abstractsv1i1a1htmlp 41

[Kra84] Hanspeter Kraft Geometrische Methoden in der InvariantentheorieSpringer 1984 doi101007978-3-663-10143-7 p 50 88 93

[KS08] Tali Kaufman and Madhu Sudan Algebraic property testing Therole of invariance In Proceedings of the Fortieth Annual ACMSymposium on Theory of Computing STOC rsquo08 pages 403ndash412New York NY USA 2008 ACMdoi10114513743761374434 p 48

[KSS16] Robert Kleinberg William F Sawin and David E Speyer Thegrowth rate of tri-colored sum-free sets arXiv 2016arXiv160700047 p 48 79 83

[Lan06] Joseph M Landsberg The border rank of the multiplication of2times 2 matrices is seven J Amer Math Soc 19(2)447ndash459 2006doi101090S0894-0347-05-00506-0 p 110

[LG14] Francois Le Gall Powers of tensors and fast matrix multiplicationIn ISSAC 2014mdashProceedings of the 39th International Symposiumon Symbolic and Algebraic Computation pages 296ndash303 ACM NewYork 2014 doi10114526086282608664 p 4 6 8 48 85

[Lic84] Thomas Lickteig A note on border rank Inf Process Lett18(3)173ndash178 1984 doi1010160020-0190(84)90023-1p 110

[LM16a] Joseph M Landsberg and Mateusz Micha lek A 2n2 minus log(n)minus 1lower bound for the border rank of matrix multiplication arXiv2016 arXiv160807486 p 110

[LM16b] Joseph M Landsberg and Mateusz Micha lek Abelian tensorsJ Math Pures Appl 2016 doi101016jmatpur201611004p 14

[LMR13] Joseph M Landsberg Laurent Manivel and Nicolas RessayreHypersurfaces with degenerate duals and the geometric complexitytheory program Comment Math Helv 88(2)469ndash484 2013doi104171CMH292 p 108

[LO15] Joseph M Landsberg and Giorgio Ottaviani New lower bounds forthe border rank of matrix multiplication Theory Comput

Bibliography 133

11285ndash298 2015 arXiv11126007doi104086toc2015v011a011 p 108 110

[Lov79] Laszlo Lovasz On the Shannon capacity of a graph IEEE TransInform Theory 25(1)1ndash7 1979 doi101109TIT19791055985p 13 35 41

[Mar08] Murray Marshall Positive polynomials and sums of squaresvolume 146 of Mathematical Surveys and Monographs AmericanMathematical Society Providence RI 2008doi101090surv146 p 34

[MP71] Robert J McEliece and Edward C Posner Hide and seek datastorage and entropy The Annals of Mathematical Statistics42(5)1706ndash1716 1971 doi101214aoms1177693169 p 41

[MP08] Guillaume Malod and Natacha Portier Characterizing Valiantrsquosalgebraic complexity classes J Complexity 24(1)16ndash38 2008doi101016jjco200609006 p 119

[MS01] Ketan D Mulmuley and Milind Sohoni Geometric complexitytheory I An approach to the P vs NP and related problemsSIAM J Comput 31(2)496ndash526 2001doi101137S009753970038715X p 14 108

[MS08] Ketan D Mulmuley and Milind Sohoni Geometric complexitytheory II Towards explicit obstructions for embeddings amongclass varieties SIAM J Comput 38(3)1175ndash1206 2008doi101137080718115 p 108

[Nes84] Linda Ness A stratification of the null cone via the moment mapAmer J Math 106(6)1281ndash1329 1984 With an appendix byDavid Mumford doi1023072374395 p 9 93 94

[Nis91] Noam Nisan Lower bounds for non-commutative computation InProceedings of the twenty-third annual ACM symposium on Theoryof computing pages 410ndash418 ACM 1991doi101145103418103462 p 110

[Nor16] Sergey Norin A distribution on triples with maximum entropymarginal arXiv 2016 arXiv160800243 p 83

[NW97] Noam Nisan and Avi Wigderson Lower bounds on arithmeticcircuits via partial derivatives Comput Complexity 6(3)217ndash234199697 doi101007BF01294256 p 108

134 Bibliography

[Pan78] Victor Ya Pan Strassenrsquos algorithm is not optimal Trilineartechnique of aggregating uniting and canceling for constructingfast algorithms for matrix operations In 19th Annual Symposiumon Foundations of Computer Science (Ann Arbor Mich 1978)pages 166ndash176 IEEE Long Beach Calif 1978 p 3

[Pan80] Victor Ya Pan New fast algorithms for matrix operations SIAMJ Comput 9(2)321ndash342 1980 doi1011370209027 p 3

[Pan81] Victor Ya Pan New combinations of methods for the accelerationof matrix multiplication Comput Math Appl 7(1)73ndash125 1981doi1010160898-1221(81)90009-2 p 3

[Pan84] Victor Ya Pan How to multiply matrices faster volume 179 ofLecture Notes in Computer Science Springer-Verlag Berlin 1984doi1010073-540-13866-8 p 3

[Pan18] Victor Ya Pan Fast feasible and unfeasible matrix multiplicationarXiv 2018 arXiv180404102 p 6

[PD01] Alexander Prestel and Charles N Delzell Positive polynomialsSpringer Monographs in Mathematics Springer-Verlag Berlin2001 From Hilbertrsquos 17th problem to real algebradoi101007978-3-662-04648-7 p 34

[Peb16] Luke Pebody Proof of a conjecture of Kleinberg-Sawin-SpeyerarXiv 2016 arXiv160805740 p 83

[PS98] George Polya and Gabor Szego Problems and theorems inanalysis I Classics in Mathematics Springer-Verlag Berlin 1998Series integral calculus theory of functions Translated from theGerman by Dorothee Aeppli Reprint of the 1978 Englishtranslation doi101007978-3-642-61905-2 p 21

[Raz09] Ran Raz Multi-linear formulas for permanent and determinant areof super-polynomial size J ACM 56(2)Art 8 17 2009doi10114515027931502797 p 108

[Raz13] Ran Raz Tensor-rank and lower bounds for arithmetic formulasJ ACM 60(6)Art 40 15 2013 doi1011452535928 p 14

[Rom82] Francesco Romani Some properties of disjoint sums of tensorsrelated to matrix multiplication SIAM J Comput 11(2)263ndash2671982 doi1011370211020 p 3

Bibliography 135

[Sap16] Ramprasad Saptharishi A survey of lower bounds in arithmeticcircuit complexity 302 2016 Online survey URLhttpsgithubcomdasarpmarlowerbounds-survey p 6 17109 112

[Sch81] Arnold Schonhage Partial and total matrix multiplication SIAMJ Comput 10(3)434ndash455 1981 p 3

[Sch03] Alexander Schrijver Combinatorial optimization polyhedra andefficiency volume 24 Springer Science amp Business Media 2003p 37 41

[Sha56] Claude E Shannon The zero error capacity of a noisy channelInstitute of Radio Engineers Transactions on Information TheoryIT-2(September)8ndash19 1956 doi101109TIT19561056798p 13 35

[Sha09] Asaf Shapira Greenrsquos conjecture and testing linear-invariantproperties In Proceedings of the Forty-first Annual ACMSymposium on Theory of Computing STOC rsquo09 pages 159ndash166New York NY USA 2009 ACMdoi10114515364141536438 p 48

[Shi16] Yaroslav Shitov How hard is the tensor rank arXiv 2016arXiv161101559 p 47

[Sin64] Richard C Singleton Maximum distance q-nary codes IEEETrans Information Theory IT-10116ndash118 1964doi101109TIT19641053661 p 101

[SOK14] Adam Sawicki Micha l Oszmaniec and Marek Kus Convexity ofmomentum map Morse index and quantum entanglement RevMath Phys 26(3)1450004 39 2014doi101142S0129055X14500044 p 9

[SSS09] Chandan Saha Ramprasad Saptharishi and Nitin Saxena Thepower of depth 2 circuits over algebras In IARCS AnnualConference on Foundations of Software Technology and TheoreticalComputer Science volume 4 pages 371ndash382 2009arXiv09042058 doi104230LIPIcsFSTTCS20092333p 109

[Sto10] Andrew James Stothers On the complexity of matrix multiplicationPhD thesis University of Edinburgh 2010httphdlhandlenet18424734 p 4 6 8 48

136 Bibliography

[Str69] Volker Strassen Gaussian elimination is not optimal NumerMath 13(4)354ndash356 1969 doi101007BF02165411 p 3 5

[Str83] Volker Strassen Rank and optimal computation of generic tensorsLinear Algebra Appl 5253645ndash685 1983doi1010160024-3795(83)80041-X p 110

[Str86] Volker Strassen The asymptotic spectrum of tensors and theexponent of matrix multiplication In Proceedings of the 27thAnnual Symposium on Foundations of Computer Science SFCS rsquo86pages 49ndash54 Washington DC USA 1986 IEEE Computer Societydoi101109SFCS198652 p 4 7

[Str87] Volker Strassen Relative bilinear complexity and matrixmultiplication J Reine Angew Math 375376406ndash443 1987doi101515crll1987375-376406 p 3 4 49 67

[Str88] Volker Strassen The asymptotic spectrum of tensors J ReineAngew Math 384102ndash152 1988doi101515crll1988384102 p 4 7 12 19 26 27 28 2930 32 33 49 50 51

[Str91] Volker Strassen Degeneration and complexity of bilinear mapssome asymptotic spectra J Reine Angew Math 413127ndash1801991 doi101515crll1991413127 p 3 4 10 48 49 5255 56 57 66 67 81 82

[Str94] Volker Strassen Algebra and complexity In First EuropeanCongress of Mathematics Vol II (Paris 1992) volume 120 ofProgr Math pages 429ndash446 Birkhauser Basel 1994doi101007s10107-008-0221-1 p 67

[Str05] Volker Strassen Komplexitat und Geometrie bilinearerAbbildungen Jahresber Deutsch Math-Verein 107(1)3ndash31 2005p 4 88 94 95 100 101

[Tao08] Terence Tao Structure and randomness pages from year one of amathematical blog American Mathematical Soc 2008 p 48

[Tao16] Terence Tao A symmetric formulation of theCrootndashLevndashPachndashEllenbergndashGijswijt capset boundhttpsterrytaowordpresscom 2016 p 48 58 81 84

[Tob91] Verena Tobler Spezialisierung und Degeneration von TensorenPhD thesis Universitat Konstanz 1991httpnbn-resolvingdeurnnbndebsz352-opus-20324p 57

Bibliography 137

[TS16] Terence Tao and Will Sawin Notes on the ldquoslice rankrdquo of tensorshttpsterrytaowordpresscom 2016 p 48 58

[Val79] Leslie G Valiant Completeness classes in algebra In ConferenceRecord of the Eleventh Annual ACM Symposium on Theory ofComputing (Atlanta Ga 1979) pages 249ndash261 ACM New York1979 doi101145800135804419 p 107 108 123

[Val80] Leslie G Valiant Reducibility by algebraic projections Universityof Edinburgh Department of Computer Science 1980 InternalReport p 109 119 123

[VC15] Peter Vrana and Matthias Christandl Asymptotic entanglementtransformation between W and GHZ states J Math Phys56(2)022204 12 2015 arXiv13103244doi10106314908106 p 69

[VDDMV02] F Verstraete J Dehaene B De Moor and H Verschelde Fourqubits can be entangled in nine different ways Phys Rev A (3)65(5 part A)052112 5 2002 doi101103PhysRevA65052112p 48

[Wal14] Michael Walter Multipartite quantum states and their marginalsPhD thesis ETH Zurich 2014 arXiv14106820 p 93

[WDGC13] Michael Walter Brent Doran David Gross and MatthiasChristandl Entanglement polytopes multiparticle entanglementfrom single-particle information Science 340(6137)1205ndash12082013 arXiv12080365 doi101126science1232957 p 8 995

[Wil12] Virginia Vassilevska Williams Multiplying matrices faster thanCoppersmith-Winograd Extended abstract InSTOCrsquo12mdashProceedings of the 2012 ACM Symposium on Theory ofComputing pages 887ndash898 ACM New York 2012doi10114522139772214056 p 4 6 8 48

[Zui17] Jeroen Zuiddam A note on the gap between rank and border rankLinear Algebra Appl 52533ndash44 2017doi101016jlaa201703015 p 2 14 110

[Zui18] Jeroen Zuiddam The asymptotic spectrum of graphs and theShannon capacity arXiv 2018 arXiv180700169 p 35

Glossary

〈n〉 ntimes middot middot middot times n diagonal tensor 47

〈a b c〉 matrix multiplication tensor 48

G lowastH or-product 42

GH strong graph product and-product 35

α(G) stability number 35

χ(G) clique cover number 40

Kk complete graph on k vertices 36

F θ(t) quantum functional 96

G(t) GLn1 times middot middot middot timesGLnk for t isin Fn1 otimes middot middot middot otimes Fnk 52

H(P ) Shannon entropy of probability distribution P 52

h(p) binary entropy of probability p isin [0 1] 53

τ(Φ) hitting set number 59

˜τ(Φ) asymptotic hitting set number 60

ω matrix multiplication exponent 47

P moment polytope 94

139

140 Glossary

P(X) the set of probability distributions on X 52

R rank 27

˜R asymptotic rank 27

R(t) border rank 50

R(G) rank of a graph clique cover number 40

R(t) tensor rank 47

SR(t) slice rank 58

Q subrank 27

˜Q asymptotic subrank 27

Q(t) border subrank 50

Q(Φ) combinatorial subrank 10

Q(G) subrank of a graph stability number 40

supp(t) support 52

Θ(G) Shannon capacity 35

ϑ(G) Lovasz theta number 41

G tH disjoint union 36

W (t) Sn1 times middot middot middot times Snk for t isin Fn1 otimes middot middot middot otimes Fnk 53

X(S6) asymptotic spectrum of semiring S with Strassen preorder 6 25

ζ(S)(t) gauge point 51

ζθ(t) support functional 52

Samenvatting

Algebraısche complexiteit asymptotische spectra enverstrengelingspolytopen

Het is welbekend dat de rang van een matrix multiplicatief is onder het Krone-ckerproduct additief onder de directe som genormaliseerd op identiteitsmatricesen niet-stijgend onder vermenigvuldiging van links en van rechts met matricesMatrixrang is zelfs de enige reele parameter met deze vier eigenschappen In 1986initieerde Strassen de studie van de uitbreiding naar tensoren vind alle afbeel-dingen van k-tensoren naar de reele getallen die multiplicatief zijn onder hettensor Kroneckerproduct additief onder de directe som genormaliseerd op ldquoiden-titeitstensorenrdquo en niet-stijgend onder het toepassen van lineaire afbeeldingen opde k tensorfactoren Strassen noemde de verzameling van deze afbeeldingen hetldquoasymptotische spectrum van k-tensorenrdquo Hij bewees als we het asymptotischespectrum begrijpen dan begrijpen we de asymptotische relaties tussen tensorswaaronder de asymptotische subrang en de asymptotische rang In het bijzonderals we het asymptotische spectrum kennen dan kennen we de aritmetische com-plexiteit van matrixvermenigvuldiging een centraal probleem in de algebraıschecomplexiteitstheorie

Een van de hoofdresultaten in dit proefschrift is de eerste expliciete construc-tie van een oneindige familie van elementen in het asymptotische spectrum vancomplexe k-tensoren genaamd de quantumfunctionalen Onze constructie is geba-seerd op informatietheorie en momentpolytopen ook wel verstrengelingspolytopengenoemd Daarnaast bestuderen we onder andere de relatie tussen de recentgeıntroduceerde slice rang en de quantumfunctionalen en we bewijzen dat deldquoasymptotischerdquo slice rang gelijk is aan het minimum over de quantumfunctionalenNaast het bestuderen van de bovengenoemde tensorparameters geven we eenuitbreiding van de CoppersmithndashWinograd-methode (voor het verkrijgen vanondergrenzen op de asymptotische combinatorische subrang) naar hogere-orde

141

142 Samenvatting

tensoren dwz tensoren van orde minstens 4 We passen deze uitbreiding toeom nieuwe bovengrenzen te krijgen op de asymptotische tensorrang van complete-graaftensoren via de lasermethode (Gezamenlijk werk met Christandl en VranaQIP 2018 STOC 2018)

Als een nieuwe toepassing van de abstracte theorie van asymptotische spectraintroduceren we het asymptotische spectrum van grafen in de grafentheorie Ana-loog aan de situatie voor tensoren geldt als we het asymptotisch spectrum vangrafen begrijpen dan begrijpen we de Shannoncapaciteit een graafparameter diede zero-error-communicatiecomplexiteit van communicatiekanalen karakteriseertMet andere woorden we bewijzen een nieuwe dualiteitsstelling voor de Shannon-capaciteit Voorbeelden van elementen in het asymptotische spectrum van grafenzijn het thetagetal van Lovasz en de fractionele Haemersgrenzen

Tot slot bestuderen we een algebraısch model van berekening genaamd algebraicbranching programs Een algebraic branching program (abp) is het spoor vaneen product van matrices met polynomen van graad hoogstens 1 als elementenDe maximale grootte van de matrices heet de breedte van de abp In 1992bewezen Ben-Or en Cleve dat elk polynoom berekend kan worden door eenbreedte-3 abp met een aantal matrices dat polynomiaal is in de formula size vanhet polynoom Daarentegen bewezen Allender en Wang in 2011 dat sommigepolynomen niet berekend kunnen worden door breedte-2 abps Wij bewijzen dat elkpolynoom benaderd kan worden door een breedte-2 abp met een aantal matricesdat polynomiaal is in de formula size van het polynoom waarbij benaderingwordt bedoeld in de zin van degeneration (Gezamenlijk werk met Ikenmeyer enBringmann CCC 2017 JACM 2018)

Summary

Algebraic complexity asymptotic spectra andentanglement polytopes

Matrix rank is well-known to be multiplicative under the Kronecker productadditive under the direct sum normalised on identity matrices and non-increasingunder multiplying from the left and from the right by any matrices In fact matrixrank is the only real matrix parameter with these four properties In 1986 Strassenproposed to study the extension to tensors find all maps from k-tensors to thereals that are multiplicative under the tensor Kronecker product additive underthe direct sum normalised on ldquoidentity tensorsrdquo and non-increasing under actingwith linear maps on the k tensor factors Strassen called the collection of thesemaps the ldquoasymptotic spectrum of k-tensorsrdquo He proved that understandingthe asymptotic spectrum implies understanding the asymptotic relations amongtensors including the asymptotic subrank and the asymptotic rank In particularknowing the asymptotic spectrum means knowing the arithmetic complexity ofmatrix multiplication a central problem in algebraic complexity theory

One of the main results in this dissertation is the first explicit construction ofan infinite family of elements in the asymptotic spectrum of complex k-tensorscalled the quantum functionals Our construction is based on information theoryand moment polytopes ie entanglement polytopes Moreover among otherthings we study the relation of the recently introduced slice rank to the quantumfunctionals and find that ldquoasymptoticrdquo slice rank equals the minimum over thequantum functionals Besides studying the above tensor parameters we extendthe CoppersmithndashWinograd method (for obtaining asymptotic combinatorialsubrank lower bounds) to higher-order tensors ie order at least 4 We applythis generalisation to obtain new upper bounds on the asymptotic tensor rankof complete graph tensors via the laser method (Joint work with Christandland Vrana QIP 2018 STOC 2018)

143

144 Summary

In graph theory as a new instantiation of the abstract theory of asymptoticspectra we introduce the asymptotic spectrum of graphs Analogous to thesituation for tensors understanding the asymptotic spectrum of graphs meansunderstanding the Shannon capacity a graph parameter capturing the zero-errorcommunication complexity of communication channels In different words weprove a new duality theorem for Shannon capacity Some known elements in theasymptotic spectrum of graphs are the Lovasz theta number and the fractionalHaemers bounds

Finally we study an algebraic model of computation called algebraic branchingprograms An algebraic branching program (abp) is the trace of a product ofmatrices with affine linear forms as matrix entries The maximum size of thematrices is called the width of the abp In 1992 Ben-Or and Cleve provedthat width-3 abps can compute any polynomial efficiently in the formula sizeOn the other hand in 2011 Allender and Wang proved that some polynomialscannot be computed by any width-2 abp We prove that any polynomial can beefficiently approximated by a width-2 abp where approximation is defined in thesense of degeneration (Joint work with Ikenmeyer and Bringmann CCC 2017JACM 2018)

Titles in the ILLC Dissertation Series

ILLC DS-2009-01 Jakub SzymanikQuantifiers in TIME and SPACE Computational Complexity of GeneralizedQuantifiers in Natural Language

ILLC DS-2009-02 Hartmut FitzNeural Syntax

ILLC DS-2009-03 Brian Thomas SemmesA Game for the Borel Functions

ILLC DS-2009-04 Sara L UckelmanModalities in Medieval Logic

ILLC DS-2009-05 Andreas WitzelKnowledge and Games Theory and Implementation

ILLC DS-2009-06 Chantal BaxSubjectivity after Wittgenstein Wittgensteinrsquos embodied and embedded subjectand the debate about the death of man

ILLC DS-2009-07 Kata BaloghTheme with Variations A Context-based Analysis of Focus

ILLC DS-2009-08 Tomohiro HoshiEpistemic Dynamics and Protocol Information

ILLC DS-2009-09 Olivia LadinigTemporal expectations and their violations

ILLC DS-2009-10 Tikitu de JagerrdquoNow that you mention it I wonderrdquo Awareness Attention Assumption

ILLC DS-2009-11 Michael FrankeSignal to Act Game Theory in Pragmatics

ILLC DS-2009-12 Joel UckelmanMore Than the Sum of Its Parts Compact Preference Representation OverCombinatorial Domains

ILLC DS-2009-13 Stefan BoldCardinals as Ultrapowers A Canonical Measure Analysis under the Axiom ofDeterminacy

ILLC DS-2010-01 Reut TsarfatyRelational-Realizational Parsing

ILLC DS-2010-02 Jonathan ZvesperPlaying with Information

ILLC DS-2010-03 Cedric DegremontThe Temporal Mind Observations on the logic of belief change in interactivesystems

ILLC DS-2010-04 Daisuke IkegamiGames in Set Theory and Logic

ILLC DS-2010-05 Jarmo KontinenCoherence and Complexity in Fragments of Dependence Logic

ILLC DS-2010-06 Yanjing WangEpistemic Modelling and Protocol Dynamics

ILLC DS-2010-07 Marc StaudacherUse theories of meaning between conventions and social norms

ILLC DS-2010-08 Amelie GheerbrantFixed-Point Logics on Trees

ILLC DS-2010-09 Gaelle FontaineModal Fixpoint Logic Some Model Theoretic Questions

ILLC DS-2010-10 Jacob VosmaerLogic Algebra and Topology Investigations into canonical extensions dualitytheory and point-free topology

ILLC DS-2010-11 Nina GierasimczukKnowing Onersquos Limits Logical Analysis of Inductive Inference

ILLC DS-2010-12 Martin Mose BentzenStit Iit and Deontic Logic for Action Types

ILLC DS-2011-01 Wouter M KoolenCombining Strategies Efficiently High-Quality Decisions from ConflictingAdvice

ILLC DS-2011-02 Fernando Raymundo Velazquez-QuesadaSmall steps in dynamics of information

ILLC DS-2011-03 Marijn KoolenThe Meaning of Structure the Value of Link Evidence for Information Retrieval

ILLC DS-2011-04 Junte ZhangSystem Evaluation of Archival Description and Access

ILLC DS-2011-05 Lauri KeskinenCharacterizing All Models in Infinite Cardinalities

ILLC DS-2011-06 Rianne KapteinEffective Focused Retrieval by Exploiting Query Context and Document Struc-ture

ILLC DS-2011-07 Jop BrietGrothendieck Inequalities Nonlocal Games and Optimization

ILLC DS-2011-08 Stefan MinicaDynamic Logic of Questions

ILLC DS-2011-09 Raul Andres LealModalities Through the Looking Glass A study on coalgebraic modal logic andtheir applications

ILLC DS-2011-10 Lena KurzenComplexity in Interaction

ILLC DS-2011-11 Gideon BorensztajnThe neural basis of structure in language

ILLC DS-2012-01 Federico SangatiDecomposing and Regenerating Syntactic Trees

ILLC DS-2012-02 Markos MylonakisLearning the Latent Structure of Translation

ILLC DS-2012-03 Edgar Jose Andrade LoteroModels of Language Towards a practice-based account of information innatural language

ILLC DS-2012-04 Yurii KhomskiiRegularity Properties and Definability in the Real Number Continuum idealizedforcing polarized partitions Hausdorff gaps and mad families in the projectivehierarchy

ILLC DS-2012-05 David Garcıa SorianoQuery-Efficient Computation in Property Testing and Learning Theory

ILLC DS-2012-06 Dimitris GakisContextual Metaphilosophy - The Case of Wittgenstein

ILLC DS-2012-07 Pietro GallianiThe Dynamics of Imperfect Information

ILLC DS-2012-08 Umberto GrandiBinary Aggregation with Integrity Constraints

ILLC DS-2012-09 Wesley Halcrow HollidayKnowing What Follows Epistemic Closure and Epistemic Logic

ILLC DS-2012-10 Jeremy MeyersLocations Bodies and Sets A model theoretic investigation into nominalisticmereologies

ILLC DS-2012-11 Floor SietsmaLogics of Communication and Knowledge

ILLC DS-2012-12 Joris DormansEngineering emergence applied theory for game design

ILLC DS-2013-01 Simon PauwSize Matters Grounding Quantifiers in Spatial Perception

ILLC DS-2013-02 Virginie FiutekPlaying with Knowledge and Belief

ILLC DS-2013-03 Giannicola ScarpaQuantum entanglement in non-local games graph parameters and zero-errorinformation theory

ILLC DS-2014-01 Machiel KeestraSculpting the Space of Actions Explaining Human Action by IntegratingIntentions and Mechanisms

ILLC DS-2014-02 Thomas IcardThe Algorithmic Mind A Study of Inference in Action

ILLC DS-2014-03 Harald A BastiaanseVery Many Small Penguins

ILLC DS-2014-04 Ben RodenhauserA Matter of Trust Dynamic Attitudes in Epistemic Logic

ILLC DS-2015-01 Marıa Ines CrespoAffecting Meaning Subjectivity and evaluativity in gradable adjectives

ILLC DS-2015-02 Mathias Winther MadsenThe Kid the Clerk and the Gambler - Critical Studies in Statistics andCognitive Science

ILLC DS-2015-03 Shengyang ZhongOrthogonality and Quantum Geometry Towards a Relational Reconstructionof Quantum Theory

ILLC DS-2015-04 Sumit SourabhCorrespondence and Canonicity in Non-Classical Logic

ILLC DS-2015-05 Facundo CarreiroFragments of Fixpoint Logics Automata and Expressiveness

ILLC DS-2016-01 Ivano A CiardelliQuestions in Logic

ILLC DS-2016-02 Zoe ChristoffDynamic Logics of Networks Information Flow and the Spread of Opinion

ILLC DS-2016-03 Fleur Leonie BouwerWhat do we need to hear a beat The influence of attention musical abilitiesand accents on the perception of metrical rhythm

ILLC DS-2016-04 Johannes MartiInterpreting Linguistic Behavior with Possible World Models

ILLC DS-2016-05 Phong LeLearning Vector Representations for Sentences - The Recursive Deep LearningApproach

ILLC DS-2016-06 Gideon Maillette de Buy WennigerAligning the Foundations of Hierarchical Statistical Machine Translation

ILLC DS-2016-07 Andreas van CranenburghRich Statistical Parsing and Literary Language

ILLC DS-2016-08 Florian SpeelmanPosition-based Quantum Cryptography and Catalytic Computation

ILLC DS-2016-09 Teresa PiovesanQuantum entanglement insights via graph parameters and conic optimization

ILLC DS-2016-10 Paula HenkNonstandard Provability for Peano Arithmetic A Modal Perspective

ILLC DS-2017-01 Paolo GaleazziPlay Without Regret

ILLC DS-2017-02 Riccardo PinosioThe Logic of Kantrsquos Temporal Continuum

ILLC DS-2017-03 Matthijs WesteraExhaustivity and intonation a unified theory

ILLC DS-2017-04 Giovanni CinaCategories for the working modal logician

ILLC DS-2017-05 Shane Noah Steinert-ThrelkeldCommunication and Computation New Questions About Compositionality

ILLC DS-2017-06 Peter HawkeThe Problem of Epistemic Relevance

ILLC DS-2017-07 Aybuke OzgunEvidence in Epistemic Logic A Topological Perspective

ILLC DS-2017-08 Raquel Garrido AlhamaComputational Modelling of Artificial Language Learning Retention Recogni-tion amp Recurrence

ILLC DS-2017-09 Milos StanojevicPermutation Forests for Modeling Word Order in Machine Translation

ILLC DS-2018-01 Berit JanssenRetained or Lost in Transmission Analyzing and Predicting Stability in DutchFolk Songs

ILLC DS-2018-02 Hugo HuurdemanSupporting the Complex Dynamics of the Information Seeking Process

ILLC DS-2018-03 Corina KoolenReading beyond the female The relationship between perception of authorgender and literary quality

ILLC DS-2018-04 Jelle BruinebergAnticipating Affordances Intentionality in self-organizing brain-body-environmentsystems

ILLC DS-2018-05 Joachim DaiberTypologically Robust Statistical Machine Translation Understanding and Ex-ploiting Differences and Similarities Between Languages in Machine Transla-tion

ILLC DS-2018-06 Thomas BrochhagenSignaling under Uncertainty

ILLC DS-2018-07 Julian SchloderAssertion and Rejection

ILLC DS-2018-08 Srinivasan ArunachalamQuantum Algorithms and Learning Theory

ILLC DS-2018-09 Hugo de Holanda Cunha NobregaGames for functions Baire classes Weihrauch degrees transfinite computa-tions and ranks

ILLC DS-2018-10 Chenwei ShiReason to Believe

ILLC DS-2018-11 Malvin GattingerNew Directions in Model Checking Dynamic Epistemic Logic

ILLC DS-2018-12 Julia IlinFiltration Revisited Lattices of Stable Non-Classical Logics

  • Acknowledgements
  • Introduction
    • Matrix multiplication
    • The asymptotic spectrum of tensors
    • Higher-order CW method
    • Abstract asymptotic spectra
    • The asymptotic spectrum of graphs
    • Tensor degeneration
    • Combinatorial degeneration
    • Algebraic branching program degeneration
    • Organisation
      • The theory of asymptotic spectra
        • Introduction
        • Semirings and preorders
        • Strassen preorders
        • Asymptotic preorders
        • Maximal Strassen preorders
        • The asymptotic spectrum
        • The representation theorem
        • Abstract rank and subrank
        • Topological aspects
        • Uniqueness
        • Subsemirings
        • Subsemirings generated by one element
        • Universal spectral points
        • Conclusion
          • The asymptotic spectrum of graphs Shannon capacity
            • Introduction
            • The asymptotic spectrum of graphs
              • The semiring of graph isomorphism classes
              • Strassen preorder via graph homomorphisms
              • The asymptotic spectrum of graphs
              • Shannon capacity
                • Universal spectral points
                  • Lovaacutesz theta number
                  • Fractional graph parameters
                    • Conclusion
                      • The asymptotic spectrum of tensors matrix multiplication
                        • Introduction
                        • The asymptotic spectrum of tensors
                          • The semiring of tensor equivalence classes
                          • Strassen preorder via restriction
                          • The asymptotic spectrum of tensors
                          • Asymptotic rank and asymptotic subrank
                            • Gauge points
                            • Support functionals
                            • Upper and lower support functionals
                            • Asymptotic slice rank
                            • Conclusion
                              • Tight tensors and combinatorial subrank cap sets
                                • Introduction
                                • Higher-order CoppersmithndashWinograd method
                                  • Construction
                                  • Computational remarks
                                  • Examples type sets
                                    • Combinatorial degeneration method
                                    • Cap sets
                                      • Reduced polynomial multiplication
                                      • Cap sets
                                        • Graph tensors
                                        • Conclusion
                                          • Universal points in the asymptotic spectrum of tensors entanglement polytopes moment polytopes
                                            • Introduction
                                            • SchurndashWeyl duality
                                            • Kronecker and LittlewoodndashRichardson coefficients
                                            • Entropy inequalities
                                            • Hilbert spaces and density operators
                                            • Moment polytopes
                                              • General setting
                                              • Tensor spaces
                                                • Quantum functionals
                                                • Outer approximation
                                                • Inner approximation for free tensors
                                                • Quantum functionals versus support functionals
                                                • Asymptotic slice rank
                                                • Conclusion
                                                  • Algebraic branching programs approximation and nondeterminism
                                                    • Introduction
                                                    • Definitions and basic results
                                                      • Computational models
                                                      • Complexity classes
                                                      • The theorem of Ben-Or and Cleve
                                                      • Approximation closure
                                                      • Nondeterminism closure
                                                        • Approximation closure of VP2
                                                        • Nondeterminism closure of VP1
                                                        • Conclusion
                                                          • Bibliography
                                                          • Glossary
                                                          • Samenvatting
                                                          • Summary
Page 8: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 9: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 10: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 11: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 12: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 13: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 14: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 15: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 16: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 17: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 18: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 19: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 20: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 21: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 22: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 23: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 24: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 25: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 26: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 27: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 28: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 29: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 30: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 31: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 32: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 33: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 34: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 35: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 36: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 37: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 38: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 39: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 40: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 41: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 42: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 43: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 44: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 45: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 46: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 47: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 48: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 49: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 50: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 51: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 52: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 53: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 54: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 55: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 56: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 57: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 58: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 59: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 60: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 61: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 62: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 63: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 64: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 65: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 66: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 67: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 68: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 69: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 70: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 71: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 72: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 73: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 74: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 75: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 76: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 77: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 78: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 79: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 80: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 81: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 82: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 83: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 84: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 85: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 86: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 87: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 88: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 89: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 90: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 91: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 92: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 93: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 94: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 95: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 96: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 97: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 98: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 99: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 100: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 101: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 102: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 103: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 104: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 105: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 106: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 107: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 108: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 109: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 110: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 111: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 112: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 113: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 114: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 115: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 116: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 117: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 118: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 119: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 120: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 121: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 122: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 123: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 124: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 125: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 126: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 127: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 128: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 129: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 130: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 131: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 132: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 133: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 134: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 135: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 136: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 137: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 138: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 139: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 140: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 141: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 142: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 143: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 144: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 145: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 146: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 147: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 148: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 149: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 150: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 151: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 152: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 153: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 154: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 155: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 156: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 157: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch
Page 158: UvA-DARE (Digital Academic Repository)as the \webklas" on the Riemann hypothesis organised by Jan van de Craats and Roland van der Veen, and the close vicinity of the UvA to the Dutch