an introduction to proof complexity, part i. · j. kraj cek: bounded arithmetic, propositional...

An Introduction to Proof Complexity,Part I.

Pavel PudlakMathematical Institute, Academy of Sciences, Prague

andCharles University, Prague

Computability in Europe 2009, Heidelberg

1

computational complexity proof complexity

complexity classes theories

elementary operations basic axioms

computations with bounded resources induction restricted to some formulas

the P vs. NP problem the ΘP vs. ΘNP problem

circuits propositional proofs

2

Overview

Part I.

A lower bound for the propositional Pigeon-Hole Principle.

Theories and complexity classes.

Conditional and relativized separation of theories.

Part II.

Propositional proof systems.

Feasible interpolation.

Unprovability of circuit lower bounds.

Total search problems. (not presented at CiE)

Feasible incompleteness. (not presented at CiE)

3

Literature:

J. Krajıcek: Bounded Arithmetic, Propositional Logic and ComplexityTheory, 1995

P. Clote and E. Kranakis: Boolean Functions and Computational Models,Chapter 5, 2002

S. Cook and P. Nguyen: Foundations of Proof Complexity, to appear, 2009?

P.P.: Logical Foundations of Mathematics and Complexity Theory,Chapter 6, (manuscript)

J. Krajıcek, lecture notes available on his web-page

4

A lower bound for propositional Pigeon-Hole Principle

Paradigm: Exponential lower bound on the size of bounded depth circuitscomputing the parity function.

Theorem (Furst-Saxe-Sipser,Ajtai,Yao,Hastad)

Every bounded depth circuit computing parity has exponential size, i.e.,∀d∃εd∀n every boolean circuit of depth d computing the parity function of n bitshas size ≥ 2nεd .

εd = 1/10dSwitching Lemma:1. random restrictions,

∧∨7→∨∧

, reduction in the depth of the circuit2. the parity function is reduced to a parity function on the remaining variables3.If the circuit is small, we get eventually a k-CNF or k-DNF with k < n whichcannot compute the parity function.

5

A Frege Proof Systempropositional variables p1, p2, . . .any complete finite set of connectives. we shall use connectives ¬, ∨, and ∧.Any complete finite set of rules.

Example. [Hilbert and Ackermann]Connectives ¬,∨.Axiom schemas

1 ¬(A ∨ A) ∨ A

2 ¬A ∨ (A ∨ B)

3 ¬(A ∨ B) ∨ (B ∨ A)

4 ¬(¬A ∨ B) ∨ (¬(C ∨ A) ∨ (C ∨ B))

Rule

From A and ¬A ∨ B derive B.

Every two Frege proof systems polynomially simulate each other.

depth of a formula = number of alternations of ¬, ∨, and ∧.we shall abbreviate by

∨and

∧long disjunctions and conjunctions

depth d proofs: proofs that use only formulas of depth dbounded depth = constant depth

6

The tautology PHPn

The depth 3 tautology expressing the principle that there is no 1-1 mappingF : [n + 1]→ [n]:

¬

n+1∧i=1

n∨j=1

pij

∨ n∨j=1

∨1≤i<i ′≤n+1

(pij ∧ pi ′j).

Meaning: pij is true iff F (i) = j .The clause that F is a function is missing.A weaker version: bijective PHP.

Test question: Why PHP? It is simple, but not trivial.Parity Principle: There is no partition of [2n + 1] into two-element blocks.

Theorem (Ajtai, Krajıcek-Pudlak-Woods, Beame-Impagliazzo-Pitassi)

Every bounded depth proof of PHPn has exponential size, i.e.,∀d∃εd∀n every depth d Frege proof of the bijective PHPn has size ≥ 2nεd .

εd = 1/6d

7

The tautology PHPn


¬

n+1∧i=1

n∨j=1

pij

∨ n∨j=1

∨1≤i<i ′≤n+1

(pij ∧ pi ′j).


Test question: Why PHP?

It is simple, but not trivial.Parity Principle: There is no partition of [2n + 1] into two-element blocks.



εd = 1/6d

7

The tautology PHPn


¬

n+1∧i=1

n∨j=1

pij

∨ n∨j=1

∨1≤i<i ′≤n+1

(pij ∧ pi ′j).


Test question: Why PHP? It is simple, but not trivial.Parity Principle: There is no partition of [2n + 1] into two-element blocks.



εd = 1/6d

7

Basic Intuition: Truth assignments are 1-1 mappings [n + 1]→ [n] (i.e.,matchings).

Let |D| = n + 1, |R| = n, D ∩ R = ∅.A matching decision tree is a finite tree T with queries:

What is the matching element for i (or j)?

For a given i ∈ D or j ∈ R.

Formally:

DefinitionA matching decision tree is a finite tree T such that

all nodes different from leaves are labeled by elements of D ∪ R;

leaves are labeled by 0 (false) and 1 (true);

the edges going out of a node v labeled by i ∈ D are labeled by all j ∈ Rthat do not appear on the path from the root to v ;

the edges going out of a node u labeled by j ∈ R are labeled by all i ∈ Dthat do not appear on the path from the root to u.

Intuition: T defines a boolean function on all matchings F : D → R.

8

Basic Intuition: Truth assignments are 1-1 mappings [n + 1]→ [n] (i.e.,matchings).

Let |D| = n + 1, |R| = n, D ∩ R = ∅.A matching decision tree is a finite tree T with queries:

What is the matching element for i (or j)?

For a given i ∈ D or j ∈ R.

Formally:

DefinitionA matching decision tree is a finite tree T such that

all nodes different from leaves are labeled by elements of D ∪ R;

leaves are labeled by 0 (false) and 1 (true);

the edges going out of a node v labeled by i ∈ D are labeled by all j ∈ Rthat do not appear on the path from the root to v ;

the edges going out of a node u labeled by j ∈ R are labeled by all i ∈ Dthat do not appear on the path from the root to u.

Intuition: T defines a boolean function on all matchings F : D → R.8

The idea of the lower bound. Given a short proof of bounded depth, transformit into a sequence of shallow decision trees that represent the formulas. Then

1 axioms are represented by trees that accept all assignments,

2 logical rules preserve this property, but

3 PHPn is represented by the tree that does not accept anyassignment—contradiction.

Ajtai’s proof (of a weaker lower bound):

finite combinatorial constructions↓

nonstandard model with a 1-1 function F : [n + 1]→ n↓

lower bound on Frege proofs.

9

The idea of the lower bound. Given a short proof of bounded depth, transformit into a sequence of shallow decision trees that represent the formulas. Then

1 axioms are represented by trees that accept all assignments,

2 logical rules preserve this property, but

3 PHPn is represented by the tree that does not accept anyassignment—contradiction.

A branch b in a matching tree defines a partial matching σb ⊆ D × R.A partial matching σ defines a partial truth assignment aσ as follows

pij = 1, if (i , j) ∈ σpij = 0, if (i , j) 6∈ σ and (i ∈ Dom(σ) or j ∈ Rng(σ))pij = ∗ otherwise.

Definition

A matching tree represents a formula φ, if for every branch b of T , φ|σb≡ 0 or

≡ 1, and it equals to the label of the leaf of b.

10

In order to preserve the rules, trees must be shallow.Some formulas can only be represented by trees of depth n (the maximal depth);e.g.,

⊕ij pij .

Therefore, we have to apply random restrictions to make the formulasrepresentable by shallow trees.

Random restriction Rq, 0 < q < 1:pick randomly uniformly a partial matching ρ ⊆ D × R, |σ| = dqne.

Random restrictions are applied to formulas, it is a syntactical operation definedinductively:

pij |ρ := pij if aρ = ∗, and pij := aρ otherwise;1

1 ∧ φ := φ, 0 ∧ φ = 0;1 ∨ φ := 1, 0 ∨ φ = φ.

Test question: Why do we do it syntactically? Because semantics is trivial—allformulas in a proof are tautologies!

1Recall: a partial matching σ defines a partial truth assignment aσ as follows: pij = 1,if (i , j) ∈ σ; pij = 0, if (i , j) 6∈ σ and (i ∈ Dom(σ) or j ∈ Rng(σ)); pij = ∗ otherwise.

11


⊕ij pij .





1 ∧ φ := φ, 0 ∧ φ = 0;1 ∨ φ := 1, 0 ∨ φ = φ.

Test question: Why do we do it syntactically?

Because semantics is trivial—allformulas in a proof are tautologies!


11


⊕ij pij .





1 ∧ φ := φ, 0 ∧ φ = 0;1 ∨ φ := 1, 0 ∨ φ = φ.

Test question: Why do we do it syntactically? Because semantics is trivial—allformulas in a proof are tautologies!


11

k-CNF - a conjunction of disjunctions of size ≤ kk-DNF - a disjunction of conjunctions of of size ≤ kk-Tree - a tree of depth ≤ k

a k-Tree can be represented by a k-CNF and a k-DNF

12

Depth reduction by random restrictions

We want: ∨1

∧2

· · ·∧d−2

∨d−1

∧d

↓∨1

∧2

· · ·∧d−2

∧d−1

∨d

1. original ∨1

∧2

· · ·∧d−2

s − DNF

↓∨1

∧2

· · ·∧d−2

t − CNF

2. new way ∨1

∧2

· · ·∨d−2

∧d−1

s − Tree

↓∨1

∧2

· · ·∨d−2

t − Tree

13

2. new way in reality: ∨1

∧2

· · ·∧d−2

∨d−1

s − Tree

↓∨1

∧2

· · ·∧d−2

∨d−1

s − DNF

↓∨1

∧2

· · ·∧d−2

s − DNF

↓ (switching lemma)∨1

∧2

· · ·∧d−2

t − Tree

14

Lemma (Matching Switching Lemma)

Let q4n3 ≤ 1/10 and let φ be an s − DNF . Then for random ρ ∈ Rq , theprobability that φ|ρ can be represented by a t-Tree is at least

1− (9q4n3s)t .

[In Hastad’s Lemma the formula is 1− (5qs)t .]

The Lemma is applied withq = n−α, s = nβ , t = nγ , 0 < α, β, γ < 0and9q4n3s ≤ constant < 1.Recall: ρ ∈ Rq are partial matchings of size (1− q)n.

15

The proof.

1. Hastad’s method of conditional probabilities—very difficult to use here.

2. Razborov’s counting method.

Choose a suitable method that transforms an s − DNF into a tree T .Say that ρ ∈ Rq is bad, if T obtained from φ|ρ has depth > t.

probability ≤ # bad ρ

# all ρ

To upper bound the probability one constructs a 1-1 mapping from bad ρ’s to aset X such that X � # all ρ’s.

Alternatively: One shows that to determine a bad ρ we need substantially fewerbits than to determine a general ρ.

16

Given a bad ρ extend it to ρ′ := ρ ∪ σ, where σ is a partial matching computedby the first branch b of T of length > t. Thus

|ρ| = (1− q)n and |ρ′| > (1− q)n + t

# all ρ =(n+1qn

)(nqn

)((1− q)n)!

ρ′ is an element of a set of size ≤(

n+1qn−t

)(n

qn−t

)((1− q)n + t)!

Thus if ρ 7→ ρ′ were 1-1

probability ≤ # bad ρ

# all ρ≤(

n+1qn−t

)(n

qn−t

)((1− q)n + t)!(

n+1qn

)(nqn

)((1− q)n)!

Notice that (n + 1

qn − t

)�(

n + 1

qn

)and

(n

qn − t

)�(

n

qn

)ρ 7→ ρ′ is not 1-1, but the number of ρs mapped on a ρ′ is small.Formally, we define ρ 7→ (ρ′, I ) where I is some information that suffices torecover ρ from ρ′.. . .

17

Nonstandard semantics for matching trees

Lemma

Let M |= Tr(N) be a countable nonstandard model, a > N a nonstandard elementand t ∈ M such that a/t > N. Then there exist 1-1 and onto mappingsF : [a]→ [a− 1] such that for every internal set X ⊆ [a] (Y ⊆ [a− 1]), of size≤ t, F ∩ (X × [a− 1]) (F ∩ ([a]× Y )) is also internal.

In fact, for every partial matching U of size ≤ b there exists such an F thatextends U.

1 we can evaluate internal matching decision trees of depth ≤ t on suchmappings;

2 T is satisfied by all such mappings iff all leaves of T are labeled by 1.

18

An application

Theorem

(Ajtai, 1988) Let M |= Tr(N) be a nonstandard model, a > N a nonstandardelement. Let Ma be M restricted to the interval [0, a]. Then it is possible toextend Ma to Ma[F ] so that

1 F is a 1-1 mapping from [a] to [a− 1] and

2 induction holds for all formulas in Ma[F ].

The same holds true for M∗a =⋃

n∈N Man .Using exponential lower bounds we can improve it to M

2no(1) .

Corollary

PHP for F is not provable in I ∆0[F ].

( I ∆0[F ] is the theory with induction for bounded formulas in the language ofarithmetic extended by the function symbol F .) If the circuit is small, we geteventually a k-CNF or k-DNF with k < n which cannot compute the parityfunction.

19

Problem (notoriously open)

Does the schema I ∆0 prove the schema PHP∆0?

By Corollary, it does not do it uniformly.

Idea of the proof of the Theorem

By contradiction, suppose that PHP for F is provable from the diagram of Ma

and induction. Let P be such a proof. W.l.o.g. assume that F occurs only inatomic formulas of the form F (x) = y , where x , y are variables.

Arguing in M,

1 replace ∃x ≤ t φ and ∀x ≤ t φ by∨

x≤t φ and∧

x≤t φ,

2 replace each application of the induction axioms by iterated modus ponens,

3 replace atomic formulas not containing F by the truth constants,

4 replace atomic formulas F (b) = c by propositional variables pbc

Thus we obtain bounded depth Frege proof of PHPa−1 of polynomial size.This is a contradiction, because M |= Tr(N).

q.e.d.Questions?

20



By Corollary, it does not do it uniformly.Idea of the proof of the Theorem



Arguing in M,


x≤t φ and∧

x≤t φ,





q.e.d.

Questions?

20



By Corollary, it does not do it uniformly.Idea of the proof of the Theorem



Arguing in M,


x≤t φ and∧

x≤t φ,





q.e.d.Questions?

20

Theories and complexity classes

C complexity class ↔ ΘC first order theory

suitable language L

class of formulas Γ that characterizes the class C

basic axioms (fixing the interpretation of primitive notions)

induction for formulas of Γ

Examples.

1. C = ARITH, ΘARITH is Peano Arithmetic.Basic axioms - Robinson’s Arithmetic

2. C = PR, ΘPR is Primitive Recursive Arithmetic.

3. C = ∆0, Θ∆0 is I ∆0, i.e., Peano Arithmetic with induction restricted to ∆0

formulas (= formulas with bounded quantifiers ∀x ≤ t and ∃x ≤ t, t a term).

Linear Time Hierarchy = ∆0, thus also ΘLinear Time Hierarchy ≡ I ∆0

21

4. PH is the Polynomial (Time) Hierarchy,⋃

k Σpk

ΘPH ≡ I ∆0[x log x ] ≡ I ∆0 + ∀x∃y(y = x log x)

Buss’s Bounded Arithmetic

Language 0,S ,+,×, bx/2c, log x , 2log x. log y ,≤Classes of bounded formulas Σb

k (defines class of sets Σpk).

Basic axioms BASIC

T k2 := BASIC + induction for Σb

k

For k ≥ 1, ΘΣpk≡ T k

2 .

In particular, ΘNP ≡ T 12 .

But Σb0 does not define all sets in P and T 0

2 is not ΘP.

22

How to define ΘP?

1. Cook: theory PV with a function symbol for every polynomial time algorithm(using the schema of recursion on notation).

2. Jerabek: extend Buss’s language by bx/2yc and add a few axioms to BASIC.Then

T 02 ≡ ΘP

in spite of the fact that (probably) the extended Σb0 still does not define all sets

in P.

ΘP versus ΘNP

Theorem (Krajıcek-Pudlak-Takeuti 1991)

If ΘP ≡ ΘNP, then Σp2 = Πp

2 .

If ΘP ` P = NP, then ΘP ≡ ΘNP.

Problem (central in proof complexity)

Prove ΘP 6≡ ΘNP.

23

Witnessing functions

FP the class of polynomial time computable functions.

Theorem (Buss 1984)

1. Suppose ΘP ` ∀x∃y φ(x , y), where φ ∈ Σb1 . Then there exists f ∈ FP such

thatN |= ∀x φ(x , f (x)). (1)

2. If f ∈ FP then there exists a formula φ(x , y) such that ΘP ` ∀x∃y φ(x , y)and (1).

We say that FP are witnessing functions for ΘP.Thus, e.g., T 0

2 is associated with FP.

Cook’s approach to C ↔ ΘC

Say that a function is in a complexity class C if the bit graph of f is in C.

A theory T is associated with a complexity class C, if the functions (with the bitgraph) in C are witnessing functions of T .

24

Second order theories

1. w ∈ {0, 1}n ↔ binary representation of a number m < 2n

2. w ∈ {0, 1}n ↔ X ⊆ [0, n − 1]

As usual, induction for sets is the same, the schema of comprehension depends onthe complexity class.

This gives more flexibility to define theories associated with various complexityclasses.

25

The witnessing theorem for conditional separations of ΘP from ΘNP

ΘNP = ΘP + ∀x∃y(y = maxx ;x≤a

f (a, x)),

where f is polynomial time computable function such that f (a, x) ≤ a.The parameter a will be omitted from f .

f (b) = maxx f (x) ⇔ ∀y ≤ a(f (b) ≥ f (y))

TheoremIf ΘP = ΘNP then ∃n ∃ polynomial time computable functions g1, . . . gn such that∀y1, . . . , yn ≤ a

g1(a) ≤ a ∧ g(a, y1) ≤ a ∧ · · · ∧ g(a, y1, . . . , yn−1) ≤ a

and

f (g1(a)) ≥ f (y1) ∨ f (g2(a, y1)) ≥ f (y2) ∨ · · · ∨ f (gn(a, y1, . . . , yn−1)) ≥ f (yn).

= interactive computation of maxx f (x) using counterexamples.If Σp

2 6= Πp2 , then such a computation is not possible, hence ΘP 6= ΘNP.

26

f (a, g1(a)) ≥ f (a, y1) ∨ f (a, g2(a, y1)) ≥ f (a, y2) ∨ . . .∨f (a, gn(a, y1, . . . , yn−1)) ≥ f (a, yn)

is obtained from Herbrand’s theorem applied to

∀a∃x∀y f (a, x) ≥ f (a, y).

Theorem (Herbrand’s Theorem for prefix ∀∃∀)

∀a∃x∀y φ(a, x , y)

is provable in the predicate calculus iff

φ(a, t1(a), y1) ∨ φ(a, t2(a, y1), y2) ∨ · · · ∨ φ(a, tn(a, y1, . . . , yn−1), yn)

is provable in the propositional calculus for some terms t1, . . . , tn with onlyvariables displayed.

27

Relativized separations

In complexity theory we have, e.g., oracles A,B such that PA = NPA andPB 6= NPB [Baker-Gill-Solovay 1975].

DefinitionLet Γ be a set of formulas defined by a syntactical condition that ignores atomicformulas. Let T be a theory axiomatized by a schema (of induction, PHP, etc.)for the class of formulas Γ. Then the relativized theory T [R] is defined byextending the class of formulas Γ to Γ[R], R a new predicate symbol, andextending the schema of T to the schema for all formulas in Γ[R]. T [R] does notcontain any specific axioms about R.

Theorem

PHP∆0[R] 6≡ I ∆0[R].

28

Theorem

ΘP[R] 6≡ ΘNP[R].

Proof:The proof of ΘP ≡ ΘNP ⇒ Σp

2 = Πp2 relativizes. Use an oracle A such that

Σp,A2 = Πp,A

2 whose existence is a consequence of Hastad’s lower bounds.

TheoremFor all k ≥ 0,

1. If ΘΣbk≡ ΘΣb

k+1, then Σp

k+2 = Πpk+2.

2. ΘΣbk[R] 6≡ ΘΣb

k+1[R].

Note: The oracles separating Σpk+2 from Πp

k+2 are constructed using lower boundson bounded depth boolean circuits [Hastad 1989]. These bounds were provedusing random restrictions.

29

an introduction to proof complexity, part i. · j. kraj cek: bounded arithmetic, propositional...

Documents