ipec 2011 - kernelization1 a tutorial on kernelization hans l. bodlaender

65
IPEC 2011 - Kernelization 1 A tutorial on Kernelization Hans L. Bodlaender

Upload: stanley-warren

Post on 02-Jan-2016

229 views

Category:

Documents


2 download

TRANSCRIPT

IPEC 2011 - Kernelization1

A tutorial onKernelization

Hans L. Bodlaender

IPEC 2011 - Kernelization2

On this talk

1. Kernelization: what, why, how?

2. Connection with fixed parameter tractability: problems without kernels

3. Problems without polynomial kernels

4. Conclusions

• Thanks to many people for inspiration, collaboration and contribution!

IPEC 2011 - Kernelization3

What to do with hard problems?

• Many combinatorial problems are hard, e.g., NP-complete– Arising in many contexts

• Approaches to deal with them:– Approximations– Special cases– Exact algorithms

• ILP techniques, branch and bound, sat-solvers, …

IPEC 2011 - Kernelization4

Useful approach: preprocessing

• Before running a slow exact algorithm, preprocess / simplify the data:

• Transform to (hopefully smaller) equivalent instance

Input

Equivalent smaller input I’

Output for I’

Output for I’

Preprocess

Solve

Undo preprocessing

IPEC 2011 - Kernelization5

On preprocessing

• Relatively fast step

• Attempt to obtain equivalent instance:– Answer does not change– Size may decrease

• Slow algorithm (like ILP-solver) uses hopefully less time on reduced instance

IPEC 2011 - Kernelization6

Kernelization: central question

• What can we prove on the size of a reduced instance, assumingpolynomial time preprocessing?

IPEC 2011 - Kernelization7

What we cannot expect

Proposition If P NP, then an NP-hard problem Q has no polynomial time preprocessing algorithm A that always reduces an input to a smaller equivalent input.

Proof: If A exists, then P=NP: repeat the algorithm till we have an instance of size O(1) and solve.

• So … instead we investigate reduced instance size as function of a parameter of the inputs.

IPEC 2011 - Kernelization8

Parameterized problem

• Subset of * x N: – (“real input”, “parameter”)– Decision problem– ‘Part of the input is an integer, called the

parameter

• We express quality of preprocessing as function of the parameter

IPEC 2011 - Kernelization9

Classic first example: Vertex Cover

• Vertex cover:– Input: Graph G=(V,E),

integer k

– Question: Is there a vertex cover of size at most k in G, i.e., a set W V such that for all {v,w} E, v W or w W?

– Parameter: k

IPEC 2011 - Kernelization10

Simplification rules

Input: Graph G and integer k• Rule 1: remove vertex with no neighbors.• Rule 2: if a vertex v has k+1 or more

neighbors, then – {v must be in each vertex cover of size k}– Remove v and all incident edges;– Set k = k – 1.

• Rule 3: if k < 0, then say no.

IPEC 2011 - Kernelization11

Counting rule

• Rule 4: if earlier rules do not apply, and we have more than k2 edges, then say no.– Each vertex has degree at most k so can cover at

most k edges. So, if G has more than k2 edges, there is no vertex cover of size at most k.

• Algorithm: execute rules while possible.• Output of algorithm:

– Sometimes no (certainly not a solution).– Sometimes a equivalent instance with at most k2

edges (and hence O(k2) vertices).

IPEC 2011 - Kernelization12

“Kernel” for Vertex Cover

Theorem There is a polynomial time algorithm, that given an input (G,k) of Vertex Cover, either decides on this input or gives an equivalent instance with O(k2) vertices and edges.

• Instead of deciding, we can also transform to trivial no instance (e.g., graph with one edge and k=0).

• We say:– Vertex Cover has kernel with O(k2) vertices

and edges.

IPEC 2011 - Kernelization13

Kernel (definition)

• A kernelization algorithm or kernel for a parameterized problem Q is an algorithm A that maps inputs for Q to inputs for Q, such that

– A uses polynomial time;

– For all (x,k): (x,k) Q if and only if A(x,k) Q;

– There is a function f such that for all (x’, k’) = A(x,k):

• k’ f(k);

• |x’| f(k).

Size and parameter of new instancebounded by function of old parameterSize and parameter of new instancebounded by function of old parameter

IPEC 2011 - Kernelization14

Research questions

• For parameterized problems Q:– Does Q have a kernel?– If so, how small (function f) can this kernel be?

• Linear kernels?

• Polynomial kernels?

• Any kernels?

Motivation for kernels

• Analysis of preprocessing.

• Kernels give new preprocessing steps.

• First step for FPT algorithms.

IPEC 2011 - Kernelization15

Compare

• Approximation algorithm =upper bound and lower bound heuristic + a proof of its quality.

• Kernel =preprocessing heuristic+ a proof of its quality.

IPEC 2011 - Kernelization16

IPEC 2011 - Kernelization17

Overview of problem behavior• O(1) size kernels: problems in P. Ex: Eulerian

– NP-completeness (variable parameter) • Polynomial kernels Shown with algorithm. Ex.: Vertex

Cover– compositionality, ppt-transformations, cross-composition

• Kernels, but not polynomial sized. Shown (usually) with FPT-algorithm. Ex: Long Path– W[1]-hardness

• XP: No kernel, polynomial if parameter is bounded. Ex.: Independent Set– NP-completeness (fixed parameter)

• Bad. Example: Graph Coloring is NP-complete for 3 colors

IPEC 2011 - Kernelization18

How do we make kernelization algorithms

• General method:– Invent Safe Rules.

• Safe rules change an instance to an equivalent instance.

– Rules should modify instances to equivalent instances that are

• Smaller or

• Give more structural insight.

– Have a • Counting rule or a

• Counting argument.

IPEC 2011 - Kernelization19

Designing the algorithm

Repeat until we have a (small enough) kernel:

• Invent safe rules.

• Analyse instances: if no safe rule applies, is the instance size bounded? If not, why not? Can we find a rule that avoids such situations?

IPEC 2011 - Kernelization20

Example: Weighted marbles

• Instance: sequence of marbles and an integer k. Each marble has a positive integer cost and a color.

• Question: can we remove marbles of total cost at most k such that for each color, all marbles with that color are consecutive?

• Parameter: k.

3 4 1 3 2 2 6

4 1 3 2 6

Solution of cost 5=3+2

IPEC 2011 - Kernelization21

Rule 1

• If we have two consecutive marbles of the same color, replace it by one with the sum of the weights.

5 7 .. ......

12 .. ......

IPEC 2011 - Kernelization22

What we have now:

• Two successive marbles have a different color.

• But, we can have many color changes, even in a solution of cost 1.

3 4 1 3 2 2 6 1

IPEC 2011 - Kernelization23

Good colors

• A color is good, if there is only one marble with this color.

3 4 1 3 2 2 6 1

IPEC 2011 - Kernelization24

Rule 2

• Suppose two successive marbles both have a good color. Give the second the color of the first.

3 4 1 3 2 2 6 1

3 4 1 3 2 2 6 1

3 times Rule 2

IPEC 2011 - Kernelization25

Rule 2

• Suppose two successive marbles both have a good color. Give the second the color of the first

3 4 1 3 2 2 6 1

3 4 1 3 2 2 6 1

3 times Rule 2

Rule 2 does not make the instance smaller, but it makes it

simpler: fewer colors! I.e., increases our structural insight!

Rule 2 does not make the instance smaller, but it makes it

simpler: fewer colors! I.e., increases our structural insight!

IPEC 2011 - Kernelization26

Algorithm

• While Rule 1 or Rule 2 is possible: apply the rule.

• Afterwards:– No 2 successive marbles of the same color.– No 2 successive marbles with a good color.

• The number of marbles is at most twice (+1) the number of marbles with a bad color.

• Can we bound the number of bad colored marbles?

IPEC 2011 - Kernelization27

Rule 3: counting rule

• If there are at least 2k+1 bad colored marbles, say no.– Safeness:

By deleting one marble, the number of bad colored marbles can decrease by at most 2 (assuming rule 1).

• Applying rules 1, 2, 3 while possible gives an instance with O(k) marbles.

• Is this a kernel for the problem?

Or transform to O(1) size no-instance

Or transform to O(1) size no-instance

IPEC 2011 - Kernelization28

Rule 4

• If a marble has weight > k+1, give it weight k+1.– Safeness: marble is never removed.

• Kernelization algorithm:– While Rules 1 – 4 are possible, apply them.

• Polynomial time. Gives equivalent instance with O(k log k) bits and O(k) marbles.

• Theorem: Weighted marbles problem has kernel of size O(k log k).

IPEC 2011 - Kernelization29

Many recent results

• Kernelization usually algorithms of form:– Rules.

Often with nontrivial correctness proofs.

– Counting argument.Often nontrivial combinatorics.

• General techniques: meta-algorithms, crown reductions, protrusions, …

• Sometimes, no (small) kernel (seems to) exist: can we show this?

IPEC 2011 - Kernelization30

Connection with Fixed Parameter Tractability

• A parameterized problem P is Fixed Parameter Tractable ( FPT) if there is an algorithm solving P that uses on inputs (x,k) in time

– f(k) * |x|c

– for a constant c– and some (computable) function f.

IPEC 2011 - Kernelization31

Three variants of FPT

• Non-uniform: – For constant c: for every k, there is an algorithm that

runs in O(nc) time.

• Uniform:– For constant c, for a function f: there exists an

algorithm that runs in f(k)nc time.

• Strongly uniform:– For constant c, for a computable function f: there

exists an algorithm that runs in f(k)nc time.

IPEC 2011 - Kernelization32

Relation between variants

• Non-uniform is a proper subset of uniform.– Example 1: {(x,k) | k X} for some undecidable set of

integers X is in non-uniform but not in uniform FPT.

– Example 2: if w is a graph parameter that does not increase by taking minors, then Robertson-Seymour theory tells that {(G,k) | w(G) k} is in non-uniform FPT.

• Uniform is proper subset of strongly uniform.– Proof by Downey and Fellows.

IPEC 2011 - Kernelization33

A useful theorem with a curious proof

Theorem (Folklore) A decidable parameterized problem P belongs to (uniform) FPT, if and only if it has a kernel.

Proof : If P has a kernel, then we have an FPT-algorithm:

• Given input (x,k),• Apply kernelization and obtain (x’, k’).• Now, use any algorithm to solve (x’, k’).

– Answer is the same.– Running time poly(|x|) + g(f(k)).

– : …

IPEC 2011 - Kernelization34

A useful theorem with a curious proof (II)

Theorem (Folklore) A decidable parameterized problem P belongs to (uniform) FPT, if and only if it has a kernel.

Proof continued : If P has an algorithm A that uses f(k) nc time:

• Suppose we have input (x, k) with |x| = n.• Run A for nc+1 steps.• If A halts we have the answer (transform to O(1) size yes- or

no-instance).• If A does not halt, just output the original instance (x, k): we

have nc+1 f(k)* nc so n f(k).

IPEC 2011 - Kernelization35

Variants

Theorem (Folklore) A decidable parameterized problem P belongs to strongly uniform FPT, if and only if it has a kernel of size bounded by a computable function.

• Same proof.• Problems in non-uniform FPT do not need to have a

kernel.

• Practical consideration on variants: it does not matter if you use uniform or strongly uniform, as long as you don’t make mistakes…

IPEC 2011 - Kernelization36

Implications of the theorem

• Positive:– Technique to obtain FPT-algorithms:

• Make small kernel.

• Algorithm on resulting small instance.

• Negative:– If we have evidence that there exists no FPT-

algorithm, we also have evidence that there exists no kernel.

IPEC 2011 - Kernelization37

Negative results

• Downey-Fellows introduce complexity classes of parameterized problems that are unlikely to have FPT algorithms, e.g. W[1].

• Hardness is shown with “parameterized variant of many-one reductions”.

Theorem If W[1] = FPT, then the Exponential Time Hypothesis is not valid.

Corollary A parameterized problem that is W[1]-hard has no kernel, unless the ETH does not hold.

IPEC 2011 - Kernelization38

Many W[1]-hard problems

• Many problems are W[1]-hard, e.g.: Clique, Independent Set, Dominating Set, …

• Canonical W[1]-complete problem:– Input: Boolean formula F in conjunctive normal form.

– Question: Can we satisfy F by setting at most k variables to true?

– Parameter: k.

• No kernels for these, unless W[1] = FPT and hence the Exponential Time Hypothesis fails.

IPEC 2011 - Kernelization39

Problems with large kernels

• For many problems in FPT, we do not know small kernels.

• Consider:Long Path– Given: Graph G=(V,E), integer k.– Question: Does G have a simple path of length at least

k?– Parameter: k.

• Is in FPT, but all known kernels have size exponential in k…

IPEC 2011 - Kernelization40

Does Long Path have a kernel of polynomial size? Maybe not…

• Suppose we have a polynomial kernel, say with kc bits size.

k k’Size bounded by kc

IPEC 2011 - Kernelization41

Long path continued

• Now, suppose we have a series of inputs to long path, say all with the same parameter:(G1,k), (G2,k), …, (Gr,k).

k k k

IPEC 2011 - Kernelization42

Take the disjoint union• G1 G2 … Gr has a simple path of length k,

if and only if there exists a graph Gi that has a path of length k.

k k k

k k

IPEC 2011 - Kernelization43

And now, apply the kernel to the union

k k k

k k

k’ Size bounded by kc

IPEC 2011 - Kernelization44

What happened?

• We have many (say r = k2c) instances of Long Path, and transform it to one instance of size < kc.

• Intuition: this cannot be possible without solving some of the instances, as we have fewer bits left than we had instances to start with…

• Theory (next) formalizes this idea.

IPEC 2011 - Kernelization45

(Or-)Compositionality

• A parameterized problem Q is or-compositional, if there is an algorithm that– Receives as input a series of inputs to Q, all with the

same parameter (I1,k), …, (Ir,k);

– Uses polynomial time;

– Outputs one input (I’,k’) to Q;

– k’ bounded by polynomial in k;– (I’,k’) Q if and only if there exists at least one j with

(Ij,k) Q.

IPEC 2011 - Kernelization46

Or-composition

poly(t*n + k) time

Qinstance

Qinstance

Q instances

Q instances

x1 k x2 k x.. k xt k

n

x* k*

poly(k)

IPEC 2011 - Kernelization47

Compositionality gives lowerbounds for kernels

Theorem (B, Downey, Fellows, Hermelin + Fortnow, Santhanam, 2008) Let P be a parameterized problem that is– Or-compositional, and– “Unparameterized form” is NP-complete.

Then P has no polynomial kernel unless NP coNP/poly.

• Variant for and-compositionality is still open problem…

IPEC 2011 - Kernelization48

• Input: t instances of Longest Path.

• Take disjoint union, output as (G’, k).

• G’ has a path of length k some Gi has a path of length k.

• Output parameter trivially bounded in poly(k).

,k,k ,k,k ,k,k ,k,k ,k,k

,k,k

Long Path does not admit a polynomial kernel unless

NP coNP/poly⊆

Long Path does not admit a polynomial kernel unless

NP coNP/poly⊆

Application to Long Path

Additional techniques (1)

• Polynomial parameter transformations (several authors): transform an argument that problem X does not have a polynomial kernel to an argument that problem Y does not have a polynomial kernel.

• Chen et al. (2009): no kernels of size kc n1-

(unless NP coNP/poly).• Cross-compositions (B, Jansen, Kratsch, 2010):

(composition of instances of problem X into instances of problem Y).– Composition of 2n instances suffices.

IPEC 2011 - Kernelization49

Additional techniques (2)

• Dell and van Melkebeek (2010): extend technique to precise lower bounds, e.g.: (k2) bits for kernel for Vertex Cover (unless NP coNP/poly).– New results by Dell and Marx, 2011.

• Weak composition: (Hermelin and Wu, 2011): polynomial lower bounds for several problems; super quasi polynomial lower bounds.

IPEC 2011 - Kernelization50

IPEC 2011 - Kernelization51

Nonstandard parameters

• Many results on “objective parameter”, e.g. size of solution:– Minimum size of vertex cover.

– Maximum length of path.

– …

• But one can also use other parameters:– E.g., treewidth of input graph.

• Rich structure!

IPEC 2011 - Kernelization52

Two problems

Treewidth parameterized by vertex cover

• Input: Graph G=(V,E), integer k, vertex cover X of G.

• Parameter: |X|.

• Question: Has G treewidth at most k?

Vertex cover parameterized by treewidth

• Input: Graph G=(V,E), integer k, tree decomposition of G

• Parameter: Width of tree decomposition.

• Question: Has G treewidth at most k?

IPEC 2011 - Kernelization53

Negative result

Theorem (B et al. 2010) Vertex cover parameterized by treewidth has no polynomial kernel, unless NP coNP/poly.

• Refinement version is:– Or-compositional,– NP-complete,– So has no polykernel unless

NP coNP/poly.• And hence problem itself

has no polykernel unless

Vertex cover REFINEMENT parameterized by treewidth

• Input: Graph G=(V,E), integer k, tree decomposition of G, vertex cover X of G of size at most k+1.

• Parameter: width of tree decomposition.

• Question: has G treewidth at most k?

IPEC 2011 - Kernelization54

Positive result

Theorem (B, Jansen, Kratsch, 2011) Treewidth parameterized by vertex cover has a kernel with O(|X|3) vertices.

• Algorithm combines old preprocessing ideas!

• Heuristics for treewidth preprocessing:– Folklore.

– B, Koster, vd Eijkhof, 2001: Preprocessing heuristics for treewidth.

– Taken from (B, 1993, “Linear time algorithm”): vertices with many common neighbors.

IPEC 2011 - Kernelization55

Rules for treewidth

• Rule 1: If k > |X|, decide yes.– G has treewidth at most (vertex cover size)+1.

• Rule 2: If non-adjacent v and w in X have k+2 common neighbors, add the edge {v,w}.– Taken from B, 1993 (linear time fpt algorithm for

treewidth).• Rule 3 (simplicial vertex rule): If the neighbors of

v form a clique (v is simplicial):– If v has degree at most k, then remove v.– If v has degree more than k, then decide no.

• G has a clique of size k+2: treewidth(G)>k.

IPEC 2011 - Kernelization56

Counting

• If no rule applies:– Associate each vertex z in V-X to a non-edge

{v,w} with v, w neighbors of z.

• We can associate at most k+1 vertices to a pair in X.

• So, |V-X| = O(k |X|2) = O(|X|3).

• QED

IPEC 2011 - Kernelization57

Remark

• We saw a cubic vertex kernel for Treewidth parameterized by vertex cover.

• Polynomial kernel for parameterization of treewidth by feedback vertex set.– More and more complicated rules.

– Some rules generalize Almost simplicial vertex rule from B, Koster, vd Eijkhof 2001 .

• Interaction between preprocessing heuristics and kernelization.

IPEC 2011 - Kernelization58

Disjoint cycles

• Disjoint cycles– Given: Graph G=(V,E), integer k.– Question: Does G contain k vertex disjoint cycles?– Parameter: k.

• NP-complete, FPT, but does it has a polynomial kernel??• Resembles Feedback Vertex Set, but behaves differently!

– Feedback vertex set• Given: Graph G, integer k.• Question: Is there a set of k vertices W such that G-W has no cycle?• Parameter: k.

– FVS has O(k2) kernel (Thomassé)

IPEC 2011 - Kernelization59

PPT-transformation

• A polynomial-parameter-time transformation (ppt-transformation) P to Q is an algorithm– which takes an instance (x,k) of P as input,– uses time polynomial in |x| + k,– outputs an instance (x’, k’) of Q with

• (x,k) P ∈ (x’, k’) Q,∈• k’ is polynomial in k.

Theorem: If P has a ppt-transformation to Q, Q is NP-complete, P is in NP, and P has no polynomial kernel, then Q has no polynomial kernel.

IPEC 2011 - Kernelization60

Proof

Theorem: If P has a ppt-transformation to Q, Q is NP-complete, P is in NP, and P has no polynomial kernel, then Q has no polynomial kernel.

Proof Suppose Q has a polynomial kernel. Build a polynomial kernel for P as follows:– Take input (x,k) for P.– Transform (x,k) to input (y,l) for Q with ppt-transformation.– Use kernel on (y,l): gives equivalent (y’,l’) for Q with polynomial

size bound on |y|.– NP-completeness gives transformation from Q to P: apply it to

(y’,l’) gives equivalent (x’,k’) with |x’| polynomially bounded in |y’|+l’, which is polynomially bounded in (x,k).

IPEC 2011 - Kernelization61

Intermediate problem: Disjoint Factors

• Disjoint Factors– Given: Integer k, string s on alphabet {1, 2, … , k}.– Question: Can we find disjoint substrings s1, s2, … , sk in s such

that si starts and ends with i?– Parameter: k

• Disjoint Factors is NP-complete.• Solvable with Dynamic Programming in 2k |s| time.• Next: compositionality.

14324141324142312412

IPEC 2011 - Kernelization62

Disjoint Factors is compositional: proof by example

• Number of instances r can be bounded by 2k otherwise we can solve them all in polynomial time.

• Take log r new characters, and build new string, like (example for r=4):– b a s1 a s2 a b a s3 a s4 a b– New characters “eat” all but one instance, in which we must then

find the other factors:

• b a s1 a s2 a b a s3 a s3 a bCorollary: Disjoint Factors has no polynomial kernel unless

NP coNP/poly.

IPEC 2011 - Kernelization63

PPT-transformation from Disjoint Factors to Disjoint Cycles

14324141324142312412

2 3 41

Disjoint Cycles does not admit a polynomial kernel unless

NP coNP/poly⊆

Disjoint Cycles does not admit a polynomial kernel unless

NP coNP/poly⊆

IPEC 2011 - Kernelization64

Overview of problem behavior• O(1) size kernels: problems in P. Ex: Eulerian Graph

– NP-completeness (variable parameter) • Polynomial kernels Shown with algorithm. Ex.: Vertex

Cover– compositionality, ppt-transformations, cross-composition

• Kernels, but not polynomial sized. Shown (usually) with FPT-algorithm. Ex: Long Path– W[1]-hardness

• XP: No kernel, polynomial if parameter is bounded. Ex.: Independent Set– NP-completeness (fixed parameter)

• Bad. Example: Graph Coloring is NP-complete for 3 colors

IPEC 2011 - Kernelization65

Conclusions

• Kernelization:– Allows to give preprocessing algorithms with a provable

guarantee on the size of resulting instances.– Generates new rules for preprocessing.– Reveals new complexity structure in combinatorial

problems.

• Lively research area:– Many kernelization algorithms, including meta-results

and kernel races.– New lower bound techniques.– Many nice open problems …