bibliography - unict · bibliography ! una descrizione della teoria della complessit à, e ......

32
AA. 2014/2015

Upload: buimien

Post on 16-Feb-2019

214 views

Category:

Documents


0 download

TRANSCRIPT

AA. 2014/2015

Bibliography

� Una descrizione della Teoria della Complessità, e più in generale delle classi NP-complete, possono essere trovate in:

�  M. R. Garey and D. S. Johnson, Computers and Intractability: A Guide to the Theory of NP-Completeness, W. H. Freeman, 1st ed. (1979)

�  Cormen, Leiserson, Rivest and Stein, Introduzione agli Algoritmi e Strutture Dati, McGraw-Hill (capitolo 34)

�  A compendium of NP optimization problems at the link:

http://www.csc.kth.se/~viggo/problemlist/

NP-­‐Completeness  Polynomial-­‐time  algorithms  on  inputs  of  size  n:  O(nk)  

All  problems  can  be  solved  in  polynomial  time?  NO  

Halting  Turing  Problem:  Given  a  description  of  an  arbitrary  computer  program  and  a  ;ixed  input  (;inite),  decide  whether  the  program  ;inishes  running  or  continues  to  run  forever  

  The  problems  can  be  tractable  or  inctractable  

If  you  can  establish  a  problem  as  NP-­‐complete,  you  provide  good  evidence  for  its  intractability  

spending  the  time  developing  an  approximation  algorithm,  rather  than  searching  for  a  fast  algorithm  

What is a Problem?

�  We define a problem Q to be a binary relation on a set I of problem instances and a set S of problem solutions:

Q: I → S �  Shortest-path problem: finding a shortest path between

two given vertices in an unweigthed, undirected graph G=(V, E)

�  instance: is a triple consisting of a graph and two vertices

�  solution: a sequence of vertices in the graph (path)

�  In general a given problem instance may have more than one solution

What is a Decision Problem? �  The theory of NP-completeness restricts attention to

decision problems: those having yes/no solution.

�  Decision problem GCP: "Is a given graph G=(V,E) K-colorable?”

�  Decision Problems: Q: I → {0, 1} �  Example: if i=<G,k> is an instance of the graph

coloring problem then COL(i)=1 (yes) if there exist a k-coloring, and COL(i)=0 (no) otherwise.

�  Optimization Problems: those where some value must be minimized or maximized.

Mario Pavone, AA. 2014/2015 – Computazione Naturale – CdL Magistrale in Informatica –DMI – UniCt – [email protected]

Review: P and NP

�  Summary so far: � P = problems that can be solved in polynomial time

� NP = problems for which a solution can be verified in polynomial time

� Unknown whether P = NP (most suspect not)

�  Hamiltonian-cycle problem is in NP: � Cannot solve in polynomial time

� Easy to verify solution in polynomial time

Mario Pavone, AA. 2014/2015 – Computazione Naturale – CdL Magistrale in Informatica –DMI – UniCt – [email protected]

Hamiltonian Cycle Problem

�  A hamiltonian cycle of an undirected graph G=(V,E) is a simple cycle that contains each vertex in V �  “Does a graph G have a hamiltonian cycle?”

�  Algorithm: lists all permutations of the vertices and check each permutation to see if it is a hamiltonian path �  there are m! Possible permutations of the vertices

�  afterwards CHECK if the provided cycle is hamiltonian: �  i.e. if it is a permutation of the vertices of V and each consecutive

edges along the cycle exists in the graph

�  This verification can be implemented to run in O(n2) time �  If a hamiltonian cycle exists in a graph can be verified in polynomial

time

Mario Pavone, AA. 2014/2015 – Computazione Naturale – CdL Magistrale in Informatica –DMI – UniCt – [email protected]

Hamiltonian Cycle Problem �  Hamiltonian Cycle (input: a graph G) �  Does G have a Hamiltonian cycle?

Solution:

0, 1, 2, 11, 10, 9, 8, 7, 6, 5, 14, 15, 6, 17, 18, 19, 12, 13, 3, 4

Such solution can be evaluated in a polynomial time!

Mario Pavone, AA. 2014/2015 – Computazione Naturale – CdL Magistrale in Informatica –DMI – UniCt – [email protected]

Reduction

�  The crux of NP-Completeness is reducibility

� Informally, a problem P can be reduced to another problem Q if any instance of P can be “easily rephrased” as an instance of Q, the solution to which provides a solution to the instance of P � This rephrasing is called transformation

� Intuitively: If P reduces to Q, P is “no harder to solve” than Q � If P is polynomial-time reducible to Q, we denote this P ≤p Q

Mario Pavone, AA. 2014/2015 – Computazione Naturale – CdL Magistrale in Informatica –DMI – UniCt – [email protected]

Reducibility

�  An example: �  P: Given a set of Booleans, is at least one TRUE?

� Q: Given a set of integers, is their sum positive? � Transformation: (x1, x2, …, xn) = (y1, y2, …, yn) where yi = 1 if xi

= TRUE, yi = 0 if xi = FALSE

�  Another example: �  Solving linear equations is reducible to solving quadratic equations

�  How can we easily use a quadratic-equation solver to solve linear equations?

NP-Completeness

�  If P ≤p Q then P is not more than a polynomial factor harder than Q

�  A problem P is NP-Complete if �  P ∈ NP, and �  P' ≤p P, for every P' ∈ NP

�  If all problems Q ∈ NP are reducible to P, then we say that P is NP-Hard

�  A problem P is NP-Complete if it is in the NP class and is NP-Hard (P is NP-Hard and P ∈ NP)

�  If P ≤p Q and P is NP-Complete, Q is also NP-Complete

Example of some NP-complete problems

�  Given one NP-Complete problem, we can prove many interesting problems NP-Complete � Graph coloring (= register allocation)

� Hamiltonian cycle

� Hamiltonian path � Knapsack problem

� Traveling salesman

�  Job scheduling with penalities

� Many, many more

Why Prove NP-Completeness?

�  Though nobody has proven that P != NP, if you prove a problem NP-Complete, most people accept that it is probably intractable

�  Therefore it can be important to prove that a problem is NP-Complete � Don’t need to come up with an efficient algorithm

� Can instead work on approximation algorithms

How I Prove the NP-Completeness?

� What steps do we have to take to prove a problem P is NP-Complete? � Pick a known NP-Complete problem Q � Reduce Q to P

� Describe a transformation that maps instances of Q to instances of P, s.t. “yes” for P = “yes” for Q

�  Prove the transformation works �  Prove it runs in polynomial time

� Oh yeah, prove P ∈ NP

SAT  (Satifability)  

0  Variables: u1, u2, u3, ... uk

0  A literal is a variable ui or the negation of a variable ¬ ui

0  If u is set to true then ¬ u is false and if u is set to false then ¬ u is true

0  A clause is a set of literals. A clause is true if at least one of the literals in the clause is true

0  The input to SAT is a collection of clauses.

0  Steve Cook in 1971 proved that SAT is NP-complete

SAT  (Satifability)  0  The output is the answer to: “Is there an assignment

of true/false to the variables so that every clause is satisfied (satisfied means the clause is true)?”

0  If the answer is yes, such an assignment of the variables is called a truth assignment.

0  SAT is in NP: Certificate is true/false value for each variable in satisfying assignment.

Hamiltonian Cycle ⇒ TSP

�  The well-known traveling salesman problem: � Optimization variant: a salesman must travel to n cities, visiting each

city exactly once and finishing where he begins. How to minimize travel time?

� Model as complete graph with cost c(i,j) to go from city i to city j

�  How would we turn this into a decision problem? � A: ask if ∃ a TSP with cost < k

Hamiltonian Cycle ⇒ TSP

�  The steps to prove TSP is NP-Complete: �  Prove that TSP ∈ NP (Argue this)

� Reduce the undirected hamiltonian cycle problem to the TSP �  So if we had a TSP-solver, we could use it to solve the hamilitonian cycle

problem in polynomial time

�  How can we transform an instance of the hamiltonian cycle problem to an instance of the TSP?

�  Can we do this in polynomial time?

The TSP

�  Random asides: � TSPs (and variants) have enormous practical importance

�  E.g., for shipping and freighting companies

�  Lots of research into good approximation algorithms

� Recently made famous as a DNA computing problem

Combinatorial  Landscapes  

¢  The notion of landscape is among the rare existing concepts which help to understand the behaviour of search algorithms and heurist ics and to characterize the difficulty of a combinatorial problem.

Search Space

�  Given a combinatorial problem P, a search space associated to a mathematical formulation of P is defined by a couple (S,f ) �  where S is a finite set of configurations (or nodes or points) and �  f a cost function which associates a real number to each

configurations of S.

�  For this structure two most common measures are the minimum and the maximum costs. In this case we have the combinatorial optimization problems.

�  Combinatorial optimization problems are often hard to solve since such problems may have huge and complex search landscape

Example: K-SAT

�  An instance of the K-SAT problem consists of a set V of variables, a collection C of clauses over V such that each clause c ∈ C has |c|= K

�  The problem is to find a satisfying truth assignment for C

�  The search space for the 2-SAT with |V|=2 is (S,f ) where �  S={ (T,T), (T,F), (F,T), (F,F) } and �  the cost function for 2-SAT computes only the number of satisfied

clauses

fsat (s)= #SatisfiedClauses(F,s), s ∈ S

SEARCH SPACE of K=2-SAT

Let we consider F = (A ∨ ¬B) ∧ (¬ A ∨ ¬B)

A B fsat(F,s)

T T 1 T F 2 F T 1 F F 2

Example  and  Relevance  of  Landscape  

¢ The search Landscape for the K-SAT problem is a N dimensional hypercube with

N = number of variables = |V| ¢ Combinatorial optimization problems are often hard to

solve since such problems may have huge and complex search landscape

K-­‐SAT:  HYPERCUBES  

Search LandscapeA one-dimensional landscape:

0

0.2

0.4

0.6

0.8

1

1.2

1.4

0 20 40 60 80 100 120 140 160

Prof

it pe

r con

sum

er p

er a

rticl

e

Price

Linear Pricing for varying C and N. w = 10, k ~U[0.0,0.7]

N=10,C=10N10,C=100

N=100,C=10N=100,C=100

Department of Computer Science — University of San Francisco – p.6/??

Search LandscapeA two-dimensional landscape:

01

2

30

1

2

3

-1-0.5

00.51

01

2

(beyond 2 dimensions, they’re tough to draw)

Department of Computer Science — University of San Francisco – p.7/??

Search landscapesLandscapes turn out to be a very useful metaphor forlocal search algorithms.Lets us visualize ‘climbing’ up a hill (or descending avalley).Gives us a way of differentiating easy problems fromhard problems.

Easy: few peaks, smooth surfaces, noridges/plateausHard: many peaks, jagged or discontinuous surfaces,plateaus.

Department of Computer Science — University of San Francisco

Hill-climbing searchThe simplest form of local search is hill-climbing search.Very simple: at any point, look at your “successors”(neighbors) and move in the direction of the greatestpositive change.Very similar to greedy search

Pick the choice that myopically looks best.Very little memory required.Will get stuck in local optima.Plateaus can cause the algorithm to wander aimlessly.

Department of Computer Science — University of San Francisco – p.9/??

Local search in CalculusFind roots of an equation f(x) = 0, f differentiable.Guess an x1, find f(x1), f ′(x1)

Use the tangent line to f(x1) (slope = f ′(x1)) to pick x2.

Repeat. xn+1 = xn − f(xn)f ′(x1)

This is a hill-climbing search.Works great on smooth functions.

Department of Computer Science — University of San Francisco – p.10/??

Improving hill-climbingHill-climbing can be appealing

Simple to codeRequires little memoryWe may not have a better approach.

How to make it better?Stochastic hill-climbing - pick randomly from uphillmoves

Weight probability by degree of slope

Department of Computer Science — University of San Francisco

Complexity of the Landscape PSP

*E

E’

0E

Complexity of the Landscape PSP

Energy level No. of Conformations 0 36.098.079 -1 31.656.934 -2 12.473.446 -3 2.934.974 -4 517.984 -5 77.080 -6 10.364 -7 1194 -8 96 -9 4

Total 83.779.155

Fitness Landscape Components  of  Nitness  landscape:  •  Set  S  of  admissible  solutions  •  Fitness  function  that  assigns  a  real  value  to  each  solution  •  Distance  measure  that  deNines  distance  between  any  two  solutions  in  S  

 

Example  distance  measures:  •  Hamming  distance  for  binary  strings  (num.  mismatched  bits).  •  Euclidean  distance  metric  for  continuous  vectors    

Easy  Problems,  and  Hard  Problems:  •  Easy:  few  peaks;  smooth  surfaces,  no  ridges/plateaus  •  Hard:  many  peaks;  jagged  or  discontinuous  surfaces,  plateaus  

Fitness Distance Correlation (FDC): Intuition

Loose definitionI FDC measures correlation between fitness and distance to global optimum.

Simple functionsI Being closer to a global optimum leads to higher fitness.I Climbing directly to the optimum is rather easy.

Di⇥cult functionsI Being closer to a global optimum may lead to lower fitness.I Climbing directly to the optimum is not that easy.

Martin Pelikan NK Landscapes, Problem Di�culty, and Hybrid EAs

Correlation Length: Intuition

Loose definitionI Correlation length measures ruggedness of the landscape.

Simple functionsI Solutions close to each other have similar fitness values.I Fitness di�erences increase slowly with distance.

Di⇥cult functionsI Solutions close to each other have di�erent fitness values.I Fitness di�erences increase fast with distance.

Martin Pelikan NK Landscapes, Problem Di�culty, and Hybrid EAs