1.1 chapter 1: introduction what is the course all about? problems, instances and algorithms running...
TRANSCRIPT
1.1
Chapter 1: IntroductionChapter 1: Introduction
What is the course all about? Problems, instances and algorithms Running time v.s. computational complexity General description of the theory of NP-
completeness Problem samples
1.2
What is this Course About?What is this Course About?
Particular Topics: NP-completeness
NP-hardness
Heuristics
PSPACE-completeness
The polynomial hierarchy, etc.
Generally: Computational complexity
Intractability
“The inherent computational complexity of problems”
1.3
Problems, Instances and AlgorithmsProblems, Instances and Algorithms
A problem is a general question to be answered that consists of: Some number of parameters (a generic instance)
A statement of what properties a solution possesses (page 4)
A problem instance is a collection of specific values for all of a problems parameters.
An algorithm is a general, step-by-step procedure for solving a specific problem, e.g., a computer program.
An algorithm is said to solve a problem if that algorithm can be applied to any instance of the problem and is guaranteed to always produce a solution for that instance.
1.4
The Traveling SalesmanThe Traveling SalesmanOptimization Problem Optimization Problem
TRAVELING SALESMAN OPTIMIZATION
INSTANCE: Set C of m cities, distance d(ci, cj) Z+ for each pair of cities ci, cj C.
GOAL: Find a tour of C (i.e., a permutation <c(1) , c(2),…, c(m)> of C) having minimum total length.
A specific instance of the TSP optimization problem:
C = {c1, c2, c3, c4}
D(c1,c2) = 10
D(c1,c3) = 5
D(c1,c4) = 9
D(c2,c3) = 6
D(c2,c4) = 9
D(c3,c4) = 3
1.5
TSP Optimization InstanceTSP Optimization Instance
One TSP instance:
C = {c1, c2, c3, c4}
D(c1,c2) = 10
D(c1,c3) = 5
D(c1,c4) = 9
D(c2,c3) = 6
D(c2,c4) = 9
D(c3,c4) = 3
1.6
Problems, Instances and AlgorithmsProblems, Instances and Algorithms
Note that our definition of a problem is somewhat general, and contains many useless problems:
INSTANCE: Positive integer B.
QUESTION: Compute the largest prime number less than 1000.
The question to be asked is usually in terms of the instance parameters.
Let denote a problem. Then the parameters for define a multi-dimensional “data space” (or collection) of instances referred to as D.
Each point in this space represents one specific instance.
1.7
Optimization vs. Decision ProblemsOptimization vs. Decision Problems
Many problems of interest are optimization problems. Minimization, maximization
Although not as interesting on the surface, the theory will focus on decision problems, which are problems that have yes or no answers.
A decision problem consists of two parts: A list of parameters (i.e., a generic instance); defines a set D of
instances. A yes/no question asked in terms of the parameters; specifies a
subset of yes instances Y which is a subset of D.
Why decision problems? No loss of generality; results extend to optimization problems As a matter of convenience; easier to transform/reduce decision
problems than it is optimization problems Simple formal counterpart – a formal language Unreasonably large output does not affect running time or
complexity.
1.8
The Traveling SalesmanThe Traveling SalesmanDecision Problem (TSP) Decision Problem (TSP)
TRAVELING SALESMAN
INSTANCE: Set C of m cities, distance d(ci, cj) Z+ for each pair of cities ci, cj C positive integer B.
QUESTION: Is there a tour of C having length B or less, I.e., a permutation <c(1) , c(2),…, c(m)> of C such that:
*See the books appendix for a list of over 300 well know/studies problems.
∑−
=
≤++1
1
?))1(),(())1(),((m
i
Bcmcdicicd ππππ
1.9
TSP InstanceTSP Instance
One TSP instance:
C = {c1, c2, c3, c4}
D(c1,c2) = 10
D(c1,c3) = 5
D(c1,c4) = 9
D(c2,c3) = 6
D(c2,c4) = 9
D(c3,c4) = 3
B = 27
The instance description, such as on the previous page, is sometimes referred to as a generic instance.
1.10
Running Time v.s. ComplexityRunning Time v.s. Complexity
We will distinguish between the running time of a specific algorithms v.s. the computational complexity of a particular problem.
Example: Matrix Multiplication
INSTANCE: Two n x n matrices A and B
SOLUTION: One n x n matrix C = A x B
Running times of specific algorithms: Simple row/column algorithm - O(n3) Strassen’s algorithm - O(n2.81) Somebody else’s algorithm - O(n2.43)
Statement on the inherent computational complexity of matrix multiplication: Any algorithm for matrix multiplication requires (n2) in the
worst case, i.e, O(n2) is the best any algorithm could possibly do (this is an information theoretic argument).
1.11
Running Time v.s. ComplexityRunning Time v.s. Complexity
Example: Sorting
INSTANCE: List of n integers.
SOLUTION: The list of integers in non-decreasing order.
Running times of specific algorithms: Real dumb algorithm - O(n3) Bubble sort - O(n2) Merge sort - O(nlogn)
Statement on the inherent computational complexity of sorting: Any comparison-based sorting algorithm requires (nlogn) operations
in the worst case, i.e, O(nlogn) is the best any algorithm could possibly do.
Is this just lower bound theory? Yes, in a sense, but we are not concerned with specific running
times, but rather polynomial v.s. exponential.
1.12
The Satisfiability Problem (SAT) The Satisfiability Problem (SAT)
SATISFIABILITY
INSTANCE: Set U of variables and a collection C of clauses over U.
QUESTION: Is there a satisfying truth assignment for C?
Example #1:
U = {u1, u2}
C = {{ u1, u2 }, { u1, u2 }}
Answer is “yes” - satisfiable by make both variables T
Example #2:
U = {u1, u2}
C = {{ u1, u2 }, { u1, u2 }, { u1 }}
Answer is “no”
1.13
Satisfiability, Cont.Satisfiability, Cont.
What would be a simple algorithm for SAT? Build a truth table Running time would be (at least) O(n2m)
m is the number of variables n is the length of the expression
see pages 7 and 8 from the book
Is a more efficient algorithm possible? probably…
How about one with polynomial running time? Come see me if you find one! A live white turkey and a Stanford job awaits…
1.14
General PointsGeneral Points
We are interested in the “border” between exponential and polynomial. Given a problem, is there a polynomial time algorithm
for it, or are all algorithms for it exponential in running time?
We are not interested in what the specific polynomial or exponential is, “per se.” Although the theory can be modified/refined to consider
these.
=> Simplistically and inaccurately speaking, saying that a problem is “NP-complete” or “NP-hard” is essentially saying that there is no (deterministic) polynomial time algorithm for that problem.
1.15
General Points, Cont.General Points, Cont.
Polynomial time does not necessarily imply practical. O(n1000) O(n2) could be 10,000,000n2
NP-complete/NP-hard/intractible does not necessarily imply that their aren’t useful, practical algorithms. Our measures are worst-case, and average case may not
be all that bad, e.g., quicksort is O(n2) worst case, but O(nlogn) on average.
In theory, an algorithm could have worst-case running time O(2n) because of one case, and O(n2) average Simplex algorithm for linear programming Branch-and-bound algorithm for knapsack problem.
isn’t all that bad.
€
2O( n )
1.16
General Points, Cont.General Points, Cont.
Proving a problem is NP-complete or NP-hard is just the beginning: Heuristic development and analysis (the problem
doesn’t go away) Special cases of the problem may be solvable in
polynomial time Sub-exponential time algorithms may exist.
1.17
General Description of the TheoryGeneral Description of the Theory
We will describe a class of (decision) problems called NP.
NP consists of those decision problems that can be solved in Non-deterministic Polynomial time Holy cow! What is that, and how could it be possibly be
important?
Why decision problems? Simplicity Convenience No loss of generality in doing so
NP
1.18
General Description of the TheoryGeneral Description of the Theory
We will define a subset of NP called P.
P consists of those problems from NP that can (also) be solved in (deterministic) polynomial time Why is deterministic in parenthasis?
A very big, important question is P = NP? i.e., can all problems in NP be solved in (deterministic)
polynomial time?
The answer to this question appears to be no, i.e., there exist problems in NP for which there is no known (deterministic) polynomial time algorithm.
NP
P
1.19
General Description of the TheoryGeneral Description of the Theory
This last point will lead us to define another subset of problems in NP called NP-complete.
The above diagram implies several relationships P and NP-complete are proper subsets of NP P and NP-complete do not intersect
Note that none of these has been shown to be true, however, both are widely believed to be true.
Henceforth, “polynomial time” will be used as short for “deterministic polynomial time.”
NP
P
NP-complete
1.20
Facts about NP-complete ProblemsFacts about NP-complete Problems
Suppose is an NP-complete problem
There are no known polynomial time algorithms for All known algorithms require exponential time, e.g.,
exhaustive search
If P then P = NP If any NP-complete problem can be solved in polynomial time,
then so can all problems in NP.
It is not known for certain whether requires exponential time or not. All NP-complete problems appear to require exponential time,
but only because no polynomial time algorithm has been found for any of them.
Give a problem , we would like to know if P or NP-complete.
NP
P
NP-complete
1.21
Facts about NP-complete ProblemsFacts about NP-complete Problems
The second observation suggests why showing an NP-complete problem is important: since NP contains many very practical problems that people have tried (and failed) to come up with polynomial time algorithms for, it is highly unlikely that any NP-complete problem can be solved in polynomial time.
1.22
More Sample Problems More Sample Problems
DIVISIBILITY BY 2
INSTANCE: Integer k.
QUESTION: Is k even?
CLIQUE
INSTANCE: A Graph G = (V, E) and a positive integer J <= |V|.
QUESTION: Does G contain a clique of size J or more?
GRAPH K-COLORABILITY
INSTANCE: A Graph G = (V, E) and a positive integer K <= |V|.
QUESTION: Is the graph G K-colorable?