PDF of Eric’s handwritten notes are here.
Overview:
How can we show that a problem is intractable or computationally difficult in the sense that it can’t be solved efficiently? To be precise, by efficient we mean that the running time is polynomial in the input size. We are not able to show that a problem cannot be solved in polynomial-time, but instead we show that if we can solve this particular problem in polynomial-time then it will have dramatic implications. In particular, we will be able solve a wide-range of other problems in polynomial-time, and these other problems include many noteworthy problems that have resisted the efforts of many people to crack them.
Following [Dasgupta-Papadimitriou-Vazirani]’s textbook we will define the class NP in terms of search problems. This is a somewhat more intuitive approach, however the traditional approach uses decision problems. The distinction is not important but you should be aware in case you look at other textbooks or you’ve seen these topics in an earlier course.
The class NP are all problems where we can efficiently verify solutions. In retrospect, we may have denoted the class as EV (to stand for efficient verification), but these concepts were devised in terms of automata and NP stands for non-deterministic polynomial-time; we’ll discuss this more in a bit. Let’s first define the notion of a search problem.
Definition: Search Problem:
- Input: Instance
- Output: Solution
if one exists; NO if no solution exists
- Requirements: Given
and solution
then in polynomial time we can check that
is a solution to
Roughly speaking, a search problem is one where we can efficiently verify solutions. We don’t need to find or construct a solution, simply if we are given a solution (who knows how it’s found) then we can efficiently check that it is in fact a solution. To show problem is a search problem, we need to show an algorithm that takes input
and solution
and checks that
is, in fact, a solution to
, and the running time of the algorithm
must be polynomial in the input size
.
We can now define the class NP as the class of all search problems. Note this says nothing about constructing the solution, just that we can verify solutions efficiently. If in addition we can construct solutions efficiently for all inputs then the problem is in the class P.
- NP = Class of all search problems
- P = All search problems that can be solved in polynomial time
Hence the question whether P =? NP is roughly asking whether constructing a solution as hard/easy as verifying a solution. An analogous question from mathematics: given a proof of a theorem is checking that proof as difficult (within polynomial factors) as constructing a proof?
We saw the SAT problem earlier in the graph algorithms section. Let’s review the definition since SAT plays a central role in the theory of NP-completeness.
Definition: SAT Problem:
- Input: Boolean formula
in CNF with
variables and
clauses
- Output: Satisfying assignment if one exists; NO if satisfying SAT assignment
SAT NP: Given
and an assignment
, for each clause in
time we can check that there is
satisfied literal. Thus it takes a total of
time to check that
satisfies
.
Here are a few additional problems that we will discuss in this lecture.
In the coloring problem we have k colors and we want to color the vertices so that adjacent vertices get different colors (i.e., no monochromatic edges).
k-Coloring Problem:
- Input:
and integer
- Output: k-coloring if one exists; NO if no such k-coloring
Note, k-Coloring NP: Given
and
. In
time to verify that
is a proper coloring.
time to check that
uses no more than
colors.
Here again is the knapsack problem that we saw in the dynamic programming section.
Knapsack Problem:
- Input:
objects: integer weights
; values
; integer
as weight limit
- Output: Solution
of objects with weight no more than
and the max value of the objects
Is Knapsack NP? Given the input and a solution
, we need to verify that the solution
is of maximum value. But how can we do that? We can run the dynamic programming algorithm to find the maximum value which is possible to attain, but that algorithm takes time
. The input size is of size poly(
). Hence we don’t know that Knapsack is in NP, in fact, if it is in NP then P=NP.
The above formulation is the optimization version, the following is the search version which adds an extra input parameter for the “goal”.
Knapsack-Search Problem:
- Input: n objects: integer weights
; values
; integer
as weight limit and integer
(Goal)
- Output: Solution
of objects with weight
and total value
Note, Knapsack-Search NP: Given the input and
, we can verify
in poly
time by checking whether total weight
and total value >=
.
Knapsack (optimization) can be reduced to Knapsack-Search:
If we can find a solution for the search version in polynomial time, then we can find the solution to the original Knapsack (optimization) problem in polynomial time using Knapsack-Search Problem as a subroutine and applying binary search over . Hence, we’ve shown that: Knapsack
Knapsack-Search.
Recall, the minimum spanning tree (MST) problem.
MST Problem:
- Input:
with
for every
- Output: Tree
of min weight
Note, MST NP: Given
and
. Run DFS to check
is a tree in
time. Then, run Kruskal’s or Prim’s algorithm to get a MST
and check that
.
NP stands for non-deterministic polynomial time these are problems that can be solved in poly-time on a non-deterministic machine. (Such machines are allowed to “guess” at each step, you can view it as an automata and we’re simply asking if there exists a path to the accepting state.)
To summarize the status, we’ve shown:
- P contains: MST
- NP contains: MST, Knapsack, SAT, K-Coloring, TSP
According to the definition of NP and P at the beginning of the lecture, it is easy to see that P NP. What will happen if P = NP?
- P = NP: If we can verify solutions efficiently, then we can construct solutions efficiently. In other words, verifying a proof is not harder than constructing a proof.
- P
NP: There are some search problem cannot be solved in poly time. These problems are NP-Complete problems, they are the hardest problems in NP in the sense that they are the most difficult search problems to solve.
If P NP, then there is no polynomial time algorithm for some problems in NP, these are the “hardest” problems in the class NP and hence are called NP-complete problems. The problems Knapsack-search, SAT, k-Colorings, TSP are all NP-complete, and there a ton of other NP-complete problems. If we can solve one of these NP-Complete problems efficiently, then we can solve all of the rest efficiently.
To show SAT is NP-Complete, we need to prove:
- SAT
NP (shown above)
- For every problem A
NP, there is a reduction from A to SAT (A
SAT): Given a poly-time algorithm for SAT we can use it to get a poly-time algorithm for every problem A
NP. Thus, SAT is at least as hard as every other problem in NP.
Here we formally the notion of a reduction alluded to in the second part.
Definition: Reduction:
Consider a pair of problems A and B, e.g., A = MST, B = SAT or A = 2SAT, B = SCC.
(or
) means A reduces to B:
If we can solve B in poly-time then we can use that algorithm for B as a black-box to solve A in poly-time.
Take input I from A:
- define input f(I) for B
- run black-box algorithm for B on f(I)
- Given solution S for f(I) produce solution h(S) and given No for f(I) then output No for I.
Thus to reduce A to B, we need:
- define f and h
- prove that if S is a solution for f(I) then h(s) is a solution for I
- prove that if there is no solution for f(I) then there is no solution for I
To prove NP-completeness:
Now we can use reductions to show the second part of the proof of NP-completeness for SAT. That is, for any problem A in NP, we need to find a reduction from A to SAT.
Suppose we know SAT is NP-Complete somehow, which gives us for every problem A in NP, we have A SAT.
And now we want to show Colorings is NP-Complete. Since we already know Colorings is NP, the remaining thing is to show for every problem A in NP, we have A Colorings. If we can show SAT
Colorings, then we have A
SAT
Colorings, and hence A
Colorings.
In general, if we want to prove a problem such as Colorings is NP-Complete, we need to show:
- Colorings
NP
- Find a known NP-Complete problem B and show that B
Colorings.