NP: Graph problems

PDF of Eric’s handwritten notes are here.

In this last lecture we took for granted that SAT is NP-Complete, and then we used this fact to show that 3-SAT is NP-Complete.  Now we’ll build on the NP-completeness of 3-SAT to prove that a few graph problems, namely, Independent Set, Clique, and Vertex Cover, are NP-Complete.

The first problem we’ll consider is the independent set problem.  Let’s first recall the definition of an independent set.

Definition: Independent Set:

For an undirected graph G = (V, E), S \subset V is an independent set if for all x, y \in S, we have (x,y) \notin E.
In other words, no edge is contained in S.

Small independent sets are trivial to find, e.g., the empty set or a single vertex.  The challenge is to find a large independent set.  For the Max-Independent-Set problem in which we find an independent set of maximum size, we can’t show that’s in NP, unless P=NP, because we can’t verify that a set is of maximum size.  Hence we have an extra input parameter g and we look for an independent set of size \geq g.

Definition: Independent-set Problem:

  • Input: Undirected graph G = (V,E) and goal g.
  • Output: Independent Set S \subset V where |S| \geq g if one exists;
    NO otherwise.

Claim: Independent-set problem is NP-Complete.

Proof:

a) Independent-set \in NP: Given input G, g and solution S, we
consider all pairs x, y \in S and check that (x, y) \notin E.  This takes O(n^2) time to consider all pairs in S and O(n) time for each pair to check that it is not an edge.  Finally, in O(n) time we check that |S| \geq g.

b) 3-SAT \rightarrow Independent-set

Assume there is a poly-time algorithm for the independent-set problem, and we’ll use this algorithm as a black-box to create an algorithm for the 3-SAT problem.

Consider 3-SAT input f with n variables x_1, x_2, \dots, x_n
and m clauses C_1, C_2, \dots, C_m, where each clause has size \leq 3.

We will create a graph G based on f.  Formula f has m clauses so we will set g = m.

The vertices of G will correspond to literals of f, though there will potentially be many vertices corresponding to the same literal.  If all clauses are of size exactly 3 then we will have 3m vertices in G.  Let us define it precisely.

For a clause C = (a_1 \vee a_2 \vee a_3) for literals a_1, a_2, a_3, create 3 vertices corresponding to a_1, a_2, a_3 and put all 3 edges between them to make them a triangle. In general, if clause C has size i\leq 3 then add i vertices corresponding to the literals in C and add the complete graph between these i vertices.

Note, any independent set of G can include at most one of the i vertices for each clause C.  And hence to obtain an independent set of size \geq g where g=m it will need to include exactly one per clause “gadget”.

For each variable x_i, add an edge between all vertices corresponding to x_i and all vertices corresponding to \overline{x_i}. These edges ensure that an independent-set S does not include both x_i and \overline{x_i}, and hence S corresponds to a valid assignment for f.

Combining these two properties, we know that to get |S| = g = m, we need exactly one vertex per clause, and hence a solution to the independent-set problem on the graph G corresponds to a satisfying assignment of f.  Let us formalize this.

Claim: f has a satisfying assignment \leftrightarrow G has an independent set of size \geq g.

Proof:

(\Rightarrow): Consider a satisfying assignment for f. For each clause C, at least one of the literals in C is satisfied. Choose one of these satisfied literals and include its corresponding vertex in S. Hence, |S| = m = g.
By construction, S includes exactly one vertex per clause so no edges within a clause’s gadget are covered by S.  Moreover, since we started from an assignment for the variables x_1,\dots,x_n then S does not include opposing literals, so no edges between opposite literals are covered. Hence, S is an independent set.

(\Leftarrow): Take an independent set S of size \geq g. For each vertex in S set the corresponding literal to T, we will prove this gives a satisfying assignment.   Since there are edges between vertices corresponding to opposite literals, S does not contain vertices corresponding to opposite literals and so this assignment from S will not set both literals x_i and \overline{x_i} to T; hence, it is a valid assignment.  Moreover, since |S|\geq g and since there is a complete graph for each clause’s gadget, the set S has exactly one vertex per clause’s gadget.  This gives at least one satisfied literal per clause and hence we have a valid assignment that satisfies f.

Next we’ll consider the clique problem.  Here is the definition of a clique.

Definition: Clique:

Clique: For an undirected graph G = (V, E), S \subset V is a clique if for all x, y \in S, we have (x,y) \in E.
In other words, S is a fully connected subgraph.

The challenge is to find a large clique.

Definition: Clique Problem:

  • Input: Undirected graph G = (V,E) and goal g.
  • Output: Clique S \subset V where |S| \geq g if one exists;
    NO otherwise.

Claim: The Clique problem is NP-Complete.

Proof:

a) Clique \in NP: Given input G, g and solution S, check for all pairs x, y \in S that (x, y) \in E; this takes O(n^2) time.  Then, check |S| \geq g which takes O(n) time.

b) Independent Set \rightarrow Clique

Key idea: Clique is opposite of independent set.
For a graph G = (V, E), denote its “opposite” graph by: \overline{G} = (V, \overline{E}) where \overline{E} = \{(x,y): (x,y) \notin E\}. In other words, (x, y) \notin E \leftrightarrow (x, y) \in \overline{E}.
Observe that: S is an independent set in G \leftrightarrow \overline{S} is a clique in \overline{G}.

Now we can show Independent set \rightarrow Clique:
Given G = (V,E) and g as input for the independent set problem, let \overline{G} and g be the input to the clique problem.
If we get a solution S for Clique then return the same S as the solution to the original independent set problem.  If we get NO, then we return NO for the independent set problem.

Definition: Vertex Cover:

For an undirected graph G = (V, E), S \subset V is a vertex cover if it covers every edge in the following sense: for all (x,y) \in E, either x \in S and/or y \in S.

The set S=V is a trivial vertex cover, and hence the challenge is to find a small vertex cover.

Definition: Vertex Cover Problem:

  • Input: Undirected graph G = (V,E) and goal b
  • Output: VC S \subset V where |S| \leq b if one exists; NO otherwise

Claim: The Vertex Cover problem is NP-Complete.

Proof:

a) Vertex Cover \in NP: Given input G, g and solution S, for every edge (u,v)\in E of G, check that at least one of $u$ and $v$ are in S; this takes O(m) time.  We then check that |S| \leq b which takes O(n) time.

b) Independent Set \rightarrow Vertex Cover.

Claim: For any undirected graph G=(V,E), S is a vertex cover \leftrightarrow \overline{S} is an independent set.

Proof:

(\Rightarrow): Consider a vertex cover S where |S|\leq b. For each (x,y) \in Ex \in S and/or y \in S. Hence, at most one of x, y is in \overline{S}. So no edge has both endpoints in \overline{S}, which means \overline{S} is an independent set.

(\Leftarrow): Take an independent set \overline{S}. For (x,y) \in E, at most one of x,y is in \overline{S}. So at least one of x,y is in S, which means S covers every edge, and hence S is a vertex cover.

Independent Set problem \rightarrow Vertex Cover problem:
Consider input G, g for independent set problem.  Let G, b = |V|-g be the input for the vertex cover problem.  We know that G has an independent set of size \geq g \leftrightarrow G has a vertex cover of size \leq b.

If we get solution S for vertex cover, then wen return \overline{S} = V-S as the solution to the original independent set problem.
If we get NO solution for VC, then we return NO for IS.