Chapter 7 Summary

A final example of a reduction

The subset-sum problem is the problem of determining whether or not a set S of positive integers contains a subset that sums to a target value t. This can be expressed as a language:

SUBSET-SUM = {<S,t> | S contains a subset whose sum is t }

This language is in NP, because a non-deterministic Turing machine can guess the subset, compute the sum of the elements in the subset, and confirm that the subset sum equals t all in polynomial time.

To show that SUBSET-SUM is in NPC, we reduce 3SAT to SUBSET-SUM. The reduction takes a 3cnf-formula φ as its input and constructs a set of integers S and a target value t from the formula.

The table below shows the structure of the set S and how t is constructed. The table is constructed from a 3cnf-formula φ that has l variables and k clauses.

	1	2	3	4	⋯	l	c₁	c₂	⋯	c_k
y₁	1	0	0	0	⋯	0	1	0	⋯	0
z₁	1	0	0	0	⋯	0	0	0	⋯	0
y₂	0	1	0	0	⋯	0	0	1	⋯	0
z₂	0	1	0	0	⋯	0	1	0	⋯	0
y₃	0	0	1	0	⋯	0	1	1	⋯	0
z₃	0	0	1	0	⋯	0	0	0	⋯	1
⋮	⋮	⋮	⋮	⋮	⋮	⋮	⋮	⋮	⋮	⋮
y_l	0	0	0	0	⋯	1	0	0	⋯	0
z_l	0	0	0	0	⋯	1	0	0	⋯	0
g₁	0	0	0	0	⋯	0	1	0	⋯	0
h₁	0	0	0	0	⋯	0	1	0	⋯	0
g₂	0	0	0	0	⋯	0	0	1	⋯	0
h₂	0	0	0	0	⋯	0	0	1	⋯	0
⋮	⋮	⋮	⋮	⋮	⋮	⋮	⋮	⋮	⋮	⋮
g_k	0	0	0	0	⋯	0	0	0	⋯	1
h_k	0	0	0	0	⋯	0	0	0	⋯	1
t	1	1	1	1	⋯	1	3	3	⋯	3

The table contains two rows for each variable x_i in φ. The row labeled y_i corresponds to x_i appearing in its non-negated form, while the row z_i corresponds to x_i appearing in its negated form x_i. The table also has a column labeled c_j for each clause in φ. If the variable x_i appears as x_i in clause c_j, we put a 1 in the table in row y_i, column c_j. If the variable x_i appears as x_i in clause c_j, we put a 1 in row z_i, column c_j of the table.

The number t is carefully constructed to express all of the relevant constraints in our system. The first thing to note about the columns of the table and the number t is that no column contains enough 1s in it to trigger a carry operation when we sum down that column.

The next thing to note is that the columns whose sum is 1 express the constraint that each variable in the formula has to either be true or false. By forcing the first set of columns to sum to 1, we are essentially forcing ourselves to pick a subset of the rows in such a way that none of the first l columns in the subset sum contain anything other than exactly one 1 per column.

The second group of columns, which must all sum to 3, express the constraint that each clause must have at least one term that evaluates to true. To get the sum in a particular column to reach 3, we can start by picking numbers from the g and h rows for our subset. These numbers will allow a given column to sum to at most 2, so we will have to pick at least one additional row that has a 1 in that column to reach the sum of three. Picking a row that has a 1 in the proper column corresponds to forcing the variable whose row we picked to evaluate to true.

Summary of the main ideas in chapter 7

Definition A language L is in the class P if there exists a polynomial time decider for L.

Definition A decidable language L is in the class NP if there exists a polynomial time nondeterministic Turing machine that decides L.

Definition A decidable language L is in the class NP if the language is polynomial time verifiable. A verifier for a language is a Turing machine program V which, when given a pair <w,c> where w and c are both strings, can use c to determine in polynomial time whether or not w is in L.

Definition A Turing decidable language L is NP-complete if

L ∈ NP
For all other languages A ∈ NP we have A ≤_P L

Cook-Levin Theorem SAT is NP-complete.

Theorem If A and B are both in NP, A is NP-complete, and A ≤_P B, then B is NP-complete.

Examples of other NP-complete languages

SAT ≤_P 3SAT

3SAT ≤_P CLIQUE

CLIQUE ≤_P VERTEX-COVER

3SAT ≤_P ≠SAT

≠SAT ≤_P MAX-CUT

3SAT ≤_P 3COLOR

3SAT ≤_P SUBSET-SUM

A universal decider for languages in NP

If A is in NP there exists a polynomial time verifier V for A. The verifier works on pairs <w,c>, where w is the input that may or may not be in A and c is a certificate chosen from some certificate space C(w).

Here is a Turing machine that can decide A with the help of the verifier V:

On input w:

For every c ∈ C(w):

Run V on <w,c>.
If V accepts, stop and accept.

Stop and reject.

This machine is a decider, because the certificate space C(w) is finite and V runs in polynomial time.

This machine does not decide A in polynomial time, because although C(w) is finite, for all NP-complete languages C(w) will have a size that is not polynomial in |w|.

Here are some examples of NP-complete languages and the sizes of their certificate spaces:

Language	Typical c	Size of C(w)
3SAT	A list of T,F assignments for the n variables in φ	2ⁿ
CLIQUE	A list of k vertices in the clique	\|V\| choose k
3COLOR	A list of color assignments for the vertices in V	3^\|V\|
SUBSET-SUM	The numbers in the subset of S	2^\|S\|

The most dangerous theorem in all of computer science

Definition A Turing decidable language B is NP-complete if

B ∈ NP
For all other languages A ∈ NP we have A ≤_P B

Theorem If A and B are Turing decidable languages with B in P and A ≤_P B, then A is in P.

Theorem If B is NP-complete and B ∈ P, then P = NP.

Proof Just combine the definition and the previous theorem.