A final example of a reduction

The subset-sum problem is the problem of determining whether or not a set S of positive integers contains a subset that sums to a target value t. This can be expressed as a language:

SUBSET-SUM = {<S,t> | S contains a subset whose sum is t }

This language is in NP, because a non-deterministic Turing machine can guess the subset, compute the sum of the elements in the subset, and confirm that the subset sum equals t all in polynomial time.

To show that SUBSET-SUM is in NPC, we reduce 3SAT to SUBSET-SUM. The reduction takes a 3cnf-formula φ as its input and constructs a set of integers S and a target value t from the formula.

The table below shows the structure of the set S and how t is constructed. The table is constructed from a 3cnf-formula φ that has l variables and k clauses.

1234lc1c2ck
y110000100
z110000000
y201000010
z201000100
y300100110
z300100001
yl00001000
zl00001000
g100000100
h100000100
g200000010
h200000010
gk00000001
hk00000001
t11111333

The table contains two rows for each variable xi in φ. The row labeled yi corresponds to xi appearing in its non-negated form, while the row zi corresponds to xi appearing in its negated form xi. The table also has a column labeled cj for each clause in φ. If the variable xi appears as xi in clause cj, we put a 1 in the table in row yi, column cj. If the variable xi appears as xi in clause cj, we put a 1 in row zi, column cj of the table.

The number t is carefully constructed to express all of the relevant constraints in our system. The first thing to note about the columns of the table and the number t is that no column contains enough 1s in it to trigger a carry operation when we sum down that column.

The next thing to note is that the columns whose sum is 1 express the constraint that each variable in the formula has to either be true or false. By forcing the first set of columns to sum to 1, we are essentially forcing ourselves to pick a subset of the rows in such a way that none of the first l columns in the subset sum contain anything other than exactly one 1 per column.

The second group of columns, which must all sum to 3, express the constraint that each clause must have at least one term that evaluates to true. To get the sum in a particular column to reach 3, we can start by picking numbers from the g and h rows for our subset. These numbers will allow a given column to sum to at most 2, so we will have to pick at least one additional row that has a 1 in that column to reach the sum of three. Picking a row that has a 1 in the proper column corresponds to forcing the variable whose row we picked to evaluate to true.

Summary of the main ideas in chapter 7

Definition A language L is in the class P if there exists a polynomial time decider for L.

Definition A decidable language L is in the class NP if there exists a polynomial time nondeterministic Turing machine that decides L.

Definition A decidable language L is in the class NP if the language is polynomial time verifiable. A verifier for a language is a Turing machine program V which, when given a pair <w,c> where w and c are both strings, can use c to determine in polynomial time whether or not w is in L.

Definition A Turing decidable language L is NP-complete if

  1. LNP
  2. For all other languages ANP we have AP L

Cook-Levin Theorem SAT is NP-complete.

Theorem If A and B are both in NP, A is NP-complete, and A ≤P B, then B is NP-complete.

Examples of other NP-complete languages

SATP 3SAT

3SATP CLIQUE

CLIQUEP VERTEX-COVER

3SATPSAT

SATP MAX-CUT

3SATP 3COLOR

3SATP SUBSET-SUM

A universal decider for languages in NP

If A is in NP there exists a polynomial time verifier V for A. The verifier works on pairs <w,c>, where w is the input that may or may not be in A and c is a certificate chosen from some certificate space C(w).

Here is a Turing machine that can decide A with the help of the verifier V:

On input w:

  1. For every cC(w):
    1. Run V on <w,c>.
    2. If V accepts, stop and accept.
  2. Stop and reject.

This machine is a decider, because the certificate space C(w) is finite and V runs in polynomial time.

This machine does not decide A in polynomial time, because although C(w) is finite, for all NP-complete languages C(w) will have a size that is not polynomial in |w|.

Here are some examples of NP-complete languages and the sizes of their certificate spaces:

LanguageTypical cSize of C(w)
3SATA list of T,F assignments for the n variables in φ2n
CLIQUEA list of k vertices in the clique|V| choose k
3COLORA list of color assignments for the vertices in V3|V|
SUBSET-SUMThe numbers in the subset of S2|S|

The most dangerous theorem in all of computer science

Definition A Turing decidable language B is NP-complete if

  1. BNP
  2. For all other languages ANP we have AP B

Theorem If A and B are Turing decidable languages with B in P and A ≤P B, then A is in P.

Theorem If B is NP-complete and BP, then P = NP.

Proof Just combine the definition and the previous theorem.