Friday, April 4, 2008

* Probabilistically Checkable Proofs (PCP)
* PCP and Inapproximability

PROBABILISTICALLY CHECKABLE PROOFS

Probabilistically Checkable Proofs is a refinement of Interactive
Proof systems where we explicitly take into account two resources used
by the verifier -- queries and randomness.  Note that in the IP
system, we could have assumed (by increasing the number of rounds)
that the prover always responds with one bit.  This is what we will
assume in the future here.  Futhermore, we will assume that the
verifier, includes in each of its queries, the entire message exchange
thus far.  So we can assume that the prover is nonadaptive.  Note that
the number of queries is simply the number of rounds.

PCP(r(n),q(n)) is the class of languages for which a (P,V) system
exists that satisfy the following conditions:

(1) V is a probabilistic polynomial-time verifier.

(2) The number of random bits used by V is O(r(n)).

(3) The number of queries is O(q(n)).

(4) w is in L => Pr[V accepts] = 1.

(5) w is not in L => Pr[V accepts] < c.

Here c can be made arbitrarily close to 0.

MAIN PCP THEOREM: NP = PCP(log(n), 1).

That is, for any SAT formula, there is a way to write a
polynomial-size proof down that a poly-time verifier can check the
correctness with high probability by looking at only a constant number
of bits of the proof!

APPROXIMATION ALGORITHMS

An algorithm A is an $a(n)$-approximation for a minimization
problem P if for every instance I of P, we have:

cost(A(I)) <= OPT(I)*a(|I|).

Similarly for a maximization problem, cost(A(I)) >= OPT(I)*a(|I|)

Since NP-complete problems do not seem to be solvable in polynomial
time, can we hope to solve the optimization versions approximately.
This has been a focus of attention the last couple of decades.  A
number of problems have been resolved but several more remain open.

Consider MAX-3SAT.  We would like to maximize the number of clauses
satisfied.

Theorem: There exists a polynomial-time 7/8-approximation for
MAX-3SAT.

Proof: Set each variable to be true with prob 1/2.  The probability
that a clause is satisfied is 7/8.  So expected number of clauses
satisfied is 7m/8.

How do we derandomize it?  For each x_i, i from 1 to n, compute:

E[# clauses satisfied|x_1 = 1]  and E[# clauses satisfied|x_1 = 0] 

One of them has to be at least 7m/8.  Pick that assignment.  Continue.

End Proof

Theorem: There is a PTAS for MAX-3SAT iff P = NP.

Proof: If P = NP, then one can solve MAX-3SAT optimally.  For the
other direction, suppose we have a PTAS for MAX-3SAT.  We will show
that P = NP.  Consider a language L in NP.  Since NP = PCP(log(n), 1),
there exists a prover-verifier system for L in which the V uses
clog(n) bits and k queries.

Given string x, we will construct a 3SAT formula f_x such that: 

-- if x is in L, then f_x is satisfiable

-- if x is not in L, then at most 1-eps fraction of the clauses can be
satisfied for eps = 1/((k-2)2^{(k+1)}).

If we had a PTAS for MAX-3SAT, then we can distinguish between the two
cases, thus establishing that L is in P, implying that P = NP.

The PCP protocol determines a computation tree T with two kinds of
branches -- random branches and oracle branches.  There are only n^c
possible random bit strings.  For each such random bit string y, let
T_y be the subtree of T obtained by fixing the random bit string
choice y.  Thus, T_y has only oracle branches.  There are 2^k leaves
in this tree.  Some of these lead to accept, some to reject.  We can
write a DNF formula corresponding to the accepting branches.  These
can be converted to CNF form.

After this, we can convert the CNF formula to a 3SAT formula.  This
has (k-2)2^k clauses.  Putting together this formula for every random
bit string gives a formula f with (k-2)2^kn^c clauses.  If x is in L,
then there exists a P such that V always accepts x.  So for each
random string y, there exists a choice of the oracle answers that
satisfies the DNF (and hence the CNF as well).  So every clause is
satisfied.

Suppose x is not in L.  Then, for any prover P', at most half of the
random bit choices lead to accept, and the other half lead to reject.
So any assignment has to leave at least one clause unsatisfied for at
least half the random strings.  Thus, number of clauses that can be
satisfied is at most n^c((k-2)2^k - 1/2).  So number of unsatisfied
clauses is at least 1/((k-2)(2^{k+1})) = eps fraction.

If MAX-3SAT has a > (1-eps)-approximation, then if x is in L, we would
be able to satisfy > 1-eps fraction.  Otherwise, can only satisfy < 1
- eps fraction.

End Proof

Theorem: There is no alpha-approximation for MAX-CLIQUE unless P = NP.

Proof: If P = NP, then one can solve MAX-CLIQUE optimally.  For the
other direction, suppose we have an alpha-approximation for
MAX-CLIQUE.  We will show that P = NP.  Consider a language L in NP.
Since NP = PCP(log(n), 1), there exists a prover-verifier system for L
in which the V uses O(log(n)) bits and O(1) queries.

Given string x, we will construct a G such that: 

-- if x is in L, then G has a clique of size n^c.

-- if x is not in L, then the largest clique has size < alpha*n^c.

Consider a random string y and a sequence of oracle answers a.  This
completely determines the oracle query sequence q.  Let V be the set

{(y,q,a):V accepts on the computation path given by y, q and a}.

We have an edge between (y,q,a) and (y',q',a') if the answer to every
query in (q',a') is consistent with the answer in every query in
(q,a).  Note that (y,q,a) is consistent with (y,q',a') iff q = q' and
a = a'.  Further note, that any prover only issues consistent answers.

If x is in L, there is a P (and hence a proof pi) such that V accepts
for all random strings.  For each random string y, we have exactly one
(q,a) pair occurring -- this leads to acceptance.

If x is not in L, there is no P' (and hence no proof pi) such that V
accepts for more than alpha fraction of the random strings.  If there
were a clique of size > alpha*n^c, that would yield a prover which can
achieve an acceptance probability of more than alpha, a contradiction.

End Proof