Tuesday, April 8, 2008

* NP is in PCP(poly, 1)

SATISFIABLE QUADRATIC EQUATIONS

We will work with NP-complete language of satisfiable quadratic
equations (call it SATQUAD).  In a SATQUAD instance, we are given a
system of m quadratic equations:

Sum_{i,j=1}^n c_{i,j}^k x_i x_j = c^k

k = 1,...,m

over the field Z_2 in the n variables x_1,...,x_n.  A system is
satisfiable iff there is an assignment to the {x_i} satisfying all the
m equations.

SATQUAD is in NP.  Also easy to reduce 3SAT to SATQUAD, so SATQUAD is
NP-complete.

PCP FOR SATQUAD

Before presenting the proof, we recall that we can assume that the
prover is a non-adaptive prover.  So the prover is nothing but a proof
string that does not change with the queries.  This will be crucially
used throughout the proof below.  We will lay out the "proof" as the
honest prover needs to do.

Verifier wants to determine whether there exists a setting of {x_i}
that satisfies the above m equations.  The (honest) prover will choose
(a_1,...,a_n) that satisfies the equations and then have a proof p
that will be indexed by a binary vector v of length n^2, with the
entry v = (v_11, v_12, ..., v_1n, ..., v_n1,..., v_nn} of p equal to:

p(v) = Sum_{i,j=1}^n a_ia_jv_{ij}

So the proof p is a function from {0,1}^{n^2} to {0,1}.  The verifier
will check three things.  

-- Linearity test: p encodes a linear function, i.e., 

pi(v) = Sum_{i,j} l_{ij} v_{ij}

for some i,j.

-- Consistency test: the coefficients of the linear function encoded
by the proof string are consistent.

l_{ij} = l_ii*l_jj for all i,j

-- Satisfiability test: The assignment defined by a_i is indeed a
satisfying assignment.

What the verifier really cares about is the last one.  But that is
made much easier if the prover passes the first two tests.

LINEARITY TEST

A function f: {0,1}^N -> {0,1} is linear if there exists an r in
{0,1}^N such that f(x) is the inner product of x and r.  That is,

f(x_1,...,x_N) = Sum_{i = 1}^N r_i x_i.

Two functions f and g from {0,1}^N -> {0,1} have distance d if they
disagree on d fraction of their points.  The following test checks
whether p is close to a linear function.

-- Choose random v1 and v2 in {0,1}^N.

-- Query p(v1), p(v2), and p(v1+v2).

-- Accept iff p(v1) + p(v2) = p(v1 + v2).

Clearly, if p is linear, then the prover always passes the test.

Theorem: If p has distance e from any linear function, then the
linearity test rejects with probability at least eps.

CONSISTENCY TEST

Suppose p has passed the linearity test.  Then, it is within distance
e of a linear function.  This means that there is a unique linear
function f within distance e from p.  Since f is linear, it is

f(v) = Sum_{i,j=1}^n l_{ij} v_{ij}

for some l_{ij}s.  We want to check that l_{ij}s are consistent.  View
these as a matrix.

M = (l_ij)

Consistency is equivalent to checking that l^T l = M.

Claim: Let M and M' be two unequal nxn matrices over Z_2.  Then

Pr_{x,y} [xMy^T = xM'y^T] <= 3/4.

Proof: xMy^T - xM'y^T = x(M-M')y^T and M" = M - M' is a nonzero matrix.
The probability that M"y^T = 0 is at most 1/2.  If this does not
occur, the probability that xM"y^T = 0 is exactly 1/2.  So the
desired probability is at most 3/4.

End Proof

Now the consistency check is to verify whether:

xMy^T = x(l^T l)y^T.

How do we compute xMy^T?  

xMy^T = Sum_{ij} l_{ij} x_i y_j.

So if we set v_ij = x_i y_j, we can query f(v).

How do we compute x(l^T l)y^T = (xl^T)(ly^T)?

xl^T = f(v) with v_ii = x_i and v_ij = 0 for i != j.  Similarly, we
can compute ly^T.

But we do not have access to f?  Only access to p.  But p is close to
f.  So we can compute f(v) by choosing a random r and computing f(r) =
p(r) + p(r+v).  With good probability, the value computed is indeed
the desired value.

SATISFIABILITY TEST

Assume p is close to a linear function f which is consistent.  The
proof p encodes an assignment l_11, l_22, ..., l_nn.  We want to check
whether this assignment indeed satisfies the equations.

c^k + Sum_ij c_ij^k x_i x_j = 0, k = 1 to m

Let y_k be the above k values.  We will test whether y*r, for a random
r is 0.  If so, accept, otherwise reject.

Random r is the same as selecting a random set S of equations and
looking at the sum

Sum_{k in S} y_k = Sum_{k in S} (c^k + Sum_ij c_ij^k a_i a_j)

= (Sum_{k in S} c^k) + (Sum_{k in S} Sum_ij c_ij^k a_i a_j)

= (Sum_{k in S} c^k) + (Sum_{k in S} Sum_ij c_ij^k a_i a_j)

Set v_ij = Sum_{k in S} c_ij^k, query f(v) (using self-correction),
and accept iff p(v) = (Sum_{k in S} c^k).

Suppose the system is not satisfiable.  If the test computes f(v)
correctly, then with probability at least 1/2 the prover fails the
test.  The probability that the test fails is at most 2e (since p is
at most distance e from f).  So the overall probability of acceptance
is at most 2e + 1/2.