Satisfying Degree-$d$ Equations over $GF[2]^n$

Authors Johan H{\aa}stad,
Plaintext
                         T HEORY OF C OMPUTING, Volume 9 (27), 2013, pp. 845–862
                                       www.theoryofcomputing.org

                   S PECIAL ISSUE : A NALYSIS OF B OOLEAN F UNCTIONS



    Satisfying Degree-d Equations over GF[2]n
                                                    Johan Håstad∗
                  Received June 20, 2012; Revised July 15, 2013; Published November 27, 2013




       Abstract: We study the problem where we are given a system of polynomial equations
       defined by multivariate polynomials over GF[2] of fixed, constant, degree d > 1 and the aim
       is to satisfy the maximum number of equations. A random assignment approximates this
       number within a factor 2−d and we prove that for any ε > 0, it is NP-hard to obtain a ratio
       2−d + ε. When considering instances that are perfectly satisfiable we give a polynomial-time
       algorithm that finds an assignment that satisfies a fraction 21−d − 21−2d of the constraints and
       we prove that it is NP-hard to do better by an arbitrarily small constant. The hardness results
       are proved in the form of inapproximability results for M AX -CSPs where the predicate
       in question has the desired form and we give some immediate results on approximation
       resistance of some predicates.

ACM Classification: F.2.2
AMS Classification: 68Q17
Key words and phrases: polynomials, probabilistically checkable proofs, constraint satisfaction, ap-
proximation

1     Introduction
The study of polynomial equations is a basic question of mathematics. In this paper we study a problem
we call M AX -d-E Q where we are given a system of m equations of degree d in n variables over GF[2].
As we consider the case of d constant, all polynomials are given in the dense representation, i. e., by a list
of monomials. Many problems can be coded as polynomial equations and in particular it is easy to code
     A conference version of this paper appeared in the proceedings of APPROX 2011 [6].
    ∗ Supported by ERC grant 226 203.



© 2013 Johan Håstad
c b Licensed under a Creative Commons Attribution License (CC-BY)                         DOI: 10.4086/toc.2013.v009a027
                                               J OHAN H ÅSTAD

3-S AT as equations of degree 3 and thus determining whether we can simultaneously satisfy all equations
is NP-complete. It is hence natural to study the question of satisfying the maximum number of equations
and our interest turns to approximation algorithms. We say that an algorithm is a C-approximation
algorithm if it always returns a solution which satisfies at least C · OPT equations where OPT is the
number of equations satisfied by the optimal solution.
     For the problem under study, a random assignment satisfies each equation with probability at least
2−d and hence obtaining an approximation within this factor must be considered folklore. When it comes
to inapproximability results, it was established early on by Håstad et al. [7] that for d = 2 (and implicitly
for any d ≥ 2) it is NP-hard to get a constant better than 1/2. Given the importance of the problem it is,
however, natural to try to determine the exact approximability of the problem and this is the purpose of
this paper.
     The result of [5] proves that the optimal approximability constant for linear equations (d = 1) is 1/2.
This approximability is obtained by simply picking a random assignment independently of the equations
at hand. To prove tightness it is established that for any ε > 0, it is NP-hard to approximate the answer
better than within a factor 1/2 + ε. This is proved by constructing a suitable Probabilistic Checkable
Proof (PCP). It turns out that these results extend almost immediately to the higher degree case giving the
optimal constant 2−d for degree-d equations. We proceed to study the case when all equations can be
simultaneously satisfied.
     In the case of linear equations, it follows by Gaussian elimination that once it is possible to satisfy all
equations one can efficiently find such a solution. The situation for higher degree equations turns out to
be more interesting. Any implied affine condition can be used to eliminate a variable but this turns out
to be the limit of what can be achieved. To be more precise, from a characterization of the low-weight
codewords of Reed-Muller codes by Kasami and Tokura [8], it follows that any equation satisfied by a
fraction lower than 21−d − 21−2d must imply an affine condition. This number turns out to be the sharp
threshold of approximability for satisfiable instances of systems of degree-d equations.
     The upper bounds is obtained by using implied affine conditions to eliminate variables and then
essentially choosing an assignment at random. To make this algorithm deterministic polynomial time we
use a pseudo-random generator of Viola [11].
     The lower bound is proved by constructing a PCP very much inspired by [5] and indeed nothing in
the current paper relies on material not known at the time of that paper. In particular, we prove standard
NP-hardness results and do not use any sophisticated results in harmonic analysis.
     It is interesting to note that in the lower bound construction we only use polynomials each of which
depends on a constant number of variables (in fact only 6d variables) while our algorithms work even if
each polynomial depends on all variables.
     As a by-product of our proofs we make some observations in the area of Maximum Constraint
Satisfaction Problems (M AX -CSP S). The problem M AX -P is given by a predicate P of arity k and an
instance is given by a sequence of k-tuples of literals. The task is to find an assignment such that the
maximum number of the resulting k-tuples of bits satisfy P. Let r(P) be the probability that a uniformly
random assignment satisfies P. Note that r(P) is the approximation ratio achieved by the algorithm
that simply picks a uniformly random assignment independent of the instance under consideration. A
predicate for which you cannot do significantly better (this is formally defined in the next section) in
polynomial time, unless P = NP, is said to be approximation resistant. This property is equivalent to, for

                     T HEORY OF C OMPUTING, Volume 9 (27), 2013, pp. 845–862                               846
                             S ATISFYING D EGREE -d E QUATIONS OVER GF[2]n

any ε > 0, it being NP-hard to distinguish instances of M AX -P where you can satisfy a fraction 1 − ε of
the constraints from those where you can satisfy only a fraction r(P) + ε.
    Given this formulation it is natural to formulate a stronger notion. Namely, that it is NP-hard to
distinguish instances of M AX -P where all constraints can be satisfied simultaneously from those where
only a fraction r(P) + ε of the constraints can be satisfied simultaneously. In this case P is called
approximation resistant on satisfiable instances.
    It is technically easier to prove approximation resistance while the extension to satisfiable instances
might be more complicated, unknown, or simply false. One example to distinguish the two notions is
parity of at least three variables which is known to be approximation resistant [5] but which is easy for
satisfiable instances due to Gaussian elimination.
    In the current paper we extend this to give an example of a predicate which is approximation resistant
on satisfiable instances but which has a different, and still non-trivial, approximation ratio for satisfiable
instances.
    An outline of the paper is as follows. In Section 2 we give some preliminaries and the rather easy
result for imperfect completeness is given in Section 3. The most technically interesting part of the
paper is given in Section 4 where we study systems of equations where all equations can be satisfied
simultaneously. We give the results on M AX -CSPs in Section 5 and end with some final remarks in
Section 6.
    This is the full version of the conference paper [6].


2    Preliminaries
We are interested in polynomials over GF[2]. Most polynomials we use are of degree d but also
polynomials of degree one, that we call “affine forms” play a special role. We are not interested in the
polynomials as formal polynomials, but rather as functions mapping GF[2]n to GF[2] and hence we freely
use that xi2 = xi and thus any monomial in our polynomials can be taken to be multi-linear. We start with
the following standard result which we, for completeness, even prove.
Theorem 2.1. Any multivariate polynomial P of degree d that is nonzero takes the value 1 for at least a
fraction 2−d of the inputs.
Proof. The proof is by induction over n and d, with the base case of d = 1 which is true as each linear
polynomial is unbiased.
    For the induction step, suppose P(x) = P0 (x) + x1 P1 (x) and let us consider what happens for the two
possible values of x1 . If both P0 and P0 + P1 are non-zero we are done by induction. If not, as P1 is of
degree at most d − 1, the polynomial of P0 and P0 + P1 that is non-zero is of degree at most d − 1. Hence
this polynomial takes the value 1 for at least a fraction 21−d of its inputs. As the set of inputs of this
polynomial constitutes half of the inputs of P, the result follows also in this case.
    It is not difficult to see that this result is tight by considering P(x) = ∏di=1 xi , or, more generally,
products of d linearly independent affine forms. It is important for us that these are the only cases of
tightness. This follows from a characterization by Kasami and Tokura [8] of all polynomials that are
non-zero for at most a fraction 21−d of the inputs. A consequence of their characterization is the following
theorem.

                     T HEORY OF C OMPUTING, Volume 9 (27), 2013, pp. 845–862                              847
                                              J OHAN H ÅSTAD

Theorem 2.2. Let P be a degree-d polynomial over GF[2] which factors as
                                                            r
                                           P(x) = Q(X) ∏ Ai (x)
                                                           i=1
where Ai are linearly independent affine forms and Q does not contain any affine factor. Then the fraction
of points on which P(x) = 1 is at least
                                        2−r (21−(d−r) − 21−2(d−r) ) ,
if d 6= r and 2−d if d = r.
    Given that we do not need their full characterization and the proof of the part that we need is shorter
than the proof of the full characterization, we give the proof of Theorem 2.2 in an appendix.
    We make use of the Fourier transform and, as we are dealing with polynomials over GF[2], we let the
inputs come from {0, 1}n . For any α ⊆ [n] we have the character χα defined by
                                            χα (x) = (−1)∑i∈α xi
and the Fourier expansion of a real-valued function f is given by
                                           f (x) = ∑ fˆα χα (x) .
                                                   α⊆[n]

Suppose that R ≤ L and we are given a projection π mapping [L] to [R]. We define a related operator
π2 acting on sets as follows. For β ⊆ [L] we let π2 (β ) be set of the elements in [R] with an odd number
of preimages under π that belong to β . The reason this is a useful definition for us is that if we have
x ∈ {0, 1}R and define y ∈ {0, 1}L by setting yi = xπ(i) then χβ (y) = χπ2 (β ) (x).
    As is standard we use the long code introduced by Bellare et al. [4]. If v ∈ [L] then the corresponding
long code is a function A : {0, 1}L → {−1, 1} where A(x) = (−1)xv . We require our tables to be folded,
which means that they only contain values for inputs with x1 = 1. The value when x1 = 0 is defined to be
−A(x̄). This ensures that the function is unbiased and hence that the Fourier coefficient corresponding to
the empty set is 0.
    We are interested in Maximum Constraint Satisfaction Problems (M AX -CSPs) given by a predicate
P of some constant arity k. An instance is given by a set of k-tuples of literals and the task is to find an
assignment that maximizes the number of the resulting k-tuples of bits that satisfy the predicate P. As
mentioned in the introduction we let r(P) be the probability that a random assignment satisfies P. We
have the following two definitions mentioned briefly in the introduction.
Definition 2.3. A predicate P is approximation resistant if, for any ε > 0, it is NP-hard to approximate
M AX -P within r(P) + ε.
Definition 2.4. A predicate P is approximation resistant on satisfiable instances if, for any ε > 0, it is
NP-hard to distinguish instances of M AX -P where all constraints can be satisfied simultaneously from
those where only a fraction r(P) + ε of the constraints can be satisfied simultaneously.
     We define PL to be the predicate of arity 3k defined on (xij ), 1 ≤ i ≤ 3 and 1 ≤ j ≤ k, by setting
y j = x1j + x2j + x3j and defining PL (x) to be true iff P(y) is true. Note that if P is given by a polynomial
equation of degree d then so is PL . By slightly abusing language, in this situation we do not distinguish
the predicate P from the polynomial defining the predicate.

                     T HEORY OF C OMPUTING, Volume 9 (27), 2013, pp. 845–862                              848
                               S ATISFYING D EGREE -d E QUATIONS OVER GF[2]n

3    The case of imperfect completeness
We start with the algorithm.
Theorem 3.1. Given a system of m polynomial equations of degree d over GF[2], it is possible, in
polynomial time, to find an assignment that satisfies at least m2−d equations.
Proof. In fact, by Theorem 2.1, a random assignment satisfies each equation with probability 2−d and
thus just picking a random assignment gives a randomized algorithm fulfilling the claim of the theorem in
expectation.
    To get a deterministic algorithm we use the method of conditional expectations. The idea is to assign
values to the variables in order and we choose values for the variables such that the expected number
of satisfied equations, if the remaining variables are set randomly, never drops below m2−d . When all
variables are set, any equation is satisfied with probability either 0 or 1 and hence at least m2−d equations
are satisfied.
                                                                                                           0
    To make this procedure efficient we, at each time, use the lower estimate that at least a fraction 2−d of
the inputs satisfy any nontrivial equation that is currently of degree d 0 . For trivial equations we naturally
give the score 1 if they are satisfied and 0 if they are falsified. Looking at the proof of Theorem 2.1 we
can set the variables one by one making sure that this lower bounds estimate never decreases. The value
of this lower bound is initially at least m2−d and at the end exactly the number of satisfied equations. The
theorem follows.

    The lower bound follows rather immediately from known results.
Theorem 3.2. For any ε > 0 it is NP-hard to approximate M AX -d-E Q within 2−d + ε.
Proof. In [5] it is proved that, for any δ > 0, it is NP-hard to distinguish systems of linear equations
where a fraction 1 − δ of the equations can be satisfied from those where only a fraction 1/2 + δ can be
satisfied. Suppose we are given an instance of this problem with m equations which, possibly by adding
one to both sides of the equation, can be be assumed to be of the form
                                       Ai (x) = 1 ,       1 ≤ i ≤ m.                                      (3.1)
Taking all d-wise products of equations from (3.1) we end up with md equations, each of the form
                                                 d
                                                ∏ Ai (x) = 1 ,
                                                      j
                                                j=1

which clearly is a polynomial equation of degree at most d. This system has the same optimal solution as
the linear system and if it satisfies τm linear equations, then it satisfies τ d md degree-d equations. As it is
NP-hard to determine whether τ = 1 − δ or τ = 1/2 + δ , it is NP-hard to approximate the optimal value
for the degree-d system with a ratio better than
                                                          !d
                                                    1
                                                    2 + δ
                                                    1−δ

and by making δ sufficiently small this can be made smaller than 2−d + ε. The theorem follows.

                     T HEORY OF C OMPUTING, Volume 9 (27), 2013, pp. 845–862                                849
                                               J OHAN H ÅSTAD

    We remark that, by appealing to the results by Moshkovitz and Raz [9], we can even obtain results
for non-constant values of ε.


4     Completely satisfiable systems
When studying systems where it is possible to simultaneously satisfy all equations the situation changes
and let us start with the positive results.

4.1   Finding a good assignment
Suppose we have an equation of the form P(x) = 1 and that this equation implies the affine condition
A(x) = 1. Then, as the system is satisfiable, we can use this equation to eliminate one variable from the
system, preserving the degrees of all equations. This is done by taking some variable xi that appears in A
and replacing it by xi + A(x) + 1 (note that this function does not depend on xi as the two occurrences of
this variable cancel). This substitution preserves the satisfiability of the system and the process stops only
when none of the current equations implies an affine condition.
     Using Theorem 2.2 we see that when this process ends each equation is satisfied by at least a fraction
2 1−d − 21−2d of the inputs and thus the following theorem should not come as a surprise.
Theorem 4.1. There is a polynomial time algorithm that, given a system of m simultaneously satisfiable
equations of degree d over GF[2], finds an assignment that satisfies at least d(21−d − 21−2d )me equations.
Proof. There are two points in the outlined argument that require closer inspection. The first is the
question of how to actually determine whether a polynomial equation implies an affine condition and the
second is to make sure that once the process of finding implied affine conditions has ended we can indeed
deterministically find a solution that satisfies the expected number of equations. Let us first address the
issue of determining whether a given equation implies an affine condition.
    Suppose P(x) = 1 implies A(x) = 1 for some unknown affine function A. Let us assume that x1
appears in A with a nonzero coefficient. We may write

                                           P(x) = P0 (x) + P1 (x)x1

where neither P0 nor P1 depends on x1 . We want to prove that P(x) = A(x)P1 (x) and towards this consider

                                       Q(x) = P(x) + A(x)P1 (x) .                                        (4.1)

As x1 appears with coefficient one in A it follows that Q does not depend on x1 and let us assume that Q
is not identically 0. Choose any values for x2 , x3 . . . xn , to make Q(x) = 1 and set x1 to make A(x) = 0. It
follows from (4.1) that P(x) = 1 and thus we have found a counterexample to the assumed implication.
We can hence conclude that Q ≡ 0 and we have

                                             P(x) = A(x)P1 (x) .

    We claim furthermore that this procedure is entirely efficient. Namely given P and the identity of one
variable occurring in A, P1 is uniquely defined. Once P1 is determined the rest of the coefficients of A can

                     T HEORY OF C OMPUTING, Volume 9 (27), 2013, pp. 845–862                              850
                              S ATISFYING D EGREE -d E QUATIONS OVER GF[2]n

be found by solving a linear system of equations. As there are only n candidates for a variable in A and
solving a linear system of equations can be done in polynomial time we conclude that the entire process
of finding implied affine conditions can be done in polynomial time.
     Once this process halts we need to find an assignment that satisfies the expected number of equations.
This could possibly be done in a manner similar to how it was done in the proof of Theorem 3.1 but
for the general case we do not know how to, in deterministic polynomial time, find a suitable bound for
the probability that an equation is satisfied. It is true that the bound of Theorem 2.2 can be calculated
deterministically, but as we are not able to prove it by induction, setting only one variable at the time, it is
not clear how to use it to guide an algorithm. We turn to a different approach and we need the following
result by Viola [11].
Theorem 4.2 ([11]). For any constant d and ε > 0 there exists a polynomial time pseudo-random
generator G with seed length O(d log n + d2d log(1/ε)) and output in {0, 1}n such that

                                   | Pr[P(x) = 0] − Pr[P(G(y)) = 0]| ≤ ε

for any polynomial P of degree d. Here x is a uniformly random string in {0, 1}n and y is a uniformly
random seed.
    This theorem implies that if we are willing to sacrifice a small ε it is enough to consider the candidate
assignments G(y) for all possible seeds y and taking the assignment from this set that satisfies the
maximum number of equations. Let us be more precise.
    We know that the expected number of satisfied equations when we consider a uniformly random
input is (21−d − 21−2d )m and, by Theorem 4.2, the same number, when we consider a random output
from G, is at least (21−d − 21−2d − ε)m. Setting ε = 2−2d m−1 and observing that the maximum number
of equations satisfied is an integer we see that this number is at least

                                         d(21−d − 21−2d )m − 2−2d e .

Finally, as (21−d − 21−2d )m is an integer multiple of 21−2d , we have that

                             d(21−d − 21−2d )m − 2−2d e = d(21−d − 21−2d )me ,

and the proof is complete.

    Note that the generator of Viola is only needed to derandomize the algorithm and had we been content
with a randomized algorithms we could simply replace the last step by random sampling, obtaining a
much more efficient algorithm.
    We turn to establishing that Theorem 4.1 is essentially tight by supplying a matching lower bound in
the next section.

4.2   Inapproximability results
In this section we establish the following lower bound result.
Theorem 4.3. For any ε > 0 it is NP-hard to distinguish satisfiable instances of M AX -d-E Q from those
where the optimal solution satisfies a fraction 21−d − 21−2d + ε of the equations.

                     T HEORY OF C OMPUTING, Volume 9 (27), 2013, pp. 845–862                               851
                                                  J OHAN H ÅSTAD

Proof. Consider the predicate, Q, on 6d variables given by

                                       d                 2d
                                      ∏ Li (x) + ∏ Li (x) = 1 ,                                         (4.2)
                                      i=1               i=d+1

where Li (x) = x3i−2 + x3i−1 + x3i , i. e., each Li is the sum of three independent variables. Note that Q is
of the form PL according to the definition given in Section 2 where P is defined by the equation

                                             d           2d
                                            ∏ yi + ∏ yi = 1 .                                           (4.3)
                                            i=1         i=d+1

Theorem 4.3 now follows from Theorem 4.4 below as the probability that a random assignment satisfies
P is exactly 21−d − 21−2d .

Theorem 4.4. The predicate Q defined by (4.2) is approximation resistant on satisfiable instances.

Proof. We below give a general PCP where the acceptance criteria is given by a predicate of the type PL .
The aim of the PCP is to prove the existence of a good labeling for a projective label cover problem and
let us start by defining this problem. For comparison we note that this is the same starting point as in [5]
but we formulate it in more modern terms.
     We are given a bipartite graph with vertices U and V . Each vertex u ∈ U should be given a label
`(u) ∈ [L] and each vertex v ∈ V should be given a label `(v) ∈ [R]. For each edge (u, v) there is a mapping
πu,v and a labeling satisfies this edge iff πu,v (`(u)) = `(v).
     As stated in [5] (and based on [1] and [10]) it is known that for any constant ε > 0 there are constant
values for L and R such that it is NP-hard to determine whether the optimal labeling satisfies all constraints
or only a fraction ε of the constraints. Using [9] one can extend this to non-constant size domains, but let
us ignore this point.
     As is standard, we transform the label cover instance into a PCP by long-coding a good assignment,
and for each vertex u we have a table gu (y) for y ∈ {0, 1}L , and similarly we have a table fv (x) for
x ∈ {0, 1}R for each v ∈ V . As mentioned in the preliminaries we assume that these tables are folded and
hence each table is unbiased.
     The PCP can use an arbitrary predicate of the form PL of arity 3k (P is thus of arity k) and is also
parametrized by a probability distribution D on k-bit strings.

                                                        Test TP,D

   1. Pick uniformly at random an edge (u, v) which comes with a projection constraint πu,v : [L] 7→ [R].

   2. Pick x(i) ∈ {0, 1}R and y(i) ∈ {0, 1}L uniformly at random, 1 ≤ i ≤ k.

   3. For each j ∈ [L] pick an element µ ( j) from the distribution D and construct z(i) by setting
                                           (i)    (i)           (i)   ( j)
                                       z j = xπu,v ( j) + y j + µi           mod 2 .


                     T HEORY OF C OMPUTING, Volume 9 (27), 2013, pp. 845–862                             852
                                   S ATISFYING D EGREE -d E QUATIONS OVER GF[2]n

   4. Read the 3k bits1 corresponding to fv (x(i) ), gu (y(i) ), and gu (z(i) ). Set

                                               wi = XOR( fv (x(i) ), gu (y(i) ), gu (z(i) ))

       for 1 ≤ i ≤ k and accept iff the k-bit string w satisfies P.

   We first have the easy completeness. Let ED denote the expected value operator when the input is
chosen with distribution D.
Lemma 4.5. If there is a labeling that satisfies all the constraints in the underlying label cover instance
then there is a proof for TP,D that makes the verifier of this proof accept with probability at least ED [P(x)].

Proof. As indicated above, given a satisfying labeling for the label cover problem we use the long code
to construct a good proof for the PCP. In other words, if the node v is given the label `v then fv is set
to be the long code for `v and similarly gu is the long code of the label of u. It follows, more or less by
definition, that for such a proof the string w equals µ (`u ) and as this string was chosen according to the
distribution D, the lemma follows.

   Let us turn to soundness. Let mD be its maximum of the absolute value for a Fourier coefficient of the
measure D that corresponds to a non-constant character. In other words, we let

                                                mD = max ED [χα (x)] .                                      (4.4)
                                                          α6=0k

Note that mD < 1 unless D is completely supported on an affine subspace of {0, 1}k .
Lemma 4.6. If the verifier in test TP,D accepts with probability at least r(P) + ε then there is a labeling
in the label cover problem that satisfies at least a fraction ck (1 − mD )ε 2 of the constraints of this problem.
Here ck > 0 is a constant depending only on k.
    Before we prove this lemma let us give some high level intuition what mechanisms are behind the
proof. That the exclusive-or of several (at least three) bits is easy to analyze using the Fourier transform
was clear already in [5]. This is the reason why we work with PL . The other important property needed is
that the entire construction is not linear. This was achieved by adding random noise in [5] while here we
have the distribution D with the key property that its support is not contained in an affine subspace.

Proof of Lemma 4.6. For notational convenience let us drop the subscripts on f , g and π. Expand the
predicate P by its multi-linear expansion. Since the constant term, P̂0/ , equals r(P), we conclude that given
the assumption of the lemma there are sets S1 , S2 and S3 , at least one of which is non-empty such that
                                "                                #
                                   E    ∏ f (x(i) ) ∏ g(y(i) ) ∏ g(z(i) )         ≥ ck ε ,                  (4.5)
                                       i∈S1             i∈S2      i∈S3

for some constant ck depending only on k.
    First note that if S2 6= S3 the expectation in (4.5) is zero as for any i in the symmetric difference we
get a factor g(y(i) ) or g(z(i) ) that is independent of the other factors and, as g is folded, the expectation of
   1 We interpret −1 as the bit 1 and 1 as the bit 0.



                         T HEORY OF C OMPUTING, Volume 9 (27), 2013, pp. 845–862                              853
                                                                    J OHAN H ÅSTAD

                                                                                                                                       (i)
such a term is 0. Note here that y(i) and z(i) are in fact symmetric and in particular if the value of y j is
                           (i)
not constrained, then z j is uniformly chosen and independent of all other variables.
    To get a non-zero value we also need S1 = S3 as otherwise negating x(i) in the symmetric difference
we get canceling terms. Thus we need to study
                                      "                     #
                                               E       ∏ f (x(i) )g(y(i) )g(z(i) ) .                                                   (4.6)
                                                       i∈S

Expanding each function by the Fourier transform we get the expectation
                       "                                                          !#
                      E ∏ ∑ fˆα i ĝβ i ĝγ i χα i (x(i) )χβ i (y(i) )χγ i (z(i) ) .                                                   (4.7)
                                 i∈S    α i ,β i γ i

If we mentally expand this product of sums and look at the expectation of each term we see that terms
                                                                          (i)
with γ i 6= β i give contribution 0. This follow as if j ∈ β i /γ i then y j is independent of all other occurring
                                                                                   (i)
variables. By symmetry this argument applies to z j if j ∈ γ i /β i .
                                                                                                                                (i)
   If π2 (β i ) 6= α i then, for any j in the symmetric difference, negating x j we negate the resulting term
and we can conclude that also in this case the expectation is 0. To analyze the remaining terms, let µi
                         ( j)
denote the vector (µi )Lj=1 then

                                                                                                                          (i)
            χπ2 (β i ) (x(i) )χβ i (y(i) )χβ i (z(i) ) = χπ2 (β i ) (x(i) )χβ i (y(i) )χβ i (xπ + y(i) + µi ) = χβ i (µi ) ,

and thus (4.7) reduces to
                                               "                                                          !#
                                           E      ∏ ∑ fˆπ (β ) ĝ2β χβ (µi )
                                                                         2
                                                                               i         i     i                    .                  (4.8)
                                                   i∈S        βi

We have
                                                                                                                   ( j)
                                                   ∏ χβ (µi ) = ∏ (−1)∑ µ
                                                               i                                              i    i                   (4.9)
                                                   i∈S                              j∈∪i β i


where the sum in the exponent is over the set of i such that j ∈ β i . As µ ( j) is chosen independently for
different values of j and each factor in the product corresponds to a non-trivial character the absolute
value of the expectation of (4.9) is bounded, in absolute value, by
                                                                   |∪ β i |                    |β i |/k
                                                              mD i            ≤ m∑
                                                                                 D
                                                                                   i∈S
                                                                                                          ,

and hence we can conclude from (4.5) that
                              "                                                                     !#
                                                                                   |β i |/k
                                    Eu,v     ∏ ∑               fˆπ2 (β i ) ĝ2β i mD                              ≥ ck ε .            (4.10)
                                             i∈S         βi


                        T HEORY OF C OMPUTING, Volume 9 (27), 2013, pp. 845–862                                                         854
                               S ATISFYING D EGREE -d E QUATIONS OVER GF[2]n

As S is nonempty and any factor is bounded from above by one we conclude that
                                    "                       #
                                                     |β |/k
                                Eu,v ∑ fˆπ (β ) ĝ2 m 2
                                                              ≥ ck ε .
                                                               β       D                                          (4.11)
                                             β

The Cauchy-Schwarz inequality implies that
                                                                   !1/2                                    !1/2
                                             |β |/k                                              2|β |/k
                        ∑    fˆπ2 (β ) ĝ2β mD ≤          ∑ ĝ2β           ∑    fˆπ22 (β ) ĝ2β mD                (4.12)
                         β                                β                β
                                                                                     !1/2
                                                                           2|β |/k
                                                 ≤        ∑ fˆπ2 (β ) ĝ2β mD
                                                                2
                                                                                            .                     (4.13)
                                                          β

Using (4.11), and E[X 2 ] ≥ E[X]2 , we conclude that
                                      "                      #
                                                     2|β |/k
                                  Eu,v ∑ fˆ2 ĝ2 m
                                                 π2 (β ) β     ≥ c2k ε 2 .
                                                                   D                                              (4.14)
                                             β

    We now extract a probabilistic labeling using the standard procedure. For each u we choose a set
β with probability ĝ2β and return a random element in β . Similarly for each v we choose a set α with
probability fˆα2 and return a random element in α. The expected fraction of satisfied constraints under this
strategy is clearly at least                  "                      #
                                                                 1
                                         Eu,v ∑ fˆπ22 (β ) ĝ2β        .                             (4.15)
                                                β
                                                                |β |

We have that emD −1 ≥ mD for any 0 ≤ mD ≤ 1 and te−t ≤ e−1 for all t ≥ 0 and using these two inequalities
we have
                             1 2e(1 − mD ) 2x(mD −1)/k 2e(1 − mD ) 2x/k
                               ≥              e             ≥             mD
                             x         k                           k
for any x ≥ 1. Applying this last inequality with x = |β | and inserting this into (4.15) and comparing this
to (4.14) we see that the expectation of (4.15) is at least

                                                      2e(1 − mD )c2k 2
                                                                    ε .
                                                            k
Finally, adjusting the value of ck , this completes the proof of Lemma 4.6.

    We now prove Theorem 4.4 using the standard translation of PCPs where the verifier uses O(log n)
bits of randomness and has acceptance criteria P to a lower bound for M AX -P. In view of Lemma 4.5
and Lemma 4.6 all we have to do is to supply a distribution D supported on inputs such that P(y) = 1
(where P is defined by (4.3)) such that mD < 1.
    Let us describe D by a sampling procedure. First flip a bit b and if b = 0 set yi = 1 for 1 ≤ i ≤ d
while (yi )2d                                    d
           i=d+1 are picked uniformly from the 2 − 1 different d bit strings that are not all-one. If b = 1
then the roles of the two halves are interchanged.

                      T HEORY OF C OMPUTING, Volume 9 (27), 2013, pp. 845–862                                       855
                                             J OHAN H ÅSTAD

    Clearly this distribution is fully supported on strings accepted by P. Let α ⊆ [2d] be a nonempty set
defining a character. If α is contained in one of the two halves we observe that the distribution on this
half is obtained by picking a string from the uniform distribution with probability (1/2)(1 + (2d − 1)−1 )
and otherwise picking the all one string. It follows that in this case
                                               1               1
                              EDµ (−1)∑i∈S µi = 1 − (2d − 1)−1 < .
                                            
                                               2                2
     If, on the other hand, α contains inputs from both halves then by conditioning on which half gets the
all one assignment it follows that
                                                                1
                                  EDµ (−1)∑i∈S µi ≤ (2d − 1)−1 < .
                                                
                                                                2
We conclude that mD ≤ 1/2.
   This completes the proof of Theorem 4.4.


5    Consequences for M AX -CSP S
Let us draw some conclusions from Lemma 4.5 and Lemma 4.6 relating to M AX -CSPs.

Theorem 5.1. For any non-trivial predicate P that accepts at least one input, the predicate PL is
approximation resistant.

Proof. Let α ∈ {0, 1}k be an input accepted by P. For an arbitrary δ > 0, define a distribution D by
setting µi = αi with probability 1 − δ and otherwise µi = αi , independently for each i. The theorem now
follows for Lemma 4.5 and Lemma 4.6 as ED [P(x)] ≥ 1 − kδ and mD ≤ 1 − δ and δ might be arbitrarily
small.

    It is not difficult to see that for any P, PL supports a measure that is pairwise independent. This
implies that the results of Austrin and Mossel [3] would have been sufficient to give approximation
resistance but this conclusion relies on the unique games conjecture. In our case we get NP-hardness
which is an advantage and it is also possible to get a general theorem with perfect completeness.

Theorem 5.2. For any predicate P such that P−1 (1) is not contained in a (k − 1)-dimensional affine
subspace of {0, 1}k , the predicate PL is approximation resistant for satisfiable instances.

Proof. Given the above discussion we need only define a suitable distribution D and we let it be the
uniform distribution on inputs accepted by P. The PCP is now given by TP,D and by the assumption of the
theorem we have mD < 1.

    Note that Theorem 4.4 is a special case of Theorem 5.2 where we spelled out the details of the proof
more carefully.
    It is tempting to guess that for any P that does imply an affine condition, and hence Theorem 5.2 does
not apply, PL would not be approximation resistant on satisfiable instances. This does not seem to be
obviously true and let us outline the problems.

                    T HEORY OF C OMPUTING, Volume 9 (27), 2013, pp. 845–862                           856
                             S ATISFYING D EGREE -d E QUATIONS OVER GF[2]n

    We can use the implied affine conditions to eliminate some variables as we did in the proof of
Theorem 4.1. The final stage when we have no more implied affine constraints is, however, more difficult
to control. The resulting constraints are given by affine constraints in conjunction with the original P.
By the assumption on perfect satisfiability we can conclude that each equation is still satisfiable but not
much more. In particular we have no immediate estimate on the expected number of equations that will
be satisfied if we give uniformly random values to the remaining free variables.
    If, however, our predicate is of limited degree when viewed as a polynomial we have more information
on the result. Clearly during the process of eliminating affine constraints, the degrees of the equations do
not increase, and in fact, they decrease when we remove the known affine factor within each polynomial.
We get the following conclusion.

Theorem 5.3. Suppose predicate P of arity k is given by a polynomial of degree d that contains r linearly
independent affine factors. Then if P accepts less than a fraction 21−(d−r) − 21−2(d−r) of the inputs, PL is
approximation resistant but not approximation resistant on satisfiable instances, unless NP 6= P.

Proof. The predicate is approximation resistant by Theorem 5.1. On perfectly satisfiable instances we
can run the algorithm of Theorem 4.1, and as we remove affine constraints the resulting degree is at most
d − r.

   The simplest example of a predicate for which this theorem applies is the predicate, P, given by the
equation
                                        x1 (x2 x3 + x4 x5 ) = 1
which has d = 3 and r(P) = 3/16. For this instantiation of P, PL is approximation resistant but not
approximation resistant for satisfiable instances. To get a hardness result for satisfiable constraints we can
use Theorem 4.4 for the predicate
                                               x2 x3 + x4 x5 = 1
which is approximation resistant with factor 3/8 on satisfiable instances. We get a matching algorithm as
the affine factor can be removed and the equations that remain are of degree 2.
    Let us finally point out that all our approximation resistance results establish the stronger property
of “uselessness” introduced by Austrin and Håstad [2]. This follows as we are able to bound arbitrary
non-trivial characters and not only the characters appearing in multivariate expression of the considered
predicates.


6    Final words
The current paper gives optimal approximability results for satisfying the maximum number of low-
degree equations over GF[2]. The methods used in the proofs are more or less standard and thus the main
contribution of this paper is to obtain tight results for a natural problem. There is a provable difference
between perfectly satisfiable and almost-perfectly satisfiable systems in that we can satisfy strictly more
equations in the former case. The difference is not as dramatic as in the linear case, but still striking.
   For the case of M AX -CSPs we obtained a few approximation resistance results for, admittedly,
non-standard predicates. We feel, however, that the examples give, a not major but nonempty, contribution

                     T HEORY OF C OMPUTING, Volume 9 (27), 2013, pp. 845–862                              857
                                             J OHAN H ÅSTAD

towards understanding the difference of approximation resistant predicates and those predicates that have
this property also on satisfiable instances. Our example of an approximation resistant predicate which has
another, nontrivial, approximation constant on satisfiable instances is the first of its kind. Although not
surprising this result gives another piece in the puzzle to understand M AX -CSPs.
     Our problem is not a CSP in the traditional sense as the only restriction we put on our polynomials is
that they are of bounded degree and in particular each constraint can depend on all variables. This implies
that an inapproximability for our basic problem does not even imply the PCP theorem and thus could
potentially be proved by simpler methods, not relying on the PCP theorem. Indeed the results of Håstad,
Phillips and Safra [7] mentioned in the introduction do not rely on the PCP theorem. It is conceivable that
it is possible to prove the optimal bounds (potentially even with stronger error terms) using such methods.
     A natural extension of the current work is to study what happens over fields other than GF[2]. This
is clearly an interesting problem which deserves to be studied. We have not made a serious attempt to
extend the results of the current paper but feel that such an extension would require some work as we
have made use of many properties that are not true over larger fields. As a simple example note that P(x)
might be a non-constant polynomial of degree two and still P(x) = 0 might have no solution modulo 3.

Acknowledgement I thank Parikshit Gopalan for alerting me to the paper [8] and providing me with
an electronic version of that paper. I am also grateful to Srikanth Srinivasan and Shachar Lovett who
independently pointed out the fact that the generator by Viola can be used to derandomize the main
algorithm. I also thank Dieter van Melkebeek for alerting me to the fact that I initially used the generator
in a strange way and pointing out the best way to get the derandomization. Finally I am grateful to three
referees for a careful reading of the manuscript.


A    Proof of Theorem 2.2
The goal of the current section is the prove Theorem 2.2. We remind the reader that this result follows
from the characterization by Kasami and Tokura [8] of all codewords of the Reed-Muller code that have
weight at most twice the minimal weight.
    First note that the bound of Theorem 2.2 is sharp as it is obtained by

                                               xα (xβ + xγ )

where α, β , and γ are disjoint multi-indices where the size of α is r and the sizes of β and γ both are
d − r.

Proof. We prove the statement by induction and we establish d = 2 as the base case below to avoid
degenerate cases. The following easy observations are useful for us.

    • As d − r cannot equal 1, it follows that for any degree-d polynomial that is not a product of affine
      factors, the fraction of inputs on which it takes the value one is at least (3/2) 2−d .

    • The lower bound to be proved never exceeds 21−d , and thus this bound is always sufficient (but not
      always possible).

                     T HEORY OF C OMPUTING, Volume 9 (27), 2013, pp. 845–862                            858
                              S ATISFYING D EGREE -d E QUATIONS OVER GF[2]n

The statement for general d and r follows from the case of d − r and 0 and hence we may focus on the
case of r = 0 (but of course use arbitrary r in the inductive statements).
    A fact that needs to be taken into account is, as we are using that xi2 = xi , that the factorization is not
unique. In particular if A and A0 are two affine factors of a polynomial P then another factor is 1 + A + A0
as
                                           (1 + A + A0 )A = A0 A .
Thus if we have several affine factors we can construct new affine factors by taking the sum of of an even
number of such factors added with the constant 1 or the sum of an odd number of such factors. Let us
point out that the statement of Theorem 2.2 that Q does not contain any affine factor is intended to mean
that there is no factorization of P that contains the given affine factors and one more linearly independent
affine factor.
     Another useful fact is that an affine function A(x) is a factor of a polynomial P iff A(x) = 0 implies
P(x) = 0 or equivalently if P(y) = 1 implies A(y) = 1. The non-obvious direction of this statement was
established during the proof of Theorem 4.1 when identifying implied affine constraints.
     For the reader worried about the non-unique factorization, let us point out the following fact that we
leave to the reader to verify. The number of affine factors is the co-dimension of the affine hull of all
points such that P(x) = 1. The factors that may appear in the factorization are the affine equations defining
this affine hull. In the full factorization we may take any full set of linearly independent equations.
     Let us address the case of d = 2, which can be established by the normal form of degree-two
polynomials, but let us follow a different path to prepare for the general proof. As stated above, the
interesting case is when P has no affine factor and let us write P(x) = P0 (x) + x1 P1 (x).
     Consider setting x1 to its two values and as P does not contain an affine factor neither of the induced
polynomials can be identically 0. Furthermore if neither of these settings result in a polynomial with an
affine factor we are done by induction. Let us finally assume that x1 = 0 results in an affine factor, which
we, by an affine change of coordinates, can assume is x2 . Thus we can assume that
                                         P(x) = x2 A2 (x) + x1 A1 (x) ,
for two affine functions A1 and A2 where we also can assume that xi does not appear in Ai (x) as it can be
replaced by 1 giving the same result. The main case is that the collection of x1 , x2 , A1 (x), and A2 (x) form
linearly independent affine functions and in this case the fraction of inputs for which P is one is exactly
3/8 which is the claimed bound. We have a number of cases to consider when the four functions are not
linearly independent.
   1. A1 (x) ≡ 1. If A2 (x) = x1 , then x1 is a factor of P while if A2 (x) = 1 + x1 then P is one with
      probability 3/4. Finally if A2 (x) is linearly independent of x1 this probability is 1/2.
   2. A1 (x) = x2 makes x2 a factor of P.
   3. A1 (x) = 1 + x2 makes Pr[P(x) = 1] = 1/2 unless we have A2 (x) ≡ 1 when this probability is 3/4.
   4. A1 (x) is linearly independent of x2 . In this case, by an affine change of variables, we can assume
      that A1 (x) = x3 . Now since A2 (x) does not contain x2 , is linearly dependent of x1 and x3 , and is not
      a factor of x1 x3 (ruling out also (1 + x1 + x3 )), A2 (x) must equal one of the functions 1, 1 + x1 or
      1 + x3 . In each of these cases it is easy to check that Pr[P(x) = 1] is 1/2.

                     T HEORY OF C OMPUTING, Volume 9 (27), 2013, pp. 845–862                                859
                                               J OHAN H ÅSTAD

This finishes the case d = 2 end we turn to the general case. Not surprisingly, also here we end up
analyzing a number of cases.
    As in the case d = 2 we can assume that P has no affine factors but the polynomial resulting when
substituting x1 = 0 gives a polynomial with at least one affine factor. Picking a full set of linearly
independent factors and making an affine transformation we can assume that

                                        P(x) = x2 xβ P2 (x) + x1 P1 (x)

where β is a possible empty multi-index and P2 has no affine factors. First let us consider affine factors in
P1 .
     Let ∏ri=1 Ai (x) be the affine factors that appear in P1 . We have two cases depending whether each Ai
that might appear in the factorization, together with the affine forms that appear in the first product (i. e.,
the coordinate functions given by x2 and the elements of β ) are linearly independent.
     Suppose these functions are not linearly independent and hence that we can choose the factorization
such that A1 (x) only depends on x2 and the variables in β . Let us look at the point x0 where x2 and
all elements of β equals 1. If A1 (x0 ) = 1 then A1 (x) is a factor of P and this is a contradiction of the
assumptions. If, on the other hand A1 (x0 ) = 0 then the sets of points where x2 xβ P2 (x) = 1 and x1 P1 (x) = 1
are disjoint and, as each of these sets is of density at least 2−d , we get Pr[P(x) = 1] ≥ 21−d and the
theorem follows also in this case.
     Now assume that any affine form appearing in the factorization of P1 is linearly independent of x2
and the variables in β . Then, by an affine transformation, we can assume that

                                    P(x) = x2 xβ P2 (x) + x1 xγ P10 (x) ,                                (A.1)

where β and γ are disjoint multi-indices and no more affine factors can be pulled out of P2 or P10 .
    Let us analyze what happens for the four possible simultaneous assignments of values to x1 and x2 .
When both are 0 we get a function that is identically 0 which is not good for us but in the other cases we
get non-zero polynomials of degree d − 1 and we now analyze the structure of these polynomials.
    When x1 = 0 and x2 = 1 we get xβ P2 (x), when x1 = 1 and x2 = 0 we get xγ P10 (x), and in the final case
we have
                                      W (x) = xβ P2 (x) + xγ P10 (x) .
Note that W is not identically 0 as (x1 + x2 ) then would have been an affine factor of P and furthermore
that xβ P2 , xγ P10 , W all are of degree at most d − 1.
    Suppose first that neither P2 nor P10 is the constant one. Then, using the bound that a degree-(d − 1)
polynomial that is not a pure product of affine factors is one with probability at least (3/2) 21−d , we have
that Pr[P(x) = 1] is at least                                  
                                        1 3 1−d 3 1−d       1−d
                                              2    + 2   +2       = 21−d
                                        4 2          2
proving the bound in this case. By symmetry we may thus assume that P2 ≡ 1. If γ = 0/ or W does not
contain any affine factor, then Pr[P(x) = 1] is at least

                                      1 1−d
                                        (2  + 21−d + 22−d − 23−2d ) ,
                                      4

                     T HEORY OF C OMPUTING, Volume 9 (27), 2013, pp. 845–862                               860
                              S ATISFYING D EGREE -d E QUATIONS OVER GF[2]n

which exactly equals the claimed bound. Thus we can assume that γ is non-empty and W contains an
affine factor A(x) and remember that

                                         W (x) = xβ + xγ P10 (x) .                                       (A.2)

    Suppose A only depends on variables in β . If it is fixed to 1 by setting all variables in β to one,
then A(x) is a factor of xβ and hence also of W (x) + xβ = xγ P10 (x) and hence also of P(x), contradicting
assumptions. If A(x) is forced to 0 by this assignment then setting any variable in γ (remember it is
non-empty and disjoint from β ) to 0 and we get W (x) = 0 while xβ = 1 and xγ P10 (x) = 0 contradicting
(A.2). Thus we can assume that A(x) depends on some variable outside β .
    If we can fix some variable in γ to 0, the variables of β to one and A to zero we get a contradiction
to (A.2) as xβ = 1 while W (x) = 0 and xγ = 0. If the size of γ is at least 2 or W contains at least two
different affine factors we claim that this must be possible. Suppose first that the size of γ is at least 2.
    As A(x) does not only depend on variables in β we can fix these variables to one, and then pick a
suitable variable in γ to fix to 0 without fixing the value of A(x). We can then fix additional variables to
make A(x) = 0 obtaining the desired contradiction.
    Now suppose that W contains at least 2 affine factors and let A0 be a factor distinct from A. In this
case, one of A, A0 and 1 + A + A0 is a factor of W and does not depend on the first variable of γ (and some
variable outside β ). It follows again that we can make xβ = 1, xγ = 0 and W (x) = 0, again obtaining a
contradiction.
    The only remaining case is when γ is of size one and W has one affine factor. In this case, by induction
Pr[P(x) = 1] is at least
                                                                    
                                     1 1−d          1  3−d     5−2d
                                         2     +2·     2    −2          .
                                     4              2

For d at least 4 this is at least 21−d and we are done. Finally note that in the case d = 3, P10 as well as the
co-factor of A in W are of degree at most one and hence if they do not contain an additional factor they
must be the constant 1 and the theorem follows also in this case.



References
 [1] S ANJEEV A RORA , C ARSTEN L UND , R AJEEV M OTWANI , M ADHU S UDAN , AND M ARIO
     S ZEGEDY: Proof verification and the hardness of approximation problems. J. ACM, 45(3):501–555,
     1998. Preliminary version in FOCS’92. See also at ECCC. [doi:10.1145/278298.278306] 852

 [2] P ER AUSTRIN AND J OHAN H ÅSTAD: On the usefulness of predicates. ACM Trans. Computation
     Theory, 5(1):1:1–1:24, 2013. Preliminary version in CCC’12. [doi:10.1145/2462896.2462897] 857

 [3] P ER AUSTRIN AND E LCHANAN M OSSEL: Approximation resistant predicates from pairwise
     independence. Comput. Complexity, 18(2):249–271, 2009. Preliminary version in CCC’08. See
     also at ECCC. [doi:10.1007/s00037-009-0272-6] 856

                     T HEORY OF C OMPUTING, Volume 9 (27), 2013, pp. 845–862                               861
                                           J OHAN H ÅSTAD

 [4] M IHIR B ELLARE , O DED G OLDREICH , AND M ADHU S UDAN: Free bits, PCPs, and
     nonapproximability—towards tight results. SIAM J. Comput., 27(3):804–915, 1998. Prelimi-
     nary version in FOCS’95. See also at ECCC. [doi:10.1137/S0097539796302531] 848
 [5] J OHAN H ÅSTAD: Some optimal inapproximability results. J. ACM, 48(4):798–859, 2001. Prelimi-
     nary version in STOC’97. [doi:10.1145/502090.502098] 846, 847, 849, 852, 853
 [6] J OHAN H ÅSTAD: Satisfying degree-d equations over GF[2]n . In Proc. 14th Internat. Workshop on
     Approximation Algorithms for Combinatorial Optimization Problems (APPROX’11), pp. 242–253.
     Springer, 2011. [doi:10.1007/978-3-642-22935-0_21] 845, 847
 [7] J OHAN H ÅSTAD , S TEVEN P HILLIPS , AND S HMUEL S AFRA: A well-characterized approximation
     problem. Information Processing Letters, 47(6):301–305, 1993. Preliminary version in ISTCS’93.
     [doi:10.1016/0020-0190(93)90076-L] 846, 858
 [8] TADAO K ASAMI AND N OBUKI T OKURA: On the weight structure of Reed-Muller codes. IEEE
     Trans. Inform. Theory, 16(6):752–759, 1970. [doi:10.1109/TIT.1970.1054545] 846, 847, 858
 [9] DANA M OSHKOVITZ AND R AN R AZ: Two-query PCP with subconstant error. J. ACM, 57(5):29:1–
     29:29, 2010. Preliminary version in FOCS’08. See also at ECCC. [doi:10.1145/1754399.1754402]
     850, 852
[10] R AN R AZ: A parallel repetition theorem. SIAM J. Comput., 27(3):763–803, 1998. Preliminary
     version in STOC’95. [doi:10.1137/S0097539795280895] 852
[11] E MANUELE V IOLA: The sum of D small-bias generators fools polynomials of degree D. Comput.
     Complexity, 18(2):209–217, 2009. Preliminary version in CCC’08. [doi:10.1007/s00037-009-0273-
     5] 846, 851

AUTHOR
     Johan Håstad
     Professor
     KTH - Royal Institute of Technology
     johanh kth se
     http://www.csc.kth.se/~johanh

ABOUT THE AUTHOR
     J OHAN H ÅSTAD received his Bachelor of Science from Stockholm University in 1981, his
         Master of Science from Uppsala University in 1984, and his Ph. D. from MIT in 1986
         under the supervision of Shafi Goldwasser. Johan was appointed Associate Professor at
         the Royal Institute of Technology in Stockholm, Sweden in 1988 and advanced to the
         level of Professor in 1992. He was elected a member of the Swedish Royal Academy
         of Sciences in 2001. He has research interests within several subareas of Theory of
         Algorithms and Complexity theory but has mainly focused on the approximability of
         NP-hard optimization problems.


                   T HEORY OF C OMPUTING, Volume 9 (27), 2013, pp. 845–862                       862