On the Hardness of Approximating Balanced Homogenous 3-Lin

Authors Johan H{\aa}stad, Rajsekar Manokaran,
Plaintext
                          T HEORY OF C OMPUTING, Volume 13 (10), 2017, pp. 1–19
                                       www.theoryofcomputing.org




On the Hardness of Approximating Balanced
           Homogeneous 3-Lin
                             Johan Håstad∗                     Rajsekar Manokaran†
                 Received November 17, 2014; Revised May 18, 2017; Published October 7, 2017




       Abstract: We consider systems of homogeneous linear equations modulo 2 with three
       variables in each equation and study balanced assignments as solutions to such equations. We
       prove that it is hard to distinguish systems where there is a balanced assignment that satisfies
       a fraction 1 − ε of the equations from systems where the best balanced assignment satisfies a
       fraction 12 + ε of the equations assuming that NP is not contained in quasipolynomial time.
       This improves on a similar result by Holmerin and Khot who relied on the assumption that
       NP is not contained in subexponential time. The key for the improvement is to replace long
       codes used by Holmerin and Khot by the low-degree long code.

ACM Classification: F.2.2, F.1.3
AMS Classification: 68Q17, 68Q25
Key words and phrases: hardness of approximation, inapproximability, probabilistically checkable
proofs, constraint satisfaction problems, bisection, approximation resistance

1     Introduction
Many natural problems can be formulated in terms of constraint satisfaction problems (CSPs) where we
have a large number of constraints on a large number of variables, but each constraint involves only a
constant number of variables. To determine if there is an assignment that satisfies all the constraints is,
except in a few special cases, NP-complete, and we turn to the question of satisfying as many constraints
as possible. Clearly, finding the maximum fraction of constraints satisfiable is NP-hard and hence one
    ∗ KTH - Royal Institute of Technology. Work supported by ERC Grant 226-203.
    † IIT Madras. Work supported by ERC Grant 226-203.



© 2017 Johan Håstad and Rajsekar Monokaran
c b Licensed under a Creative Commons Attribution License (CC-BY)                  DOI: 10.4086/toc.2017.v013a010
                               J OHAN H ÅSTAD AND R AJSEKAR M ONOKARAN

turns to approximation algorithms. An algorithm is an α-approximation algorithm if its output is always
within a factor α of the optimal value.
     A huge effort has been spent on finding the best value of α achievable by a polynomial time algorithm
for different problems. A key question here is whether an efficient algorithm beats the value of α obtained
by simply picking an assignment at random. Problems where this is not possible are called approximation
resistant, and a surprising number of problems have this property (see for instance [12, 8]). These results
establish NP-hardness of the discussed problems and if we settle for Unique Games hardness, Austrin
and Mossel [4] gave very general conditions for approximation resistance that apply to a vast majority
of all constraint predicates [3]. Going even further, also in the case of Unique Games hardness, Khot et
al. [16], gave necessary and sufficient conditions for “strong approximation resistance,” a slightly stronger
notion and thus we have, in broad terms, a fairly good understanding of the complexity of approximating
CSPs. There are two special cases where the picture is less clear and more information is needed. One is
that of perfect completeness (i. e., when we are guaranteed that there is an assignment that satisfies all
conditions) and the other is when the problem also comes with global constraints and this is the case of
interest in the current paper.
     The simplest and possibly the most natural CSPs with global constraints are Max-Bisection and
Min-Bisection where an assignment bisecting the vertices of a graph while maximizing or minimizing
the cut edges is required. It is not difficult to see that Max-Bisection is at least as hard as the Max-Cut
problem wherein any partition of the vertices is allowed. Thus, it is Unique Games hard to approximate
this problem within the Goemans-Williamson constant (approximately 0.8786) [10, 15]. The algorithm
by Austrin et al. [2] comes very close to obtaining this approximation ratio (it obtains a factor 0.8776)
and it is a tantalizing open problem to understand whether the problem is strictly harder than Max-Cut.
     When it comes to Min-Bisection, the situation is even more open. The best algorithm, by Räcke [18],
achieves an approximation ratio that is logarithmic in the instance size. On the other hand, the best
                                                                                                            0
inapproximability, due to Khot [14], rules out a 1 + ε-approximation unless NP ⊆ DTIME(exp(nε ))
where ε 0 → 0 as ε → 0.
     In this paper we study a problem given by a system of linear equations modulo 2 with three variables
in each equation where a solution is required to be balanced, i.e. give values 0 and 1 to (almost) the
same number of variables. The basic variant of problem, called Max-3-LIN, without the condition of
balance between the number of 0s and 1s is known to be NP-hard [12] to approximate within a factor of
1/2 + ε for any ε > 0. As the good solution in the completeness case in the reduction of [12] is balanced
the same hardness applies to the problem requiring a balanced assignment. In this paper, however, we
study the homogeneous case, denoted by Max-Hom-Bal-3-LIN, where all the right hand sides are 0. In
this situation allowing a general assignment makes the problem trivial as the identically 0 assignment
satisfies all equations. The reason for studying this problem is at least two-fold. The fact that the balance
condition is essential highlights this global condition which in the general case just happens to be satisfied.
Secondly, having hardness for this simpler problem can give a more powerful starting point to use in
further reductions.
     Max-Hom-Bal-3-LIN, has previously been studied by Holmerin and Khot [13] who show that that it is
hard to distinguish systems where a fraction 1 − ε of the constraints can be satisfied, from systems where
the best balanced assignment satisfies a fraction 1/2 + ε of the constraints under a complexity assumption:
a polynomial time distinguishing algorithm implies that NP ⊆ DTIME(exp(nδ )) for all δ > 0. We note

                        T HEORY OF C OMPUTING, Volume 13 (10), 2017, pp. 1–19                                2
                O N THE H ARDNESS OF A PPROXIMATING BALANCED H OMOGENOUS 3-L IN

that an inapproximability of homogeneous linear equations, albeit with a stronger guarantee on the
structure of the system, forms the core of the inapproximability of Min-Bisection [14].


1.1   Our results and techniques used
We improve the complexity assumption in the above, proving the following statement.

Theorem 1.1 (Main). For every ε > 0, a language L ∈ NP can be reduced to Max-Hom-Bal-3-LIN so
that a w ∈ L maps to an instance, I, with value at least 1 − ε, while if w ∈/ L, the instance I output has
value at most 1/2 + ε. Further,  the reduction is deterministic, and the running-time  (and the size of I) is
at most exp log(|w|)O(log 1/ε) .
                              


  As an immediate corollary, we see that even quasipolynomial time algorithms fail to approximate
Max-Hom-Bal-3-LIN better than a random assignment.

Corollary 1.2. For any ε > 0, there is no algorithm that runs in time exp((log n)O(1) ) and distinguishes
Max-Hom-Bal-3-LIN instances with value at least 1 − ε from instances with value at most 1/2 + ε unless
NP ⊆ DTIME exp (log n)O(1) .

     Our proof of inapproximability follows rather closely that of Holmerin and Khot [13] and is obtained
by reducing graph 3-colorability to Max-Hom-Bal-3-LIN. The reduction uses a PCP where the verifier
reads three bits from a proof and verifies that their exclusive-or is 0. The verifier is a composition of
an outer verifier with an inner verifier. The former starts with a 3-colorability question and expects
to find an encoding of a coloring as a low-degree polynomial. As noted by Holmerin and Khot [13],
low-degree polynomials satisfy certain crucial properties that allow the inner verifier of Håstad [12] to
output homogeneous linear equations.
     Our main new ingredient is the use of the low-degree long code, introduced under the name “short
code” by Barak et al. [6]. To adopt this to our inapproximability proof, we follow the approach by Dinur
and Guruswami [9] and Guruswami et al. [11] who used it to prove lower bounds for some maximum
constraint satisfaction problems and hypergraph colorability.
     The technical problems encountered in the adaptation are not substantial as soon as one realizes
that the line-point test for low-degree testing is a linear test in the bits in the proof. We believe that our
result reinforces the proposition that the low-degree long code is versatile and may prove useful in many
situations to get improved asymptotic bounds.
     As already noted in [13], lower bounds for Max-Hom-Bal-3-LIN imply, through a simple gadget
reduction, results for Max-Bisection. The result is that for any ε > 0 it is hard to approximate the latter
problem within 15/16 + ε. The assumption needed is the same as for Corollary 1.2. This compares
favorably to the factor 16/17 + ε for which the problem is known to be NP-hard, but, of course, falls
short Goemans-Williamson constant for which Unique Games hardness is known.
     The next section introduces our notation and some standard facts used in the rest of the paper.
Section 3 describes the outer verifier and its analysis. Section 4 describes the inner verifier and the
proof of the main theorem. For completeness we give most details of the reduction to Max-Bisection in
Section 5.

                       T HEORY OF C OMPUTING, Volume 13 (10), 2017, pp. 1–19                                3
                              J OHAN H ÅSTAD AND R AJSEKAR M ONOKARAN

2    Preliminaries
The focus of this article is the problem Max-Hom-Bal-3-LIN, defined below.

Definition 2.1 (Max-Hom-Bal-3-LIN). An instance I of Max-Hom-Bal-3-LIN is specified by a set of
variables, X, and a distribution C over 3-tuples of X. An assignment, λ , to I is a map X → F2 . Such an
assignment is said to be balanced if the sets λ −1 (0) and λ −1 (1) differ in cardinality by at most 1. The
value of an assignment is:

                          val(λ ; I) ,        Pr           {λ (x1 ) + λ (x2 ) + λ (x3 ) = 0} .
                                         (x1 ,x2 ,x3 )←C

Max-Hom-Bal-3-LIN asks us to identify a balanced assignment with maximum value.

    We are interested in the size of an instance but as polynomial blow-up does not change the statement
of the results it does not matter if we measure the size in the number of variables or the number of
constraints. In view of this, for simplicity, we define the size to be the number of variables.
    The Label Cover (LC) problem is a canonical starting point in many optimal inapproximability results.
An intermediate step in our reduction produces LC instances with a special structure which is crucial in
the subsequent parts of the proof. We call these instances of the Certified LC problem and the definition
of the problem and its properties is given below. The key property that we need is that the constraints σu
given below are low-degree polynomials and that the mappings πuv are affine mappings. This is crucial
for our ability to use the low-degree long code to code assignments.

Definition 2.2 (Certified LC). An instance of Certified LC problem, henceforth CLC, is a tuple L =
(U,V, E, Π, Σ) where:

    – U and V are finite sets, E is a distribution over U ×V , and

    – there exists L, R ∈ Z such that:

               Π = πuv : FL2 → FR2 | (u, v) ∈ supp(E) ,                 and Σ = σu : FL2 → F2 | u ∈ U .
                                                                              


The integers L and R are called the left and right bit-length respectively while Π and Σ are called the
projection and certification constraints. An assignment to the instance is a pair of functions (ΛL , ΛR )
where ΛL : U → FL2 satisfies σu (ΛL (u)) = 0 for every u ∈ U and ΛR : V → FR2 . The value of an assignment
Λ = (ΛL , ΛR ) is defined as:

                                valL (Λ) , Pr {ΛR (v) = πuv (ΛL (u))} .
                                                (uv)←E

We use the notation Prv|u {E} to denote the probability of event E when choosing v as a random neighbor
of u and we have a similar notation Ev|u [X] for expectation.

    We construct CLC instances that satisfy two crucial properties, mixing and smoothness, that are
defined below.

                       T HEORY OF C OMPUTING, Volume 13 (10), 2017, pp. 1–19                              4
                  O N THE H ARDNESS OF A PPROXIMATING BALANCED H OMOGENOUS 3-L IN

Definition 2.3 (Smoothness). A CLC instance is said to be ξ -smooth if for every u ∈ U and every two
distinct a, b ∈ FL2 ,
                                   Pr{πuv (a) = πuv (b)} ≤ ξ .
                                               v|u

Definition 2.4 (Mixing). A CLC instance is said to be η-mixing if for every ψ : V → [0, 1],
                                                             
                                          E E [ψ(v)] − E[ψ(v)] ≤ η .
                                           u     v|u        v


   Our result invokes the analysis of two different low-degree testers and their set up and analysis are
described next.

2.1    Testing low-degree functions
For a field F and a positive integer m, Fm denotes the m-dimensional vector space over F. We consider
polynomials with coefficients in F and the degree of a polynomial p : Fm → F is the maximum over the
degrees of the monomials where the degree of the monomial ∏ xidi is ∑i di . The set of all polynomials of
degree at most d is a vector space over F and is denoted by Pd (Fm ), or simply Pd if the domain of the
polynomials is clear from the context. The agreement between two functions f , g : Fm → F is the fraction
of inputs where they are equal and this quantity is denoted by α( f , g). The agreement of a function f with
Pd , α( f , Pd ), is the maximum of α( f , p) over p ∈ Pd . The well known fact that low-degree polynomials
have small pairwise agreement is essential for us and in coding theory terms this is a statement about the
minimal distance of a Reed-Muller code. This is also known as the Schwarz-Zippel lemma and we use it
in the following form.

Lemma 2.5. Any two distinct p, q ∈ Pd (Fm ) satisfy α(p, q) ≤ d/|F|.

     The first low-degree tester we describe is designed for large fields. It was developed in the context of
the PCP theorem and uses geometric structures such as lines. We present it here as a CLC to suit our needs.
A line passing through two distinct points u, v ∈ Fm has a parametric representation ` : F → Fm given by
`(x) = u + x · (v − u). Similarly, for a positive integer k, and k + 1 points u0 , . . . , uk , we have the function
p : (x1 , . . . , xk ) = u0 + ∑i xi (ui −u0 ) whose image, {p(x) | xi ∈ F} is a flat, denoted by S(u0 , . . . , uk ). When
the elements (ui − u0 ) are all linearly independent, the set is a k-flat. Note that different choices of ui
might yield the same flat, and we fix an arbitrary canonical parametric representation of each flat. The
space of all k-flats is denoted by Sk , or Sk (Fm ) when the vector space needs to be emphasized.
     Fix a positive integer t and let F be the field of order q = 2t . The line-point test is the CLC with
V = Fm , U = S1 (Fm ), and σu ≡ 0 for every u ∈ U. The assignments to V are expected to be the evaluation
of a polynomial p ∈ Pd and thus we set R = t. The restriction of p of degree d to a line ` is p ◦ ` and is a
univariate polynomial of degree at most d. The assignment to the line ` ∈ U is to be the representation of
this polynomial. We set L = t(d + 1) and treat an assignment as a polynomial p` : F → F. For a x ∈ F,
`(x) ∈ Fm = V the constraint π`,`(x) maps p` to p` (x). Finally, the distribution E picks ` ← U, and x ← F
uniformly at random and produces (`, `(x)). We denote the CLC by LLP               m,d (F). The following analysis of
this test is due to Arora and Sudan [1].

                          T HEORY OF C OMPUTING, Volume 13 (10), 2017, pp. 1–19                                         5
                                    J OHAN H ÅSTAD AND R AJSEKAR M ONOKARAN

Theorem 2.6 (Theorem 16, [1]; reformulation1 ). There exists constants c > 0 and γ0 > 0 such that, if
q > d c , γ < γ0 the following holds. For every assignment ΛR : V → F to LLP
                                                                          m,d (F), if for some ΛL : (ΛL , ΛR )
has value at least 1 − γ, then there is a p ∈ Pd such that α(ΛR , p) ≥ 7/12.

   Points on a line are pairwise independent and hence the line-point test yields a mixing CLC.

Lemma 2.7. The instance LLP
                                        p
                             m,d (F) is  1/q-mixing.

Proof. Fix a function ψ : V → [0, 1]. Now,
                                              
                                  E E [ψ(`(t))] = E m [ψ(u)] ,
                                `←S1 t←F         u←F
                                 "          #      2
                     and, E           E ψ(`(t))          = E [ψ(`(t1 ))ψ(`(t2 ))] .
                            `←S1      t←F                  `,t1 ,t2

For fixed t1 6= t2 , conditioned on x = `(t1 ), `(t2 ) on a random line ` takes every value in V other than `(t1 )
with equal probability and thus in this case
                                                               qm E[ψ] − ψ(x)
                               E[ψ(`(t1 ))ψ(`(t2 ))] = ψ(x)                     .
                               `                                     qm
Since t1 = t2 with probability 1/q,
                                 "              2 #
                                                          1 q−1                 1
                             E      E ψ(`(t))          ≤ +         (E[ψ])2 ≤ + (E[ψ])2 .
                            `←S1   t←F                    q     q               q

An application of Jensen’s inequality yields the result.                                                                


2.2    Fourier analysis over Pd (FL2 )
The second low-degree testing result is used in analyzing maps from Pd (FL2 ) to R. We review standard
facts about the Fourier transform of such maps towards stating the testing result in this context. In what
follows, F = F2 and the vector space is FL unless stated otherwise. Every function f : FL → F can be
written as a multilinear polynomial of degree at most L and thus the space of maps, FL → F, is equal to
PL . For two maps f , g, we have the natural inner product: h f , gi , ∑x f (x)g(x) where the operations are
over F. As is well known, the dual code of Pd under this inner product is PL−d−1 . For a β ∈ PL , define
the character map χβ : PL → R by:
                                                  χβ (g) , (−1)hβ ,gi .

If β1 = β2 + p where p ∈ PL−d−1 then, as PL−d−1 is the dual of Pd , χβ1 (g) = χβ2 (g) for any g ∈ Pd . By
standard terminology β1 and β2 are in this case said to belong to the same coset of PL−d−1 .
    The Hamming weight of β , denoted by wt(β ), is the number of x such that β (x) = 1. From each
coset of PL−d−1 in PL , we pick a β of least Hamming weight (breaking ties arbitrarily) and denote the
   1 The number 7/12 can be replaced by any number between 1/2 and 2/3. The constant γ is such that setting ρ ≥ (1 − γ ) in
                                                                                      0                               0
Theorem 16 satisfies 2ρ/3 ≥ 7/12.


                          T HEORY OF C OMPUTING, Volume 13 (10), 2017, pp. 1–19                                          6
                 O N THE H ARDNESS OF A PPROXIMATING BALANCED H OMOGENOUS 3-L IN

set of these representatives by Xd . A standard fact about the Fourier transform states that a function
λ : Pd → R can be written as:
                                       λ (g) = ∑ λ̂ (β )χβ (g) ,                                  (2.1)
                                                   β ∈Xd

where the Fourier coefficients λ̂ (β ) are real numbers. The characters form an orthonormal basis as for
every β1 , β2 ∈ Xd ,                                     (
                                                        1 if β1 = β2 ,
                                   E χβ1 (g)χβ2 (g) =                                              (2.2)
                                 g←Pd                     0 otherwise.
     An affine form in L variables is the polynomial a0 + ∑ ai xi where each ai is in F2 . A list of affine forms
π (1) , . . . , π (R) : FL → F, can be viewed as an affine map π : FL → FR . In this situation, for f ∈ Pd (FR ),
we have that also f ◦ π = f (π(x)) belong to Pd (FL ). Let π −1 (y) be the set of all x such that π(x) = y,
and for a β : FL → F define the projected map π2 (β ) : FR → F as

                                          π2 (β )(y) ,      ∑         β (x) .                              (2.3)
                                                         x∈π −1 (y)

It is not difficult to see that the identity χβ ( f ◦ π) = χπ2 (β ) ( f ) holds.
     Let Ld denote the space of all polynomials which are the product of exactly d linearly independent
affine forms. The result of Bhattacharyya et al. [7] on testing functions in PL−d−1 (FL2 ) shows the
following.

Theorem 2.8 (Theorem 1, [7]; stated here as in Proposition 14, [9]). There is a ρ0 < 1 such that for every
β ∈ Xd ,                                                             
                                                         wt(β )
                                E χβ (ζ ) ≤ max 1 −               , ρ0 .
                              ζ ←Ld                          2d

3    Outer verifier
In this section we show that the values of CLC instances satisfying all the necessary structural properties
are still hard to approximate. The proof is via a reduction from the 3-coloring problem on graphs and is
essentially due to Holmerin and Khot [13]. Our statement (see Theorem 3.2) allows for a more flexible
setting of the parameters controlling the size of the reduction and proves a few additional properties that
are relevant to its use in Section 4.
     A graph is said to be θ -far from being 3-colorable if every 3-coloring of the vertices has at least a
θ -fraction of the edges between vertices of the same color. The work of Petrank [17] (see also [5]) shows
the following inapproximability of the 3-coloring problem.

Theorem 3.1 (Theorem 3.3, [17]). There is a number θ > 0 such that 3-colorable graphs are NP-hard
to distinguish from graphs that are θ -far from being 3-colorable.

    In what follows, we reserve the symbol n for the number of vertices in the graph we reduce from.
A parameter, say m, that depends on the size of the graph is a function Z+ → Z+ of n, but we omit the
argument writing m instead of m(n).

                        T HEORY OF C OMPUTING, Volume 13 (10), 2017, pp. 1–19                                  7
                                 J OHAN H ÅSTAD AND R AJSEKAR M ONOKARAN

Theorem 3.2. There exist constants c, d0 such that, for parameters m, d, and t satisfying
                                
                                 m
                                       ≥ n; d ≥ d0 ; and q = 2t ≥ d c ,                                          (3.1)
                                 d

and for any δ > 0, there is a reduction from a graph G on n vertices to a CLC L = (U,V, E, Π, Σ) such
that the statements below are true.

   1. If G is 3-colorable, then val(L) = 1.

   2. If G is θ -far from being 3-colorable, then val(L) ≤ δ (here θ is as in Theorem 3.1).
                                       √ 
   3. The instance L is O log(1/δ )/ q -mixing, d/q-smooth and further, the marginal distribution of
      v ∈ V obtained from E is uniform.

   4. The left and right bit-lengths (L and R) are at most O log(1/δ )td 3 .
                                                                          

   5. Each projection constraint πuv : FL2 → FR2 in Π is a list of R affine maps FL2 → F2 .

   6. Each certification constraint σu in Σ is decomposable into a list of polynomials p1 , . . . , pk each
      belonging to P2 (FL2 ) such that σu (x) = 0 if and only if pi (x) = 0 for every i.

   7. The reduction is deterministic and requires at most (qm n)O(log(1/δ )) steps.

     The rest of this section proves this theorem. The reduction first embeds the coloring problem in the
line-point CLC. We set up some notation before presenting the reduction. For parameters m, d, and t as
in Theorem 3.2, F denotes the field of order 2t . We need a representation of each element of F by a vector
of t bits and use a natural representation such that field-addition becomes exclusive-or of vectors and
multiplication by a fixed field element is a linear transformation. The zero of the field is represented by 0t
and we assume that the identity is represented by the unit vector et with a one in the last component. We
denote these two elements by 0 and 1, respectively. Finally we let 2 denote the field element represented
by et−1 . These three elements are used as colors and a 3-coloring of G is viewed as φ : G → {0, 1, 2}.
     A set s ⊆ [m] is also viewed as a point s ∈ Fm where sk = 1 if k ∈ s and sk = 0 otherwise where we
typeset the symbol in bold face to emphasize the latter view. Let (S1 (Fm ), Fm , E0 , Π0 , Σ0 ) denote the line-
point CLC, LLP    m,d (F). Each element in U of the CLC we construct extends a line ` = S(u0 , u1 ) ∈ S1 into a
flat S(u0 , u1 , si , sj ) for some choice of si , sj ∈ Fm . An assignment to this flat is a polynomial in Pd (F3 ) that
describes the restriction of a polynomial in Pd (Fm ) along the parametric representation of the flat. Note
that different choices of u0 and u1 may lead to the same flat and thus we fix some canonical parametric
representation of the flat S(u0 , u1 , si , sj ). In particular, given an assignment p ∈ Pd (F3 ), p(0, 1, 0) and
p(0, 0, 1) are the values at si and sj respectively.

Procedure 3.3 (Reduction to CLC).
Input: A graph G = ([n], EG ); and parameters m, d and t.
Output: A CLC instance L = (U,V, E, Π, Σ).
                                                                           [m]
  1. Pick a collection F = {s1 , . . . , sn } of n distinct elements of     d .


                          T HEORY OF C OMPUTING, Volume 13 (10), 2017, pp. 1–19                                       8
                 O N THE H ARDNESS OF A PPROXIMATING BALANCED H OMOGENOUS 3-L IN

  2. Set V = Fm , and
                         [ 
                 U=               S(u0 , u1 , si , sj ) | S(u0 , u1 ) ∈ S1 (Fm ) and S(u0 , u1 , si , sj ) a 3-flat .
                      (i, j)∈EG

      An assignment to v ∈ V is an element of F; thus R = t. Assignments to U are from Pd (F3 ) and thus
          d+3
              
      L = 3 t.
  3. Set E to be the distribution that
      (a) samples an edge (i, j) ∈ EG uniformly at random;
      (b) samples (` = S(u0 , u1 ), v) from E0 ;
      (c) picks a random 3-flat u containing ũ = S(u0 , u1 , si , sj ), v); and
      (d) produces (u, v).
  4. Let (x1 , x2 , x3 ) map to the point v in the canonical representation of the flat u picked above. Set
     πuv = p → p(x1 , x2 , x3 ), for p ∈ Pd (F3 ).
  5. Finally, σu maps a polynomial p to zero if and only if p(0, 1, 0) and p(0, 0, 1) are valid and distinct
     colors.

    Note that the collection F exists as long as Equation (3.1) is satisfied. We fix a CLC L obtained from
a graph G and proceed with the analysis.

Lemma 3.4. If G is 3-colorable, then val(L) = 1.

Proof. For a set s ⊆ [m], we write xs for the monomial ∏ j∈s x j . A 3-coloring of the graph, φ : [n] →
{0, 1, 2} corresponds to the polynomial p = ∑i∈[n] xsi φ (i) where si are from the collection F. The
evaluation of this polynomial is the intended assignment to the above instance. If G is 3-colorable, then
setting φ to be a valid coloring assures that the restriction of p to the flat u, p ◦ u, satisfies the constraint
σu . Further, since each si has exactly d elements, p is degree-d. Clearly the assignment v → p(v) for
v ∈ V , and u → p ◦ u for the flats u ∈ U satisfies the projection constraints, πuv .                          

     Thinking of the set si as its characteristic vector we note that p(si ) = φ (i). This is mostly a curiosity
as values at specific points is not a “robust” property and it is much more interesting what values p takes
at random points.
     As a complement to Lemma 3.4 giving completeness, we have the following soundness statement
which is the analogue of item 2 in Theorem 3.2.

Lemma 3.5. ∃γ0 > 0, d0 so that ∀γ < γ0 , d > d0 , and G that is γ-far from being 3-colorable, val(L) ≤
1 − γ 2.

Proof. Suppose Λ = (ΛL , ΛR ) is an assignment whose value is larger than 1 − γ 2 . Then, a fraction 1 − γ
of edges (i, j) ∈ EG are valuable which we define to be that the probability, conditioned on picking one
of these edges, of satisfying the projection is at least 1 − γ.
    The distribution E conditioned on an edge (i, j) ∈ EG , is exactly E0 of LLP
                                                                               m,d (F). Conditioning on a
valuable edge (i, j) and applying Theorem 2.6, we see that ΛR is 7/12 close to a p ∈ Pd . Further, using
Lemma 2.5, any other p0 ∈ Pd agrees with p on at most a d/q-fraction and thus agrees with ΛR on at most

                        T HEORY OF C OMPUTING, Volume 13 (10), 2017, pp. 1–19                                           9
                                         J OHAN H ÅSTAD AND R AJSEKAR M ONOKARAN

a fraction 5/12 + d/q. Thus, for a sufficiently large d0 (and hence d and q ≥ d c ), the choice of p with
agreement 7/12 with ΛR is unique. We claim, and prove below, that whenever (i, j) is valuable, p(si )
and p(sj ) are valid and distinct colors. This implies that {p(si )}i contradicts G being γ-far from being
3-colorable. To see that p(si ) and p(sj ) are valid and distinct suppose for contradiction that they are not.
    As ΛL (u)(si ) and ΛL (u)(sj ) are valid and distinct colors, p and ΛL (u) on u and are two distinct
degree-d polynomials and hence they agree on a fraction at most d/q of the points.
    Let ` be the line through si and sj and consider the distribution of u and v. It is easy to see that the
probability that v is on ` is O(q1−m ) and otherwise it is uniformly distributed on u outside `. As each
point in the space outside ` is equally likely to belong to u we have the the expected agreement of p and
ΛR on u outside ` is at least 7/12 − O(q1−m ). As ` is a fraction 1/q2 of u it follows that the probability
that ΛL (u) agrees with ΛR at v is at most 5/12 + O(q1−m ) + d/q + 1/q2 . This contradicts the assumption
that (i, j) is valuable and thus we can conclude that p(si ) and p(sj ) are valid and distinct colors and the
proof is complete.                                                                                         

3.1      Gap amplification
To amplify the guarantee from Lemma 3.5, we apply the parallel repetition theorem. We let U ⊕k be the
k-fold direct product of U and use the notations V ⊕k and E ⊕k similarly. Given a CLC L, and a positive
integer k, parallel repetition produces an instance L⊕k = (Uk ,Vk , Ek , Σk , Πk ) obtained by setting:
      – Uk = U ⊕k ; Vk = V ⊕k ; E as the k-wise product distribution E ⊕k ;
      – π(u1 ,...,uk )(v1 ,...,vk ) = (πu1 v1 , . . . , πuk vk ); and σu1 ,...,uk (p1 , . . . , pk ) = 0 if ∀i σui (pi ) = 0.
   The CLC problem, akin to Label Cover, falls in the category of 2-prover 1-round projection games
and thus we have the following guarantee on val(L⊕k ) from the work of Rao [19].
Theorem 3.6 (Theorem 4, [19]; restated for CLCs). There is an α > 0 such that if a CLC instance L has
value at most 1 − γ, then ∀k ∈ Z+ , we have val(L⊕k ) ≤ (1 − γ/2)αγk . In particular, if
                                                                    2 log(1/δ )
                                                              k≥                ,
                                                                        αγ 2
then val(L⊕k ) ≤ δ .
    Further, as shown by Holmerin and Khot [13], the smoothness is preserved while mixing deteriorates
linearly with k.
Theorem 3.7 (Theorem B.1, [13]). If L is ξ -smooth, then so is L⊕k for any positive integer k.
Theorem 3.8 (Theorem B.5, [13]). If L is η-mixing, then L⊕k is kη-mixing for any k ∈ Z+ .
      We now finish the proof of the main theorem of this section.

Proof of Theorem 3.2. Set d0 as in Lemma 3.5 and fix parameters m, d, and t satisfying the premises and
fix a δ > 0. Set γ = max{γ0 , θ } (γ0 as in Lemma 3.5, and θ as in Theorem 3.1) and set
                                                                     2 log(1/δ )
                                                              k=
                                                                         αγ 4

                               T HEORY OF C OMPUTING, Volume 13 (10), 2017, pp. 1–19                                            10
                  O N THE H ARDNESS OF A PPROXIMATING BALANCED H OMOGENOUS 3-L IN

with α as in Theorem 3.6. The reduction transforms G to L by running Procedure 3.3 and produces L⊕k .
We show that L⊕k satisfies all the properties stated.
    Indeed, if G is 3-colorable, then Lemma 3.4 gives an assignment (ΛL , ΛR ) that satisfies the constraints
                          (k)   (k)
in L. The assignment (ΛL , ΛR ) where
                                       (k)
                                     ΛL (u1 , . . . , uk ) = (ΛL (u1 ), . . . , ΛL (uk ))
                  (k)
and similarly, ΛR (v1 , . . . , vk ) = (ΛR (v1 ), . . . , ΛR (vk )) satisfies all the constraints in L⊕k . Thus we have
item 1.
    To establish item 2 we note that, by Lemma 3.5, if G is θ -far from 3-colorable then val(L) ≤ 1 − γ 2
and hence by Theorem 3.6 and the choice of k we get the desired conclusion, and we proceed to consider
item 3.
    The projection constraints of L are evaluations of the polynomials in Pd (F3 ) at specific points. Thus,
smoothness of L follows immediately from Lemma 2.5, and applying Theorem 3.7 proves it for L⊕k .
For each choice of an edge (i, j) in the distribution E, the tests are a copy of LLP           m,d (F) over the same set
V = V0 , Thus, mixing follows from Lemma 2.7 and Theorem 3.8. The claim on the marginal distribution
follows from uniformity of points on a random line. This proves item 3.
    An assignment to an element in V ⊕k is a list of k field elements and hence R =                  kt. Similarly, an
element in Pd (F3 ) is identified by specifying d+3            3    elements   and   hence L  =   d+3
                                                                                                   3 tk bits encode an
                  ⊕k
assignment to U . Thus, we have item 4.
    Recall that the elements of F are represented by vectors in Ft2 where field additions correspond to
vector additions while multiplication by an element is multiplication by an invertible matrix. Thus, if the
polynomial is written as a vector in FL2 , the evaluation at a fixed point is a multiplication by a matrix in
FR×L
  2  . This proves item 5.
    Given a p ∈ Pd (F3 ), verifying the constraint σu for u ∈ U requires verifying that p(0, 1, 0) and
p(0, 0, 1) are valid and distinct colors. A color y ∈ Ft2 is valid if all but the last 2 bits are 0 and if the last
two bits are not both 1. The former constraints are linear and the latter degree-2 constraints and thus
may be encoded as a list of t − 1 polynomial equations in the coefficients of p, all of degree at most two.
We have that, x, y ∈ Ft2 , both coding correct colors, are distinct iff (1 + xt + yt ) · (1 + xt−1 + yt−1 ) = 0.
Each of these bits is an affine form in the bits representing the polynomial p and hence we have a list
of 2(t − 1) + 1 polynomials (2(t − 2) affine polynomials and 3 quadratic polynomials) that encode the
coloring constraints. Concatenating such lists obtained for different ui yields the necessary list for a
u ∈ U ⊕k hence proving item 6.
    Finally, the size of the set U is O n2 · q2m and hence the size of L⊕k is at most (q2m n2 )O(k) . This
                                                          

proves item 7, concluding the proof.                                                                                  


4    Inner verifier
In this section, we prove the main theorem by reducing from CLC to Max-Hom-Bal-3-LIN. The reduction
replaces each element of U and V with the low-degree long code (a.k.a. the short code) introduced in
the work of Barak et al. [6] and uses folding ideas from the work of Dinur and Guruswami [9]. We now
describe these concepts, and follow it up with our reduction and the analysis.

                         T HEORY OF C OMPUTING, Volume 13 (10), 2017, pp. 1–19                                      11
                                    J OHAN H ÅSTAD AND R AJSEKAR M ONOKARAN

   We have a parameter s, to be determined later, governing our construction. The low-degree long code
                                            (a)
encoding of an element a ∈ FL2 is the map SCs : Ps (FL2 ) → F2 defined as:
                                                         (a)
                                                     SCs (g) , g(a) .                                                        (4.1)
     For each u ∈ U, the constraint σu can be decomposed to a list of polynomials p1 , . . . , pk ∈ P2 (FL2 ) as
in item 6 of Theorem 3.2. Define the subspaces2
                   Iu = p ∈ Ps (FL2 ) | p = ∑ pi qi ; qi ∈ Ps−2 (FL2 ) for each u ∈ U .
                         

                                                                                                                       (a)
Let P(u) denote the cosets of Iu in Ps (FL2 ). Note that if a ∈ FL2 satisfies σu (a) = 0, then SCs ( f ) =
   (a)
SCs ( f 0 ) if f and f 0 belong to the same coset. The reduction uses this to replace each vertex u by a table
indexed by P(u) instead of a complete low-degree code. This is viewed as the complete code folded so
that elements belonging to a fixed coset are implicitly mapped to a single element in P(u) .
Procedure 4.1 (Reduction to Max-Hom-Bal-3-LIN).
Input: A CLC L = (U,V, E, Π, Σ) ; and a parameter s, where the functions in Π are affine and the
constraints in Σ are polynomials of degree at most 2.
Output: A Max-Hom-Bal-3-LIN instance I = (X, C) where,
                           n                      o
                      X = (u, g) | u ∈ U, g ∈ P(u) ∪ V × Ps (FR2 ) ;
                                                                  

and C is the distribution that
  1. samples a constraint (u, v) ← E ;
  2. samples g ← Ps (FL2 ) and h ← Ps (FR2 ) at random;
  3. samples w = 23s/4 independent ζ1 , . . . , ζw ← Ls (FL2 ) ; sets ζ = ζ1 + · · · + ζw ; and
  4. produces the tuple ((u, g), (v, h), (u, g + ζ + h ◦ πuv )).
   The reduction is well-defined for inputs L guaranteed by Theorem 3.2 as item 5 ensures that h ◦ π is
degree-s if h is degree-s. We analyze the value of the instance in relation to the value of L.
Lemma 4.2 (Completeness). If L has an assignment (ΛL , ΛR ) whose value is 1, then there is a balanced
assignment to I with value at least 1 − 2−s/4 .
                                                             (Λ (u))                    (Λ (v))
Proof. Consider the assignment: (u, g) → SCs L (g); (v, h) → SCs R (h). Since Λ satisfies the
certification constraints, and hence p(ΛL (u)) = 0 for any p ∈ Iu , the assignment is well-defined. Further,
(u, g) and (u, 1 + g) as well as (v, h) and (v, 1 + h) take distinct values and hence the assignment is
balanced. Under this assignment, (u, g + ζ + h ◦ πuv ) gets the value
                      (g + ζ + h ◦ πuv )(ΛL (u)) = g(ΛL (u)) + ζ (ΛL (u)) + h ◦ πuv (ΛL (u)) .
Now, h ◦ πuv (ΛL (u)) = h(ΛR (v)) whenever the constraint πuv is satisfied. Further, each ζi ∈ Ls (FL2 ) is
non-zero on a fraction 2−s of the inputs, and thus ζ (ΛL (u)) is non-zero on at most a fraction w2−s of
inputs. Hence, the constraints are satisfied with probability at least (1 − w2−s ).                      
   2 For the values i such that p is linear we could allow q to be of degree s − 1 but as this does not greatly improve the result
                                 i                          i
and just causes notational problems, we ignore this point.


                           T HEORY OF C OMPUTING, Volume 13 (10), 2017, pp. 1–19                                               12
                O N THE H ARDNESS OF A PPROXIMATING BALANCED H OMOGENOUS 3-L IN

    Next, we analyze a generic assignment λ to I taking values 0 and 1, assuming val(L) is small. For
each u ∈ U, set λu : Ps (FL2 ) → {+1, −1} as λu ( f ) = (−1)λ ((u, f )) and similarly λv : Ps (FR2 ) → {+1, −1}
for each v ∈ V .
    We show that val(λ ; I) is close to the quantity Prv,h {λv (h) = 1}. In the proof of the main theorem,
we show that L can be amended (by making many copies of each v) so that any balanced assignment λ
satisfies Prv,h {λv (h) = 1} ' 1/2. This proves our main inapproximability result in light of Theorem 3.2
and Theorem 3.1.
Theorem 4.3 (Soundness). For every η, ξ > 0, and every integer s, if L is η-mixing, ξ -smooth, then
every assignment λ to the instance I output by Procedure 4.1 satisfies
                                                 
                          1
         val(λ ; I) ≤ max  , Pr {λv (h) = 1} + exp(−2s/4 ) + η + 2s ξ + 2s/2 val(L) .
                                                                               p
                            2 (·,v)←E,
                                   h←Ps (FR )

Proof. Fix an assignment λ = {λu }{u∈U} ∪ {λv }{v∈V } . Writing out the Fourier expansion, we have:

                     1 1
          val(λ ; I) =+      E [λu (g) λv (h) λu (g + ζ + h ◦ πe )]
                     2 2 u,v,g,h,ζ
                            "                                                            #
                     1 1                                                               
                    = + E ∑ λ̂u (α) λ̂v (β ) λ̂u (γ) E χα (g)χβ (h)χγ (g + h ◦ πuv + ζ )
                     2 2 u,v α,β ,γ                    g,h,ζ
                                                               
                     1 1                2
                    = + E ∑(λ̂u (α)) λ̂v (π2 (α)) E[χα (ζ )] .                                                     (4.2)
                     2 2 u,v α                        ζ

The last equality follows as if α 6= γ then taking expectation over g yields 0 and similarly looking at
expectation over h only terms with π2 (α) = β are nonzero.
   We recall that α ∈ Xs (FL2 ) and bound the expectation in Equation (4.2) by a case distinction on
wt(α).

Case 1: wt(α) ≥ 2s/2 . The contribution of these terms is bounded by applying Theorem 2.8. Indeed, for
such an α, we have | Eζi ←Ls [χα (ζi )]| ≤ 1 − 2−s/2 . Since ζ is a sum of 23s/4 such independent terms, we
have:
                                                                                                            3s/4
                 ∑          (λ̂u (α))2 λ̂v (π2 (α)) E[χα (ζ )] ≤       ∑          (λ̂u (α))2 (1 − 2−s/2 )2         (4.3)
                                                  ζ
             α|wt(α)≥2s/2                                          α|wt(α)≥2s/2
                                                                                   3s/4
                                                               ≤ (1 − 2−s/2 )2            ≤ exp(−2s/4 ) .          (4.4)

Case 2: wt(α) = 0. Here we use the mixing property. Since L is η-mixing,
                  h             i                h    i
                          2
               E (λ̂u (0))    / ≤ E max(0, E λ̂v (0)
                       / λ̂v (0)                     / )
               u,v                  u         v|u
                                                   h    i
                                  ≤ η + max(0, E λ̂v (0)
                                                      / ) = η + max(0, E [λv (h)])                                 (4.5)
                                                           v                                   v,h



                         T HEORY OF C OMPUTING, Volume 13 (10), 2017, pp. 1–19                                       13
                                  J OHAN H ÅSTAD AND R AJSEKAR M ONOKARAN

Case 3: 0 < wt(α) < 2s/2 ; wt(π2 (α)) = 0. Here we use smoothness. A necessary condition for π2 (α) =
0/ is that at least two distinct elements of α map to the same element. Since L is ξ -smooth, for a fixed
u, and α, the probability over the choice of v that any two elements of α map to the same element is
bounded wt2 (α)ξ ≤ 2s ξ .

Case 4: 0 < wt(α) < 2s/2 ; wt(π2 (α)) 6= 0. Here we use a standard list-decoding argument as done by
Dinur and Guruswami [9] where they prove the following lemma.

Lemma 4.4 (see Lemma 13, [9]).
                                                                         
                     E       ∑                  (λ̂u (α))2 (λ̂v (π2 (α)))2 ≤ 2s/2 val(L).
                            u,v
                                  wt(α)<2s/2 ,
                                   π2 (α)6=0


In view of the above lemma, and using Cauchy-Schwarz, we see that:
                                                         vu                                  
                                 2                                              2             2
        E              (     (α))     (π (α)) E [χ  (ζ )] ≤   E        (    (α))  (   (π (α)))
                                                            u
        u,v
                ∑       λ̂ u       λ̂v 2
                                              ζ
                                                  α         u
                                                            tu,v ∑      λ̂u        λ̂v 2
                   s/2
            wt(α)<2 ,                                              s/2  wt(α)<2 ,
             π2 (α)6=0                                                   π2 (α)6=0
                                                               q
                                                              ≤ 2s/2 val(L) .                       (4.6)

    These exhaust all the cases and substituting these bounds in Equation (4.2), we have:
                                                               
                      1 1
         val(λ ; I) = + E ∑(λ̂u (α)) λ̂v (π2 (α)) E[χα (ζ )]
                                           2
                      2 2 u,v α                         ζ
                                                                                     
                      1 1             s/4                            s
                                                                           p
                                                                               s
                    ≤ +       exp(−2 ) + η + max(0, E [λv (h)]) + 2 ξ + 2 val(L)
                      2 2                                 v,h
                                                    
                           1                                                     p
                    ≤ max  , Pr {λv (h) = 1} + exp(−2s/4 ) + η + 2s ξ + 2s val(L) .                  
                                                     
                             2 (·,v)←E,
                                   h←Ps (FR )

    Now, we prove the main theorem.

Proof of Theorem 1.1. Fix a language L ∈ NP. We write |w| to denote the size of an instance w ∈ L.
Applying Theorem 3.1, we have a graph G which is 3-colorable if w ∈ L and θ -far from being 3-colorable
     / L. Further, n = |G| = |w|O(1) .
if w ∈
    For an ε > 0, set s = 4 log(1/ε) so that Lemma 4.2 applied on a L with value 1 shows that val(I) ≥
1 − ε. Set d = Ω(log(n)/s), m = d 3s , and q = 2t = d c so that the pre-conditions (Equation (3.1)) of
Theorem 3.2 are satisfied and apply this theorem with δ = ε 7 . We note that while ε and hence s are
constants, d and hence q are ω(1) and hence both mixing, η and smoothness ξ are o(1).
    Running the reduction promised in Theorem 3.2 yields a CLC L which, in turn, yields a Max-Hom-
Bal-3-LIN instance I through Procedure 4.1. Set Γ = |I|/(ε 2 |V |) and construct L0 by making Γ identical

                         T HEORY OF C OMPUTING, Volume 13 (10), 2017, pp. 1–19                        14
                O N THE H ARDNESS OF A PPROXIMATING BALANCED H OMOGENOUS 3-L IN

copies of each v ∈ V in L. Let I0 be the output of Procedure 4.1 run on L0 . If w ∈ L, then L and hence L0
have value 1. Thus, from Lemma 4.2, we know that val(I0 ) ≥ 1 − ε.
   We claim that if λ is a balanced assignment to I0 , then

                                                                1
                                         Pr    {λv (h) = 1} ≤     + ε2
                                       (·,v)←E,                 2
                                      h←Ps (FR )


as the variables (u, ·) account only for an ε 2 fraction of all variables. Now, applying Theorem 4.3 yields
that val(I0 ) ≤ 1/2 + pε asymptotically. Indeed both the terms η and 2s ξ are o(1) (as a function of n) while
         s/4
exp(−2 ) and 2     s/2    val(L) are O(1) as a function of n but o(ε) as function of ε.
    Finally, we verify the claim on the size of the reduction. From item 7 of Theorem 3.2, we know that
                                                                                                          O(s)
|L| ≤ (qm n)O(log(1/δ )) ≤ exp(O(log n log(1/ε))). The length of the low-degree long code is at most 2L
and thus |I| ≤ exp((log n)O(log(1/ε)) ). Therefore,
                                                                                    
                            I0 ≤ |L| · |I|/ε 2 · exp(LO(s) ) ≤ exp (log n)O(log(1/ε)) .                    


5    Hardness for Max-Bisection
Let us just observe that, as also pointed out in [13], we can use the gadget of [20] to get hardness for
Max-Bisection.

Theorem 5.1. For any ε > 0, there is no algorithm that runs in time exp((log n)O(1) ) and approximates
                                                                  O(1)
                                                                       
Max-Bisection within 15/16 + ε unless NP ⊆ DTIME exp (log n)             .

Proof. Let us describe the properties needed of the gadget-reduction of [20]. It introduces a global
variable z and for each equation of the form x1 + x2 + x3 = 0 it introduces 8 auxiliary variables (yi )7i=0
which are local to this equation and outputs 18 equations of the form a + b = 1. The constructed equations
have the property that if z = 0 and the equation x1 + x2 + x3 = 0 is satisfied, then there is a setting of the
auxiliary variables satisfying 16 equations. On the other if z = 0 and x1 + x2 + x3 = 1 then the best setting
of the variables satisfies 14 equations. Note that if we complement all variables then this preserves any
sum of two variables and thus it is possible to focus on the case when z = 0.
     Now we start with an instance of Max-Hom-Bal-3-LIN constructed to prove Corollary 1.2. Suppose
that it has N variables and contains M equations. If we straightforwardly apply the gadget construction to
each equation we get a new system of equations each being on the form a + b = 1 such that this system
contains 1 + N + 8M variables and 18M equations.
     Clearly the condition a + b = 1 is a cut condition and we need to make sure to address the bisection
property. In the constructed instance the new auxiliary variables are in majority but we change this by
simply making copies of the original variables.
     Let T be a parameter, to be determined, and let us make T copies (xij )Tj=0−1
                                                                                   of each original variable
                                                                    3
xi . Now for each original equation xi1 + xi2 + xi3 = 0 we create T gadgets, all using the same auxiliary
variables while using all different combinations of copies of the original variables. This way we get an
instance with T N + 8M + 1 variables and 18T 3 M equations. Let us analyze this reduction.

                       T HEORY OF C OMPUTING, Volume 13 (10), 2017, pp. 1–19                               15
                              J OHAN H ÅSTAD AND R AJSEKAR M ONOKARAN

    If there is a balanced solution xi = ai to the original set of equations that satisfies (1 − ε)M equations
then we make a solution to the new system by setting z = 0, xij = ai for all j and giving the auxiliary
variables their optimal values. This satisfies at least 16(1−ε)T 3 M equations. The solution is not, however,
completely balanced has we have no control over the balance of the auxiliary variables. However, by
changing the value of (all copies) of at most 8M/T original variables we can make the solution biased.
Taking the least frequently occurring variables this may yield another 16T 3 M · 16M/(NT ) falsified
equations, and we conclude that the system has a balanced assignment satisfying
                                                       
                                                    16M
                                         16 1 − ε −       T 3M
                                                    NT

equations. Provided that T ≥ 16M/(Nε) this number is at least 16(1 − 2ε)T 3 M.
    Now suppose we have as assignment that satisfies at least (15 + δ )T 3 M of the produced cut-equations.
First we claim that we can modify the assignment such that all but one value of i, all copies of xi take the
same value maintaining at least as many satisfied equations.
    Indeed suppose that both xi and xi0 have copies taking both values. Suppose each xij has a neighbors
taking the value 0 and b neighbors taking the value 1. Note that, by construction, these numbers do not
depend on j. Let the similar numbers for xi0 be a0 and b0 . Now suppose, without loss of generality that
a + b0 − (a0 + b) ≥ 0 and consider what happens if we change the value of a copy of xi from 0 to 1 and
and a copy of xi0 from 1 to 0. For notational simplicity let us assume that these copies are xi0 and xi00 .
    Suppose there are t equations containing both xi0 and xi00 . Then the number of satisfied equations
containing either variable before the switch is b + a0 − t and after the switch a + b0 + t. As t ≥ 0 we see
that we satisfy at least as many equations after the switch. By repeatedly doing similar switches we arrive
at a solution with the same number of ones, that satisfies at least as many equations and such that at all
but at most one of the original variables has all its copies take the same value.
    Let us consider the assignment to the original system given by this unanimous value (taking either
value for the “mixed-up” variable).
    First we claim that this assignment is balanced within relative error O(M/NT ). This follows as
the assignment to all the variables was biased and the fraction of auxiliary variables is O(M/NT ).
Furthermore it still satisfies (15 + δ )T 3 M variables and hence the just defined assignment to the original
variables must, by the property of the gadget satisfy at least (1/2 + δ )M original equations. Setting
T = ω(M/(δ N)), and adjusting the values of ε and δ , Theorem 4.3 is sufficient to establish the quality
of the reduction.
    The blow-up in size is only polynomial and hence we have established Theorem 5.1.                       


6    Conclusion
Our paper proves tight inapproximability of Max-Hom-Bal-3-LIN assuming that NP-hard languages do
not have quasipolynomial time algorithms. A standard NP-hardness of approximation still eludes us.
    A situation where our understanding is even more limited is Min-Bisection where further progress in
the form of a good algorithm or a stronger hardness result is badly needed.

                       T HEORY OF C OMPUTING, Volume 13 (10), 2017, pp. 1–19                               16
                O N THE H ARDNESS OF A PPROXIMATING BALANCED H OMOGENOUS 3-L IN

Acknowledgment. We are grateful to three referees for a very careful reading of a preliminary version
of the paper and in particular for pointing out a difference in two probability distributions claimed to be
the same. We also thank one referee for reminding us of the reduction to Max-Bisection given in [13].


References
 [1] S ANJEEV A RORA AND M ADHU S UDAN: Improved low-degree testing and its applications. Combi-
     natorica, 23(3):365–426, 2003. Preliminary version in STOC’97. [doi:10.1007/s00493-003-0025-0]
     5, 6

 [2] P ER AUSTRIN , S IAVOSH B ENABBAS , AND KONSTANTINOS G EORGIOU: Better balance by being
     biased: A 0.8776-approximation for max bisection. ACM Trans. Algorithms, 13(1):2:1–2:27, 2016.
     Preliminary version in SODA’13. [doi:10.1145/2907052, arXiv:1205.0458] 2

 [3] P ER AUSTRIN AND J OHAN H ÅSTAD: Randomly supported independence and resistance. SIAM J.
     Comput., 40(1):1–27, 2011. Preliminary version in STOC’09. [doi:10.1137/100783534] 2

 [4] P ER AUSTRIN AND E LCHANAN M OSSEL: Approximation resistant predicates from pairwise
     independence. Comput. Complexity, 18(2):249–271, 2009. Preliminary version in CCC’08.
     [doi:10.1007/s00037-009-0272-6, arXiv:0802.2300] 2

 [5] P ER AUSTRIN , RYAN O’D ONNELL , L I -YANG TAN , AND J OHN W RIGHT: New NP-hardness
     results for 3-coloring and 2-to-1 label cover. ACM Trans. Comput. Theory, 6(1):2:1–2:20, 2014.
     Preliminary version in APPROX’12. [doi:10.1145/2537800, arXiv:1210.5648] 7

 [6] B OAZ BARAK , PARIKSHIT G OPALAN , J OHAN H ÅSTAD , R AGHU M EKA , P RASAD R AGHAVEN -
     DRA , AND DAVID S TEURER : Making the long code shorter. SIAM J. Comput., 44(5):1287–1324,
     2015. Preliminary version in FOCS’12. [doi:10.1137/130929394, arXiv:1111.0405] 3, 11

 [7] A RNAB B HATTACHARYYA , S WASTIK KOPPARTY, G RANT S CHOENEBECK , M ADHU S UDAN ,
     AND DAVID Z UCKERMAN : Optimal testing of Reed-Muller codes. In Proc. 51st FOCS, pp. 488–
     497. IEEE Comp. Soc. Press, 2010. Another version appears as a chapter in Property Testing, O.
     Goldreich, ed., LNCS vol. 6390, 2010, pp. 269–275. [doi:10.1109/FOCS.2010.54, arXiv:0910.0641]
     7

 [8] S IU O N C HAN: Approximation resistance from pairwise independent subgroups. J. ACM,
     63(3):27:1–27:32, 2016. Preliminary version in STOC’13. [doi:10.1145/2873054] 2

 [9] I RIT D INUR AND V ENKATESAN G URUSWAMI: PCPs via low-degree long code and hardness for
     constrained hypergraph coloring. Israel J. Math., 209(2):611–649, 2015. Preliminary version in
     FOCS’13. [doi:10.1007/s11856-015-1231-3] 3, 7, 11, 14

[10] M ICHEL X. G OEMANS AND DAVID P. W ILLIAMSON: Improved approximation algorithms for
     maximum cut and satisfiability problems using semidefinite programming. J. ACM, 42(6):1115–
     1145, 1995. Preliminary version in STOC’94. [doi:10.1145/227683.227684] 2

                      T HEORY OF C OMPUTING, Volume 13 (10), 2017, pp. 1–19                             17
                           J OHAN H ÅSTAD AND R AJSEKAR M ONOKARAN

[11] V ENKATESAN G URUSWAMI , P RAHLADH H ARSHA , J OHAN H ÅSTAD , S RIKANTH S RINI -
     VASAN , AND G IRISH VARMA : Super-polylogarithmic hypergraph coloring hardness via low-
     degree long codes. SIAM J. Comput., 46(1):132–159, 2017. Preliminary version in STOC’14.
     [doi:10.1137/140995520, arXiv:1311.7407] 3

[12] J OHAN H ÅSTAD: Some optimal inapproximability results. J. ACM, 48(4):798–859, 2001. Prelimi-
     nary version in STOC’97. [doi:10.1145/502090.502098] 2, 3

[13] J ONAS H OLMERIN AND S UBHASH K HOT: A new PCP outer verifier with applications to homo-
     geneous linear equations and max-bisection. In Proc. 36th STOC, pp. 11–20. ACM Press, 2004.
     [doi:10.1145/1007352.1007362] 2, 3, 7, 10, 15, 17

[14] S UBHASH K HOT: Ruling out PTAS for graph min-bisection, dense k-subgraph, and bipar-
     tite clique. SIAM J. Comput., 36(4):1025–1071, 2006. Preliminary version in FOCS’04.
     [doi:10.1137/S0097539705447037] 2, 3

[15] S UBHASH K HOT, G UY K INDLER , E LCHANAN M OSSEL , AND RYAN O’D ONNELL: Optimal
     inapproximability results for MAX-CUT and other 2-variable CSPs? SIAM J. Comput., 37(1):319–
     357, 2007. Preliminary version in FOCS’04. [doi:10.1137/S0097539705447372] 2

[16] S UBHASH K HOT, M ADHUR T ULSIANI , AND P RATIK W ORAH: A characterization of
     strong approximation resistance. In Proc. 46th STOC, pp. 634–643. ACM Press, 2014.
     [doi:10.1145/2591796.2591817, arXiv:1305.5500] 2

[17] E REZ P ETRANK: The hardness of approximation: Gap location. Comput. Complexity, 4(2):133–157,
     1994. Preliminary version in ISTCS’93. [doi:10.1007/BF01202286] 7

[18] H ARALD R ÄCKE: Optimal hierarchical decompositions for congestion minimization in networks.
     In Proc. 40th STOC, pp. 255–264. ACM Press, 2008. [doi:10.1145/1374376.1374415] 2

[19] A NUP R AO: Parallel repetition in projection games and a concentration bound. SIAM J. Comput.,
     40(6):1871–1891, 2011. Preliminary version in STOC’08. [doi:10.1137/080734042] 10

[20] L UCA T REVISAN , G REGORY B. S ORKIN , M ADHU S UDAN , AND DAVID P. W ILLIAMSON:
     Gadgets, approximation, and linear programming. SIAM J. Comput., 29(6):2074–2097, 2000.
     Preliminary version in FOCS’96. [doi:10.1137/S0097539797328847] 15


AUTHORS

      Johan Håstad
      Professor
      KTH Royal Institute of Technology, Sweden.
      johanh kth se
      http://www.csc.kth.se/~johanh



                     T HEORY OF C OMPUTING, Volume 13 (10), 2017, pp. 1–19                       18
             O N THE H ARDNESS OF A PPROXIMATING BALANCED H OMOGENOUS 3-L IN

    Rajsekar Manokaran
    Assistant professor
    Indian Institute of Technology, Madras, India.
    rajsekar gmail com
    http://www.cse.iitm.ac.in/~rajsekar


ABOUT THE AUTHORS

    J OHAN H ÅSTAD received his Bachelor of Science from Stockholm University in 1981, his
        Master of Science from Uppsala University in 1984, and his Ph. D. from MIT in 1986
        under the supervision of Shafi Goldwasser. Johan was appointed Associate Professor
        at the Royal Institute of Technology in Stockholm, Sweden in 1988 and advanced
        to the level of Professor in 1992. He was elected a member of the Swedish Royal
        Academy of Sciences in 2001. He has research interests within several subareas of
        Theory of Algorithms and Complexity theory but has recently mainly focused on the
        inapproximability of NP-hard optimization problems.


    R AJSEKAR M ANOKARAN graduated from Princeton in 2012; his advisor was Sanjeev Arora.
       The subject of his thesis was the inapproximability of constraint satisfaction problems
       using convex relaxations. He was appointed as an assistant professor at the Indian
       Institute of Technology in Madras, India in 2013.




                   T HEORY OF C OMPUTING, Volume 13 (10), 2017, pp. 1–19                         19