On the NP-Hardness of Approximating Ordering-Constraint Satisfaction Problems

Authors Per Austrin, Rajsekar Manokaran, Cenny Wenner,
Plaintext
                        T HEORY OF C OMPUTING, Volume 11 (10), 2015, pp. 257–283
                                       www.theoryofcomputing.org

                          S PECIAL ISSUE : APPROX-RANDOM 2013



    On the NP-Hardness of Approximating
   Ordering-Constraint Satisfaction Problems
               Per Austrin∗                 Rajsekar Manokaran†               Cenny Wenner‡
                 Received October 2, 2013; Revised November 9, 2014; Published June 18, 2015




       Abstract: We show improved NP-hardness of approximating Ordering-Constraint Satis-
       faction Problems (OCSPs). For the two most well-studied OCSPs, M AXIMUM ACYCLIC
       S UBGRAPH and M AXIMUM B ETWEENNESS, we prove NP-hard approximation factors of
       14/15 + ε and 1/2 + ε, respectively. When it is hard to approximate an OCSP by a constant
       better than taking a uniform random ordering, then the OCSP is said to be approximation
       resistant. We show that the M AXIMUM Non-B ETWEENNESS P ROBLEM is approximation
       resistant and that there are width-m approximation-resistant OCSPs accepting only a fraction
       1/(m/2)! of assignments. These results provide the first examples of approximation-resistant
       OCSPs subject only to P 6= NP.

ACM Classification: F.2.2
AMS Classification: 68Q17, 68W25
Key words and phrases: acyclic subgraph, APPROX, approximation, approximation resistance, be-
tweenness, constraint satisfaction, CSPs, feedback arc set, hypercontractivity, NP-completeness, orderings,
PCP, probabilistically checkable proofs



    An extended abstract of this paper appeared in the 15th International Workshop on Approximation Algorithms for
Combinatorial Optimization Problems (APPROX 2013) [4].
  ∗ Supported by Swedish Research Council Grant 621-2012-4546.
  † Supported by ERC Advanced Grant 226203.
  ‡ Supported by ERC Advanced Grant 226203.



© 2015 Per Austrin, Rajsekar Manokaran, and Cenny Wenner
c b Licensed under a Creative Commons Attribution License (CC-BY)                     DOI: 10.4086/toc.2015.v011a010
                      P ER AUSTRIN , R AJSEKAR M ANOKARAN , AND C ENNY W ENNER

1    Introduction

We study the NP-hardness of approximating a rich class of optimization problems
known as Ordering-Constraint Satisfaction Problems (OCSPs). An instance of
an OCSP is given by a set X of variables and a set C of local ordering constraints                  a
where each constraint is specified by a set of permitted permutations on a subset
of the variables. The objective is to find an ordering of X maximizing the fraction           x     z      y
of constraints satisfied by the induced local permutations.
     A simple example of an OCSP is the problem M AXIMUM ACYCLIC S UB -                             b
GRAPH (MAS) where one is given a directed graph G = (V, A) with the task of
finding an acyclic subgraph of G containing as many arcs as possible. Phrased
as an OCSP, V is the set of variables and each arc u → v is a constraint “u ≺ v” Figure 1: An MAS in-
                                                                                         stance with value 5/6.
dictating that u should precede v. The maximum fraction of constraints simulta-
neously satisfiable by an ordering of V is then precisely the fraction of edges in a
largest acyclic subgraph. Since each constraint involves exactly two variables, MAS is an OCSP of width
two. Another example of an OCSP is the M AXIMUM B ETWEENNESS (M AX BTW) problem [12]. In this
width-three OCSP, a constraint on a triplet of variables (x, y, z) is satisfied by the local orderings x ≺ z ≺ y
and y ≺ z ≺ x – in other words, z has to be between x and y, giving rise to the name of the problem.
     Determining the optimal value of an MAS instance is NP-hard and one turns to approximations.
An algorithm is called a c-approximation if, when applied to an instance I, the algorithm is guaranteed
to produce an ordering O satisfying a fraction of constraints within a factor c of the optimum, i. e.,
val(O; I) ≥ c · val(I). As in the case of classical constraint satisfaction, every OCSP admits a naive
approximation algorithm which picks an ordering of X uniformly at random without even looking at the
constraints. For example, this algorithm yields an 1/2-approximation in expectation for MAS.
     Surprisingly, there is evidence that this mindless procedure achieves the best polynomial-time approx-
imation constant: assuming the Unique Games Conjecture (UGC) [17], MAS is hard to approximate
within 1/2 + ε for every ε > 0 [14, 13]. An OCSP is called approximation resistant if it exhibits this
behavior, i. e., if it is NP-hard to approximate within a constant better than the random-ordering algorithm.
In fact, the results of Guruswami et al. [13] are more general and contrasting with classical constraint
satisfaction: assuming the UGC, they prove that every OCSP of bounded width is approximation resistant.
     In many cases—such as for V ERTEX C OVER, M AX C UT, and as we just mentioned, for all OCSPs—
the UGC implies optimal NP-hard inapproximability constants which are not known without the conjec-
ture. For instance, the problems MAS and M AX BTW have, until the present work, only been known
to be NP-hard to approximate within 65/66 + ε [19] and 47/48 + ε [9], resp., far from matching the
respective random-assignment thresholds of 1/2 and 1/3. In fact, while the UGC implies that all OCSPs
are approximation resistant, there were no results proving NP-hard approximation resistance of an OCSP
prior to this work. In contrast, there is a significant body of work on NP-hard approximation resistance
of classical Constraint Satisfaction Problems (CSPs) [16, 21, 11, 6]. Furthermore, the UGC is still
very much open and recent algorithmic advances have given rise to subexponential algorithms (as the
completeness-error ε goes to 0) for Unique Games [1, 5] calling the conjecture into question. Several
recent works have also been aimed at bypassing the UGC for natural problems by providing comparable
results without assuming the conjecture [15, 6].

                     T HEORY OF C OMPUTING, Volume 11 (10), 2015, pp. 257–283                              258
                            NP-H ARDNESS OF A PPROXIMATING O RDERINGS

         Problem          Approx. factor         UG-inapprox.    NP-inapprox.     This work
         MAS              1/2 + Ω(1/log n) [8]   1/2 + ε [14]    65/66 + ε [19]   14/15 + ε
         M AX BTW         1/3                    1/3 + ε [7]     47/48 + ε [9]    1/2 + ε
         M AX NBTW        2/3                    2/3 + ε [7]     -                2/3 + ε
         m-OCSP           1/m!                   1/m! + ε [13]   -                1/ bm/2c! + ε


                                 Table 1: Prior results and improvements.


1.1   Results
In this work we obtain improved NP-hardness of approximating various OCSPs. While a complete
characterization such as in the UG regime still eludes us, our results improve the knowledge of what we
believe are four important flavors of OCSPs; see Table 1 for a summary of the present state of affairs.
    We address the two most studied OCSPs: MAS and M AX BTW. For MAS, we show a factor
(14/15 + ε)-inapproximability improving the factor from 65/66 + ε [19]. For M AX BTW, we show a
factor (1/2 + ε)-inapproximability improving from 47/48 + ε [9].
Theorem 1.1. For every ε > 0, it is NP-hard to distinguish MAS instances with value at least 15/18 − ε
from instances with value at most 14/18 + ε.
Theorem 1.2. For every ε > 0, it is NP-hard to distinguish M AX BTW instances with value at least
1 − ε from instances with value at most 1/2 + ε.
    The above two results are inferior to what is known assuming the UGC and in particular do not
prove approximation resistance. We introduce the M AXIMUM N ON -B ETWEENNESS (M AX NBTW)
problem which accepts the complement of the predicate in M AX BTW. This predicate accepts four of the
six permutations on three elements and thus a random ordering satisfies two thirds of the constraints in
expectation. We show that this is optimal up to smaller-order terms.
Theorem 1.3. For every ε > 0, it is NP-hard to distinguish M AX NBTW instances with value at least
1 − ε from instances with value at most 2/3 + ε.
    Finally, we address the approximability of a generic width-m OCSP as a function of the width m.
In the CSP world, the generic version is called m-CSP and we call the ordering version m-OCSP. We
devise a simple predicate, “2t-Same Order” 2t-SO, on m = 2t variables that is satisfied only if the first t
elements are relatively ordered exactly as the last t elements. A random ordering satisfies only a fraction
1/t! of the constraints and we prove that this is essentially optimal, implying a (1/ bm/2c! + ε)-factor
inapproximability of m-OCSP.
Theorem 1.4. For every ε > 0 and integer m ≥ 2, it is NP-hard to distinguish m-OCSP instances with
value at least 1 − ε from value at most 1/ bm/2c! + ε.

1.2   Proof overviews
With the exception of MAS, our results follow a route which is by now standard in inapproximability:
starting from the optimization problem L ABEL C OVER (LC), we give reductions using dictatorship-test

                    T HEORY OF C OMPUTING, Volume 11 (10), 2015, pp. 257–283                           259
                       P ER AUSTRIN , R AJSEKAR M ANOKARAN , AND C ENNY W ENNER

 gadgets, also known as long-code tests. We describe these reductions in the context of M AX NBTW to
 highlight the new techniques in this paper.
       The reduction produces an instance I of M AX NBTW from an instance L of LC such that val(I) >
 1 − ε if val(L) = 1 while val(I) < 2/3 + ε if val(L) ≤ η for some η = η(ε). By the PCP Theorem
 and the Parallel Repetition Theorem [3, 2, 20], it is NP-hard to distinguish between val(L) = 1 and
 val(L) ≤ η for every constant η > 0 and thus we obtain the result in Theorem 1.3. The core component
 in this paradigm is the design of a dictatorship test: a M AX NBTW instance on [q]L ∪ [q]R , for integers q
 and label sets L and R. Let π be a map R → L. Each constraint is a tuple (~x,~y,~z) where~x ∈ [q]L , while
~y,~z ∈ [q]R . The distribution of tuples is obtained as follows. First, pick~x, and~y uniformly at random from
 [q]L , and [q]R . Set z j = y j + xπ( j) mod q. Finally, add noise by independently replacing each coordinate
 xi , y j and z j with a uniformly random element from [q] with probability γ.
       This test instance has canonical assignments that satisfy almost all the constraints. These are obtained
 by picking an arbitrary j ∈ [R], and partitioning the variables into q sets S0 , . . . Sq−1 where

                               St = {~x ∈ [q]R | xπ( j) = t} ∪ {~y ∈ [q]L | y j = t} .

If a constraint (~x,~y,~z) is so that ~x ∈ St , ~y ∈ Su then ~z ∈
                                                                / Sv for any v ∈ {t + 1, . . . , u − 1} except with
probability O(γ). This is because (a + b) mod q is never strictly between a and b. Further, the probability
that any two of~x,~y, and~z fall in the same set Si is simply the probability that any two of xπ( j) , y j , and z j
are assigned the same value, which is at most O(1/q). Thus, ordering the variables such that S0 ≺ S1 ≺
. . . ≺ Sq−1 with an arbitrary ordering of the variables within a set satisfies a fraction 1 − O(1/q) − O(γ)
constraints.
      The proof of Theorem 1.3 requires a partial converse of the above: every ordering that satisfies more
than a fraction 2/3 + ε of the constraints is more-or-less an ordering that depends on a few coordinates j
as above. This proof involves three steps. First, we show that there is a Γ = Γ(q, γ, β ) such that every
ordering O of [q]L and [q]R can be broken into Γ sets S0 , . . . , SΓ−1 such that one achieves expected value at
least val(O) − β for arbitrarily small β by ordering the sets S0 ≺ · · · ≺ SΓ−1 and within each set ordering
elements arbitrarily. The proof of this “bucketing” uses hypercontractivity of noised functions from a
finite domain. We note that a related bucketing argument is used in proving inapproximability of OCSPs
assuming the UGC [14, 13]. Somewhat similar to our analysis, they “round” table values to q buckets
of equal size and proceed to analyze indicator functions for ranges of order assignments. In particular,
the UG-based analysis only needs to consider multiple queries to, and bucketing of, individual tables.
For our construction, we have to consider the simultaneous bucketing of two tables. Merely “rounding”
the tables to a set [q] does not suffice. Rather, we consider having q equal-sized buckets for each table
and analyze functions which do not differentiate values within buckets. More precisely, we introduce a
“bucketed payoff” which, given a tuple of bucket indices, samples a random element from the respective
indexed buckets of the tables and evaluates the predicate for the resulting tuple. With this setup, we can
analyze functions of bounded range, yet the bucketing preserves arbitrarily well the value of an instance
even when the tables have influential coordinates.
      Similarly to [14, 13], the bucketing argument allows an OCSP to be analyzed as if it were a CSP
on a finite domain, enabling us to use powerful techniques developed in this setting. In particular, we
show that unless val(L) > η, for some η, then the distribution of constraints (~x,~y,~z) can be regarded
as obtained by sampling~x independently up to some negligible error in the payoff; in other words,~x is

                      T HEORY OF C OMPUTING, Volume 11 (10), 2015, pp. 257–283                                 260
                                NP-H ARDNESS OF A PPROXIMATING O RDERINGS

“decoupled” from (~y,~z). We note that the marginal distribution of the tuple (~y,~z) is already symmetric
with respect to swaps: P (~y = y,~z = z) = P (~y = z,~z = y). In order to prove approximation resistance, we
combine three of these dictatorship tests: the jth variant has ~x as the jth component of the triple. We
show that the combined instance is symmetric with respect to every swap up to an error O(ε) unless
val(L) > η. This implies that the instance has value at most 2/3 + O(ε) hence proving approximation
resistance of M AX NBTW.
    For M AX BTW and M AX 2t-SO, we do not require the final symmetrization and instead use a
dictatorship test based on a different distribution. Finally, the reduction to MAS is a simple gadget
reduction from M AX NBTW. For hardness results of width-two predicates, such gadget reductions
presently dominate the scene of classical CSPs and also designates the state of affairs for MAS. As an
example, the best-known NP-hard approximation hardness of 16/17 + ε for M AX C UT is via a gadget
reduction from M AX 3-L IN -2 [16, 23]. The preceding best approximation hardness of MAS was also
via a gadget reduction from M AX 3-L IN -2 [19], although with the significantly smaller hardness constant
65/66 + ε. By reducing from a problem more similar to MAS, namely M AX NBTW, we improve to the
approximation hardness to 14/15 + ε. The gadget in question is simple and was given in Section 1.


Organization. Section 2 sets up the notation used in the rest of the article. Section 3 gives a general
hardness result based on test distributions, although the proofs are deferred to Section 5. In Section 4, we
introduce distributions tailored to the studied predicates and invoke the construction from Section 3 to
prove three of the four approximation-hardness results; the fourth result is proven in the same section via
a gadget reduction.


Acknowledgments. We would like to thank Johan Håstad for suggesting the topic and for numerous
helpful discussions regarding the same, as well as the anonymous referees, and the editors of the
publishing issue, who assisted in improving the legibility of the work.


2     Preliminaries
We denote by [n] the integer interval {0, . . . , n − 1} and by [n]1 the interval {1, . . . , n}. Given a tuple of
reals x ∈ Rm , we write ~σ (x) ∈ Sm for the natural-order permutation on {1, . . . , m} induced by x. We also
denote by x−i the tuple (x1 , . . . , xi−1 , xi+1 , . . . , xm ). For a distribution D over Ω1 × · · · × Ωm , we use D≤t
and D>t to denote the projection to coordinates up to t and to the remaining coordinates, respectively.


2.1    Ordering-constraint satisfaction problems
We are concerned with predicates P : Sm → [0, 1] on the symmetric group Sm . Such a predicate specifies a
width-m OCSP written as OCSP(P). An instance I of OCSP(P) problem is a tuple (X, C) where X is the
set of variables and C is a distribution over ordered m-tuples of X referred to as the constraints.
    An assignment to I is an injective map O : X → Z called an ordering of X. For a tuple~c = (v1 , . . . , vm ),
O|c denotes the tuple (O(v1 ), . . . , O(vm )). An ordering is said to satisfy the constraint c when P(~σ (O|c )) =
1. The value of an ordering is the probability that a random constraint c←C is satisfied by O and the

                      T HEORY OF C OMPUTING, Volume 11 (10), 2015, pp. 257–283                                      261
                       P ER AUSTRIN , R AJSEKAR M ANOKARAN , AND C ENNY W ENNER

value val(I) of an instance is the maximum value of any ordering. Thus,
                                      def             def
                               val(I) = max val(O; I) = max E P(O|c ) .
                                                                    
                                         O:X→Z                O:X→Z c∈C

To simplify analysis, we extend the predicate to Zm and permit assignments which are not necessarily
injective as follows: define P : Zm → [0, 1] by setting P(a1 , . . . , am ) = E~σ [P(~σ )] where ~σ is drawn
uniformly at random over all permutations in Sm such that σi < σ j whenever ai < a j . As an example, for
m = 3,
                                                   1            1
                                P(42, 13, 13) = P(3, 1, 2) + P(3, 2, 1) .
                                                   2            2
Note that the value of an instance is invariant this extension as there is a complete ordering attaining at
least the value of any non-injective assignment.
    We define the predicates and problems studied in this work. MAS is exactly OCSP({(1, 2)}). The
betweenness predicate BTW is the set {(1, 3, 2), (3, 1, 2)} and NBTW is S3 \ BTW. We define M AX
BTW as OCSP(BTW) and M AX NBTW as OCSP(NBTW). Finally, introduce 2t-SO as the subset of
S2t such that the induced ordering on the first t elements coincides with the ordering of the last t elements,
                             def
                     2t-SO = {π ∈ S2t | σ (π(1), . . . , π(t)) = σ (π(t + 1), . . . , π(2t))} .

This predicate has (2t)!/t! satisfying orderings and will be used in proving the inapproximability of the
generic m-OCSP with constraints of width at most m.

2.2    Label cover and inapproximability
The problem L ABEL C OVER (LC) is a common starting point of strong inapproximability results. An LC
instance L = (U,V, E, L, R, Π) consists of a bipartite graph (U ∪V, E) associating with every edge {u, v}
a projection πv,u : R → L with the goal of labeling the vertices, λ : U ∪V → L ∪ R, so as to maximize the
fraction of projections for which πv,u (λ (v)) = λ (u). For simplicity, we shall additionally assume that
every projection has degree exactly d = d(ε). That is, for every LC instance, edge e ∈ E, and label i ∈ L,
there are exactly d choices of j ∈ R for which πe ( j) = i.

Theorem 2.1. For every ε > 0, there exists fixed label sets L and R such that it is NP-hard to distinguish
LC instances of value one from instances of value at most ε. Additionally, one can assume that there
exists d = d(ε) such that every projection of every instance has degree d.

Proof. Follows from an easy modification [24] of the classical LC construction [2, 20].

2.3    Primer on real analysis
We refer to a finite domain Ω along with a distribution µ as a probability space. Given a probability
space (Ω, µ), the nth tensor power is (Ωn , µ ⊗n ) where µ ⊗n (ω1 , . . . , ωn ) = µ(ω1 ) · · · µ(ωn ). The ` p norm
                                                                                          1/p
of f : Ω → R w. r. t. µ is denoted by || f ||µ,p and is defined as Ex∼µ [| f (x)| p ]            for real p ≥ 1 and
maxx∈supp(µ) f (x) for p = ∞. When clear from the context, we omit the distribution µ. The following
noise operator and its properties play a pivotal role in our analysis.

                     T HEORY OF C OMPUTING, Volume 11 (10), 2015, pp. 257–283                                   262
                             NP-H ARDNESS OF A PPROXIMATING O RDERINGS

Definition 2.2. Let (Ω, µ) be a probability space and f : Ωn → R be a function on the nth tensor power.
For a parameter ρ ∈ [0, 1], the noise operator Tρ takes f to Tρ f → R defined by
                                                       def
                                            Tρ f (x) = E [ f (y) | x] ,

where independently for each i, with probability ρ, yi = xi and otherwise a uniform random sample.
   The noise operator preserves the mass E [ f ] of a function while spreading it in the space. The
quantitative bound on this is referred to as hypercontractivity.
Theorem 2.3 ([25]; [18, Theorem 3.16, 3.17]). Let (Ω, µ) be a probability space in which the minimum
nonzero probability of any atom is α < 1/2. Then, for every q > 2 and every f : Ωn → R,

                                              Tρ(q,α) f       q
                                                                   ≤ || f ||2 ,

where for α < 1/2 we set
                                                                                                    !1/2
                    1−α             1      1                                        A1/q − A−1/q
                 A=     ;            0
                                       = 1− ;          and         ρ(q, α) =            0       0          .
                     α              q      q                                        A1/q − A−1/q

For α = 1/2, we set ρ(q, α) = (q − 1)−1/2 .
    For a fixed probability space, the above theorem says that for every γ > 0, there is a q > 2 such that
 T1−γ f q ≤ || f ||2 . For our application, we need the easy corollary that the reverse direction also holds:
for every γ > 0, there exists a q > 2 such that hypercontractivity to the `2 -norm holds.
Lemma 2.4. Let (Ω, µ) be a probability space in which the minimum nonzero probability of any atom is
α ≤ 1/2. Then, for every f : Ωn → R, small enough γ > 0,

                                               T1−γ f        2+δ
                                                                    ≤ || f ||2

for any
                                            log((1 − γ)−2 ) − 1                           1−α
                    0 < δ ≤ δ (γ, α) = 2                                     with    A=       > 1.
                                                  log(A)                                   α
Further, δ (γ, 1/2) = γ(2 − γ)(1 − γ)−2 .
Proof. The estimate for δ (γ, 1/2) follows immediately from the above theorem assuming γ < 1/2. In
the case when α < 1/2, solving
                              def
                           ρ 2 = (1 − γ)2 = (A1/q − A−1/q )(A1−1/q − A1/q−1 )−1

for q gives, for γ < 1 − A−1/2 ,

                                                         log (1 − γ)−2 − 1
                                                                      
                                            2 log(A)
                          δ = q−2 =     1+ρ 2 A 
                                                  −2 ≥ 2                   .
                                    log 1+ρ 2 /A
                                                               log(A)


                    T HEORY OF C OMPUTING, Volume 11 (10), 2015, pp. 257–283                                   263
                        P ER AUSTRIN , R AJSEKAR M ANOKARAN , AND C ENNY W ENNER

Efron-Stein Decompositions. Our proofs make use of the Efron-Stein decomposition which has useful
properties akin to Fourier decompositions.

Definition 2.5. Let f : Ω(1) × · · · × Ω(n) → R and µ a measure on ∏ Ω(t) . The Efron-Stein decomposition
of f with respect to µ is defined as { fS }S⊆[n] where

                                               def
                                        fS (x) = ∑ (−1)|S\T | E f (x0 ) | x0T = xT .
                                                                                 
                                                 T ⊆S


Lemma 2.6 (Efron and Stein [10]; Mossel [18]). Assuming {Ω(t) }t are independent, the Efron-Stein
decomposition { fS }S of f : ∏ Ω(t) → R satisfies:

    • fS (x) depends only on xS ,

                                                                / E[ fS (x0 ) | x0T = xT ] = 0.
    • for any S, T ⊆ [m], and xT ∈ ∏t∈T Ω(t) such that S \ T 6= 0,

    We use the standard notion of influence and noisy influence as in previous work.

Definition 2.7. Let f : Ωn → R be a function on a probability space. The influence of the 1 ≤ ith ≤ n
coordinate is
                                               def
                                    Infi ( f ) = E [VarΩi ( f )] .
                                                               {Ω j } j6=i

Additionally, given a noise parameter γ, the noisy influence of the ith coordinate is
                                            (1−γ)       def                                
                                         Infi       (f) =         E          VarΩi (T1−γ f ) .
                                                              {Ω j } j6=i

    Influences have simple expressions in terms of Efron-Stein decompositions:
                                                                            (1−γ)
                  Infi ( f ) = ∑ E fS2                                                ( f ) = ∑(1 − γ)2|S| E fS2 .
                                                                                                           
                                                     and            Infi
                                  S3i                                                        S3i

    The following well-known bounds on noisy influences are handy in the analysis.

Lemma 2.8. For every γ > 0, natural numbers i and n such that 1 ≤ i ≤ n, and every f : Ωn → R,

                          (1−γ)                                                            (1−γ)           Var( f )
                       Infi       ( f ) ≤ Var( f )            and              ∑ Infi              (f) ≤            .
                                                                                  i                          γ

Proof. The former inequality is immediate
                                      from the influences in terms of Efron-Stein and Parseval’s
Identity for the decomposition: ∑S E fS2 = Var( f ). Expanding the LHS of the second inequality,

                          (1−γ)
                                  ( f ) = ∑ k(1 − γ)2k            ∑ E FS2 ≤ Var( f ) max k(1 − γ)2k
                                                                                      
                 ∑ Infi                                                              k≥1
                   i                       k≥1                 S : |S|=k
                                                                                               1       Var( f )
                                        ≤ Var( f ) ∑ (1 − γ)2k ≤ Var( f )                          2
                                                                                                     ≤          .
                                                    k≥1                                     (1 − γ)      γ


                       T HEORY OF C OMPUTING, Volume 11 (10), 2015, pp. 257–283                                         264
                                    NP-H ARDNESS OF A PPROXIMATING O RDERINGS

      Let the total (noisy) influence of a function f : Ωn → R be defined as
                              def                                                                        def            (1−γ)
                 TotInf ( f ) = ∑ Infi ( f )                     and          TotInf(1−γ) ( f ) = ∑ Infi                        ( f ).
                                    i∈[n]1                                                                     i∈[n]1

    Finally, we introduce the notion of cross influence between functions subject to a projection. This
notion is implicitly prevalent in modern LC reductions, either for noised functions or for, analytically
similar, degree-bounded functions. Given finite sets Ω, L, R; an injective function π : L → R; and functions
f : ΩL → R; g : ΩR → R; define
                                             (1−γ)          def                    (1−γ)            (1−γ)
                                 CrInfπ              ( f , g) =     ∑ Infi                 ( f ) Inf j         (g) .
                                                                  (i, j)∈π

Note that our definition differs somewhat from the cross-influence, denoted “XInf,” used by Samorodnit-
sky and Trevisan [22].


3      A general hardness result
In this section, we prove a general inapproximability result for OCSPs which, similar to results for
classic CSPs, permit us to deduce hardness of approximation based on the existence of certain simple
distributions. The proof is via a scheme of reductions from LC to OCSPs. For an m-width predicate
P, we instantiate this scheme with a distribution D over Qt1 × Qm−t  2 ; for some parameter 1 ≤ t < m and
where Q1 and Q2 are finite integer sets. As an example of a classical distribution, Q1 and Q2 could be the
set {−1, 1}; t = 1, m = 3; and D could choose the two first arguments independently, setting the third
to be their product. To get appropriate properties for ordering problems, we may however need Q1 and
Q2 to be larger ranges than the classical bits, and the sizes of tables in the resulting OCSP instance will
similarly depend on these ranges.
    When the predicate permits introducing distributions satisfying certain properties, akin to pairwise
independence, then the hardness claims in this section follow and, together with the hardness of LC and
other properties of the distributions, yield the desired inapproximability results.
    The reduction itself is composed of pieces known as dictatorship test which is described in the next
section. Section 3.2 uses this test to construct the overall reduction and also contains the properties of this
reduction. For concrete examples of distributions D, we refer the reader to Section 4.

3.1     Dictatorship test
The dictatorship test uses the following test distribution (Distribution 1) parametrized by γ, and π and is
                 (γ)
denoted by Tπ (D). The reader should think of the support of the argument distribution D as essentially
being contained in the predicate P.
     Recall our convention of extending P to a predicate P : Zm → [0, 1]. For a pair of functions ( f , g),
let ( f , g) ◦ (X, Y) denote the tuple ( f (~x(1) ), . . . , f (~x(t) ), g(~y(t+1) ), . . . , g(~y(m) )). Then, the acceptance
                     (γ)
probability on Tπ (D) for a pair of functions ( f , g) where f : QL1 → Z and g : QR2 → Z is
                                               (γ)         def
                               Acc f ,g (Tπ (D)) =                      E            [P(( f , g) ◦ (X, Y))] .                            (3.1)
                                                                             (γ)
                                                                 (X,Y)←Tπ (D)


                       T HEORY OF C OMPUTING, Volume 11 (10), 2015, pp. 257–283                                                           265
                         P ER AUSTRIN , R AJSEKAR M ANOKARAN , AND C ENNY W ENNER

                                                                           (γ)
                                                  Distribution 1: Tπ (D)

                                                           t                     m−t
                                            z    }|       { z      }|       {
Parameters:           • distribution D over Q1 × · · · × Q1 × Q2 × · · · × Q2 ;

                      • noise parameter γ > 0;

                      • projection map π : R → L.
                             (γ)
Output: Distribution Tπ (D) over

                         (~x(1) , . . . ,~x(t) ,~y(t+1) , . . . ,~y(m) ) ∈ QL1 × · · · × QL1 × QR2 × · · · × QR2 .
                                                                                                               


          1. pick a random |L| × t matrix X over Q1 by letting each row be a sample from D≤t , indepen-
             dently.
                                                               def
          2. pick a random |R| × (m − t) matrix Y = (~y(t+1) , . . . ,~y(m) ) over Q2 by letting the ith row be a
                                                        (1)             (t)
             sample from D>t conditioned on Xπ(i) = (xπ(i) , . . . , xπ(i) )
          3. for each entry of X and Y, independently with probability γ, replace the entry with a new
             sample from Q1 and Q2 , resp.
          4. output (X, Y).


This definition is motivated by the overall reduction described in the next section. The distribution is
designed so that functions ( f , g) that are dictated by a single coordinate have a high acceptance probability,
justifying the name of the test.

Lemma 3.1. Let f : QL1 → Z and g : QR2 → Z be defined by g(y) = yi and f (x) = xπ(i) for some i ∈ R.
Then,
                                                    (γ)
                                         Acc f ,g (Tπ (D)) ≥ E [P(x)] − γm .
                                                                     ~x() ∼D


Proof. The vector ( f (~x(1) ), . . . , f (~x(t) ), g(~y(t+1) ), . . . , g(~y(m) )) simply equals the π(i)th row of X followed
by the ith row of Y. With probability (1 − γ)m ≥ 1 − γm this is a sample from D and is hence accepted by
P with probability at least Ex∼D [P(x)] − γm.

    We prove a partial converse of the above: unless f and g have influential coordinates i and j such that
π( j) = i, the distribution D can be replaced by a product of two distributions with a negligible loss in the
acceptance probability. We define this product distribution below and postpone the analysis to Section 5.2
in order to complete the description of the reduction.

Definition 3.2. Given the base distribution D, the decoupled distribution D⊥ is obtained by taking
independent samples x from D≤t and y from D>t .

                       T HEORY OF C OMPUTING, Volume 11 (10), 2015, pp. 257–283                                          266
                                  NP-H ARDNESS OF A PPROXIMATING O RDERINGS

                                                                           (P)
                                                       Reduction 2: RD,γ


Parameters: distribution D over Qt1 × Qm−t
                                       2   and noise parameter γ > 0.

Input: a Label Cover instance L = (U,V, E, L, R, Π).

Output: a weighted OCSP(P) instance I = (X, C) where X = (U × QL1 ) ∪ (V × QR2 ). The distribution
    of constraints in C is obtained by sampling a random edge e = (u, v) ∈ E with projection πe and
                  (γ)
    (X, Y) from Tπe (D); the constraint is the predicate P applied to the tuple

                                      ((u,~x(1) ), . . . , (u,~x(t) ), (v,~y(t+1) ), . . . , (v,~y(m) )) .


3.2     Reduction from Label Lover
An assignment to I is seen as a collection of functions, { fu }u∈U ∪{gv }v∈V , where fu : QL1 → Z, gv : QR2 → Z
and, e. g., fu (x) is identified with the variable (u, x). The value of an assignment is
                                                                                                    (γ)
                              E       P (( fu , gv ) ◦ (X, Y)) =            E       [Acc fu ,gv (Tπe (D))] .
                         e=(u,v)∈E;                                    e=(u,v)∈E
                                (γ)
                      (X,Y)∈Tπe (D)

Lemma 3.1 now implies that if L is satisfiable, then the value of the output instance is also high.
Lemma 3.3. If λ is a labeling of L satisfying a fraction c of its constraints, then the ordering assignment
fu (x) = xλ (u) , gv (y) = yλ (v) satisfies at least a fraction
                                                                 
                                                 c · E [P(x)] − γm
                                                         x∼D

                        (P)                                                               (P)
of the constraints of RD,γ (L). In particular, there is an ordering of RD,γ (L) which does not depend on D
and attains a value                                               
                                        val(L) · E [P(x)] − γm .
                                                             x∼D

   On the other hand, we also extend the decoupling property of the dictatorship test to the instance
output when val(L) is small. This is the technical core of the paper and is proven in Section 5, more
specifically via Lemma 5.9.
Theorem 3.4. Suppose that D over Qt1 × Qm−t
                                        2   satisfies the following properties:
      • D has uniform marginals;
      • for every i > t, Di is independent of D≤t .
Then, for every ε > 0 and γ > 0 there exists εLC > 0 such that if val(L) ≤ εLC then for every assignment
A = { fu }u∈U ∪ {gv }v∈V to I it holds that
                                                 (P)                        (P)
                                     val(A; RD,γ (L)) ≤ val(A; RD⊥ ,γ (L)) + ε .

                     T HEORY OF C OMPUTING, Volume 11 (10), 2015, pp. 257–283                                  267
                        P ER AUSTRIN , R AJSEKAR M ANOKARAN , AND C ENNY W ENNER

In particular,
                                             (P)             (P)
                                       val(RD,γ (L)) ≤ val(RD⊥ ,γ (L)) + ε .


4     Applications of the general result
In this section, we prove the inapproximability of M AX BTW, M AX NBTW, and M AX 2t-SO, using the
general hardness result of Section 3. We also present a gadget reduction from M AX NBTW to MAS.

4.1    Hardness of M AXIMUM B ETWEENNESS
For q ∈ Z≥1 , define the distribution D over {−1, q} × [q] × [q] by picking x1 ∼ {−1, q}, y2 ∼ [q], and
                                         (
                                           y2 + 1 mod q if x1 = q,
                                    y3 =
                                           y2 − 1 mod q if x1 = −1.

The way to think of this distribution is that x1 will always take on either the smallest or the largest value
which in turn dictates whether y3 should be smaller or greater than y2 in order to satisfy BTW, which will
be satisfied when drawn as above except perhaps the unlikely y2 ∈ {0, q − 1}.

Proposition 4.1. Let (x1 , y2 , y3 ) ∼ D. Then the following holds:

    1. D has uniform marginals.

    2. y2 and y3 are separately independent of x.

    3. (y2 , y3 ) has the same distribution as (y3 , y2 ).

    4. Ex1 ,y2 ,y3 ∼D [BTW(x1 , y2 , y3 )] ≥ 1 − 1/q.

     Let γ > 0 be a noise parameter and D⊥ be the decoupled distribution of D which draws the first
coordinate independently of the remaining. Consider applying Reduction 2 to a given LC instance L with
test distributions D and D⊥ , obtaining M AX BTW instances
                                                       ⊥
                                     I = RBTW
                                          D,γ (L) and I = RD⊥ ,γ (L) .
                                                           BTW


Lemma 4.2 (Completeness). If val(L) = 1, then val(I) ≥ 1 − 1/q − 3γ.

Proof. This is an immediate corollary of Lemma 3.3 and Proposition 4.1.

Lemma 4.3 (Soundness). For every ε > 0, γ > 0, q, there is an εLC > 0 such that if val(L) ≤ εLC , then
val(I) ≤ 1/2 + ε.

Proof. We note that the first two items of Proposition 4.1 asserts that D satisfies the conditions of
Theorem 3.4 and it suffices to show val(I⊥ ) ≤ 1/2. Let { fu : {0, 1}L → Z}u∈U , {gv : [q]R → Z}v∈V be

                      T HEORY OF C OMPUTING, Volume 11 (10), 2015, pp. 257–283                           268
                               NP-H ARDNESS OF A PPROXIMATING O RDERINGS

an arbitrary assignment to I⊥ . Fix an LC edge {u, v} with projection π and consider the mean value of
constraints produced for this edge by the construction:
                                                   h                                        i
                                       E            BTW fu (~x(1) ), gv (~y(2) ), gv (~y(3) ) .  (4.1)
                                           (γ)
                       ~x(1) ,y(2) ,y(3) ←Tπ (D⊥ )


    As noted in Proposition 4.1, (~y(2) ,~y(3) ) has the same distribution as (~y(3) ,~y(2) ) when drawn from D.
Consequently, when drawing arguments from the decoupled test distribution, the probability of a specific
outcome (x(1) , y(2) , y(3) ) equals the probability of (x(1) , y(3) , y(2) ). For strict orderings, possibly having
broken ties, at most one of the two can satisfy the predicate BTW. Thus, the expression in (4.1), and in
effect val(I⊥ ), is bounded by 1/2.

      Theorem 1.2 is now an immediate corollary of Lemmas 4.2 and 4.3, taking q = d2/εe and γ = ε/6.

4.2     Hardness of M AXIMUM N ON -B ETWEENNESS
For an implicit parameter q, define a distribution D over [q]3 by picking x1 , y2 ∼ [q] and setting y3 =
x1 + y2 mod q.

Proposition 4.4. The distribution D satisfies the following:

   1. D is pairwise independent with uniform marginals,

   2. and Ex1 ,x2 ,x3 ∼D [NBTW(x1 , y2 , y3 )] ≥ 1 − 3/q.

      A straightforward application of the general inapproximability with t = 1 shows that x1 is decoupled
from y2 and y3 unless val(L) is large. Furthermore, pairwise independence implies that the decoupled
distribution is simply the uniform distribution over [q]3 . However, this does not suffice to prove approxi-
mation resistance and in fact the value could be greater than 2/3. To see this, note that if { fu }u∈U , {gv }v∈V
is an ordering of the instance from the reduction, then the first coordinate of every NBTW constraint is a
variable of the form fu (·) while the rest are gv (·). Thus, ordering the fu (·) variables in the middle and
randomly ordering gv (·) on both sides satisfies a fraction 3/4 of the constraints.
      To remedy this issue and prove approximation resistance, we permute D by swapping the last
coordinate with each of the remaining coordinates and overlay the instances obtained by the reduction
using these respective distributions. More specifically, for 1 ≤ j ≤ 3, define D( j) as the distribution
over [q]3 obtained by first sampling from D and then swapping the jth and third coordinate, i. e., the
 jth coordinate is the sum of the other two which are picked independently at random. Similarly, define
NBTW( j) as the ordering predicate which is true if the jth argument does not lie between the other two,
e. g., NBTW(3) = NBTW.
      As in the previous section, take an LC instance L and consider applying Reduction 2 to L with the
distributions {D( j) }3j=1 , and write
                                                                 ( j)
                                              I( j) = RNBTW
                                                       D( j) ,γ
                                                                (L) .
Similarly let
                                                   ( j)            ( j)
                                              I⊥          = RNBTW
                                                              ⊥ ( j)
                                                                     (L)
                                                             D   ,γ


                     T HEORY OF C OMPUTING, Volume 11 (10), 2015, pp. 257–283                                  269
                       P ER AUSTRIN , R AJSEKAR M ANOKARAN , AND C ENNY W ENNER

be the corresponding decoupled instances.
    As the distributions D( j) are over the same domain [q]3 , the instances I(1) , I(2) , I(3) are over the same
variables. We define a new instance I over the same variables as the “sum”
                                                   1
                                                      ∑ I( j) ,
                                                   3 j∈[3]

defined by taking all constraints in I(1) , I(2) , I(3) with multiplicities and normalizing their weights by 1/3.
In other words, up to noise, we have an acceptance probability of
                                                                                                  
                2           u      v        v π               1           v      v π         u
            E NBTW( f (x), g (y), g (x +q y)) + NBTW(g (y), g (x −q y), f (x)) .
                3                                             3

Lemma 4.5 (Completeness). If val(L) = 1, then val(I) ≥ 1 − 3/q − 3γ.

Proof. This is an immediate corollary of Lemma 3.3 and Proposition 4.4.

Lemma 4.6 (Soundness). For every ε > 0, γ > 0, q, there is an εLC > 0 such that if val(L) ≤ εLC , then
val(I) ≤ 2/3 + ε.

Proof. Again our goal is to use Theorem 3.4 and we start by bounding val(I⊥ ). To do this, note that
                                ( j)
the decoupled distributions {D⊥ } j are in fact the uniform distributions over [q]3 and in particular do
                                                                                                              ( j)
not depend on j. This means that the distributions of variables which NBTW( j) is applied to in I⊥
                                 (1)                                                                    (2)
is independent of j, e. g., if I⊥ contains the constraint NBTW(1) (z1 , z2 , z3 ) with weight w then I⊥
contains the constraint NBTW(2) (z1 , z2 , z3 ) with the same weight. In other words, I⊥ can be thought of
as having constraints of the form           h                      i
                                         E NBTW( j) (z1 , z2 , z3 ) .
                                           j

It is readily verified that
                                            h                 i 2
                                          E NBTW( j) (a, b, c) ≤
                                          j                      3
for every a, b, c.
    Getting back to the main task of bounding val(I), fix an arbitrary assignment

                               A = { fu : [q]L → Z}u∈U ∪ {gv : [q]R → Z}v∈V
                                                 ( j)
of I. By Theorem 3.4, val(A; I( j) ) ≤ val(A; I⊥ ) + ε for j ∈ [3]. It follows that val(A; I) ≤ val(A; I⊥ ) + ε
and therefore, since A was arbitrary, it holds that val(I) ≤ val(I⊥ ) + ε ≤ 2/3 + ε, as desired.




                     T HEORY OF C OMPUTING, Volume 11 (10), 2015, pp. 257–283                                270
                              NP-H ARDNESS OF A PPROXIMATING O RDERINGS

4.3     Hardness of M AXIMUM ACYCLIC S UBGRAPH
The inapproximability of MAS is via a simple gadget reduction from M AX
NBTW. We claim the following properties of the directed graph shown in Fig. 2,
defined formally as follows.                                                                        a

Definition 4.7. Define the MAS gadget H from a constraint NBTW(x, y, z)                     x       z      y
                                                              def
with auxiliary variables {a, b} as the directed graph H = (V, A) where V =
{x, y, z, a, b} and A consists of the walk                                                         b
                          b → x → a → z → b → y → a.
                                                                                        Figure 2: The gadget re-
Lemma 4.8. Consider an ordering O of x, y, z. Then,                                     ducing NBTW to MAS.

   1. if NBTW(O(x), O(y), O(z)) = 1, then maxO0 val(O0 ; H) = 5/6 where the
      max is over all extensions O0 : V → Z of O to V .

   2. if NBTW(O(x), O(y), O(z)) = 0, then maxO0 val(O0 ; H) = 4/6 where the
      max is over all extensions O0 of O to V .

Proof. To find the value of the gadget H, we individually consider the optimal placement of a and b
relative x, y, and z. There are three edges in which the respective variables appear: a appears in (x, a),
(y, a) and (a, z); while b appears in (z, b), (b, x), and (b, y).
      From this, we gather that two out of the three respective constraints can always be satisfied by placing
a after x, y, z and similarly placing b before x, y, z. We also see that all three constraints involving a can be
satisfied if and only if z comes after both x and y. Similarly, satisfying all three constraints involving b is
possible if and only if z is ordered before both x and y. From this, one concludes that if NBTW(x, y, z) = 1,
i. e., if z comes first or last, then we can satisfy five out of the six constraints, whereas if z is the middle
element of O, we can satisfy only four out of the six constraints.

      The proof of Theorem 1.1 is now a routine application of the MAS gadget.

Proof of Theorem 1.1. Given an instance I of M AX NBTW, construct a directed graph G by replacing
each constraint NBTW(x, y, z) of I with a MAS gadget H, identifying x, y, z with the vertices x, y, z of H
and using two new vertices a, b for each constraint of I. By Lemma 4.8, it follows that

                                             5        4
                                     val(G) = val(I) + (1 − val(I)) .
                                             6        6

By Theorem 1.3, it is NP-hard to distinguish between val(I) ≥ 1 − ε and val(I) ≤ 2/3 + ε for every
ε > 0, implying that it is NP-hard to distinguish val(G) ≥ 5/6 − ε/6 from val(G) ≤ 7/9 + ε/6, providing
a hardness gap of
                                           7/9
                                               + ε 0 = 14/15 + ε 0 .
                                           5/6

                     T HEORY OF C OMPUTING, Volume 11 (10), 2015, pp. 257–283                                  271
                               P ER AUSTRIN , R AJSEKAR M ANOKARAN , AND C ENNY W ENNER

4.4     Hardness of M AXIMUM 2t-S AME O RDER
We establish the hardness of M AX 2t-SO, Theorem 1.4, via the approximation resistance of the relatively
sparse predicate 2t-SO. The proof is similar to the that of M AX BTW in Section 4.1. Let q1 < q2 be integer
parameters and define the base distribution D over [q1 ]t × [q2 ]t as follows: draw x1 , . . . , xt uniformly
at random from [q1 ], draw z uniformly at random from [q2 ], and for 1 ≤ j ≤ t set y j = x j + z mod q2 .
The distribution of (x1 , . . . , xt , y1 , . . . , yt ) defines D. For a permutation ~σ ∈ St , let 1σ (·) be the ordering
predicate which is 1 on ~σ and 0 on all other inputs.

Proposition 4.9. D satisfies the following properties.

   1. D has uniform marginals.

   2. For every i > t, Di is independent of D≤t .

   3. For every ~σ ∈ St , E [1σ (x1 , . . . , xt )] = E [1σ (yt+1 , . . . , ym )] = 1/t!.

                                                                                             t2  q1
   4.               E                 [2t-SO(x1 , . . . , xt , yt+1 , . . . , ym )] ≥ 1 −       − .
        x1 ,...,xt ,yt+1 ,...,ym ∼D                                                         2q1 q2

Proof. The first three properties are immediate from the construction and recalling the extension of
predicates to non-unique values. For the last property, note that 2t-SO(x1 , . . . , xt , yt+1 , . . . , ym ) = 1 if
x1 , . . . , xt are distinct and z < q2 − q1 . The former event occurs with probability at least 1 − t 2 /(2q1 ) and
the latter independently with probability at least 1 − q1 /q2 ; a union bound implies the claim.

      As in the proof of Theorem 1.2, let γ be some noise parameter, take an LC instance L, and let

                                                  I = R2t-SO
                                                       D,γ (L) and              I⊥ = R2t-SO
                                                                                      D⊥ ,γ (L)


be the respective instances produced by Reduction 2 using the base distribution D and the decoupled
version D⊥ .
    The following lemma is an immediate corollary of Lemma 3.3, Proposition 4.9, and Item 4.

                                                                                              t2  q1
Lemma 4.10 (Completeness). If val(L) = 1, then val(I) ≥ 1 −                                      − − 2tγ.
                                                                                             2q1 q2

      For the soundness, we have the following.

Lemma 4.11 (Soundness). For every ε > 0, γ > 0, and 1 ≤ q1 ≤ q2 , there is an εLC > 0 such that if
val(L) ≤ εLC , then val(I) ≤ 1/t! + ε.

Proof. As in the proof of Lemma 4.3, it suffices to prove that val(I⊥ ) ≤ 1/t!. Let { fu : [q1 ]L → Z}u∈U ,
{gv : [q2 ]R → Z}v∈V be an arbitrary assignment to I⊥ . Set m = 2t. Fix an arbitrary edge {u, v} of L with

                             T HEORY OF C OMPUTING, Volume 11 (10), 2015, pp. 257–283                                272
                                     NP-H ARDNESS OF A PPROXIMATING O RDERINGS

projection π. The value of constraints corresponding to {u, v} satisfied by the assignment is
                           h                                                                             i
               E            2t-SO( fu (~x(1) ), . . . , fu (~x(t) ), gv (~y(t+1) ), . . . , gv (~y(2t) ))
                (γ)
         (X,Y)∈Tπ (D⊥ )
                                                h                                                                         i
                = ∑                E             1~σ fu (~x(1) ), . . . , fu (~x(t) ) 1~σ gv (~y(t+1) ), . . . , gv (~y(2t) )
                                     (γ)
                      ~σ ∈St (X,Y)∈Tπ (D⊥ )
                                      h                                   i          h                                      i
                = ∑               E    1~σ fu (~x(1) ), . . . , fu (~x(t) )        E    1~σ gv (~y(t+1) ), . . . , gv (~y(2t) )
                              (γ)                                              (γ)
                      ~σ ∈St Tπ (D⊥ )                                         Tπ (D⊥ )

                = 1/t! ,

where the penultimate step uses the independence of X and Y in the decoupled distribution, and the final
step uses Item 3 of Proposition 4.9.

      Theorem 1.4 is a corollary of Lemmas 4.10 and 4.11, taking, e. g.,

                                         2t 2
                                                                     
                                                                    2q1                               ε
                                    q1 =        ,              q2 =       ,           and γ =            .
                                          ε                          ε                                8t


5      Analysis of the reduction
In this section we prove Theorem 3.4 which bounds the value of the instance generated by the reduction
in terms of the decoupled distribution. Throughout, we fix an LC instance L, a predicate P, an OCSP
                                         (P)
instance I obtained by the procedure RD,γ for a distribution D and noise-parameter γ, and finally an
assignment A = { fu }u∈U ∪ {gv }v∈V .
     The proof involves three major steps. First, we show that the assignment functions, which are
Z-valued, can be approximated by functions on finite domains via bucketing (see Section 5.1). This
approximation makes the analyzed instance value susceptible to tools developed in the context of finite-
domain CSPs [24, 6] which are used in Section 5.2 to prove the decoupling property of the dictatorship
test. Finally, the decoupling is applied to the reduction in Section 5.3 and related to the value of the
reduced-from LC instance.


5.1     Bucketing
For an integer Γ, we approximate the possibly non-injective function fu : QL1 → Z by partitioning the
range into Γ multisets. Set q1 = |Q1 | and partition the cardinality-QL1 multiset range of fu into sets
  (f )          (f )                                            (f )         (f )
B1 u , . . . , BΓ u of respective size qL1 /Γ such that if x ∈ Bi u and y ∈ B j u for some i < j then x ≤ y.
Note that this is possible as long as the parameter Γ divides qL1 , which is the case in this treatise. Let
                                                                                        (a)
Fu : QL1 → [Γ] specify the mapping of points to the bucket containing it, and Fu : QL1 → {0, 1} the
                                       (f )                                                  (g )
indicator of points assigned to Ba u . Partition gv : QR2 → Z similarly into buckets {Ba v } obtaining
                          (a)
Gv : QR2 → [Γ] and Gv : QR2 → {0, 1}.

                          T HEORY OF C OMPUTING, Volume 11 (10), 2015, pp. 257–283                                                  273
                             P ER AUSTRIN , R AJSEKAR M ANOKARAN , AND C ENNY W ENNER

    We show that a bucketed version approximates well the acceptance probability of the dictatorship test
for any edge e = (u, v) from Section 3,
                                                   (γ)            def
                                     Acc f ,g (Tπ (D)) =                            E         [P(( f , g) ◦ (X, Y))] .
                                                                                        (γ)
                                                                        (X,Y)←Tπ (D)

Fix an edge e = (u, v) and put f = fu , g = gv . As before, we concisely denote by (X, Y) and ( f , g) ◦ (X, Y)
the query tuple and assignment tuple,

          (~x(1) , . . . ,~x(t) ,~y(t+1) , . . . ,~y(m) )         and               ( f (~x(1) ), . . . , f (~x(t) ), g(~y(t+1) ), . . . , g(~y(m) )) .

Define the bucketed payoff function ℘( f ,g) : [Γ]m → [0, 1] with respect to f and g as
                                                      h                                                i
                  ℘( f ,g) (a1 , . . . , am ) = E      P(~x(1) , . . . ,~x(t) ,~y(t+1) , . . . ,~y(m) )                                                   (5.1)
                                                                        (f)
                                                             ~x(i) ←Bai ; i≤t
                                                                        (g)
                                                             ~y( j) ←Ba j ; j>t


and the bucketed acceptance probability,
                                                                                     h                          i
                                           (γ)
                          BAcc f ,g (Tπ (D)) =                          E             ℘( f ,g) ((F, G) ◦ (X, Y)) .                                        (5.2)
                                                                              (γ)
                                                             (X,Y)←Tπ (D)

In other words, bucketing corresponds to generating a tuple ~a = (F, G) ◦ (X, Y) and replacing each
coordinate ai with a random value from the bucket which ai fell in. We show that above is close to the
                                       (γ)
true acceptance probability Acc f ,g (Tπ (D)).
Theorem 5.1 (Bucketing preserves value). For every predicate P, noise parameter γ > 0, projection
π : R → L, distribution D with uniform marginals, pair of functions f : QL1 → Z and g : QR2 → Z, and
bucketing parameter Γ,
                                                       (γ)                                    (γ)
                                        Acc f ,g (Tπ (D)) − BAcc f ,g (Tπ (D)) ≤ m2 Γ−δ ,

for some δ = δ (γ, Q) > 0 with Q ≥ max{|Q1 |, |Q2 |}.
    To prove this, we show that for every choice of f and g, including f = g, there is only a few
overlapping pairs of buckets, and the probability of hitting any particular pair is small. For each a ∈ [Γ],
     (f)                                          (f)                  (g)
let Ra be the smallest interval in Z containing Ba ; and similarly Ra for g.
Lemma 5.2 (Few buckets overlap). For every integer Γ there are at most 2Γ choices of pairs (a, b) ∈
                     (f)  (g)
[Γ] × [Γ] such that Ra ∩ Rb 6= 0.
                               /
Proof. Construct the bipartite intersection graph GI = (UI ∪ VI , EI ) where the vertex sets are disjoint
                                                                  (f)    (g)
copies of [Γ], and there is an edge (a, b) ∈ UI × VI iff Ra ∩ Rb 6= 0.         / By construction, the intervals
   (f)                                      (g)
{Ra }a are disjoint and similarly {Rb }b . Consequently, GI must be a forest as it does not contain any
pair of distinct edges (u, v), (u0 , v0 ) such that u < u0 and v > v0 . We conclude that that the number of edges
and hence intersections is at most |UI ∪VI | = 2Γ.

                           T HEORY OF C OMPUTING, Volume 11 (10), 2015, pp. 257–283                                                                        274
                                  NP-H ARDNESS OF A PPROXIMATING O RDERINGS

   Next, we prove a bound on the probability that a fixed pair of the m queries fall in a specific pair of
buckets. For a distribution D over QL1 × QR2 , define D(γ) as the distribution that samples from D and for
each of the |L| + |R| coordinates independently with probability γ replaces it with a new sample from D.
                                                                (γ)
The distribution D(γ) is representative of the projection of Tπ (D) to two specific coordinates and we
show that noise prevents the buckets from intersecting with good probability.

Lemma 5.3 (Bucket collisions are unlikely). Let D be a distribution over QL1 × QR2 whose marginals are
uniform in QL1 and QR2 and D(γ) be as defined above. For every integer Γ and every pair of functions
F : QL1 → {0, 1} and G : QR2 → {0, 1} such that E [F(~x)] = E [G(~y)] = 1/Γ,

                                               E          [F(~x)G(~y)] ≤ Γ−(1+δ )
                                           (~x,~y)∈D(γ)


for some δ = δ (γ, Q) > 0 where Q ≥ min{|Q1 |, |Q2 |}.

Proof. Without loss of generality, let |Q1 | = max{|Q1 |, |Q2 |}. Set q = 2 + δ 0 > 2 as in Lemma 2.4,
1/q0 = 1 − 1/q, and define
                                              def            
                                       H(~x) = E T1−γ G(~y) .
                                                             ~y|~x

Then,
                                                                 
                       E          [F(~x)G(~y)] = E T1−γ F(~x)H(~x) ≤ T1−γ F q ||H||q0
                   (~x,~y)∈D(γ)                    ~x

                                              ≤ ||F||2 ||H||q0 ≤ ||F||2 T1−γ G q0 ≤ ||F||2 ||G||q0
                                                                     0      δ0
                                              = Γ−(1/2+1/q ) = Γ−(1+ /2(2+δ )) ,
                                                                                    0




using the hypercontractivity from Lemma 2.4 and Jensen’s inequality.


   Note that the above lemma applies to queries to the same function as well, setting F = G, etc. To
complete the proof of Theorem 5.1, we apply the above lemma to every distinct pair of the m queries
         (γ)
made in Tπ (D) and each of the at most 2Γ pairs of overlapping buckets.


Proof of Theorem 5.1. Note that the bucketed payoff ℘( f ,g) ((F, G) ◦ (X, Y)) is equal to the true payoff
P(( f , g) ◦ (X, Y)) except possibly when at least two elements in (F, G) ◦ (X, Y) fall in an overlapping
pair of buckets. Fix a pair of inputs, say x(i) and y( j) – the proofs are identical when choosing two inputs
from X or two from Y. Let a = F(x(i) ) and b = G(y( j) ). By Lemma 5.2 there are at most 2Γ possible
values (a, b) such that the buckets indexed by a and b are overlapping. From Lemma 5.3, the probability
that F(~x(i) ) = a and G(~y( j) ) = b is at most Γ−1−δ . By a union bound, the two outputs F(~x(i)), G(~y( j) )
consequently fall in overlapping buckets with probability at most 2Γ−δ . As there are at most m2 ≤ m2 /2
pairs of outputs, the proof is complete.

                     T HEORY OF C OMPUTING, Volume 11 (10), 2015, pp. 257–283                               275
                      P ER AUSTRIN , R AJSEKAR M ANOKARAN , AND C ENNY W ENNER

5.2   Soundness of the dictatorship test
We now reap the benefits of bucketing and prove the decoupling property of the dictatorship test alluded
to in Section 3.
Theorem 5.4. For every predicate P and distribution D satisfying the conditions of Theorem 3.4, and
any noise rate γ > 0, projection π : R → L, and bucketing parameter Γ, the following holds. For any
functions f : QL1 → Z, g : QR2 → Z with bucketing functions F : QL1 → [Γ], G : QR2 → [Γ],
                                                                                                  1/2
                  (γ)                  (γ)                                     (1−γ)
      BAcc f ,g (Tπ (D)) − BAcc f ,g (Tπ (D⊥ )) ≤ γ −1/2 m1/2 4m Γm ∑ CrInfπ          F (a) , G(b)      .
                                                                                a,b∈[Γ]

    Recall that the decoupled version D⊥ of a base distribution D is obtained by combining two inde-
pendent samples of D: one for the first t coordinates and one for the remaining. A similar claim, as
above, for the true acceptance probabilities of the dictatorship test is now a simple corollary of the above
theorem and Theorem 5.1. This will be used later in extending the decoupling property to our general
inapproximability reduction.
Lemma 5.5. For every predicate P and distribution D satisfying the conditions of Theorem 3.4, and any
noise rate γ > 0, projection π : R → L, and bucketing parameter Γ, the following holds. For any functions
f : QL1 → Z, g : QR2 → Z with bucketing functions F : QL1 → [Γ], G : QR2 → [Γ],
                            (γ)                   (γ)
                  Acc f ,g (Tπ (D)) − Acc f ,g (Tπ (D⊥ ))
                                                                  (1−γ)
                                                                                       1/2               (5.3)
                           ≤ γ −1/2 m1/2 4m Γm    ∑        CrInfπ          F (a) , G(b)      + 2Γ−δ m2 .
                                                 a,b∈[Γ]

     The proof of the theorem is via an invariance-style theorem and uses techniques found in the works of
Mossel [18], Samorodnitsky and Trevisan [22], and Wenner [24]. Frankly, it mostly involves introducing
new notation to apply existing machinery. The notion of lifted functions may be the most alien: a
large-side table g : [q2 ]R → R may equivalently be seen as the function g0 π : Ω0 L → R where Ω0 = [q2 ]d
contains the values of all d coordinates in R projecting to the same coordinate in L. g0 π is called the lifted
analogue of g with respect to the projection π.
     The first lemma proved in this section essentially says that if a product of functions is influential, then
at least one of the functions is also influential. The second says that if the lifted analogue of a function g
is influential for some coordinate i ∈ L, then g is influential for a coordinate projecting to i. The lemma
will be used to—after massaging the expression—decouple the small-side table from the lifted analogue
of the large-side table as a function of their cross influence.
Lemma 5.6 (Mossel [18, Lemma 6.5]). Let f1 , . . . , ft : Ωn → [0, 1] be arbitrary functions. Then for any
j ∈ [n], Inf j (∏tr=1 fr ) ≤ t ∑tr=1 Inf j ( fr ).
Lemma 5.7. Consider a function g : ΩR → R, a map π : R → L, and an arbitrary bijection ς : R ↔
                                                                                                def  −1
((`) × |π −1 (`)| )`∈L such that for all (i, j), ∃t ς ( j) = (i,t) iff π( j) = i. Introduce Ω0` = Ω|π (`)| and
                

define g0 : ∏L Ω0` → R as g0 (~z) = g(~y) where y j = zς ( j) . Then for every i ∈ L,
                                          Infi g0 ≤ ∑ Inf j (g) .
                                                 
                                                           j∈π −1 (i)



                     T HEORY OF C OMPUTING, Volume 11 (10), 2015, pp. 257–283                               276
                                      NP-H ARDNESS OF A PPROXIMATING O RDERINGS

Proof. Using definitions and the law of total conditional variance,
                                    "               #               
       0              0
         
 Infi g = E0 Var   0
                     g =        E            Var     g ≤      ∑ E Var g =                                                          ∑             Inf` (g) .
               Ω−i    Ωi             {Ω` }`:6∃ j ς (i, j)=`   {Ω` }`:∃ j ς (i, j)=`                      Ω−`   Ω`
                                                                                      `:∃ j ς (i, j)=`                        `:∃ j ς (i, j)=`


Theorem 5.8 (Wenner [24, Theorem 3.21]). Consider functions { f (r) ∈ L∞ (Ωnr )}r∈[m]1 on a probability
space P = (∏mi=1 Ωi , P) , a set M ( [m]1 , and a collection C of minimal sets C ⊆ [m]1 ,C * M such that
                        ⊗n                                                    1

the spaces {Ωi }i∈C are dependent. Then,
               "          #                "          #
                  m               h i
                      (r)           (r)           (r)
            E ∏f            − ∏E f      E ∏f
                     r=1              r∈M
                                       /                           r∈M
                                             s
                                         m                   0
                                                                                              ∏ Inf` f (r) ∏ f (r) ∞ .
                                                                                                                   
                                      ≤ 4 max min
                                               0
                                                  TotInf f (r ) ∑
                                                 C∈C          r ∈C                     ` r∈C\{r0 }                      r∈C
                                                                                                                         /


Proof of Theorem 5.4
                                                                       (γ)
Proof. We massage the expression BAcc f ,g (Tπ (D)) to a form suitable for applying Theorem 5.8. Recall
that F (a) denotes the indicator of “F(x) = a” and similarly G(a) of “G(y) = a.” In terms of these indicators,
             (γ)
BAcc f ,g (Tπ (D)) equals
                                                  "                                  #
                                                     t               m
                            ∑ ℘(~a,~b) E ∏ F (ar ) (x(r) ) ∏ G(br−t ) (y(r) ) .
                                                                 (γ)
                            ~a∈[Γ]t ,~b∈[Γ]m−t                 Tπ (D) r=1                     r=t+1


Consequently, BAcc f ,g (D) − BAcc f ,g (D⊥ ) may be bounded from above by
                   "                                 #         "                                   #
                      t               m                          t                m
  ∑℘(~a,~b) (γ)E ∏ F (ar ) (x(r) ) ∏ G(br−t ) (y(r) ) − (γ)E ⊥ ∏ F (ar ) (x(r) ) ∏ G(br−t ) (y(r) ) .
  ~a,~b      Tπ (D) r=1           r=t+1                 Tπ (D ) r=1             r=t+1
                                                                                                    (5.4)

    We note that 0 ≤ ℘ ≤ 1 and proceed to bound the difference in the summation for fixed (a, b). To
this end, we make a slight change of notation as discussed previously. The new notation may seem
cumbersome; the high-level picture is that we group the first set of functions into a single function,
receiving a single argument over a larger domain, and redefine the latter functions to take arguments
indexed by L instead of R. Also recall that we intend to reduce from LC instances with projection degrees
exactly d for some d = d(εLC ).
    Define m0 = m − t + 1, Ω1 = [q1 ]t , Ω2 = · · · = Ωm0 = [q2 ]d . Let ς be a bijection L × [d] ↔ R such that
π(ς (i, j0 )) = i; i. e., for each i ∈ L, we group together the d coordinates in R mapping to i. Introduce the
distribution
                                                                                 0
                                     ΩL1 × · · · × ΩLm0 3 (~w, z(2) , . . . , z(m ) ) ∼ R(µ)
   1 Minimal sets of dependent outcome spaces in the sense that if C ∈ C, then the outcomes spaces of every strict subset of C

are independent.


                           T HEORY OF C OMPUTING, Volume 11 (10), 2015, pp. 257–283                                                                           277
                          P ER AUSTRIN , R AJSEKAR M ANOKARAN , AND C ENNY W ENNER

                                                                            (γ)
which samples (x(1) , . . . , x(t) , y(t+1) , . . . , y(m) ) from Tπ (D), setting

                                                       (r)                 (r)             (r)
                                          wi,r = xi          and         zi = (yς −1 (i, j0 ) )d−1
                                                                                               j0 =0 .

Let
                                                              t
                                                          def
                                                   W (~w) = ∏ T1−γ F (ar ) (x(r) )
                                                                          
                                                                  r=1

         (r)
where xi = wi,r and similarly, for 2 ≤ r ≤ m0 , call the lifted function H (r) : ΩLr → R defined as

                                                    H (r) (z) = T1−γ G(br−1 ) (y)
                                                                             


where yς (i, j0 ) = zi, j0 . In the new notation, the difference within the summation in (5.4) is
                                  "                                #                   "                               #
                                               m0                                                   m0
                                                       (r)   (r)                                         (r)     (r)
                             E    W (~w) ∏ H (z ) − E                                  W (~w) ∏ H (z )                     .           (5.5)
                           R(D)               r=2                          R(D⊥ )                  r=2


     We note that R is a product distribution R = µ ⊗L for some µ and for any 2 ≤ r ≤ m0 , Ωr is independent
of Ω1 due to the premise that Dr independent of D≤t . Choosing M = {2, . . . , m0 }, minimal indices C of
dependent sets in µ not contained in M contains 1 and at least two elements from M, i. e., C ∈ C implies
1, e, e0 ∈ C for some e 6= e0 ∈ {2, . . . , m0 }.
     Applying Theorem 5.8 and choosing r0 6= 1 bounds the difference (5.5) by
                     s
                     m
                                                                                                  Infi H (t) ∏ H (r)
                                                                                                            
                   4  max min TotInf(H (e) ) ∑ Infi (W )                               ∏                                           .   (5.6)
                           C∈C 16=e∈C                         i                                                                ∞
                                                                                  t∈C\{1,e}                    r∈C
                                                                                                                /


   As we assumed the codomain of the studied functions {G(r) }r to be [0, 1] the same holds for {H (r) }r
and consequently the influences and infinity norms in (5.6) are upper-bounded by one on account of
Lemma 2.8, yielding at most

                                                                                                        1/2
                                                                                                        !
                                                                                                
                                      m                           (e)                              (e)
                         (5.6) ≤ 4         max TotInf(H                 ) · max ∑ Infi (W ) Infi H           .                         (5.7)
                                            e6=1                           e6=1    i


      We recall that W = ∏tr=1 T1−γ F (ar ) and hence by Lemma 5.6,

                                           t                                       
                                                                        (1−γ)
                             Infi (W ) ≤ t ∑ Infi T1−γ F (ar ) = t ∑ Infi      F (ar ) .
                                                r=1

Similarly, Lemma 5.7 implies that
                                                                                                                     
                                                                                                         (1−γ)
               Infi H (e) ≤ ∑ Inf j T1−γ G(be−1 ) =                                              ∑ Inf j          G(be−1 ) .
                                          j∈π −1 (i)                                        j∈π −1 (i)


                         T HEORY OF C OMPUTING, Volume 11 (10), 2015, pp. 257–283                                                       278
                                         NP-H ARDNESS OF A PPROXIMATING O RDERINGS

Returning to Equation (5.7), we have the bound

                                                                                                                               1/2
                                                                                                                               !
                                                                       t                                            
                                                             (1−γ)                                              (1−γ)
  (5.7) ≤ 4m        t max TotInf(1−γ) (G(be ) ) · max ∑ ∑ Infi      F (ar )                       ∑          Inf j     G(be )       . (5.8)
                      e>t                                   e>t
                                                                    i r=1                       j∈π −1 (i)


Using t ≤ m; maxe · · · ≤ ∑e · · · for non-negative expressions; bounding the total noisy influence by γ −1
from Lemma 2.8; and identifying the inner sum as a cross influence, we establish the desired bound on
the considered difference as

                                                                         1/2
                                                                         !
                                                                                                                             1/2
                                 −1               (1−γ)                                               (1−γ)
(5.5) ≤ (5.8) ≤ 4     m
                            tγ         ∑     CrInfπ         (ar )
                                                           F ,G   (be )
                                                                              ≤ γ −1/2 m1/2 4m ∑ CrInfπ      F (ar ) , G(br0 )      .
                                       r,e                                                                   r,r0

Finally, as we noted before, since 0 ≤ ℘ ≤ 1, and since there are at most Γm terms in the summation,
(5.4) is at most
                                                                                                    1/2
                                                                               (1−γ)
                                        γ −1/2 m1/2 4m Γm          ∑      CrInfπ        F (a) , G(b)      .
                                                                a,b∈[Γ]


5.3   Soundness of the reduction
With the soundness for the dictatorship test in place, proving the soundness of the reduction (Theorem 3.4)
is a relatively standard task of constructing noisy-influence decoding strategies.
     The proof follows immediately from the more general estimate given in the following lemma by
taking
                                                                                 !2
                             l
                                   2   1/δ
                                           m                           εγ 3/2
                        Γ = (4m /ε)            and then εLC =                        .
                                                                    m1/2 4m Γm+1

Lemma 5.9. Given an LC instance L = (U,V, E, L, R, Π) and a collection of functions, fu : QL1 → Z for
u ∈ U; gv : QR2 → Z for v ∈ V , and Γ, γ, δ as in this section,
              h                                                               i
                                 (γ)                              (γ)
        E         Acc fu ,gv (Tπ (D)) − Acc fu ,gv (Tπ (D⊥ ))                     ≤ γ −1.5 m1/2 4m Γm+1 val(L)1/2 + 2Γ−δ m2 .
      u,v∼E


Proof. For a function f : QL1 → Z define a distribution Ψ( f ) over L as follows. First pick a ∼ [Γ]
                                                   (1−γ)    (a)
uniformly, then pick ` ∈ L with probability γ · Inf`     (Fv ) and otherwise an arbitrary label. Note that
by Lemma 2.8,
                                                                    (1−γ)     (a)
                                                          ∑ Inf`            (Fu ) ≤ 1/γ
                                                          `∈L

and so picking ` ∈ L with the given probabilities is possible. Define Ψ(g) over R for g : QR2 → Z similarly.
Now define a labeling of L by, for each u ∈ U (and v ∈ V ), sampling a label from Ψ( fu ) (Ψ(gv ), resp.),
independently.

                          T HEORY OF C OMPUTING, Volume 11 (10), 2015, pp. 257–283                                                     279
                            P ER AUSTRIN , R AJSEKAR M ANOKARAN , AND C ENNY W ENNER

   For an edge e = (u, v) ∈ E, the probability that e is satisfied by the labeling equals P(πe (Ψ( fu )) =
Ψ(gv )), which can be lower-bounded by
                         h                              i                                     
                   2         (1−γ)    (a)      (1−γ)   (b)           2         (1−γ)    (a)    (b)
           ∑     γ   E    Infi      Fu      Inf j     Gv      = (γ/Γ)  ∑ CrInf πe     Fu    , Gv     .
                           a,b∈[Γ]                                      a,b
        i, j : πe ( j)=i

Taking the expectation over all edges of L, we get that the fraction of satisfied constraints is
                                       "                            #
                                                                  
                                                (1−γ)     (a)  (b)
                         (γ/Γ)2 E        ∑ CrInfπe Fu , Gv ≤ val(L) ,
                                          e=(u,v)   a,b
                       √
and by concavity of the · function, this implies that
                            "                                #
                                                     (b) 1/2
                                                       
                                       (1−γ)   (a)
                        E    ∑ CrInfπe Fu , Gv                 ≤ Γγ −1 val(L)1/2 .
                               e=(u,v)   a,b

Plugging this bound on the total cross influence into the soundness for the test, Lemma 5.5, we obtain
           h                                               i
                           (γ)                    (γ)
       E      Acc fu ,gv (Tπe (D)) − Acc fu ,gv (Tπe (D⊥ )) ≤ γ −1.5 m1/2 4m Γm+1 val(L)1/2 + 2Γ−δ m2 ,
      e=(u,v)

as desired.


References
 [1] S ANJEEV A RORA , B OAZ BARAK , AND DAVID S TEURER: Subexponential algorithms for Unique
     Games and related problems. In Proc. 51st FOCS, pp. 563–572. IEEE Comp. Soc. Press, 2010.
     [doi:10.1109/FOCS.2010.59] 258
 [2] S ANJEEV A RORA , C ARSTEN L UND , R AJEEV M OTWANI , M ADHU S UDAN , AND M ARIO
     S ZEGEDY: Proof verification and the hardness of approximation problems. J. ACM, 45(3):501–555,
     1998. Versions in ECCC and FOCS’92. [doi:10.1145/278298.278306] 260, 262
 [3] S ANJEEV A RORA AND S HMUEL S AFRA: Probabilistic checking of proofs: A new characterization
     of NP. J. ACM, 45(1):70–122, 1998. Preliminary version in FOCS’92. [doi:10.1145/273865.273901]
     260
 [4] P ER AUSTRIN , R AJSEKAR M ANOKARAN , AND C ENNY W ENNER: On the NP-hardness of
     approximating Ordering Constraint Satisfaction Problems. In Proc. 16th Internat. Workshop on
     Approximation Algorithms for Combinatorial Optimization Problems (APPROX’13), pp. 26–41,
     2013. [doi:10.1007/978-3-642-40328-6_3] 257
 [5] B OAZ BARAK , F ERNANDO G. S. L. B RANDÃO , A RAM W ETTROTH H ARROW, J ONATHAN A.
     K ELNER , DAVID S TEURER , AND Y UAN Z HOU: Hypercontractivity, sum-of-squares
     proofs, and their applications. In Proc. 44th STOC, pp. 307–326. ACM Press, 2012.
     [doi:10.1145/2213977.2214006, arXiv:1205.4484] 258

                           T HEORY OF C OMPUTING, Volume 11 (10), 2015, pp. 257–283                   280
                          NP-H ARDNESS OF A PPROXIMATING O RDERINGS

 [6] S IU O N C HAN: Approximation resistance from pairwise independent subgroups. In Proc. 45th
     STOC, pp. 447–456. ACM Press, 2013. Version in ECCC. [doi:10.1145/2488608.2488665] 258,
     273

 [7] M OSES C HARIKAR , V ENKATESAN G URUSWAMI , AND R AJSEKAR M ANOKARAN: Every per-
     mutation CSP of arity 3 is approximation resistant. In Proc. 24th IEEE Conf. on Computational
     Complexity (CCC’09), pp. 62–73, 2009. [doi:10.1109/CCC.2009.29] 259

 [8] M OSES C HARIKAR , KONSTANTIN M AKARYCHEV, AND Y URY M AKARYCHEV: On the advantage
     over random for Maximum Acyclic Subgraph. In Proc. 48th FOCS, pp. 625–633. IEEE Comp. Soc.
     Press, 2007. Version in ECCC. [doi:10.1109/FOCS.2007.65] 259

 [9] B ENNY C HOR AND M ADHU S UDAN: A geometric approach to betweenness. SIAM J. Discrete
     Math., 11(4):511–523, 1998. Preliminary version in ESA’95. [doi:10.1137/S0895480195296221]
     258, 259

[10] B RADLEY E FRON AND C HARLES S TEIN: The Jackknife Estimate of Variance. Annals of Statistics,
     9(3):586–596, 1981. [doi:10.1214/aos/1176345462] 264

[11] L ARS E NGEBRETSEN AND J ONAS H OLMERIN: More efficient queries in PCPs for NP and
     improved approximation hardness of maximum CSP. Random Struct. Algorithms, 33(4):497–514,
     2008. Preliminary version in STACS’05. [doi:10.1002/rsa.20226] 258

[12] M ICHAEL R. G AREY AND DAVID S. J OHNSON: Computers and Intractability: A Guide to the
     Theory of NP-Completeness. W. H. Freeman & Co., 1979. 258

[13] V ENKATESAN G URUSWAMI , J OHAN H ÅSTAD , R AJSEKAR M ANOKARAN , P RASAD R AGHAVEN -
     DRA , AND M OSES C HARIKAR : Beating the random ordering is hard: Every ordering CSP
     is approximation resistant. SIAM J. Comput., 40(3):878–914, 2011. Version in ECCC.
     [doi:10.1137/090756144] 258, 259, 260

[14] V ENKATESAN G URUSWAMI , R AJSEKAR M ANOKARAN , AND P RASAD R AGHAVENDRA: Beating
     the random ordering is hard: Inapproximability of Maximum Acyclic Subgraph. In Proc. 49th
     FOCS, pp. 573–582. IEEE Comp. Soc. Press, 2008. [doi:10.1109/FOCS.2008.51] 258, 259, 260

[15] V ENKATESAN G URUSWAMI , P RASAD R AGHAVENDRA , R ISHI S AKET, AND Y I W U: Bypassing
     UGC from some optimal geometric inapproximability results. In Proc. 23rd Ann. ACM-SIAM
     Symp. on Discrete Algorithms (SODA’12), pp. 699–717. ACM Press, 2012. Available on ACM DL.
     Preliminary version in ECCC. 258

[16] J OHAN H ÅSTAD: Some optimal inapproximability results. J. ACM, 48(4):798–859, 2001. Prelimi-
     nary version in STOC’97. [doi:10.1145/502090.502098] 258, 261

[17] S UBHASH K HOT: On the power of unique 2-prover 1-round games. In Proc. 34th STOC, pp.
     767–775. ACM Press, 2002. Also in CCC’02. [doi:10.1145/509907.510017] 258

                  T HEORY OF C OMPUTING, Volume 11 (10), 2015, pp. 257–283                     281
                   P ER AUSTRIN , R AJSEKAR M ANOKARAN , AND C ENNY W ENNER

[18] E LCHANAN M OSSEL: Gaussian bounds for noise correlation of functions. Geometric and Func-
     tional Analysis, 19(6):1713–1756, 2010. Preliminary version in FOCS’08. [doi:10.1007/s00039-
     010-0047-x] 263, 264, 276

[19] A LANTHA N EWMAN: The Maximum Acyclic Subgraph Problem and degree-3 graphs. In Proc.
     4th Internat. Workshop on Approximation Algorithms for Combinatorial Optimization Problems
     (APPROX’01), pp. 147–158. Springer, 2001. [doi:10.1007/3-540-44666-4_18] 258, 259, 261

[20] R AN R AZ: A parallel repetition theorem. SIAM J. Comput., 27(3):763–803, 1998. Preliminary
     version in STOC’95. [doi:10.1137/S0097539795280895] 260, 262

[21] A LEX S AMORODNITSKY AND L UCA T REVISAN: A PCP characterization of NP with op-
     timal amortized query complexity. In Proc. 32nd STOC, pp. 191–199. ACM Press, 2000.
     [doi:10.1145/335305.335329] 258

[22] A LEX S AMORODNITSKY AND L UCA T REVISAN: Gowers uniformity, influence of variables,
     and PCPs. SIAM J. Comput., 39(1):323–360, 2009. Preliminary version in STOC’06.
     [doi:10.1137/070681612] 265, 276

[23] L UCA T REVISAN , G REGORY B. S ORKIN , M ADHU S UDAN , AND DAVID P. W ILLIAMSON:
     Gadgets, approximation, and linear programming. SIAM J. Comput., 29(6):2074–2097, 2000.
     Preliminary version in FOCS’96. [doi:10.1137/S0097539797328847] 261

[24] C ENNY W ENNER: Circumventing d-to-1 for approximation resistance of satisfiable predicates
     strictly containing parity of width at least four. Theory of Computing, 9(23):703–757, 2013.
     Preliminary version in APPROX’12. [doi:10.4086/toc.2013.v009a023] 262, 273, 276, 277

[25] PAWEŁ W OLFF: Hypercontractivity of simple random variables. Studia Mathematica, 180(3):219–
     236, 2007. [doi:10.4064/sm180-3-3] 263


AUTHORS

     Per Austrin
     associate professor
     KTH – Royal Institute of Technology, Stockholm, Sweden
     austrin kth se
     http://www.csc.kth.se/~austrin


     Rajsekar Manokaran
     postdoc
     KTH – Royal Institute of Technology, Stockholm, Sweden
     rajsekar csc kth se
     http://www.csc.kth.se/~rajsekar



                  T HEORY OF C OMPUTING, Volume 11 (10), 2015, pp. 257–283                   282
                         NP-H ARDNESS OF A PPROXIMATING O RDERINGS

    Cenny Wenner
    Ph. D. student
    KTH – Royal Institute of Technology, Stockholm, Sweden
    cenny cwenner net
    http://www.cwenner.net


ABOUT THE AUTHORS

    P ER AUSTRIN graduated from KTH – Royal Institute of Technology in Sweden, 2008. His
        advisor was Johan Håstad. Following this, he did post-docs at Institut Mittag-Leffler,
        New York University, the University of Toronto, and Aalto University. In his spare time
        he enjoys writing tepid biographic blurbs.


    R AJSEKAR M ANOKARAN received his Ph. D. from Princeton University under the supervi-
       sion of Sanjeev Arora. He is currently a postdoctoral researcher with Johan Håstad at
       KTH in Stockholm.


    C ENNY W ENNER is a Ph. D. student at Stockholm University and KTH – Royal Institute of
       Technology advised by Johan Håstad and Viggo Kann. His research area is Hardness of
       Approximation and with a particular interest for extending PCP machinery to problems
       which seem to present fundamental technical obstacles to the same. The author’s spare
       time is occupied by activities such as playing Go, Mahjong, programming, volunteering,
       and daydreaming about making a difference in the world when he grows up.




                 T HEORY OF C OMPUTING, Volume 11 (10), 2015, pp. 257–283                         283