Inverse Conjecture for the Gowers Norm is False

Authors Shachar Lovett, Roy Meshulam, Alex Samorodnitsky,
Plaintext
                             T HEORY OF C OMPUTING, Volume 7 (2011), pp. 131–145
                                          www.theoryofcomputing.org




                             Inverse Conjecture for the
                               Gowers Norm is False
              Shachar Lovett∗                   Roy Meshulam†                    Alex Samorodnitsky‡
                     Received: October 8, 2008; revised: April 29, 2011; published: July 12, 2011.




         Abstract: Let p be a fixed prime number and N be a large integer. The “Inverse Conjecture
         for the Gowers norm” states that if the “d-th Gowers norm” of a function f : FNp → F p
         is non-negligible, that is, larger than a constant independent of N, then f is non-trivially
         correlated to a degree-(d − 1) polynomial. The conjecture is known to hold for d = 2, 3
         and for any prime p. In this paper we show the conjecture to be false for p = 2 and d = 4,
         by presenting an explicit function whose 4-th Gowers norm is non-negligible, but whose
         correlation to any polynomial of degree 3 is exponentially small. Essentially the same result
         (with different correlation bounds) was independently obtained by Green and Tao (2009).

ACM Classification: F.2.2
AMS Classification: 05E99
Key words and phrases: Inverse Gowers conjecture, additive combinatorics, Gowers norm

1      Introduction
We consider multivariate functions over finite fields. The main question of interest here is whether these
functions can be non-trivially approximated by a low-degree polynomial. Fix a prime p. Let F = F p
denote the finite field with p elements. Let ω = e2πi/p be the primitive p-th root of unity. Denote by e(x)
the exponential function taking x ∈ F to ω x ∈ C. For two functions f , g : FN → F, let

                                               h f , gi := Ex [e( f (x) − g(x))] .
     ∗ Supported by DMS grant 0835373.
     † Supported by ISF and BSF grants
     ‡ Supported by ISF grant 039-7682 and by BSF grant 2006377.



    2011 Shachar Lovett, Roy Meshulam, and Alex Samorodnitsky
    Licensed under a Creative Commons Attribution License                                DOI: 10.4086/toc.2010.v007a009
                    S HACHAR L OVETT, ROY M ESHULAM , AND A LEX S AMORODNITSKY

    A function f is non-trivially approximable by a degree-d polynomial if

                                                  |h f , gi| > ε

for some polynomial g(x) of degree at most d over F. More precisely, in this paper we are looking at
a sequence fN of functions and of approximating low-degree polynomials gN in N variables, and let N
grow to infinity, where the remaining parameters, that is the field size p, the degree d and the offset ε are
fixed, independent of N.
Definition 1.1. Fix a finite field F = F p . A sequence of functions { fN : FN → F} is non-trivially
approximable by degree-d polynomials if there exists a sequence of degree-d polynomials {gN : FN → F}
and an offset ε > 0 such that for all N,
                                             |h fN , gN i| > ε .
    A counting argument shows that a generic function cannot be approximated by a polynomial of
low degree. The problem of showing a specific given function to have no non-trivial approximation by
low-degree polynomials has been extensively investigated, since solutions to this problem have many
applications in complexity (cf. discussion and references in [1, 3, 16], as well as an excellent survey by
Viola on correlation bounds [15]).
    This paper studies a technical tool that measures distance from low-degree polynomials. This is the
Gowers norm, introduced in [4]. For a function f : FN → F and a vector y ∈ FN , we take fy to be the
directional derivative of f in direction y by setting

                                           fy (x) := f (x + y) − f (x) .

For a k-tuple of vectors y1 , . . . , yk ∈ Fn we take the iterated derivative in these directions to be
                                                                          
                                              fy1 ,...,yk := fy1 ,...,yk−1 y .
                                                                     k

It is easy to see that this definition does not depend on the ordering of y1 , . . . , yk . The “k-th Gowers norm”
k f kU k of f is
                                                                                         k
                                     k f kU k = (Ex,y1 ,...,yk [e ( fy1 ,...,yk (x))])1/2 .
More accurately, as shown in [4], this is indeed a norm of the associated complex-valued function e( f )
(for k ≥ 2).
     It is easy to see that k f kU d+1 is 1 iff f is a polynomial of degree at most d. This is just another way of
saying that all order-(d + 1) iterative derivatives of f are zero if and only if f is a polynomial of degree
at most d. It is also possible to see that |h f , gi| > ε for g of degree at most d, implies k f kU d+1 > ε [5].
That is to say, if f is non-trivially close to a degree-d polynomial, this can be detected via an appropriate
Gowers norm.
     This discussion naturally leads to the inverse conjecture [5, 9, 11], that is, if the (d + 1)-th Gowers
norm of f is non-trivial, then f is non-trivially approximable by a degree-d polynomial. This conjecture
is referred to as the Inverse Conjecture for the Gowers Norm (ICGN). This conjecture is easily seen to
hold for d = 1 and has been proved also for d = 2 [5, 9]. It is of interest to prove this conjecture for
higher values of d. As the conjecture remained open, special cases of the inverse conjecture were also
studied (i. e., the d-vs-(d − 1) conjecture [3], and the inverse conjecture for low-degree polynomials [6]).

                         T HEORY OF C OMPUTING, Volume 7 (2011), pp. 131–145                                 132
                        I NVERSE C ONJECTURE FOR THE G OWERS NORM IS FALSE

    Our main result is that ICGN is false. This note is aimed at providing a self-contained proof for the
case of p = 2. The result for general p can be found in the conference version [7], where the bounds
obtained are much weaker than the ones presented in this note.
    From now on we consider functions f : FN2 → F2 . For x ∈ FN2 let x(i) denote the i-th coordinate of x.
Let S4 denote the symmetric polynomial of degree 4 in N variables,

                                 S4 (x) =       ∑          x(i)x( j)x(k)x(`) .
                                            1≤i< j<k<`≤N

We prove two results, whose combination shows that ICGN is false over F2 .

Theorem 1.2. There exists an absolute constant c > 0 such that

                                                kS4 kU 4 ≥ c .

Theorem 1.3. There exists an absolute constant α < 1 such that for any cubic polynomial g(x) over F2

                                               hS4 , gi ≤ α N .

1.1   Related work
Our results have a large overlap with a recent work of Green and Tao [6].
    The paper of Green and Tao has two parts. In the first part ICGN is shown to be true when f is itself
a polynomial of degree less than p. In the second part, the conjecture is shown to be false in general. In
particular, the symmetric polynomial S4 is shown to be a counterexample for p = 2 and d = 4.
    The calculation of the 4-th Gowers norm of S4 in this paper as well as in [6] follows by a direct
calculation of the bias of the 4-th derivative polynomial of S4 , where the analysis in [6] is somewhat
simpler and cleaner. In any case, this is the simpler and more direct part of the proof in both papers.
    The proof of non-approximability of S4 by lower-degree polynomials in [6] uses a modification of a
Ramsey-type argument due to Alon and Beigel [1]. Very briefly, this argument shows that if a function
over F2 has a non-trivial correlation with a multilinear polynomial of degree d, then its restriction to
a subcube of smaller dimension has a non-trivial correlation with a symmetric polynomial of degree
d. The problem of inapproximability by symmetric polynomials turns out to be easier to analyze. This
analysis gives weaker bounds for non-inapproximability of S4 , in that it shows hS4 , gi < log−c (N) for
any degree-3 polynomial g and for an absolute constant c > 0. On the other hand, this argument is more
robust than our inapproximability argument, as it can be readily extended to the case of a general prime p.

1.2   The case of a general prime field
Let F = F p . Let Sd denote the symmetric polynomial of degree d in N variables,

                                       Sd (x) =      ∑         ∏ x(i) .
                                                  S⊂[N],|S|=d i∈S

    The polynomial S p2 over F p provides a counterexample for ICGN over F p . One can prove the
following:

                        T HEORY OF C OMPUTING, Volume 7 (2011), pp. 131–145                            133
                     S HACHAR L OVETT, ROY M ESHULAM , AND A LEX S AMORODNITSKY

   1. For any d ≥ 2p, there exists an absolute constant εd > 0, such that for any N

                                                     kSd kU d ≥ εd .


   2. For any polynomial g of degree at most p2 − 1,
                                                          2
                                                                −1
                                           S p2 , g ≤ log(p ) N     = oN (1)

      where log(t) denotes the t-fold iteration of the logarithm function.

   The proof of the first claim entails calculating and working with the iterated derivatives of Sd . It is
similar in the outline to the argument for S4 over F2 , but requires some additional technicalities. The
proof of the second claim follows by an appropriate adaptation of the argument of Alon and Beigel.


1.3   Subsequent work
Subsequent to this work, Bergelson, Tao, and Ziegler [2, 13, 14] proved a refined version of ICGN. Let
F : FNp → C be a function. The derivative of F in direction y ∈ FNp is defined as Fy (x) = F(x + y)F(x),
and iterated derivatives are defined analogously. Note that when F takes values which are p-roots of unity,
i. e., when F(x) = e f (x)2πi/p for some f : FNp → F p , then this definition of derivatives coincides with our
previous definition, i. e., Fy (x) = e fy (x)2πi/p .
      A function F : FNp → C is said to be a non-classical polynomial of degree d if for any y1 , . . . , yd+1 ∈ FNp
we have Fy1 ,...,yd+1 ≡ 1. Clearly, if f : FNp → F p is a degree d polynomial, then F = e f (x)2πi/p is a non-
classical polynomial of degree d. However, there exist other examples of non-classical polynomials. Let
                                                                                        `
 f (x) be a degree d − (p − 1)(` − 1) polynomial over Z p` . Then F(x) = e f (x)2πi/p is also a non-classical
polynomial of degree d (note that F is still evaluated over FNp ). In fact, this is a complete classification of
all non-classical polynomials [12].

Theorem 1.4 ([2, 13, 14]). Fix prime p, d ≥ 1 and ε > 0. Let F : FNp → C be a function such that
kFk∞ ≤ 1 and kFkU d+1 ≥ ε. Then there exists a non-classical polynomial G : FNp → C of degree d such
that
                                                            
                                |hF, Gi| = Ex∈FNp F(x)G(x) ≥ δ

where δ = δ (F p , d, ε) > 0, i. e., δ does not depend on N.

    Consider, in this framework, the counterexample of S4 over F2 discussed in this paper. For x ∈ Fn2 and
f (x) = x(1) + · · · + x(n) (mod 8), i. e., f is a linear polynomial over Z8 , observe that S4 (x) is the most
                                                                                                       2πi
significant bit of f (x). Moreover, it can be easily checked that (−1)S4 is correlated to F(x) = e 8 f (x) , a
non-classical polynomial of degree 3 (to verify, note that both functions depend only on the hamming
weight of x modulo 8). Thus, S4 has a noticeable 4-th Gowers norm, and is indeed correlated to a
non-classical cubic polynomial.

                          T HEORY OF C OMPUTING, Volume 7 (2011), pp. 131–145                                  134
                              I NVERSE C ONJECTURE FOR THE G OWERS NORM IS FALSE

2 S4 has high 4-Gowers norm
This section contains the proof of Theorem 1.2. We show that there exists a positive constant c > 0 such
that kS4 kU 4 ≥ c, independent of the number of variables N.
    We start by explicitly describing the 4-th iterative derivative of S4 .
Claim 2.1. Let S(y1 , y2 , y3 , y4 ) = ∑i1 6=i2 6=i3 6=i4 y1 (i1 )y2 (i2 )y3 (i3 )y4 (i4 ), where the sum is over distinct
elements i1 , i2 , i3 , i4 ∈ [N]. Then

                                             (S4 )y1 ,y2 ,y3 ,y4 (x) = S(y1 , y2 , y3 , y4 ) .

In particular,
                                          kS4 kU164 = Ey1 ,y2 ,y3 ,y4 [(−1)S(y1 ,y2 ,y3 ,y4 ) ] .
Proof. Let i1 , i2 , i3 , i4 be distinct elements of [N]. Consider a monomial m(x) = x(i1 )x(i2 )x(i3 )x(i4 ). Its
4-th iterated derivative in directions y1 , . . . , y4 is, by definition,
                                                                            !
                                                                    4
                                     my1 ,y2 ,y3 ,y4 (x) = ∑ ∏ x(i j ) + ∑ yk (i j )                .
                                                            I⊆[4] j=1               k∈I

Expanding the right hand side as a sum of monimials, we observe that the only monomials appearing an
odd number of times are of the form ∏4j=1 y j (iπ( j) ), where π is a permutation on 4 elements. Since we
are working in F2 , we conclude
                                                                              4
                                            my1 ,y2 ,y3 ,y4 (x) =       ∑ ∏ y j (iπ( j) ) .
                                                                    π∈Sym4 j=1

The claim now follows by summing over all monomials in S4 .

     Let M = M(y1 , y2 , y3 , y4 ) denote the 4 × N matrix over F2 whose rows are given by y1 , y2 , y3 , y4 . Let
Mi1 ,i2 ,i3 ,i4 denote the 4 × 4 minor of M restricted to columns i1 , i2 , i3 , i4 . Observe that S(y1 , y2 , y3 , y4 ) can
be expressed as the sum over all permanents of 4 × 4 minors of M.
Claim 2.2. S(y1 , y2 , y3 , y4 ) = ∑i1 <i2 <i3 <i4 Per(Mi1 ,i2 ,i3 ,i4 ).
     We recall the Binet-Cauchy formula.
Lemma 2.3. Let A be an m × n matrix with m ≤ n over a field. For I ⊂ [n] such that |I| = m, let AI be
the m × m minor of A restricted to columns of I. Then

                                                   ∑        Det(AI )2 = Det(AAt ) .
                                              I⊂[n],|I|=m

   Recall that x2 = x in F2 , and also that determinants and permanents are identical in F2 . Therefore we
may apply the Binet-Cauchy formula for M, using Claim 2.2, and conclude that
Corollary 2.4. S(y1 , y2 , y3 , y4 ) = Det(MMt ).

                              T HEORY OF C OMPUTING, Volume 7 (2011), pp. 131–145                                      135
                         S HACHAR L OVETT, ROY M ESHULAM , AND A LEX S AMORODNITSKY

Proof.

         S(y1 , y2 , y3 , y4 ) =      ∑          Per(MI ) =      ∑          Det(MI ) =      ∑          Det(MI )2 = Det(MMt ) .
                                   I⊂[n],|I|=4                I⊂[n],|I|=4                I⊂[n],|I|=4




    The matrix MMt is a 4 × 4 symmetric matrix, whose (i, j) entry is given by hyi , y j i. We next show that
the distribution of MMt where y1 , y2 , y3 , y4 are chosen uniformly from FN2 is very close to the distribution
of a uniformly chosen 4 × 4 symmetric matrix. Recall that the statistical distance between two random
variables X,Y is given by
                                                1
                               sd(X,Y ) = ∑ | Pr[X = a] − Pr[Y = a]| .
                                                2 a

Claim 2.5. Let X denote a uniform 4 × 4 symmetric matrix over F2 . Then the distribution of MMt for
uniformly chosen y1 , y2 , y3 , y4 ∈ FN2 is O(2−N ) close to the distribution of X (in statistical distance). In
particular
                                Pr[Det(MMt ) = 0] ≥ Pr[Det(X) = 0] − O(2−N ) .

    The proof of the claim uses the following fact, which is easily verified using standard Fourier analysis
in FN2 (see, e. g., [3]).

Claim 2.6. Let Y ∈ Fk2 be a random variable, such that for all non-zero c ∈ Fk2

                                                                               1
                                                       Pr[hY, ci = 0] −          ≤α.
                                                                               2

Then the distribution of Y is 2k · α close (in statistical distance) to the uniform distribution over Fk2 .

Proof of Claim 2.5. We can consider the symmetric matrix MMt as a vector in F10
                                                                             2 indexed by {(i, j) :
1 ≤ i ≤ j ≤ 4}. Let c ∈ F10
                         2  be a non-zero vector. We will prove that

                                                                             1
                                                  Pr[hMMt , ci = 0] −          ≤ O 2−N .
                                                                                      
                                                                             2

This will imply that the distribution of MMt is O(2−N ) close to the uniform distribution over symmetric
4 × 4 matrices. To prove this, for a1 , a2 , a3 , a4 ∈ F2 define the 4 × 4 symmetric matrix A(a1 , a2 , a3 , a4 )
whose (i, j) entry is ai a j . Note that MMt = ∑Ni=1 A(y1 (i), y2 (i), y3 (i), y4 (i)). Therefore,

                                             h           i  N  h                                  i
                                                     t
                  2 Pr[hMMt , ci = 0] − 1 = E (−1)hMM ,ci = ∏ E (−1)A(y1 (i),y2 (i),y3 (i),y4 (i)) .
                                                                                i=1

Thus we have                                                         h                           iN
                         2 Pr[hMMt , ci = 0] − 1 = Ea1 ,a2 ,a3 ,a4 ∈F2 (−1)hA(a1 ,a2 ,a3 ,a4 ),ci     .

                               T HEORY OF C OMPUTING, Volume 7 (2011), pp. 131–145                                               136
                            I NVERSE C ONJECTURE FOR THE G OWERS NORM IS FALSE

To conclude, note that hA(a1 , a2 , a3 , a4 ), ci is a nonzero quadratic polynomial over F2 in the variables
a1 , a2 , a3 , a4 . Standard bounds on the number of roots of polynomials over a field [10] imply

                                       1                                       3
                                         ≤ Pr[hA(a1 , a2 , a3 , a4 ), ci = 0] ≤ .
                                       4                                       4
Therefore
                                           2 Pr hMMt , ci = 0 − 1 ≤ 2−N
                                                            

which, by Claim 2.6, gives
                                              sd(MMt , X) ≤ 1024 · 2−N .



    To conclude the proof of Theorem 1.2, we use the following fact, which can be easily verified.
                                                                                              9
Claim 2.7. Let X be a uniformly chosen 4 × 4 symmetric matrix over F2 . Then Pr[Det(X) = 0] = 16 .

    We now conclude the proof of Theorem 1.2. Combining Claims 2.1, 2.4, 2.5 and 2.7,
                h                      i   h          t
                                                        i   h          i        1
   kS4 kU164 = E (−1)S(y1 ,y2 ,y3 ,y4 ) = E (−1)Det(MM ) ≥ E (−1)Det(X) − O 2−N ≥ − O 2−N .
                                                                                         
                                                                                 8

3 S4 has no correlation with cubics
This section contains the proof of Theorem 1.3. We show there is an absolute constant α > 0 such, that
for any cubic polynomial g in N variables holds

                                                 hS4 , gi < exp(−αN) .

    A first step is to observe that there is a relation between the inner product of two functions and the
average inner product of their derivatives.

Lemma 3.1. For any two functions f and g holds

                                                h f , gi4 ≤ Ey [h fy , gy i2 ] .

Proof. This is an immediate corollary of a lemma in [9], but we give the elementary proof for complete-
ness. By the Cauchy-Schwarz inequality,

    Ey [h fy , gy i2 ] ≥ Ey [h fy , gy i]2 = Ex,y [(−1) f (x)+ f (x+y)+g(x)+g(x+y) ]2 = E[(−1) f (x)+g(x) ]4 = h f , gi4 .



Corollary 3.2.
                                              h f , gi8 ≤ Ey,z [h fy,z , gy,z i2 ] .


                           T HEORY OF C OMPUTING, Volume 7 (2011), pp. 131–145                                               137
                     S HACHAR L OVETT, ROY M ESHULAM , AND A LEX S AMORODNITSKY

      We will show that for any polynomial g of degree at most 3 it holds that
                                       h                  i
                                                        2
                                   Ey,z (S4 )y,z , gy,z     ≤ exp(−αN) .

First, here is a brief overview of the argument.
      The advantage in taking second derivatives is that a second derivative of g is a linear function, and
a second derivative of S4 is a quadratic. Fix y, z and let L(x) = gy,z (x) and Q(x) = (S4 )y,z (x) be the
corresponding linear and quadratic function. The correlation between L and Q is the absolute value of the
Fourier coefficient of Q which corresponds to the character given by L. The Fourier spectrum of quadratic
functions is well understood using a theorem of Dixon. It turns out that given the choice of directions
y, z, the quadratic Q falls into one of two classes. For half of the choices for y, z, all Fourier coefficients
of Q are exponentially small, and hence in particular the correlation between Q and L is exponentially
small. For the remaining choices of y, z however, the quadratic function Q has only a constant number of
Fourier coefficients. However, they all lie in an explicitly given 3-dimensional affine subspace depending
on y, z. We then argue that for any fixed cubic polynomial g, the support of the character (−1)gy,z lies in
this affine subspace with exponentially small probability over y, z.
      We proceed with computing the second derivatives of S4 .

3.1     Second derivatives of S4
Fix directions y, z ∈ Fn2 , and let Q(x) = (S4 )y,z (x). Write

                                    Q(x) = ∑ qi, j x(i)x( j) + ∑ `i x(i) + c .
                                               i< j                      i

We start by computing the coefficients qi, j . Let S(y, z) = ∑k6=` y(k)z(`) = hy, zi + hy, 1i · hz, 1i, where
1 ∈ FN2 denotes the all-1 vector.

Claim 3.3.

             qi, j = S(y, z) + hy, 1i · z(i) + z( j) + hz, 1i · y(i) + y( j) + y(i)z( j) + y( j)z(i) .
                                                                                                  


Proof. Direct computation gives
                                              qi, j =      ∑ y(k)z(`) .
                                                           k6=`
                                                        k,`6∈{i, j}

Using the inclusion-exclusion formula yields

                 qi, j = ∑ y(k)z(`) −     ∑ y(k)z(`) − ∑ y(k)z(`) + ∑                       y(k)z(`)
                        k6=`            k∈{i, j}                 `∈{i, j}    {k,`}={i, j}
                                        `6∈{i, j}                k6∈{i, j}

                     = S(y, z) − (y(i) + y( j)) hz, 1i − z(i) − z( j)
                                                                     
                                                                                            
                               − (z(i) + z( j)) hy, 1i − y(i) − y( j) + y(i)z( j) + y( j)z(i) .

Rearranging, and using the fact that the computation is over F2 , we obtain the claim.

                          T HEORY OF C OMPUTING, Volume 7 (2011), pp. 131–145                              138
                            I NVERSE C ONJECTURE FOR THE G OWERS NORM IS FALSE

      At this point we invoke (a corollary of) a theorem of Dixon (see for example [8], Section 15, Theorem
5):
Theorem 3.4. Let Q(x) = ∑i< j qi, j x(i)x( j) + ∑i `i x(i) + c be a quadratic polynomial over F2 . Consider
the symmetric matrix B with zeros on the diagonal and off-diagonal entries given by Bi, j = B j,i = qi, j .
Let the rank of B = 2h (it is always even). Then the function (−1)Q has exactly 22h non-zero Fourier
coefficients all of absolute value 2−h . Moreover, all these coefficients lie in an 2h-dimensional affine
subspace of Fn2 .
    Consider the matrix B in our case. Some notation: let J be the matrix with 0 on the diagonal and 1 off
the diagonal. Let u ⊗ v denote the outer product uvt . Then

            B = S(y, z) · J + hy, 1i · z ⊗ 1 + 1 ⊗ z + hz, 1i · y ⊗ 1 + 1 ⊗ y + y ⊗ z + z ⊗ y .
                                                                                           

Since the rank of J is at least N − 1 and the rank of each of the remaining matrices is at most 2, the matrix
B is almost of full rank if S(y, z) = 1. In this case, by Theorem 3.4, the Fourier coefficients of (−1)Q are
exponentially small. In fact,
Corollary 3.5. If S(y, z) = 1 then, for any cubic polynomial g(x),

                                            (S4 )y,z , gy,z ≤ 128 · 2−N .

Proof. The matrix B has rank at least N − 7, since J has rank at least N − 1 and each matrix of the form
u ⊗ v is a rank one matrix. The claim now follows from Theorem 3.4.

    Recall that we wish to show that for any polynomial g of degree at most 3, the average value of
h(S4 )y,z , gy,z i for uniformly chosen y, z ∈ Fn2 is exponentially small in n. Corollary 3.5 shows that this
holds for any y, z whenever S(y, z) = 1. We, therefore, may assume that S(y, z) = 0 from now on. In this
case the quadratic part of Q may be simplified as
                                                                                                             
            Q(x) = ∑ qi, j x(i)x( j) = hy, 1i · hx, 1ihx, zi + hz, 1i · hx, 1ihx, yi + hx, yihx, zi + hx, yzi .
                     i< j

Here yz denotes the pointwise product of the vectors y and z, that is (yz)(i) = y(i)z(i).
    Observe, that the above computation implies the non-zero Fourier coefficients of ∑i< j qi, j x(i)x( j) lie
in an affine subspace of Fn2 of dimension at most 3, given by yz + Span(1, y, z).
    Next, consider the linear part h`, xi = ∑i `(i)x(i) of Q.
Claim 3.6. If S(y, z) = 0 then ` ∈ Span(y, z, 1).
Proof. Straightforward calculation gives that
                                                                                                        
`(i) =     ∑        y(k)y(l)z( j)+y( j)y(l)z(k)+y( j)y(k)z(l) + y( j)z(k)z(l)+y(k)z( j)z(l)+y(l)z( j)z(k) .
         j<k<l6=i

This can be directly verified to be equal to

            S(y, z) + S(z, z) + hz, 1i · y(i) + S(y, z) + S(y, y) + hy, 1i · z(i)
                                                                         

                                             + S(y, y) · hz, 1i + S(z, z) · hy, 1i + hy, zi · hy + z, 1i .
                                                                                                        


                            T HEORY OF C OMPUTING, Volume 7 (2011), pp. 131–145                              139
                       S HACHAR L OVETT, ROY M ESHULAM , AND A LEX S AMORODNITSKY

By assumption, S(y, z) = hy, 1i·hz, 1i+hy, zi = 0. Note that this also implies hy, zi·hy+z, 1i = 0, implying
       `(i) = S(z, z) + hz, 1i · y(i) + (S(y, y) + hy, 1i · z(i) + S(y, y) · hz, 1i + S(z, z) · hy, 1i .
                                                                                                    

Consequently, the linear part of Q may be written as

            ∑ `(i)x(i) = S(z, z) + hz, 1i · hx, yi + (S(y, y) + hy, 1i · hx, zi
                                                                     
                  i
                                                               + S(y, y) · hz, 1i + S(z, z) · hy, 1i · hx, 1i.
                                                                                                    



      Consequently, we deduce
Corollary 3.7. If S(y, z) = 0 then the non-zero Fourier coefficients of the polynomial
                                         Q = ∑ qi, j x(i)x( j) + ∑ `(i)x(i) + c
                                               i< j                   i

lie in the affine subspace AFy,z = yz + Span (y, z, 1).

3.2     Second derivatives of a fixed polynomial of degree 3
Let
                                             g(x) =    ∑ ai, j,k x(i)x( j)x(k)
                                                      i< j<k

be a polynomial of degree 3. For directions y, z ∈ FN2 , consider the second derivative gy,z = ∑i vy,z (i)x(i) +
cy,z . We need to show that the probability of the vector vy,z falling in the affine space AFy,z = yz +
Span (y, z, 1) is exponentially small.
      First, we fix some notation. For 1 ≤ i ≤ N, let Gi be a symmetric N × N matrix over F2 with
(Gi ) j,k = (Gi )k, j = ai, j,k for all j 6= k. (Here we think about {i, j, k} as an unordered subset of [N].) The
diagonal entries of Gi are set to 0. For future use note the important property (Gi ) j,k = (G j )i,k = (Gk )i, j .
      These matrices are relevant because they describe the vector vy,z .
Lemma 3.8.
      • vy,z (i) = coefx(i) (gy,z (x)) = hy, Gi zi.
      • An alternative representation of vy,z will be more convenient for us. For z ∈ FN2 , let G(z) =
        ∑Ni=1 z(i)Gi . Then
                                                 vy,z = G(z) · y .

Proof. For the first claim of the lemma, by linearity of the derivative, it suffices to consider the monomial
g(x) = x(i)x( j)x(k). This case can be easily verified directly.
    For the second claim, note that
                          N                      N             N                 N          N
       (G(z) · y)(l) = ∑ (G(z))k,l y(k) = ∑ y(k) · ∑ z(i) (Gi )k,l = ∑ y(k) · ∑ (Gl )k,i z(i) = hy, Gl zi.
                        k=1                     k=1        i=1                  k=1        i=1




                              T HEORY OF C OMPUTING, Volume 7 (2011), pp. 131–145                                140
                           I NVERSE C ONJECTURE FOR THE G OWERS NORM IS FALSE

    Consider the event {vy,z ∈ AFy,z }. This means vy,z = yz + uy,z , for some vector uy,z ∈ Span(y, z, 1).
There are only 8 possible choices for uy,z . For convenience, let us assume, without loss of generality (as
can be easily seen from the proof), that uy,z = y + z + 1 is the most popular one. By the lemma, the event
{vy,z = yz + uy,z } is the same as {G(z) · y = yz + uy,z }. To simplify things some more, let Ai = Gi + ei ⊗ ei ,
i = 1, . . . , N. That is, Ai = Gi but for (Ai )i,i = 1. Let A(z) = ∑Ni=1 z(i)Ai . Note that A(z) · y = G(z) · y + yz.
Hence {G(z) · y = yz + uy,z } is the same as {A(z) · y = uy,z = y + z + 1}.
    We conclude the proof by a technical claim.
Proposition 3.9. Let {Ai }, i = 1, . . . , N be a family of symmetric N × N matrices over F2 with Ai (k, k) =
δik . Then, for y, z uniformly at random and independently from FN2 ,
                                                                   N
                                                                    3
                                      Pr [A(z) · y = y + z + 1] ≤        .
                                      y,z                           4
    The proof of the proposition is based on the claim that the rank of a matrix A(z) is typically large.
Lemma 3.10. Let matrices {Ai } be as in the proposition. Let C be any fixed symmetric N × N matrix.
Then
                                                          1 k−1 N
                                                                  
                          Pr [rank(A(z) +C) ≤ k − 1] ≤ N · ∑          .
                           z                             2 i=0 i
    The proof of Lemma 3.10 uses the following estimate on the number of common zeros of a set of
polynomials.
Lemma 3.11. Let { fI } be a set of K = Nk polynomials over F2 , indexed by k-subsets I of [N]. Assume
                                         

that for any such subset I holds                    !
                                           deg   fI (x) − ∏ xi     ≤ k − 1.
                                                         i∈I
Then,
                                                                     1 k−1 N
                                                                           
                                 Pr [ f1 (x) = · · · = fK (x) = 0] ≤ N ∑     .
                                x∈Fn2                               2 j=0 j
    We defer the proof of Lemma 3.11 to Subsection 3.3.

Proof of Lemma 3.10. Consider a family of Nk polynomials fI on FN2 . These polynomials are indexed
                                                   

by k-subsets of [N]. For a k-subset I, let fI (z) be the determinant of the I × I minor of A(z) +C. Clearly,
rank of A(z) +C is smaller than k if and only if z is a joint zero of { fI }.
    We now claim that the coefficient of ∏i∈I zi in fI (z) is 1. If this is true, deg( fI − ∏i∈I zi ) ≤ k − 1 and
the claim of the lemma will follow from Lemma 3.11.
    Let B(z) = A(z) +C. Since we are working in characteristic two, the symmetry of B(z) implies that
                                      N
               Det(B(z)) =      ∑ ∏ Biσ (i) (z) = ∑              ∏          (zi +Ci,i ) ·       ∏           Biσ (i) (z)
                               σ ∈SN i=1              σ ∈SN {i : σ (i)=i}                   {i : i<σ (i)}
                              σ =σ −1                σ =σ −1
                               n
                           = ∏ zi + lower order terms
                              i∈I


                          T HEORY OF C OMPUTING, Volume 7 (2011), pp. 131–145                                             141
                      S HACHAR L OVETT, ROY M ESHULAM , AND A LEX S AMORODNITSKY

where in the second equality we used the identity B2iσ (i) (z) = Biσ (i) (z) in F.
      We now prove Propostion 3.9.
Proof of Proposition 3.9. Let I denote the identity N × N matrix.
   Let p(z) = Pry [A(z) · y = y + z + 1]. Clearly p(z) ≤ 2− rank(A(z)+I) . By Lemma 3.10,
                                                                                         N
                                                                        1 N N −k
                                                                               
                                                   h
                                                     − rank(A(z)+I)
                                                                    i                     3
          Pr [A(z) · y = y + z + 1] = Ez [pz ] ≤ Ez 2                 ≤ N ∑      2 =         .
          y,z                                                          2 k=0 k            4


      Summing up, for any cubic polynomial g(x),
      • For any directions y, z ∈ Fn2 such that S(y, z) = 1 we have h(S4 )y,z , gy,z i ≤ O 2−N .
                                                                                              

      • For all but exponentially few directions y, z ∈ Fn2 such that S(y, z) = 0 we have h(S4 )y,z , gy,z i = 0.
Hence Ey,z [h(S4 )y,z , gy,z i] = 2−Ω(n) , and by Lemma 3.1 we have hS4 , gi = 2−Ω(n) . This concludes the
proof of Theorem 1.3.

3.3     Estimates on the number of common zeroes of some families of polynomials
The main claim of this subsection is the following proposition.
Proposition 3.12. Fix a prime p and let F = F p . Let M be thering of F-valued functions on FN , that is
M = F[x1 , . . . , xN ]/I, where I is the ideal x1p − x1 , . . . , xNp − xN . Let f1 , . . . , fK be polynomials in M. Let
S be the set of common zeroes of f1 , . . . , fK , that is
                                     S = u ∈ FN : f1 (u) = · · · = fK (u) = 0 .
                                          

Then
                                                   |S| ≤ dim (M/J)
where J is the ideal generated by { fi }, and dim (M/J) denotes the dimension of M/J, viewed as a vector
space over F.
Proof. For each u ∈ S, let qu ∈ M be defined by qu (u) = 1 and qu (v) = 0 for all v 6= u. We will show that
the family {qu + J}u∈S is linearly independent in M/J. This will immediately imply the claim of the
proposition.
    Consider a linear combination q = ∑u∈S λu qu such that q ∈ J. Let v ∈ S. We compute q(v) in two
ways. First, since q ∈ J, we have q(v) = 0. On the other hand, q(v) = ∑u∈S λu qu (v) = λv . This shows
λv = 0 for all v ∈ S, completing the proof.
      We now prove Lemma 3.11.
Proof of Lemma
              N
                 3.11. We will construct a generating subset of the vector space M/J of Ncardinality at
most ∑k−1j=0 j . We start from a trivial generating set {m+J}, where m runs through all the 2 multi-linear
monomials in N variables. Now, in the factor space M/J, we can replace any product of k variables,
∏i∈I xi , by a polynomial of degree smaller than k. Iterating
                                                            this procedure, we arrive to a generating set
                                                     k−1 N
spanned by {s + J}, where s now runs through ∑ j=0 j monomials of degree at most k − 1.


                           T HEORY OF C OMPUTING, Volume 7 (2011), pp. 131–145                                       142
                       I NVERSE C ONJECTURE FOR THE G OWERS NORM IS FALSE

Acknowledgements
We are grateful to Ben Green and Terence Tao for sending us their paper. We would also like to thank
them and Emanuele Viola for bringing the argument of Alon and Beigel to our attention. We thank the
anonymous reviewers for their careful reading and helpful comments.


References
 [1] N OGA A LON AND R ICHARD B EIGEL: Lower bounds for approximations by low degree polynomi-
     als over Zm . In Proc. 16th Ann. IEEE Conf. Comput. Complexity (CCC), pp. 184–187, Washington,
     DC, USA, 2001. IEEE Comp. Soc. Press. [doi:10.1109/CCC.2001.933885] 132, 133

 [2] V ITALY B ERGELSON , T ERENCE TAO , AND TAMAR Z IEGLER: An inverse theorem for the
     uniformity seminorms associated with the action of F∞p . Geom. Funct. Anal., 19:1539–1596, 2010.
     [doi:10.1007/s00039-010-0051-1] 134

 [3] A NDREJ B OGDANOV AND E MANUELE V IOLA: Pseudorandom bits for polynomials. SIAM J.
     Comput., 39:2464–2486, April 2010. [doi:10.1137/070712109] 132, 136

 [4] T IMOTHY G OWERS: A new proof of Szemerédi’s theorem. Geom. Funct. Anal., 11(3):465–588,
     2001. [doi:10.1007/s00039-001-0332-9] 132

 [5] B EN G REEN AND T ERENCE TAO: An inverse theorem for the Gowers U 3 (G) norm. Proc. Edinb.
     Math. Soc. (2), 51:73–153, 2008. [doi:10.1017/S0013091505000325] 132

 [6] B EN G REEN AND T ERENCE TAO: The distribution of polynomials over finite fields, with applica-
     tions to the Gowers norms. Contrib. Discrete Math., 4(2):1–36, 2009. http://cdm.ucalgary.
     ca/index.php/cdm/article/view/133/98. 132, 133

 [7] S HACHAR L OVETT, ROY M ESHULAM , AND A LEX S AMORODNITSKY: Inverse conjecture for the
     Gowers norm is false. In Proc. 40th STOC, pp. 547–556, New York, NY, USA, 2008. ACM Press.
     [doi:10.1145/1374376.1374454] 133

 [8] F. J. M AC W ILLIAMS AND N. J. A. S LOANE: The Theory of Error Correcting Codes. Amsterdam,
     North-Holland, 1977. 139

 [9] A LEX S AMORODNITSKY: Low-degree tests at large distances. In Proc. 39th STOC, pp. 506–515,
     New York, NY, USA, 2007. ACM Press. [doi:10.1145/1250790.1250864] 132, 137

[10] JACOB T. S CHWARTZ: Fast probabilistic algorithms for verification of polynomial identities. J.
     ACM, 27(4):701–717, 1980. [doi:10.1145/322217.322225] 137

[11] T ERENCE TAO: Structure and randomness in combinatorics. In Proc. 48th FOCS, pp. 3–15. IEEE
     Comp. Soc. Press, October 2007. [doi:10.1109/FOCS.2007.17] 132

                      T HEORY OF C OMPUTING, Volume 7 (2011), pp. 131–145                        143
                  S HACHAR L OVETT, ROY M ESHULAM , AND A LEX S AMORODNITSKY

[12] T ERENCE TAO:    Some notes on “non-classical” polynomials in finite charac-
     teristic, 2008.   Online blog:   http://terrytao.wordpress.com/2008/11/13/
     some-notes-on-non-classical-polynomials-in-finite-characteristic. 134

[13] T ERENCE TAO AND TAMAR Z IEGLER: The inverse conjecture for the Gowers norm over finite fields
     via the correspondence principle. Analysis & PDE, 3(1):1–20, 2010. [doi:10.2140/apde.2010.3.1]
     134

[14] T ERENCE TAO AND TAMAR Z IEGLER: The inverse conjecture for the Gowers norm over finite
     fields in low characteristic. Submitted, 2011. 134

[15] E MANUELE V IOLA: Guest column: Correlation bounds for polynomials over {0, 1}. SIGACT
     News, 40:27–44, February 2009. [doi:10.1145/1515698.1515709] 132

[16] E MANUELE V IOLA AND AVI W IGDERSON: Norms, XOR lemmas, and lower bounds for poly-
     nomials and protocols. Theory of Computing, 4:137–168, 2008. [doi:10.4086/toc.2008v004a007]
     132


AUTHORS

     Shachar Lovett
     member, School of Mathematics
     Institute for Advanced Study, Princeton, NJ
     slovett math ias edu
     http://www.math.ias.edu/~slovett


     Roy Meshulam
     professor, Department of Mathematics
     The Technion, Haifa, Israel
     meshulam math technion ac il
     http://www.math.technion.ac.il/~meshulam


     Alex Samorodnitsky
     professor, School of Engineering and Computer Science
     The Hebrew University of Jerusalem, Jerusalem, Israel
     salex cs huji ac il
     http://www.cs.huji.ac.il/~salex




                      T HEORY OF C OMPUTING, Volume 7 (2011), pp. 131–145                      144
                     I NVERSE C ONJECTURE FOR THE G OWERS NORM IS FALSE

ABOUT THE AUTHORS

    S HACHAR L OVETT graduated from the Weizmann Institute of Science in 2010; his advi-
       sors were Omer Reingold and Ran Raz. He is interested in the theory of computing,
       combinatorics, and coding theory, and in particular in the interplay between structure,
       randomness, and pseudorandomness.


    ROY M ESHULAM is a Professor of Mathematics at the Technion. He has worked in a number
      of areas, including applications of harmonic analysis to combinatorics and extremal
      problems for spaces of matrices. In recent years his main research interests have been
      applications of algebraic topology to discrete geometry and random simplicial complexes.


    A LEX S AMORODNITSKY graduated from the Hebrew University of Jerusalem in 1999; his
       advisor was Nathan Linial. He is interested in coding theory, combinatorics, and the
       theory of computing, and especially in applying tools of analysis and geometry to these
       areas of discrete math.




                    T HEORY OF C OMPUTING, Volume 7 (2011), pp. 131–145                          145