Fourier and Circulant Matrices are Not Rigid

Authors Zeev Dvir, Allen Liu,
Plaintext
                          T HEORY OF C OMPUTING, Volume 16 (20), 2020, pp. 1–48
                                       www.theoryofcomputing.org

                                        S PECIAL ISSUE : CCC 2019



Fourier and Circulant Matrices are Not Rigid
                                        Zeev Dvir∗                   Allen Liu
                Received July 7, 2019; Revised December 18, 2020; Published December 31, 2020




       Abstract. The concept of matrix rigidity was first introduced by Valiant in 1977. Roughly
       speaking, a matrix is rigid if its rank cannot be reduced significantly by changing a small
       number of entries. There has been considerable interest in the explicit construction of rigid
       matrices as Valiant showed in his MFCS’77 paper that explicit families of rigid matrices can
       be used to prove lower bounds for arithmetic circuits.
            In a surprising recent result, Alman and Williams (FOCS’19) showed that the 2n × 2n
       Walsh–Hadamard matrix, which was conjectured to be rigid, is actually not very rigid. This
       line of work was extended by Dvir and Edelman (Theory of Computing, 2019) to a family of
       matrices related to the Walsh–Hadamard matrix, but over finite fields. In the present paper
       we take another step in this direction and show that for any abelian group G and function
        f : G → C, the G-circulant matrix, given by Mxy = f (x − y) for x, y ∈ G, is not rigid over C.
       Our results also hold if we replace C with a finite field Fq and require that gcd(q, |G|) = 1.
       En route to our main result, we show that circulant and Toeplitz matrices (over finite fields
       or C) and Discrete Fourier Transform (DFT) matrices (over C) are not sufficiently rigid to
       carry out Valiant’s approach to proving circuit lower bounds. This complements a recent
       result of Goldreich and Tal (Comp. Complexity, 2018) who showed that Toeplitz matrices
       are nontrivially rigid (but not enough for Valiant’s method). Our work differs from previous
     A preliminary version of this paper appeared in the Proceedings of the 34th IEEE Conference on Computational Complexity,
2019.
   ∗ Research supported by NSF CAREER award DMS-1451191 and NSF grant CCF-1523816.



ACM Classification: F.2.2, F.1.3
AMS Classification: 68Q17, 68Q15
Key words and phrases: matrix rigidity, circulant matrix


© 2020 Zeev Dvir and Allen Liu
c b Licensed under a Creative Commons Attribution License (CC-BY)                             DOI: 10.4086/toc.2020.v016a020
                                       Z EEV DVIR AND A LLEN L IU

      non-rigidity results in that those papers considered matrices whose underlying group of
      symmetries was of the form Znp with p fixed and n tending to infinity, while in the families
      of matrices we study, the underlying group of symmetries can be any abelian group and, in
      particular, the cyclic group ZN , which has very different structure. Our results also suggest
      natural new candidates for rigidity in the form of matrices whose symmetry groups are highly
      non-abelian.


Contents
1   Introduction                                                                                         3
    1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .     3
    1.2 Our contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   4
    1.3 Overview of the proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .    6
         1.3.1 Generalized Walsh–Hadamard matrices . . . . . . . . . . . . . . . . . . . . . .           6
         1.3.2 DFT matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .      7
         1.3.3 G-circulant matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .      8
    1.4 Rigidity over different fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   8
    1.5 Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   9

2   Preliminaries                                                                                        9
    2.1 Basic notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
    2.2 Special families of matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
    2.3 Matrix rigidity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
    2.4 Preliminary results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

3   Non-rigidity of generalized Walsh–Hadamard matrices                                                  13

4   Non-rigidity of DFT matrices of well-factorable size                                             17
    4.1 Structure of generalized Walsh–Hadamard and DFT matrices . . . . . . . . . . . . . . . 19
    4.2 Proof of Theorem 4.7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

5   Non-rigidity of all circulant matrices                                                               25

6   Non-rigidity of G-circulant matrices for abelian groups                                              27

7   Finite field case                                                                                 30
    7.1 Preliminaries about Galois theory and finite fields . . . . . . . . . . . . . . . . . . . . . 30
    7.2 Modifications to the main proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

8   G-circulant matrices over finite fields                                                              37

9   Final remarks and open questions                                                                     45

                       T HEORY OF C OMPUTING, Volume 16 (20), 2020, pp. 1–48                              2
                                F OURIER AND C IRCULANT M ATRICES ARE N OT R IGID

1     Introduction
1.1    Background
A major goal in complexity theory is to prove lower bounds on the size and depth of arithmetic circuits
that compute certain functions. One specific problem that remains open despite decades of effort is to
find functions for which we can show super-linear size lower bounds for circuits of logarithmic depth. In
[19], Valiant introduced the notion of matrix rigidity as a possible method of proving such lower bounds
for arithmetic circuits. More precisely, over a field F, an m × n matrix M is said to be (r, s)-rigid if any
m × n matrix of rank at most r differs from M in at least s entries. Valiant showed that for any linear
function f : Fn → Fn that can be computed by an arithmetic circuit of size O(n) and depth O(log n), the
                                                       n                     1+ε ) entries for any ε > 0. Thus,
corresponding matrix can be reduced to rank O( log log    n ) by changing O(n
to prove a circuit lower bound for a function f , it suffices to lower bound
                                                                             the rigidity of the corresponding
                        n                                                      n            1+ε
matrix at rank O( log log n ). We call a matrix Valiant-rigid if it is O( log log n ), Ω(n ) -rigid for some
ε > 0, i. e., sufficiently rigid for Valiant’s method to yield circuit lower bounds. Over any infinite field,
Valiant shows that almost all n × n matrices are (r, (n − r)2 )-rigid for any r, while over a finite field one
can get a similar result with a logarithmic loss in the sparsity parameter. Despite much effort, explicit
constructions of rigid matrices have remained elusive.
     Over infinite (or very large) fields, there are ways to construct highly rigid matrices using either
algebraically independent entries or entries that have exponentially large description (see [16, 12, 15]).1
However, these constructions are not considered to be fully explicit as they do not tell us anything about
the computational complexity of the corresponding function. Ideally, we would be able to construct rigid
(0, 1)-matrices, but even a construction where the entries are in a reasonably simple field (such as the mth
cyclotomic field for a small value of m) would be a major breakthrough. The best known constructions
                               2
of such matrices are (r, Ω( nr log nr ))-rigid (see [18, 9]). There has also been work towards constructing
semi-explicit rigid matrices. Semi-explicit constructions which require O(n) bits of randomness (instead
of the usual O(n2 )) would still yield circuit lower bounds through Valiant’s approach.2 The best result in
                                                                          3
this realm (see [11]) shows that random Toeplitz matrices are (r, r2 nlog n )-rigid with high probability for
        √
r ≥ Ω( n).
     Note that both of these bounds become trivial when r is n/ log log n. Other variants of semi-explicit
                                                                                 1/4−o(1)
constructions have also been studied. [1] gives a construction of (2(log n)               , Ω(n2 ))-rigid matrices
using an NP-oracle. This construction is not in the regime for Valiant-rigidity.
     Many well-known families of matrices, such as Hadamard matrices (square matrices with ±1 entries
whose rows are orthogonal) and DFT (Discrete Fourier Transform) matrices, have been conjectured
to be Valiant-rigid [17]. However, a recent line of work (see [2, 7]) shows that certain well-structured
matrices are not rigid. Alman and Williams show in [2] that the Walsh–Hadamard matrix, i. e., the 2n × 2n
Hadamard matrix given by Hxy = (−1)hx,yi as x and y range over {0, 1}n , is not Valiant-rigid over Q.
Along similar lines, Dvir and Edelman show in [7] that G-circulant matrices for the additive group of Fnp ,
   1 It remains open to construct a matrix that is Valiant-rigid, even if we only require that the entries live in a number field of

dimension polynomial in the size of the matrix.
   2 Note however, that it is easy to construct rigid matrices with O(n1+ε ) bits of randomness for any ε > 0 (for example by

taking a random matrix with at most nε non-zeros per row) but this is not sufficient for Valiant’s approach.


                            T HEORY OF C OMPUTING, Volume 16 (20), 2020, pp. 1–48                                                 3
                                              Z EEV DVIR AND A LLEN L IU

given by Mxy = f (x − y) where f : Fnp → F p and x, y range over Fnp , are not Valiant-rigid over F p (where
we view p as fixed and n goes to infinity). The Walsh–Hadamard matrix and the G-circulant matrices for
the additive group of Fnp have the property that for any ε > 0, there exists an ε 0 > 0 such that it is possible
                                                                0
to change at most N 1+ε entries and reduce the rank to N 1−ε (where N denotes the size of the matrix).
The proofs of both results rely on constructing a matrix determined by a polynomial P(x, y) that agrees
with the given matrix on almost all entries and then arguing that the constructed matrix has low rank.

1.2     Our contribution
Definition 1.1 (G-circulant matrices). Let G be a finite abelian group, F a field, and f : G → F a function.
The G-circulant matrix M( f ) is defined as the |G| × |G| matrix whose rows and columns are labeled by
the elements of G and whose (x, y) entry is f (x − y) (for x, y ∈ G).

    In this paper we prove that for an abelian group G, over any finite field with characteristic relatively
prime to |G| and over the complex numbers, G-circulant matrices are not Valiant-rigid (Theorem 1.5).
En route to our main result, we prove that the following commonly studied families of matrices are not
Valiant-rigid:

      • DFT matrices (over C).

      • Circulant matrices (over finite fields and C): matrices whose rows are obtained by cyclically
        shifting the top row.

      • Toeplitz matrices (over finite fields and C): matrices with constant diagonals.3

Remark 1.2. For circulant and Toeplitz matrices over finite fields, we do not require any additional
conditions, i.e. we do not require that the size of the matrix and the characteristic of the field are relatively
prime. See the beginning of Section 8 for a more in-depth discussion about why we require such a
condition for general abelian groups.

The families of matrices we consider in our paper have very different underlying group structure than
those considered in previous work. Both [2] and [7] analyze matrices constructed from an underlying
group of the form Znp with p a fixed prime number and n tending to infinity. In this paper we study
matrices whose underlying symmetry group can be any abelian group. In fact, the core of our proof is
handling the case when the underlying group is cyclic.
    Circulant matrices are the special case of G-circulants for cyclic groups G. Similarly, the DFT
matrices are the special case of the DFTG matrices (where DFTG is the matrix given by the character
table of an abelian group G) when G is cyclic. The Walsh–Hadamard matrices are another special
case of the DFTG matrices, where G is the group Zn2 . We use the fact, that every finite abelian group
can be decomposed into the direct product of cyclic groups, to extend our results to all abelian groups,
although this extension is by no means immediate. While most natural constructions of matrices are
highly symmetric, our results show that matrices that are symmetric under abelian groups are not rigid
   3 It is not hard to see that rigidity of circulant and Toeplitz matrices is essentially the same question so for the sake of

consistency with our (group theoretic) approach we will primarily consider circulant matrices.


                           T HEORY OF C OMPUTING, Volume 16 (20), 2020, pp. 1–48                                             4
                           F OURIER AND C IRCULANT M ATRICES ARE N OT R IGID

and that perhaps we should look toward less structured matrices, or matrices whose symmetry group is
non-abelian, as candidates for rigidity.
   We now move into a more technical overview of our paper. We begin with a few definitions.

Definition 1.3. Define the regular-rigidity rFA (r) of a matrix A over a field F as the minimum value of s
such that it is possible to change at most s entries in each row and column of A to obtain a matrix of rank
at most r.

    When the field is clear from context, we will omit the superscript.The notion of regular-rigidity is
weaker than the usual notion of rigidity (and is also weaker than the commonly used notion of row-
rigidity) as if A is an n × n matrix and A is (r, ns)-rigid then rA (r) ≥ s. Note that this actually makes our
results stronger as we will show that the matrices we consider are not regular-rigid.
    To simplify the exposition, we define a qualitative notion of non-rigidity we call QNR (quasipolyno-
mial non-rigidity).

Definition 1.4. We say that a family A of matrices is quasipolynomially non-rigid (QNR) over a field F
if there are constants c1 , c2 > 0 such that for any ε > 0, all sufficiently large matrices M ∈ A satisfy
                                                                 
                                                      N
                                        F
                                       rM                           ≤ Nε ,
                                            exp (ε c1 (log N)c2 )

where M is an N × N matrix.

    We will prove that various families of matrices are QNR. Note that this immediately implies that they
are not Valiant-rigid. Our main results are stated below.

Theorem 1.5. Let G be an abelian group. The family of G-circulant matrices is QNR over C. For a finite
field Fq , if gcd(|G|, q) = 1, then the family of G-circulant matrices is QNR over Fq .

   The formal statement and proof of this result can be found in Section 6 (see Theorem 6.2) for the
complex numbers and Section 8 (see Theorem 8.5) for finite fields.

Theorem 1.6. Let G be an abelian group of order N. Then there exists m = Õ(N 3 ), depending only on G,
such that the rational G-circulant matrices are QNR over the mth cyclotomic field. The same also holds
for the matrix DFTG .

    The formal statement and proof of this result can be found in Section 6 (see Theorem 6.3).
    In addition to the aforementioned results for circulant, Toeplitz and DFT matrices, our main theorem
has a few more consequences that are worth mentioning. The following two corollaries to our main result
were pointed out by Babai and Kivva [4]:

    • The Paley–Hadamard matrices are QNR over C.

    • The Vandermonde matrices Vn (x1 , . . . , xn ) whose generators x1 , . . . , xn form a geometric progression
      are QNR over C.

                        T HEORY OF C OMPUTING, Volume 16 (20), 2020, pp. 1–48                                   5
                                             Z EEV DVIR AND A LLEN L IU

1.3     Overview of the proof
We now take a more detailed look at the techniques used in the proof of Theorem 1.5.
    In general, matrices that we deal with will be over C except in Sections 7 and 8 where we extend our
results to matrices over finite fields. First we define two families of matrices that we will use extensively.

Definition 1.7 (Generalized Walsh–Hadamard (GWH) matrices). The generalized Walsh–Hadamard
(GWH) matrix Hd,n is a d n × d n complex matrix that has rows and columns indexed by Znd and entries
(Hd,n )I,J = ω I·J where ω = e2πi/d .

      Next we define the Discrete Fourier Transform (DFT) matrices.

Definition 1.8 (DFT matrix). The (x, y) entry of the N × N matrix DFTN (0 ≤ x, y ≤ N − 1) is ω xy where
ω = e2πi/N .

      Note DFTN = HN,1 and Hd,n = DFTd ⊗ · · · ⊗ DFTd where ⊗ denotes the Kronecker product.
                                  |      {z         }
                                                     n
    One key idea in our argument is the observation that, if all members of a family A of matrices are
simultaneously diagonalizable by a matrix M, then the rigidity of any matrix A ∈ A implies the rigidity
of the matrix M (Lemma 2.21). This situation happens, e. g., when A is the family of circulant matrices
and M is the DFT matrix. This simple, yet crucial observation allows us to deduce the non-rigidity of a
larger family of matrices.4

1.3.1    Generalized Walsh–Hadamard matrices
The first step in the proof of Theorem 1.5 is proving that the generalized Walsh–Hadamard matrices are
not rigid in the following sense, which is stronger than QNR.

Theorem 1.9 (Generalized Walsh–Hadamard matrices are not rigid).        For fixed d and 0 < ε < 0.01, there
           0                                                n(1−ε 0)     nε
exists an ε such that for all sufficiently large n, rHd,n d            ≤d .

     Note that Theorem 1.9 generalizes the main result of [2] (which deals with d = 2). The result of [2]
is stronger in the sense that it holds over Q while our results for GWH matrices require working over a
field extension. See Section 1.4 for a discussion on rigidity over different fields.
     Also, given any d n × d n matrix of the form Mxy = f (x − y) with f : Znd → C, we can permute its rows
so that it is diagonalized by Hd,n . Thus, we can apply the diagonalization trick mentioned above and
obtain the following result, which extends the results in [7] to matrices over C.

Corollary 1.10. Let f be a function from Znd → C and let M be a d n × d n matrix with Mxy = f (x − y).
                                                        0
Then
   for any
          0
            fixed d and 0 < ε < 0.01, there exists an ε > 0 such that for all sufficiently large n, we have
rM d n(1−ε ) ≤ d nε .

   4 The observation that DFT matrices diagonalize circulant matrices has similar algorithmic implications as if there were,

say, linear-size circuits for computing the DFT matrix then we would be able to obtain linear-size circuits for computing any
convolution.


                           T HEORY OF C OMPUTING, Volume 16 (20), 2020, pp. 1–48                                           6
                            F OURIER AND C IRCULANT M ATRICES ARE N OT R IGID

1.3.2   DFT matrices

Equipped with the machinery for Generalized Walsh–Hadamard (GWH) matrices, the next step is to
prove non-rigidity for DFT matrices. The result we prove is the following.

Theorem 1.11 (DFT Matrices are Not Rigid). The family of DFT matrices DFTN where N ∈ N is QNR
over C.

    Our proof consists of two steps. First we show that for integers N of a very special form, the N × N
DFT matrix is not rigid because it can be decomposed into submatrices with GWH-type structure. We
say an integer N is well-factorable if it is a product of distinct primes q1 , . . . , ql such that qi − 1 has no
large prime power divisors for all i. We will make this notion more precise later, but informally, the first
step is as follows.

Theorem 1.12. Let A denote the family of DFT matrices DFTN where N is well-factorable. Then the
family A is QNR over C.

     The main intuition is that if N is a product of distinct primes q1 , . . . , ql , then within the DFT matrix
DFTN , we can find submatrices whose rows and columns can be indexed by F×                               ×
                                                                                           q1 × · · · × Fql where F
                                                                                                                    ×

denotes the multiplicative group of the field F. This multiplicative structure can be replaced by the
additive structure of Zq1 −1 × · · · × Zql −1 . We can then factor each additive group Zqi −1 into prime power
components. If q1 − 1, . . . , ql − 1 all have no large prime power divisors, we expect prime powers to
be repeated many times when all of the terms are factored. This allows us to find submatrices with Zld
additive structure to which we can apply tools such as Theorem 1.9 and Corollary 1.10 to reduce the rank
while changing a small number of entries. We then bound the rank and total number of entries changed
over all submatrices to deduce that DFTN is not rigid.
     The second step of our proof that DFT matrices are not rigid involves extending Theorem 1.12 to
all values of N. The diagonalization trick gives that N × N circulant matrices are not rigid when N is
well-factorable. We then show that for N 0 < N2 , we can rescale the columns of the N 0 × N 0 DFT matrix and
embed it into an N × N circulant matrix. As long as N 0 is not too much smaller than N (say N 0 > (logNN)2 ),
we get that the N 0 × N 0 DFT matrix is not rigid. Thus, for each well-factorable N and all N 0 in the range
   N
(log N)2
         < N 0 < N2 , the N 0 × N 0 DFT matrix is not rigid. We then use a number theoretic result of Baker and
Harman [5] to show that the multiplicative gaps between well-factorable integers are not too large. Thus,
the above intervals cover all integers as N runs over all well-factorable numbers, finishing the proof.
     As a corollary to Theorem 1.11 (due to the diagonalization trick), we get that circulant matrices are
not rigid.

Theorem 1.13 (Circulant Matrices are not Rigid). Let A denote the family of circulant matrices. Then A
is QNR over C.

    Also notice that since any Toeplitz matrix of size at most N2 can be embedded in an N × N circulant
matrix, the above implies an analogous result for all Toeplitz matrices. While [11] shows nontrivial
rigidity lower bounds for rank much smaller than N, our results imply that there are actually no nontrivial
rigidity lower bounds for rank close to N.

                         T HEORY OF C OMPUTING, Volume 16 (20), 2020, pp. 1–48                                     7
                                        Z EEV DVIR AND A LLEN L IU

1.3.3 G-circulant matrices

We extend our results for DFT and circulant matrices to DFTG and G-circulant matrices for finite abelian
groups G by using the fundamental theorem of finite abelian groups to write G as a direct product of cyclic
groups. Note that any G-circulant matrix is diagonalized by the DFTG matrix which is the Kronecker
product of the DFT matrices for the individual cyclic groups. If there are many small cyclic groups in the
product, then we can use the same techniques that we use for GWH-matrices while if there are enough
large cyclic groups, then we can rely on our results for DFT matrices.



1.4   Rigidity over different fields

There are several interesting questions that arise when considering rigidity over different fields. Our
results for circulant and DFT matrices require working over a field extension. The matrix DFTN is defined
over the N-th cyclotomic field Q[ω] where ω is a primitive N th root of unity. But we are not able to
show non-rigidity of this matrix over Q[ω], only over a larger field that includes additional roots of
unity. Therefore the same holds for circulant matrices over Q whose non-rigidity we derived from the
non-rigidity of DFTN . The degree of the extension is Õ(N 2 ) (so combined with ω, the entire extension of
Q has degree Õ(N 3 )). See Theorem 6.3 for more details.
     We leave it as an open question whether our results for fields of characteristic 0 still hold without
field extensions or for extensions of lower degree.
     In Section 7 we extend our results for complex-valued matrices to any finite field, Fq . The main idea
is to first work over an extension, say Fq [α], where the matrices are not rigid, and then sum over the
matrices obtained by replacing α with its conjugates. A key ingredient in our proof is that over finite fields,
primitive nth roots of unity may have minimal polynomial with very low degree, even subpolynomial in
n. However, over Q, the primitive nth roots of unity have minimal polynomial with degree φ (n) (which
is n1−o(1) ) so our argument does not generalize to this case. We leave it as an open question whether
circulant matrices are rigid over Q. It is worth noting that the result of Alman and Williams in [2] does
hold over Q while the result of Dvir and Edelman in [7] holds only when the field is a finite field related
to the group structure of the matrix.
     Finally, over a finite field Fq , our result for G-circulant matrices requires that gcd(q, |G|) = 1. Our
result for circulant matrices (i.e. cyclic groups) over finite fields does not require this assumption. The
reason that we need gcd(q, |G|) = 1 for general abelian groups is that our techniques do not deal with
groups such as G = Z p2 × · · · × Z p2 (where p is the characteristic of Fq ) because pth roots of unity do
not exist over any extension of Fq so we cannot diagonalize G-circulant matrices even if we lift to a
field extension. This is not an issue for large cyclic groups because for a cyclic group, say ZN , we can
embed a ZN -circulant matrix in a circulant matrix of any given size at least 2N. We only require the
condition gcd(q, |G|) = 1 to rule out the case when G contains a direct product of many copies of the
same small cyclic group whose order is not relatively prime to q. See Section 8 for more details. We leave
it as an open question whether our results for G-circulant matrices still hold without the condition that
gcd(q, |G|) = 1. The results of Dvir and Edelman in [7] deal with a special case where gcd(q, |G|) > 1,
namely when G is a direct product of many copies of Z p where p is the characteristic of Fq .

                        T HEORY OF C OMPUTING, Volume 16 (20), 2020, pp. 1–48                                8
                            F OURIER AND C IRCULANT M ATRICES ARE N OT R IGID

1.5    Organization
In Section 2, we introduce notation and prove several basic results that we will use throughout the paper.
In Section 3, we show that GWH matrices and several closely related families of matrices are not rigid.
In Section 4, we show that N × N DFT matrices are not rigid when N satisfies certain number-theoretic
conditions. In Section 5, we complete the proof that no (sufficiently large) DFT matrix is rigid. We then
deduce that Toeplitz matrices are not rigid. In Section 6, we use the results from the previous section
to show that G-circulant matrices for abelian groups G are not rigid. From Section 2 through Section 6,
we work with matrices over C for ease of exposition. In Section 7 and Section 8, we sketch how to
modify the proofs in the previous sections to deal with “missing” roots of unity in a finite field. Finally,
in Section 9, we discuss a few open questions and possible directions for future work.


2     Preliminaries
Throughout this paper, we let d ≥ 2 be an integer and ω = e2πi/d be a primitive d th root of unity. When
we consider an element of Znd , we will view it as an ordered n-tuple with entries in the range [0, d − 1].
When we say a list of d n elements x1 , . . . , xd n is indexed by Znd , we mean that each xi is labeled with an
element of Znd such that all labels are distinct and the labels of x1 , . . . , xd n are in lexicographical order.

2.1    Basic notation
We will frequently work with ordered tuples, say I = (i1 , . . . , in ) ∈ Znd . Below we introduce some notation
for dealing with ordered tuples that will be used later on.

Definition 2.1. For an ordered tuple I, we let I (i) denote its ith entry. For instance if I = (i1 , . . . , in ) then
I (k) = ik .

Definition 2.2. For an ordered n-tuple I = (i1 , i2 , . . . , in ), define the polynomial in n variables xI =
x1i1 · · · xnin .

Definition 2.3. For ω a d th root of unity and an ordered n-tuple I = (i1 , i2 , . . . , in ) ∈ Znd , we define
ω [I] = (ω i1 , . . . , ω in ).

Definition 2.4. For a function f : Znd → C, define the n-variable polynomial Pf as

                                                 Pf = ∑ f (I)xI .
                                                       I∈Znd


Definition 2.5. For an ordered n-tuple I = (i1 , i2 , . . . , in ), we define the set perm(I) to be a set of ordered
n-tuples consisting of all distinct permutations of the entries of I. Similarly, for a set of ordered n-tuples
S, we define perm(S) to be the set of all ordered n-tuples that can be obtained by permuting the entries of
some element of S.

Definition 2.6. We say a set S ⊆ Znd is symmetric if perm(I) ⊆ S for any I ∈ S.


                         T HEORY OF C OMPUTING, Volume 16 (20), 2020, pp. 1–48                                      9
                                            Z EEV DVIR AND A LLEN L IU

Definition 2.7. For a set of ordered n-tuples S, let red(S) denote the set of equivalence classes under
permutation of entries in S. Let rep(S) be a set of ordered n-tuples formed by taking one representative
from each equivalence class in red(S) (note rep(S) is not uniquely determined but this will not matter for
our purposes).

    Note that if rep(S) = {I1 , . . . , Ik }, then the sets perm(I1 ), perm(I2 ), . . . , perm(Ik ) are disjoint and their
union contains S. If the set S is symmetric then their union is exactly S.

2.2     Special families of matrices
We now define notation for working with a few special families of matrices.

Definition 2.8. An N × N matrix M is called a Toeplitz matrix if Mi j depends only on i − j. An N × N
matrix M is called a Hankel matrix if Mi j depends only on i + j. Note that the rows of any Toeplitz matrix
can be permuted to obtain a Hankel matrix so any non-rigidity results we show for one family also hold
for the other.

Definition 2.9 (Adjusted G-circulant matrices). For an abelian group G and a function f : G → C, let
MG ( f ) denote the |G| × |G| matrix (over C) whose rows and columns are indexed by elements x, y ∈ G
and whose entries are given by Mxy = f (x + y). When it is clear what G is from context, we will simply
write M( f ). We let VG denote the family of matrices MG ( f ) as f ranges over all functions from G to C.
We call VG the family of adjusted G-circulant matrices for the group G. When G is a cyclic group, we
call the matrices in VG adjusted-circulant.

    Compared to the usual G-circulant (and circulant) matrices defined by Mxy = f (x − y), the matrix
MG ( f ) differs only in a permutation of the rows. In the subsequent sections, we will work with MG ( f ) for
technical reasons, but it is clear that the same non-rigidity results hold for the usual G-circulant matrices.
Similarly, we will use adjusted-circulant and Hankel matrices as it is clear that the same non-rigidity
results hold for circulant and Toeplitz matrices. Also note that adjusted-circulant matrices are a special
case of Hankel matrices.
    Recall that a character of an abelian group G is a homomorphism from G to C× , the multiplicative
group of complex numbers.

Definition 2.10 (Discrete Fourier Transfrom matrices). For a finite abelian group G, we define DFTG , the
DFT matrix for G, as the |G| × |G| matrix whose rows correspond to elements of G and whose columns
correspond to the characters of the group. To simplify notation, we will write DFTN for DFTZN , the
classical N × N Fourier Transform matrix (for the cyclic group ZN ).

      The following is immediate from the definition.

Fact 2.11. For a finite abelian group G, if G = H × K where H and K are subgroups, there is an ordering
of the rows and columns of G so that DFTG = DFTH ⊗ DFTK . In particular, if G = Zn1 × Zn2 × · · · × Zna
then
                                     DFTG = DFTn1 ⊗ · · · ⊗ DFTna .


                         T HEORY OF C OMPUTING, Volume 16 (20), 2020, pp. 1–48                                         10
                            F OURIER AND C IRCULANT M ATRICES ARE N OT R IGID

2.3   Matrix rigidity
Here, we review basic notation for matrix rigidity.

Definition 2.12. For a matrix M and a real number r, we define RM (r) to be the smallest number s for
which there exists a matrix A with at most s nonzero entries and a matrix B of rank at most r such that
M = A + B. If RM (r) ≥ s, we say M is (r, s)-rigid.

Definition 2.13. For a matrix M and a real number r, we define rM (r) to be the smallest number s for
which there exists a matrix A with at most s nonzero entries in each row and column and a matrix B of
rank at most r such that M = A + B. If rM (r) ≥ s, we say M is (r, s)-regular rigid.

    It is clear that if a matrix is (r, ns)-rigid, then it must be (r, s)-regular rigid. In the following sections,
                                                      N       ε
we will show that various matrices are not ( log log     N , N )-regular rigid for any ε > 0 and this will imply
that Valiant’s method for showing circuit lower bounds in [19] cannot be applied for these matrices.


2.4   Preliminary results
Next, we mention several basic results that will be useful in the proofs later on.

Definition 2.14. For an m × n matrix A and p × q matrix B, the Kronecker product A ⊗ B is the mp × nq
matrix given by
                                                           
                                          a11 B . . . a1n B
                                         ..     ..     .. 
                                         .         .    . 
                                               am1 B . . .   amn B
where ai j are the entries of A.

Fact 2.15. For matrices A, B,C, D such that matrix products AC and BD are defined,

                                       (A ⊗ B)(C ⊗ D) = (AC) ⊗ (BD) .

Claim 2.16. Hd,n = DFTd ⊗ · · · ⊗ DFTd where ⊗ denotes the Kronecker product.
                   |      {z         }
                                   n

Proof. This can easily be verified from the definition.

                  ∗ = d n I where H ∗ is the conjugate transpose of H
Claim 2.17. Hd,n Hd,n              d,n                               d,n and I is the identity matrix.

Proof. We verify that DFTd DFT∗d = dI, and then use Claim 2.17 and Fact 2.15.

Claim 2.18. Let f : Znd → C be a function. Let ω be a primitive d th root of unity and set Pf = ∑I∈Znd f (I)xI
(see Definition 2.2). Let D = Hd,n MZnd ( f )Hd,n . Then D is a diagonal matrix with diagonal entries
d n Pf (ω [J] ) as J ranges over Znd .


                        T HEORY OF C OMPUTING, Volume 16 (20), 2020, pp. 1–48                                   11
                                         Z EEV DVIR AND A LLEN L IU

Proof. First, we analyze the product MZnd ( f )Hd,n . This is a d n × d n matrix and its rows and columns can
naturally be indexed by ordered tuples I, J ∈ Znd . The entry with row indexed by I and column indexed by
J is
                                     0                              0
                    ∑ f (I + I 0 )ω I ·J = ω −I·J ∑ f (I + I 0 )ω (I +I)·J = ω −I·J Pf (ω [J] ) .
                    I 0 ∈Znd                     I 0 ∈Znd

Therefore, the columns of MZnd ( f )Hd,n are multiples of the columns of Hd,n    ∗ . In fact, the column of
                                                                              ∗ . Since H H ∗ = d n I, we
MZnd ( f )Hd,n indexed by J is Pf (ω [J] ) times the corresponding column of Hd,n           d,n d,n
                                                                                  n     [J]
deduce that D must be a diagonal matrix whose entries on the diagonal are d Pf (ω ) as J ranges over
Znd .

Claim 2.19. Let M be a d × d adjusted-circulant matrix. Then DFTd ·M · DFTd is a diagonal matrix.

Proof. Plug n = 1 into the above.

    Claim 2.18 gives us a characterization of the rank of matrices of the form MZnd ( f ).

Claim 2.20. Let f : Znd → C be a function. Let ω be a d th root of unity and assume Pf = ∑I∈Znd f (I)xI
has C roots among the set {(ω i1 , . . . , ω in ) | (i1 , . . . , in ) ∈ Znd } . Then rank(MZnd ( f )) = d n −C.

Proof. Consider the product D = Hd,n MZnd ( f )Hd,n . Note that Hd,n is clearly invertible by Claim 2.17.
Therefore, it suffices to compute the rank of D. By Claim 2.18, D must be a diagonal matrix whose entries
on the diagonal are d n Pf (ω [J] ) as J ranges over Znd . The rank of D is the number of nonzero diagonal
entries which is simply d n −C.

     As mentioned in the introduction, we can relate the rigidity of a matrix to the rigidity of matrices that
it diagonalizes.

Lemma 2.21. If B = A∗ DA where D is a diagonal matrix and rA (r) ≤ s then rB (2r) ≤ s2 . The same
inequality holds also for B0 = ADA.

Proof. Let E be the matrix with at most s nonzero entries in each row and column such that A − E has
rank at most r. We have
                             B − E ∗ DE = A∗ D(A − E) + (A∗ − E ∗ )DE .
Since rank(A − E) ≤ r, we get that rank(B − E ∗ DE) ≤ 2r. Also, E ∗ DE has at most s2 nonzero entries
in each row and column so rB (2r) ≤ s2 . The second part can be proved in the exact same way with A∗
replaced by A.

    In light of Lemma 2.21, Claim 2.19, and Claim 2.18, proving non-rigidity for d × d circulant matrices
reduces to proving non-rigidty for DFTd and proving non-rigidity for G-circulant matrices for G = Znd
reduces to proving non-rigidity for Hd,n . Below, we show that these statements are actually equivalent.

Claim 2.22. It is possible to rescale the rows and columns of Hd,n to get a matrix of the form MZnd ( f )
for some symmetric function f : Znd → C. In particular, it is possible to rescale the rows and columns of
DFTd to get an adjusted-circulant matrix.

                          T HEORY OF C OMPUTING, Volume 16 (20), 2020, pp. 1–48                              12
                             F OURIER AND C IRCULANT M ATRICES ARE N OT R IGID

Proof. Let ζ be such that ζ 2 = ω. Multiply each row of Hd,n by ζ (I·I) and each column by ζ (J·J) to get a
matrix H 0 . We have
                                            0
                                          HIJ = ζ (I+J)·(I+J) .
                                                                                      2      2
For an ordered tuple x = (x1 , . . . , xn ) ∈ Znd , we define f (x) = ζ x1 +···+xn . To complete the proof, it suffices
                                                                                       2
to show that f : Znd → C is well defined. To do this, we will show that ζ x depends only on the residue
of x mod d. If d is odd, we can choose ζ to be a d th root of unity and the claim is clear. If d is
            2       2     2                                                                           2
even ζ (x+d) = ζ x ζ 2dx+d but since 2dx + d 2 is a multiple of 2d, we get that ζ 2dx+d = 1 and thus
       2      2
ζ (x+d) = ζ x .


3      Non-rigidity of generalized Walsh–Hadamard matrices
In this section, we show that the GWH matrix Hd,n becomes highly non-rigid for large values of n. The
precise result is stated below.

Theorem 3.1. Let N = d n for positive integers d, n. Let 0 < ε < 0.01 and assume n ≥ 1/ψ where

                                                                ε2
                                              ψ=                            .
                                                      400 log2 (1/ε)d log d

Then
                                                    rHd,n N 1−ψ ≤ N ε .
                                                               


   First we prove a few lemmas about symmetric polynomials that we will use in the proof of Theo-
rem 3.1.

Lemma 3.2. Let Tm denote the set of ordered tuples in Znd such that at least m entries are equal to 0. Let
rep(Tm ) = {I1 , . . . , Ik }. Consider the polynomials P1 (x1 , . . . , xn ), . . . , Pk (x1 , . . . , xn ) defined by

                                              Pi (x1 , . . . , xn ) =        ∑        xI .
                                                                        I∈perm(Ii )


For any complex numbers y1 , . . . , ym , and any polynomial Q(xm+1 , . . . xn ) that is symmetric and degree at
most d − 1 in each of its variables, there exist coefficients c1 , . . . , ck such that

                               Q(xm+1 , . . . , xn ) = ∑ ci Pi (y1 , . . . ym , xm+1 , . . . , xn ) .

Proof. It suffices to prove the statement for all Q of the form
                                                                                 00
                                                              ∑             xI
                                                         I 00 ∈perm(I 0 )


where I 0 ∈ Zn−m
             d   . We will prove this by induction on the degree. Clearly one of the Ii is (0, 0 . . . 0), so one
of the polynomials Pi (x1 , . . . , xn ) is constant. This finishes the case when Q has degree 0. Now we do the

                         T HEORY OF C OMPUTING, Volume 16 (20), 2020, pp. 1–48                                      13
                                                      Z EEV DVIR AND A LLEN L IU

induction step. Note that we can extend I 0 to an element of Tm by setting the first m entries equal to 0.
Call this extension I and say that I ∈ perm(Ii ). We have
                                        00
                        ∑             xI = Pi (y1 , . . . , ym , xm+1 , . . . , xn ) − R(y1 , . . . , ym , xm+1 , . . . xn ) .
                   I 00 ∈perm(I 0 )


R(y1 , . . . , ym , xm+1 , . . . xn ), when viewed as a polynomial in xm+1 , . . . , xn (since y1 , . . . , ym are complex
numbers that we can plug in), is symmetric and of lower degree than the left hand side. Thus, using the
induction hypothesis, we can write R in the desired form. This completes the induction step.

   The key ingredient in the proof of Theorem 3.1 is the following lemma which closely resembles the
main result in [7], but deals with matrices over C.

Lemma 3.3. Let f : Znd → C be a symmetric function on the n variables. Let N = d n . Let 0 < ε < 0.01
and assume n ≥ 1/ψ where
                                                  ε2
                                   ψ=                         .
                                        400 log2 (1/ε)d log d
Then
                                                            rM( f ) N 1−ψ ≤ N ε .
                                                                         


    Let                                                                                   
                                                     ε                                1−δ 
                                             δ=             ,           and      m= n
                                                10 log(1/ε)                            d
and let S denote the set of all ordered tuples (i1 , i2 , . . . , in ) ∈ Znd such that the entries indexed 1, 2, . . . , m
are equal to 0, the entries indexed m + 1, . . . , 2m are equal to 1 and in general for 0 ≤ i ≤ d − 1, the entries
                                                                                    2
indexed im + 1, . . . , (i + 1)m are equal to i. Note |S| = d n−dm ≈ d δ n = N ε (since n − dm is approximately
δ n).
    The main idea will be to change f in a small number of locations so that it has many zeros in the
set {ω [I] | I ∈ Znd } in order to make use of Claim 2.20. More precisely, first we will change f to f 0 by
changing its values in at most N ε places so that f 0 is still symmetric in all of the variables and
                                                                
                                           ∀I ∈ S,       Pf ω [I] = 0 .
                                                             0



Note that although the size of S is small, the fact that f 0 is symmetric implies that f 0 also vanishes on
perm(S), which covers almost all of Znd . Once we have shown the above, we quantitatively bound the
number of entries changed between M( f ) and M( f 0 ) and also the rank of M( f 0 ) to complete the proof of
Lemma 3.3. To do the first part, we need the following sub-lemma.

Lemma 3.4. Let T denote the set of all ordered tuples (i1 , i2 , . . . , in ) ∈ Znd such that at least n 1 − δ of
                                                                                                              

the entries are 0. By changing the values of f only on elements of T , we can obtain f 0 satisfying
                                                         
                                      ∀I ∈ S,       Pf 0 ω [I] = 0 .                                         (3.1)


                            T HEORY OF C OMPUTING, Volume 16 (20), 2020, pp. 1–48                                                14
                              F OURIER AND C IRCULANT M ATRICES ARE N OT R IGID

Proof. We interpret (3.1) as a system of linear equations where the unknowns are the values of f 0
at various points. Let rep(T ) = {J1 , J2 , . . . , Jk } for J1 , J2 , . . . Jk ∈ T . Since we must maintain that f 0 is
symmetric, there are essentially k variables each corresponding to an equivalence class of ordered tuples
under permutations. Each equivalence class is of the form perm(J j ) and we denote the corresponding
variable by m j . The system of equations in (3.1) can be rewritten in the form
                                               k
                                                                                                 0
                             ∀I ∈ S          ∑ mj            ∑       ω I·J + ∑ f (J 0 )ω I·J = 0 .
                                             j=1      J∈perm(J j )          J 0 ∈T
                                                                                /

If we let rep(S) = {I1 , I2 , . . . , Il }, the system has exactly l distinct equations corresponding to each element
of rep(S) due to our symmetry assumptions. Let M denote the l × k coefficient matrix represented by
Mi j = ∑J∈perm(J j ) ω Ii ·J . To show that the system has a solution, it suffices to show that the column span
of M is full. This is equivalent to showing that for each i = 1, 2, . . . , l there exist coefficients a1 , a2 , . . . , ak
such that
                                                         k
                                                        ∑ aj ·         ∑        ω Ii ·J 6= 0 ,
                                                        j=1      J∈perm(J j )
                                                         k
                                       ∀i0 6= i         ∑ aj ·         ∑        ω Ii0 ·J = 0 .
                                                        j=1      J∈perm(J j )


    Fix an index i0 . We can view each equation above as a polynomial in ω [Ii ] given by
                                                                  k
                                           P(x1 , . . . , xn ) = ∑ a j       ∑          xJ
                                                                 j=1     J∈perm(J j )


and the problem becomes equivalent to constructing a polynomial that vanishes on ω [Ii ] if and only if
i 6= i0 . Note that only the entries xdm+1 , . . . , xn matter as we have

                              x1 = · · · = xm = 1, . . . , x(d−1)m+1 = · · · = xdm = ω d−1

for all points we consider.
    For Ii = (i1 , i2 , . . . in ), let Ii0 denote the (ordered) sub-tuple (idm+1 , . . . , in ). The problem is equivalent
to constructing a polynomial

                         Q(xdm+1 , . . . , xn ) = P(1, 1, . . . , ω d−1 , . . . , ω d−1 , xdm+1 , . . . xn )
                                0
such that Q vanishes on ω [Ii ] if and only if i 6= i0 .
    Lemma 3.2 implies that by choosing the coefficients a1 , . . . , ak , we can make Q be any polynomial
that is symmetric in xdm+1 , . . . , xn and degree at most d − 1 in each of the variables.
    Now consider the polynomial
                                                             d
                                                               xdm+1 − 1              xnd − 1
                                                                                                  
                 Qi0 (xdm+1 , . . . , xn ) =       ∑                           ···                    .
                                                                       I 0(1)      xn − ω I
                                                                                            0(n−dm)
                                             I 0 ∈perm(I 0 ) xdm+1 − ω
                                                          i0



                          T HEORY OF C OMPUTING, Volume 16 (20), 2020, pp. 1–48                                         15
                                           Z EEV DVIR AND A LLEN L IU

Note this is a polynomial with coefficients in C since each of the factors reduces to a degree d − 1
polynomial.
    It is clear that the above polynomial is symmetric in all of the variables and satisfies the degree
constraint so we know we can choose suitable coefficients a1 , . . . , ak . We claim that the polynomial we
                                [I 0 ]                       0
construct does not vanish on ω i0 but vanishes on ω [Ii ] for i 6= i0 . Indeed, the product
                                     d
                                         xdm+1 − 1                xnd − 1
                                                                              
                                                   0(1)
                                                         ···            0(n−dm)
                                       xdm+1 − ω I             xn − ω I
is 0 if and only if (xdm+1 , . . . , xn ) 6= I 0 . However, there is exactly one I 0 ∈ perm(Ii00 ) with I 0 = Ii00 and
none with I 0 = Ii0 for i 6= i0 since I1 , I2 , . . . , Il are representatives of distinct equivalence classes under
permutation of entries. This means that the polynomial Qi0 we constructed has the desired properties and
completes the proof that the system is solvable.

Proof of Lemma 3.3. Since M( f ) = (M( f ) − M( f 0 )) + M( f 0 ), to complete the proof of Lemma 3.3, it
suffices to bound the number of nonzero entries in M( f ) − M( f 0 ) and the rank of M( f 0 ).
                                                                                 0
    The number of nonzero entries in each row and column
                                    n
                                                            of (M( f ) − M( f )) is at most |T |. This is
exactly the number of elements of Zd with at least n 1 − δ entries equal to 0. Using standard tail bounds
on the binomial distribution (see [3]), the probability of a random ordered n-tuple having at least that
many 0s is at most
                                                                                      
                               1                                                      dδ
           exp −nD 1 − δ ||           = exp −n (1 − δ ) log(d(1 − δ )) + δ log
                               d                                                     d −1
                                                                                          
                                   −n(1−δ )                                           dδ
                               =d           exp −n (1 − δ ) log(1 − δ ) + δ log
                                                                                     d −1
where
                                                    a                  1−a
                                 D(a || b) = a log + (1 − a) log
                                                    b                  1−b
denotes the KL-divergence between Bernoulli distributions with means a and b.
    For δ < 0.01, the above is at most d −n(1−4δ log(1/δ )) . Since 4δ log(1/δ ) < ε, we change at most d εn
entries in each row and column.
    By Claim 2.20, the rank of M( f 0 ) is at most d n − |perm(S)|. Equivalently, this is the number of
ordered n-tuples such that some element in {0, 1, . . . , d − 1} appears less than (1−δ
                                                                                      d
                                                                                        )n
                                                                                           times. We use the
multiplicative Chernoff bound and then union bound over the d possibilities to get the probability that a
randomly chosen ordered n-tuple in Znd is outside perm(S) is at most

                                         δ 2n                  δ 2n
                                                                        
                               d exp −          = exp −             + log d .
                                          2d                   2d
                                                      2
   When n > 4d(log
                 δ2
                    d)
                       , the above is at most d −(δ n)/(4d log d) and thus the rank of M( f 0 ) is at most d (1−ψ)n
where
                                                         ε2
                                         ψ=                             ,
                                              400 log2 (1/ε)d log d
completing the proof of Lemma 3.3.

                         T HEORY OF C OMPUTING, Volume 16 (20), 2020, pp. 1–48                                     16
                               F OURIER AND C IRCULANT M ATRICES ARE N OT R IGID

Proof of Theorem 3.1. Combine Claim 2.22 and Lemma 3.3.

   Using Theorem 3.1, Lemma 2.21, and Claim 2.18, we get the following result which extends
Lemma 3.3 to matrices where f is not symmetric.
Corollary 3.5. For any function f : Znd → C and any 0 < ε < 0.01 such that n ≥ 1/ψ where

                                                               ε2
                                              ψ=                           ,
                                                     400 log2 (1/ε)d log d
we have
                                                  rM( f ) 2N 1−ψ ≤ N 2ε
                                                                

where N = d n .


4     Non-rigidity of DFT matrices of well-factorable size
Our goal in this section is to show that we can find infinitely many values of N for which the DFT matrix
DFTN is highly non-rigid. The integers N we analyze will be products of many distinct primes qi with
the property that qi − 1 is smooth (has all prime factors small). For these values of N, we can decompose
the matrix DFTN into several submatrices that are closely related to Hadamard matrices. We then apply
the results from the previous section to show that each submatrix is non-rigid and aggregate over the
submatrices to conclude that DFTN is non-rigid.
    We first show precisely how to construct N. We rely on the following number theoretic result, found
in [5], that allows us to find a large set of primes qi for which qi − 1 is smooth.
Definition 4.1. For a positive integer m, let ρ + (m) denote the largest prime factor of m. For a fixed
positive integer a, let
                              πa (x, y) = |{p | a < p ≤ x, ρ + (p − a) ≤ y}|
where p ranges over all primes. In other words, πa (x, y) is the number of primes at most x such that p − a
is y-smooth.
Theorem 4.2 ([5]). There exist constants x0 ,C such that for β = 0.2961, x > x0 and y ≥ xβ we have 5
                                                                    x
                                                   π1 (x, y) >            .
                                                                 (log x)C
   Throughout the remainder of this paper, set C0 = C + 1 where C is the constant in Theorem 4.2. The
properties that we want N to have are stated in the following two definitions.
Definition 4.3. We say a prime q is (α, x)-good if the following conditions hold.
     • (logxx)C0 ≤ q ≤ x.

     • All prime powers dividing q − 1 are at most xα .
    5 [5] proves the same inequality with π (x, y) for any integer a where x may depend on a and C is an absolute constant.
                                           a                                0



                           T HEORY OF C OMPUTING, Volume 16 (20), 2020, pp. 1–48                                              17
                                             Z EEV DVIR AND A LLEN L IU

Definition 4.4. We say an integer N is (l, α, x)-factorable if the following conditions hold.

    • N = q1 · · · ql where q1 , . . . , ql are distinct primes.

    • q1 , . . . , ql are all (α, x)-good.

   To show the existence of (l, α, x)-factorable integers, it suffices to show that there are many (α, x)-
good primes. This is captured in the following lemma.

Lemma 4.5. For a fixed constant C0 , any parameter α > 0.2961, and sufficiently large x (possibly
depending on α), there are at least 10x/(log x)C0 distinct (α, x)-good primes.

Proof of Lemma 4.5. Let y = xβ where β = 0.2961. By Theorem 4.2, for sufficiently large x, we can find
at least                                                  
                                           x         x
                                                −
                                        (log x)C (log x)C0
primes p1 , . . . , pl between x/(log x)C0 and x such that all prime factors of pi − 1 are at most xβ . Eliminate
all of the pi such that one of the prime powers in the prime factorization of pi − 1 is more than xα . Note
that there are at most xβ log x integers in the range [xα , x] that are powers of primes smaller than xβ .
Each of these prime powers can divide at most x1−α of the elements {p1 − 1, . . . , pl − 1}, so in total, we
eliminate at most x1−α+β log x of the pi . Thus, for sufficiently large x, the number of (α, x)-good primes
is at least
                                    x           x                            x
                                       C
                                         −         C
                                                      − x1−α+β log x ≥             .
                                (log x)     (log x) 0                    2(log x)C
    For simplicity, we will set α = 0.3 by default.

Definition 4.6. We say a prime is x-good if it is (0.3, x)-good. We say an integer N is (l, x)-factorable if
it is (l, 0.3, x)-factorable.

     Lemma 4.5 implies that for all sufficiently large x and l ≤ (logxx)C0 (where C0 is an absolute constant),
we can find (l, x)-factorable integers. We now show that if we choose x sufficiently large and N to be
(l, x)-factorable for some
                                           x                     x
                                               +100
                                                    ≤l≤                  ,
                                     (log x)C0             (log x)C0 +10
then DFTN is highly non-rigid.

Theorem 4.7. Let 0 < ε < 0.01 be some constant. For x sufficiently large and N a (l, x)-factorable
number with
                                     x                  x
                                         +100
                                              ≤l≤               ,
                                      C
                               (log x) 0          (log x)C0 +10
we must have                                                              
                                                            N
                                        rDFTN                                  ≤ N 7ε .
                                                    exp (ε (log N)0.36 )
                                                          6

    In order to prove Theorem 4.7, we will first prove a series of preliminary results that characterize the
structure of DFT and GWH matrices.

                          T HEORY OF C OMPUTING, Volume 16 (20), 2020, pp. 1–48                               18
                            F OURIER AND C IRCULANT M ATRICES ARE N OT R IGID

4.1   Structure of generalized Walsh–Hadamard and DFT matrices
Lemma 4.8. Let n = x1 x2 · · · x j for pairwise relatively prime positive integers x1 , . . . , x j . There exists a
permutation of the rows and columns of DFTn , say DFT0 , such that

                                         DFT0 = DFTx1 ⊗ · · · ⊗ DFTx j ,

where ⊗ denotes the Kronecker product.

Proof. This follows from Fact 2.11.

Lemma 4.9. Let M = A ⊗ B where A is an m × m matrix and B is an n × n matrix. For any two integers
r1 , r2 we have
                                rM (r1 n + r2 m) ≤ rA (r1 )rB (r2 ) .

Proof. The proof of this lemma is similar to the proof of Lemma 2.21. There are matrices E, F with at
most rA (r1 ) and rB (r2 ) nonzero entries respectively such that rank(A + E) ≤ r1 and rank(B + F) ≤ r2 .
We will now show that rank(M − E ⊗ F) ≤ r1 n + r2 m. Indeed

                                   M − E ⊗ F = (A + E) ⊗ B − E ⊗ (B + F)

and the right hand side of the above has rank at most r1 n + r2 m since rank multiplies under the Kronecker
product. Clearly E ⊗ F has at most rA (r1 )rB (r2 ) nonzero entries in each row and column so we are
done.

Lemma 4.10. Consider the matrix

                         A = (DFTt1 ⊗ · · · ⊗ DFTt1 ) ⊗ · · · ⊗ (DFTtn ⊗ · · · ⊗ DFTtn ) .
                              |       {z          }              |       {z          }
                                          a1                                    an

Let 0 < ε < 0.01 be some chosen parameter and D be
                                                  some       sufficiently large constant (possibly depending
                                                   ti2 (logti )2
                                                                    
on ε). Assume t1 ≤ t2 ≤ · · · ≤ tn and ai ≥ max         ε 10
                                                                 , D for all i. Let P = t1a1 · · ·tnan and L =
d2 log log Pe. Then
                                           6     2
                                                             
                                    rA P1−ε /(10Ltn logtn ) ≤ P5ε .

Proof. First, we consider the case when there exists an integer B such that B ≤ t1a1 , . . . ,tnan ≤ B2 . Note that

                                         (DFTti ⊗ · · · ⊗ DFTti ) = Hti ,ai .
                                          |       {z         }
                                                    ai


By Theorem 3.1, for each i there exists a matrix Ei such that Ei has at most tiεai nonzero entries in each
row and column and
                                                        ai (1−ε 4 /(ti2 logti ))
                                 rank(Hti ,ai − Ei ) ≤ ti                        .

                        T HEORY OF C OMPUTING, Volume 16 (20), 2020, pp. 1–48                                    19
                                                Z EEV DVIR AND A LLEN L IU

Let Ai = Hti ,ai − Ei . Then
(DFTt1 ⊗ · · · ⊗ DFTt1 ) ⊗ · · · ⊗ (DFTtn ⊗ · · · ⊗ DFTtn ) = (E1 + A1 ) ⊗ · · · ⊗ (En + An )
 |       {z          }              |       {z          }
          a1                                 an
                 !               !                         !            !                                                     !                   !
           O            O                                      O                  O                                    O            O
= ∑              Ai ⊗            Ei0   =       ∑                     Ai ⊗                     Ei0   +       ∑                Ai ⊗           Ei0       .
   S⊂[n]   i∈S           i0 ∈S
                            /              S⊂[n],|S|≥εn        i∈S                    i0 ∈S
                                                                                         /              S⊂[n],|S|<εn   i∈S          i0 ∈S
                                                                                                                                       /

Let the first term above be N1 and the second term be N2 . We bound the rank of N1 and the number of
nonzero entries in each row and column of N2 . Note that by grouping terms in the sum for N1 , we can
find matrices ES for all S ⊂ [n] with |S| = εn and write
                                                           !
                                                                          O
                                             N1 =              ∑                 Ai ⊗ ES .
                                                      S⊂[n],|S|=εn         i∈S

Now we have
                                                                                       
                                                                 n   1       P
               rank(N1 ) ≤      ∑ P ∏ ai ε 4 /(ti2 logti ) ≤ εn Bε 4 /(tn2 logtn ) εn
                           S⊂[n],|S|=εn i∈S ti
                                                                 εn
                                   (n)εn            P            3            P
                              ≤ εn εn                       ≤
                                              4 /(t 2 logt ) εn           5 n/(t 2 logt ) .
                                     3     Bε      n      n      ε    B ε       n      n

                          2
                            t (logt )2
                                         
Since we assumed ai ≥ max i ε 10 i , D , we get

                                              a ε 4 /(2tn2 logtn )                       Dε 4 /(2tn2 logtn )
                            4     2
                                                                                                            
                        Bε /(tn logtn ) ≥ tn n                       ≥ max tn0.5 logtn ,tn                     .

Either the first term is larger than (3/ε)2 or tn is bounded above by some function of ε in which case if
we choose D sufficiently large, the second term will be larger than (3/ε)2 . In any case we get
                                                  P                                   P                       5   2
                     rank(N1 ) ≤                                      ≤                             ≤ P1−ε /(4tn logtn ) .
                                           ε 4 /(tn2 logtn ) εn             5     2
                                                                          Bε n/(2tn logtn )
                                       ε
                                                               
                                       3 ·B

Now we bound the number of nonzero entries in each row and column of N2 . This number is at most
                                                 2n B2εn Pε ≤ 2n P3ε ≤ P4ε .
Thus, when we have B ≤ t1a1 , . . . ,tnan ≤ B2 ,
                                                 5    2
                                                                
                                           rA P1−ε /(4tn logtn ) ≤ P4ε .

    Now we move on to the case where we no longer have control over the range of values t1a1 , . . . ,tnan .
                                                                                       j−1  j
Fix k = 2D and consider the intervals I1 = [k, k2 ), I2 = [k2 , k4 ), . . . , I j = [k2 , k2 ), . . . and so on. Note
                                                                                 
                                              OO
                                       A=                        (DFTt j ⊗ · · · ⊗ DFTt j ) .
                                                                                           
                                                      
                                                          aj
                                              i∈[L]
                                                                  |        {z         }
                                                       t j ∈Ii                   aj


                        T HEORY OF C OMPUTING, Volume 16 (20), 2020, pp. 1–48                                                                     20
                              F OURIER AND C IRCULANT M ATRICES ARE N OT R IGID

                                       a
For an integer i, let Pi = ∏t a j ∈I t j j . Let T be the set of indices i ∈ [L] such that Pi ≥ Pε/(2L) . Then
                               j   i

                                                                                                      
        O  O                             O  O
      A=      (DFTt j ⊗ · · · ⊗ DFTt j ) ⊗       (DFTt j ⊗ · · · ⊗ DFTt j ) = B ⊗C
                                                                                 
         i∈T aj                                     aj
                                                i∈T
                                                 /
                 |        {z         }                  |        {z         }
                    t j ∈Ii            aj                                        t j ∈Ii           aj

where naturally B denotes the first term and C denotes the second.
                                                                                      L
   Note that the dimension of the matrix C, which we denote by |C|, is at most Pε/(2L) = Pε/2 . We
now apply Lemma 4.9 repeatedly to bound the rigidity of B. Let
                                                                  
                                                O
                                           Bi =     (DFTt j ⊗ · · · ⊗ DFTt j ) .
                                                                               
                                                  aj  |        {z         }
                                                   t j ∈Ii               aj

Then we have,
                                               !                                !!                !4ε
                                                                  1
                              rB       ∏ Pi          ∑     ε 5 /(4tn2 logtn )
                                                                                      ≤    ∏ Pi         .
                                       i∈T           i∈T Pi                                i∈T

From the above inequality, the fact that Pi ≥ Pε/(2L) for all i ∈ T , and |C| ≤ Pε/2 we deduce
                                                                                   6     2
                                                                                                   !
           
             1−ε 6 /(10Ltn2 logtn )
                                         
                                              1−ε 6 /(8Ltn2 logtn )
                                                                             LP1−ε /(8Ltn logtn )
         rA P                         ≤ rA LP                         ≤ |C|rB                        ≤ P5ε .
                                                                                     |C|


4.2    Proof of Theorem 4.7
To complete the proof of Theorem 4.7, we will break DFTN into submatrices, show that each submatrix
is non-rigid using techniques from the previous section, and then combine our estimates to conclude that
DFTN is non-rigid. Recall that N is (l, x)-factorable with
                                                   x                  x
                                                            ≤l≤               ,
                                             (log x)C0 +100     (log x)C0 +10

meaning N = q1 q2 · · · ql for some distinct primes q1 , . . . , ql where qi − 1 has no large prime power divisors
for all i. Let γ be a primitive N th root of unity.

Definition 4.11. For a subset S ⊂ [l] define multN (S) = ∏s∈S qs and factN (S) = ∏s∈S (qs − 1).

Definition 4.12. For all S ⊂ [l] we will define TS as the subset of [N] × [N] indexed by (i, j) such that

                                               ∀s ∈ S           i j 6≡ 0        mod qs ,
                                               ∀s ∈
                                                  /S            ij ≡ 0          mod qs .

Note that as S ranges over all subsets of [l], the sets TS form a partition of [N] × [N].

                          T HEORY OF C OMPUTING, Volume 16 (20), 2020, pp. 1–48                                  21
                                         Z EEV DVIR AND A LLEN L IU

    For each S, we will divide the set TS into submatrices such that when filled with the corresponding
entries of DFTN , we can apply Lemma 4.10 to show that each submatrix is nonrigid. The key intuition is
that for a given prime qi , once we restrict to nonzero residues, the multiplicative subgroup actually has the
additive structure of Zqi −1 . Since qi − 1 is smooth, Zqi −1 is a direct sum of cyclic groups of small order.

Definition 4.13. For all S ⊂ [l], we define the factN (S) × factN (S) matrix M(S) as follows. Let RS be the
set of residues modulo multN (S) that are relatively prime to multN (S). Note that |RS | = factN (S). Each
row and each column of M(S) is indexed by an element of RS and the entry in row i and column j is θ i· j
where θ is a primitive multN (S) root of unity. The exact order of the rows and columns will not matter
for our uses. Note that replacing θ with θ k for k relatively prime to multN (S) simply permutes the rows
so it does not matter which root of unity we choose.

Lemma 4.14. Consider the set of entries in DFTN indexed by elements of TS . We can partition this set
       / (2qs − 1) submatrices each of size factN (S) × factN (S) that are equivalent to M(S) up to some
into ∏s∈S
permutation of rows and columns.

Proof. In TS , for each prime qs with s ∈/ S, there are 2qs − 1 choices for what i and j are mod qs . Now
fix the choice of i, j mod qs for all s ∈/ S. We restrict to indices with i ≡ c1 mod ∏s∈S    / qs and j ≡ c2
mod ∏s∈S   q
         / s  for some  c ,
                         1 2c .
     We are left with a factN (S) × factN (S) matrix, call it A, where i and j run over all residues modulo
multN (S) that are relatively prime to multN (S). Naturally, label all rows and columns of this matrix by
what the corresponding indices i and j are modulo multN (S). For a row labeled a and a column labeled
                                                 0 0
b, we compute the entry Aab . The value is γ a ·b where a0 is the unique element of ZN such that a0 ≡ a
mod multN (S) and a0 ≡ c1 mod ∏s∈S                0
                                      / qs and b is defined similarly. We have

                                        a0 · b0 ≡ ab mod multN (S) ,
                                        a0 · b0 ≡ c1 c2 ≡ 0   mod ∏ qs .
                                                                  s∈S
                                                                   /

Therefore
                                       a0 b0 ≡ k ∏ qs ab mod multN (S)
                                                s∈S
                                                 /

where k is defined as an integer such that k ∏s∈S / qs ≡ 1 mod multN (S). Note that k clearly exists since
∏s∈S
   / sq and mult  N (S) are relatively prime. Since γ k ∏s∈S
                                                          / qs is a primitive mult (S) root of unity, the matrix
                                                                                  N
A is equivalent to M(S) up to some permutation, as desired.

Lemma 4.15. For a subset S ⊂ [l] with |S| = k and M(S) (as defined in Definition 4.13) a factN (S) ×
factN (S) matrix as described above. we have
                                                         
                                           factN (S)
                                 rM(S)                      ≤ (factN (S))6ε
                                         exp (ε 6 x0.37 )

as long as k ≥ (log x)xC0 +200 .


                           T HEORY OF C OMPUTING, Volume 16 (20), 2020, pp. 1–48                             22
                            F OURIER AND C IRCULANT M ATRICES ARE N OT R IGID

Proof. Without loss of generality S = {1, 2, . . . , k}. Consider the factorizations of q1 − 1, . . . , qk − 1 into
prime powers. For each prime power pei i ≤ x0.3 , let c(pei i ) be the number of indices j for which pei i
appears (exactly) in the factorization of q j − 1. Consider all prime powers pei i for which c(pei i ) < x0.62 .
                                                                  x0.3
                                             c(t)        0.3 x0.62           0.92
                                     ∏     t      ≤   (x    )            ≤ xx .
                                   t,c(t)≤x0.62
                                                                                       c(t )   c(t )
Now consider all prime powers say t1 , . . . ,tn for which c(ti ) ≥ x0.62 . Let P = t1 1 · · ·tn n . From the above
we know that as long as x is sufficiently large
                                                                               εk
                                                                        x
                                  factN (S)                               C
                                                                  (log x) 0  +1
                          P≥          x 0.92 ≥ (factN (S))(1−ε)         x 0.92      ≥ (factN (S))(1−ε) .       (4.1)
                                    x                                 x
      We will use the prime powers ti and Theorem 3.1 to show that M(S) is not rigid. Note that we
can associate each row and column of M(S) with an ordered k-tuple (a1 , . . . , ak ) where ai ∈ Zqi −1 as
follows. First, it is clear that each row and column of M(S) can be associated with an ordered k-tuple
(z1 , . . . , zk ) ∈ F×              ×          ×
                      q1 × · · · × Fqk . Now Zqi can be viewed as a cyclic group on qi − 1 elements. This allows us
to create a bijection between the rows and columns of M(S) and elements of Zq1 −1 × · · · × Zqk −1 .
      Also note that for a row indexed by A = (a1 , . . . , ak ) and a column indexed by B = (b1 , . . . , bk ), the
entry M(S)AB is dependent only on A + B. We will now decompose M(S) into several P × P submatrices.
In particular, we can write qi − 1 = di Ti where Ti is a product of some subset of {t1 , . . . ,tn } and di is
relatively prime to Ti . We have T1 T2 · · · Tk = P. For each A0 , B0 ∈ Zd1 × · · · × Zdk , we can construct a
P × P submatrix M(S, A0 , B0 ) consisting of all entries M(S)AB of M(S) such that A ≡ A0 , B ≡ B0 (where the
equivalence is over Zd1 × · · · × Zdk ). This gives us d 2 different submatrices where d = d1 · · · dk . Naturally,
we can associate each row and column of a submatrix M(S, A0 , B0 ) with an element of ZT1 × · · · × ZTk
such that for a row labeled I and a column labeled J, the entry M(S, A0 , B0 )IJ only depends on I + J. In
particular, this means that X (M(S, A0 , B0 )) X is diagonal where X = DFTT1 ⊗ · · · ⊗ DFTTk . Now, using
Lemma 4.8, we can rewrite
                         X = (DFTt1 ⊗ · · · ⊗ DFTt1 ) ⊗ · · · ⊗ (DFTtn ⊗ · · · ⊗ DFTtn ) .
                              |       {z          }              |       {z          }
                                         c(t1 )                            c(tn )

    Since for x sufficiently large, c(ti ) ≥ x0.62 ≥ ti2 (logti )2 /ε 10 , we can use Lemma 4.10 and get that
                                               6                0.62
                                                                      
                                     rX P1−ε /(20(log log P)x ) ≤ P5ε .
   Let E be the matrix of changes to reduce the rank of X according to the above. We have that E has at
most Pε nonzero entries in each row and column, and
                                                           6              0.62 )
                                    rank(X − E) ≤ P1−ε /(20(log log P)x            .
We can write M(S) in block form as
                                                                                    
                           M(S, A1 , B1 ) M(S, A1 , B2 ) . . .        M(S, A1 , Bd )
                         M(S, A2 , B1 ) M(S, A2 , B2 ) . . .         M(S, A1 , Bd )
                                                                                    
                               ..             ..        ..                ..        
                                .              .            .              .        
                          M(S, Ad , B1 ) M(S, Ad , B2 ) . . .         M(S, Ad , Bd )

                        T HEORY OF C OMPUTING, Volume 16 (20), 2020, pp. 1–48                                   23
                                               Z EEV DVIR AND A LLEN L IU

where A1 , . . . , Ad and B1 , . . . , Bd range over the elements of Zd1 × · · · × Zdk . We can rearrange the above
as                                                                                         
                          M(S, A1 , B1 ) . . . M(S, A1 , Bd )        XD11 X . . . XD1d X
                                   ..        ..          ..     ..              ..      .. 
                                                              = .
                        
                                   .           .         .                          .     . 
                         M(S, Ad , B1 ) . . .         M(S, Ad , Bd )         XDd1 X          ...   XDdd X
where the Di j are diagonal matrices. Now consider the matrix
                                                                
                                            ED11 E . . . ED1d E
                                   E(S) =  ...      ..       ..  .
                                          
                                                        .      . 
                                                           EDd1 E . . .     EDdd E
We have
                                                                    
                            XD11 X − ED11 E    . . . XD1d X − ED1d E
            M(S) − E(S) =         ..          ..             ..
                                                                     =
                                                                    
                                    .              .           .
                            XDd1 X − EDd1 E . . . XDdd X − EDdd E
                                                                                  
               XD11 (X − E) . . . XD1d (X − E)       (X − E)D11 E . . . (X − E)D1d E
                     ..     ..          ..                 ..       ..        ..
                                               +                                   .
                                                                                  
                     .         .        .                  .          .       .
               XDd1 (X − E) . . .         XDdd (X − E)                 (X − E)Dd1 E . . .           (X − E)Ddd E
    In the above expression, each of the two terms has rank at most
                                                                                                               
                       1−ε 6 /(20(log log P)x0.62 )            factN (S)             1         factN (S)
                    dP                                = ε 6 /(20(log log P)x0.62 ) ≤                                .
                                                       P                             2       exp (ε 6 x0.37 )
Note that when computing the rank, we only multiply by d (and not d 2 ) because the small blocks are all
multiplied by the same low rank matrix on either the left or right. The number of nonzero entries in each
row and column of E(S) is at most P5ε d = fact  N (S)
                                             P1−5ε
                                                      . Since P ≥ (factN (S))1−ε , we conclude
                                                        
                                          factN (S)
                                rM(S)                      ≤ (factN (S))6ε .
                                        exp (ε 6 x0.37 )
    We are now ready to complete the analysis of the non-rigidity of the DFT matrix DFTN .
Proof of Theorem 4.7. Set the threshold m = x0.365 and k0 = l − m. The sets TS , as S ranges over all
subsets of [l], form a partition of [N] × [N]. For each S ⊂ [l] with |S| ≥ k0 , we will divide TS into
factN (S) × factN (S) submatrices using Lemma 4.14 and change entries to reduce the rank of every
submatrix according to Lemma 4.15. We will not touch the entries in sets TS for |S| < k0 . Call the
resulting matrix M 0 . We now estimate the rank of M 0 and then the maximum number of entries changed
in any row or column.
      We remove all rows and columns corresponding to integers divisible by at least m2 of the primes
q1 , . . . , ql . The number of rows and columns removed is at most
                                                                            !m/2
                                                          
                                       1       N        l               l                 N
                  N ∑ ∏ ≤                      m/2
                                                              <N        x          ≤          0.365 .
                                      q                m/2                           (log x)x
                                                 
                      S⊂[l],|S|= m i∈S i
                           2
                                             x                      (log x)C0
                                               (log x)C0


                          T HEORY OF C OMPUTING, Volume 16 (20), 2020, pp. 1–48                                         24
                            F OURIER AND C IRCULANT M ATRICES ARE N OT R IGID

    The remaining entries must be subdivided into matrices of the form M(S) for various subsets S ⊂ [l]
with |S| ≥ k0 . Let q1 < q2 < · · · < ql . The number of such submatrices is at most
                                                                                          2
                 N2
                                                           
                                                      2              q1 · · · qk0
                               2
                                   ≤ (qk0 +1 · · · ql )                                        ≤ 3(qk0 +1 · · · ql )2 ≤ 3x2m .
      ((q1 − 1) · · · (qk0 − 1))                               (q1 − 1) · · · (qk0 − 1)

Each one of the submatrices has rank at most
                                                                N
                                                                           ,
                                                          exp (ε 6 x0.37 )

so in total the rank is at most
                                                  3x2m         N
                                          N                ≤               .
                                              exp (ε x ) exp (ε 6 x0.369 )
                                                    6 0.37

Combining the two parts we easily get
                                                                       N
                                              rank(M 0 ) ≤                         .
                                                                 exp (ε 6 x0.365 )

     Now we bound the number of entries changed. The number of entries changed in each row or column
is at most
                                                                                  
                 N                6ε                            q1 · · · qk0
                                 N ≤ (qk0 +1 · · · ql )                              N 6ε ≤ 3N 6ε+1.1m/l ≤ N 7ε .
      ((q1 − 1) · · · (qk0 − 1))                          (q1 − 1) · · · (qk0 − 1)

As exp ε 6 x0.365 ≥ exp ε 6 (log N)0.36 for sufficiently large x, we conclude
                                       

                                                                              
                                                           N
                                       rDFTN                                       ≤ N 7ε .
                                                   exp (ε (log N)0.36 )
                                                         6



5    Non-rigidity of all circulant matrices
In the previous section, we showed that there exists an infinite set of DFT matrices that are not Valiant-
rigid. In this section, we will bootstrap the results from Section 4 to show that in fact, no (sufficiently
large) DFT matrix is rigid.
    The first ingredient will be a stronger form of Lemma 4.5. Recall that a prime q is defined to be
x-good if x/(log x)C0 ≤ q ≤ x and all prime powers dividing q − 1 are at most x0.3 and that an integer N is
defined to be (l, x)-factorable if it can be written as the product of l distinct x-good primes.
    To simplify our formulas we use the following notation.

Notation 5.1. Define
                                                                      x
                                                   gk (x) =                   .
                                                                 (log x)C0 +k
Lemma 5.2. For all sufficiently large integers K, there exist l, x, N such that the following conditions
hold:

                        T HEORY OF C OMPUTING, Volume 16 (20), 2020, pp. 1–48                                                    25
                                          Z EEV DVIR AND A LLEN L IU

    • g100 (x) ≤ l ≤ g10 (x),
    • N is (l, x)-factorable,
    • K < N < K(log K)2 .
Proof. Call an N well-factorable if it is (l, x)-factorable for some x and g100 (x) ≤ l ≤ g10 (x). Let N0 be
the largest integer that is well-factorable with N0 ≤ K. Assume N0 is (l, x)-factorable.
     We have N0 = q1 · · · ql where q1 , . . . , ql are distinct, x-good primes. If l < bg10 (x)c then by Lemma 4.5,
we can find another x-good prime ql+1 . We can then replace N0 with ql+1 N0 . ql+1 N0 > K by the
maximality of N0 and also ql+1 N0 ≤ N0 x ≤ N0 (log N0 )2 so ql+1 N0 satisfies the desired conditions.
     We now consider the case where l = bg10 (x)c. First, if q1 , . . . , ql are not the l largest x-good primes
then we can replace one of them say q1 with q01 > q1 . The number N 0 = q01 q2 · · · ql is well-factorable and
between N0 and N0 (log x)C0 . Using the maximality of N0 , we deduce that N 0 must be in the desired range.
     On the other hand if q1 , . . . , ql are the l largest x-good primes, we know they are actually all between
3x/(log x)C0 and x. This is because by Lemma 4.5, there are at least 10x/(log x)C0 distinct x-good primes.
Let x0 = 2x. The above implies that q1 , . . . , ql are x0 -good and clearly g100 (x0 ) ≤ l ≤ g10 (x0 ). Furthermore,
g10 (x0 ) > g10 (x) + 1 so l = bg10 (x)c < bg10 (x0 )c and we can now repeat the argument from the first
case.

    We can now complete the proof that circulant matrices are not rigid.
Theorem 5.3. Let 0 < ε < 0.01 be a given parameter. For all sufficiently large N, if M is an N × N
circulant (or Toeplitz) matrix, then
                                                               
                                                   N
                                    rM                            ≤ N 15ε .
                                         exp (ε 6 (log N)0.35 )
Proof. First we analyze circulant matrices of size N0 where N0 is (l, x)-factorable for some g100 (x) ≤ l ≤
g10 (x). Theorem 4.7 and Lemma 2.21 imply that for M0 an N0 × N0 circulant matrix where N0 satisfies
the previously mentioned conditions,
                                                                
                                                  2N0
                                  rM0                              ≤ N014ε .
                                        exp (ε 6 (log N0 )0.36 )
Now for a circulant matrix M of arbitrary size N × N, note that it is possible to embed an M in the
upper left corner of a circulant matrix of any size at least 2N. By Lemma 5.2, there exists an N0 that is
(l, x)-factorable for some g100 (x) ≤ l ≤ g10 (x) such that
                                                 N0          N0
                                                         ≤N≤    .
                                              (log N0 )2     2
We deduce                                                     
                                                2N0
                                 rM                              ≤ N014ε .
                                      exp (ε 6 (log N0 )0.36 )
Rewriting the bounds in terms of N we get
                                                             
                                                 N
                                  rM                             ≤ N 15ε .
                                       exp (ε 6 (log N)0.35 )


                        T HEORY OF C OMPUTING, Volume 16 (20), 2020, pp. 1–48                                    26
                              F OURIER AND C IRCULANT M ATRICES ARE N OT R IGID

Remark 5.4. Note that our proof actually shows something slightly stronger, namely that the changes
to reduce the rank of a circulant matrix are actually fixed linear combinations of the entries. See
Definition 8.1 and Claim 8.2 for a more precise statement.

    From the above and Claim 2.22, we immediately deduce that DFT matrices are not rigid.

Theorem 5.5. Let 0 < ε < 0.01 be a given parameter. For all sufficiently large N,
                                                            
                                                N
                              rDFTN                            ≤ N 15ε .
                                      exp (ε 6 (log N)0.35 )

6    Non-rigidity of G-circulant matrices for abelian groups
Using the results from the previous section, we can show that DFTG and G-circulant matrices are not
Valiant-rigid for any infinite class of finite abelian groups G. Our proof follows the same strategy as the
proof of Lemma 4.10.

Theorem 6.1. Let 0 < ε < 0.01 be fixed. Let G be an abelian group and f : G → C be a function. If |G|
is sufficiently large then                                    
                                                |G|
                             rDFTG                               ≤ |G|19ε .
                                      exp (ε 8 (log |G|)0.32 )
Proof. By the Fundamental Theorem of Finite Abelian Groups we can write G = Zn1 × · · · × Zna . By
Fact 2.11, we have DFTG = DFTn1 ⊗ · · · ⊗ DFTna . Let us write F = DFTG .
    Without loss of generality, n1 ≤ n2 ≤ · · · ≤ na . We will choose k to be a fixed, sufficiently large
positive integer. By Theorem 5.5, we can ensure that for N > k
                                                               
                                                   N
                                 rDFTN                            ≤ N 15ε .
                                         exp (ε 6 (log N)0.35 )
                                                                         j−1        j
Consider the ranges I1 = [k, k2 ), I2 = [k2 , k4 ), . . . I j = [k2 , k2 ) . . . and so on. Let S j be a multiset defined
by S j = I j ∩ {n1 , . . . , na }. Fix a j and let the elements of S j be x1 ≤ · · · ≤ xb . By Theorem 5.5, for each xi ,
there are matrices Exi and Axi such that DFTxi = Axi + Exi , Exi has at most xi15ε nonzero entries in each
row and column, and
                                                                     xi
                                             rank(Axi ) ≤                         .
                                                           exp (ε (log xi )0.35 )
                                                                   6

Now we can write

                                                                                                                     !                       !
                                                                                                        O                     O
       M j = DFTx1 ⊗ · · · ⊗ DFTxb = (Ax1 + Ex1 ) ⊗ · · · ⊗ (Axb + Exb ) = ∑                                   Axi       ⊗            Exi0
                                                                                                S⊂[b]    i∈S                  i0 ∈S
                                                                                                                                 /
                                                         !                      !                                   !                       !
                                             O                   O                                      O                    O
                          =       ∑                Axi       ⊗           Exi0       +       ∑                 Axi       ⊗            Exi0       .
                              S⊂[b],|S|≥εb   i∈S                 i0 ∈S
                                                                    /                   S⊂[b],|S|<εb    i∈S                  i0 ∈S
                                                                                                                                /



                         T HEORY OF C OMPUTING, Volume 16 (20), 2020, pp. 1–48                                                                      27
                                                   Z EEV DVIR AND A LLEN L IU

Let the first term above be N1 and the second term be N2 . We will bound the rank of N1 and the number
of nonzero entries in each row and column of N2 . Note that by grouping the terms in the sum for N1 we
can write it in the form                            O
                                            ∑          Axi ⊗ ES
                                                      S⊂[b],|S|=dεbe i∈S

where ES is some matrix for each S. This implies that

                                                                  bdεbe
                           
                         b             x1 · · · xb                                   x1 · · · xb
          rank(N1 ) ≤                                 dεbe
                                                           ≤                                        dεbe
                       dεbe (exp (ε 6 (log x1 )0.35 ))         (3)εb dεbe
                                                                          (exp (ε 6 (log x1 )0.35 ))
                                                                                                 dεbe
                                                                                   3
                                                       = x1 · · · xb                                    .
                                                                       ε exp (ε 6 (log x1 )0.35 )

As long as k is sufficiently large, we have
                                                                dεbe                                             dεbe
                                                3                                                     1
       rank(N1 ) ≤ x1 · · · xb                                             ≤ x1 · · · xb
                                      ε exp (ε (log x1 )0.35 )
                                              6                                            exp (ε 6 (log x1 )0.34 )
                                                                                                      x1 · · · xb
                                                                                         ≤
                                                                                           exp (ε (log x1 · · · xb )0.33 )
                                                                                                  7


where in the last step we used the fact that xi ≤ x12 for all i. The number of nonzero entries in each row or
column of N2 is at most

         2b xb · · · xb−bεbc+1 (xb−bεbc · · · x1 )15ε = 2b (x1 · · · xb )15ε (xb · · · xb−bεbc+1 )1−15ε ≤ (x1 · · · xb )18ε .

Note in the last step above, we used the fact that xi ≤ x12 .
   For each integer c between 2 and k, let mc be the number of copies of c in the set {n1 , . . . , na }. If
mc ≥ k2 (log k)2 /ε 4 then by Theorem 3.1, if we define Ac = DFTc ⊗ · · · ⊗ DFTc then
                                                              |     {z         }
                                                                                            mc
                                                             4   2
                                                                          
                                                  rAc cmc (1−ε /(k log k)) ≤ cmc ε .

Let L = d2 log log |G|e and ensure that |G| is sufficiently large so that L > k. Let T be the set of integers c
between 2 and k such that cmc ≥ |G|ε/(2L) (note that as long as |G| is sufficiently large, all elements of T
must satisfy mc ≥ k2 (log k)2 /ε 4 ). Let R be the set of indices j for which ∏x∈S j x ≥ |G|ε/(2L) . Since S j is
clearly empty for j ≥ L, the matrix F can be written as
                                                                             !
                                          O                                                 O
                               F =               DFTc ⊗ · · · ⊗ DFTc  ⊗                         Mj    .
                                                   |      {z         }
                                         2≤c<k                 mc                          1≤ j≤L


Define                                                                                         
                                            O                                              O
                                  B=             DFTc ⊗ · · · ⊗ DFTc  ⊗                     M j .
                                                   |      {z         }
                                            c∈T
                                             /                 mc                          j∈R
                                                                                            /


                            T HEORY OF C OMPUTING, Volume 16 (20), 2020, pp. 1–48                                               28
                                 F OURIER AND C IRCULANT M ATRICES ARE N OT R IGID

Note that the size of B is at most                                k+L
                                                         |G|ε/(2L)      ≤ |G|ε .
Also F = B ⊗ D where
                                                                                               !
                                              O                                         O
                                  D=               DFTc ⊗ · · · ⊗ DFTc  ⊗                 Mj       .
                                                     |      {z         }
                                              c∈T              mc                       j∈R


For any rank r, we have rM (|B|r) ≤ |B|rD (r). Applying Lemma 4.9 iteratively, we get
                                                                      
                                                                                18ε
                 |G|             1                         1                    |G|
            rD                             +
                       ∑ cmc ε 4 /(k2 log k) ∑                          ≤         .
                 |B| c∈T                               7
                                              j∈R exp ε (log ∏ x)0.33            |B|
                                                                                x∈S j

Note that
                                                                       
                  1                             1                            k                        L
  ∑            4   2         +∑                               ≤      5 /(2Lk2 log k) +
     c∈T   cmc ε /(k log k)     j∈R exp ε 7 (log ∏x∈S j x)0.33      |G|ε                   exp (ε (log |G|/2L)0.33 )
                                                                                                 8


                                                                                                                      1
                                                                                                       ≤                              .
                                                                                                           exp (ε 8 (log |G|)0.32 )
Overall, we conclude
                                                                                 18ε
                                                   |G|                        |G|
                                rF                                      ≤ |B|              ≤ |G|19ε .
                                         exp (ε 8 (log |G|)0.32 )             |B|
Theorem 6.2. Let 0 < ε < 0.01 be fixed. Let G be an abelian group and f : G → C be a function. Let
M = MG ( f ) be a G-circulant matrix. If |G| is sufficiently large then
                                                              
                                                2|G|
                                rM                               ≤ |G|38ε .
                                      exp (ε 8 (log |G|)0.32 )
Proof. Note DFTG diagonalizes M. Thus, combining Theorem 6.1 with Lemma 2.21 gives the desired
conclusion.

    The rigidity results we proved hold over C. By examining the proofs more carefully, we can actually
show that when G is an abelian group, the same results hold for G-circulant matrices over an abelian
extension of Q of degree Õ(N 3 ).
Theorem 6.3. Let G be an abelian group of order N. Then there exists m = Õ(N 3 ), depending only on G,
such that G-circulant matrices with entries in Q satisfy
                                                              
                                                2|G|
                                 rM                              ≤ |G|38ε
                                      exp (ε 8 (log |G|)0.32 )

over the mth cyclotomic field. The same bound holds for the matrix DFTG .

                              T HEORY OF C OMPUTING, Volume 16 (20), 2020, pp. 1–48                                                       29
                                          Z EEV DVIR AND A LLEN L IU

Proof. First note that it is immediate from our proof that the GWH matrix Hd,n is not rigid over Q[ω]
where ω is a d th primitive root of unity. Now for well-factorable integers N = p1 p2 · · · pk , the additional
roots of unity that we need to adjoin for non-rigidity of DFTN are all roots of unity with order dividing
(p1 − 1)(p2 − 1) · · · (pk − 1). Thus, DFTN is not rigid over an extension Q[ω][α] where α is a root of
unity of order at most N. Finally for other values of N, we find a well-factorable integer N 0 = Õ(N)
and embed DFTN into an N 0 × N 0 circulant matrix. It is now immediate from Lemma 2.21 that DFTN is
not rigid over a cyclotomic field of order Õ(N 3 ). The proof in Theorem 6.2 for DFTG and G-circulant
matrices generalizes directly.


7      Finite field case
In this section, we sketch how to modify the proofs in the previous sections to deal with matrices over a
finite field. The main difficulty that arises when attempting to extend the above methods to finite fields is
that the entries of the corresponding DFT matrix might not exist in the field. Furthermore, for a finite
field Fq and integer k with gcd(k, q) > 1, there are no primitive kth roots of unity in any extension of
Fq . Because this section involves a significant amount of abstract algebra, we begin by giving a brief
overview of the algebraic tools that we will use.


7.1     Preliminaries about Galois theory and finite fields
Our standard reference for field extensions, Galois theory, and finite fields is Chapters V and VI of Lang’s
Algebra [13]. The monograph by Lidl and Niederreiter [14] is entirely dedicated to finite fields.

Definition 7.1. A field extension K/F means that K is a field and F is a subfield. The degree of the
extension is the dimension of K as a vector space over F. Given a field extension K/F, we say that α ∈ K
is algebraic over F if α is a root of some nonzero polynomial P with coefficients in F. We say that K/F
is an algebraic extension if every element of K is algebraic over F. A finite extension is an extension of
finite degree.

Fact 7.2. Every finite extension is algebraic.

Definition 7.3. Given a field extension K/F and α ∈ K that is algebraic over F, we say that the
polynomial P over F is the minimal polynomial of α if P is monic and has minimal degree among all
nonzero polynomials over F that have α as a root. The degree of α over F is the degree of its minimal
polynomial.

      It is not difficult to see that P exists, is unique, and must be irreducible over F.

Fact 7.4. Given a field extension K/F, let α ∈ K be algebraic over F. The set F[α], defined as the set of
all polynomials of α with coefficients in F, is a field. The degree of the extension F[α]/F is the degree of
α over F.

      F[α1 , α2 ] denotes F[α1 ][α2 ].

                          T HEORY OF C OMPUTING, Volume 16 (20), 2020, pp. 1–48                             30
                          F OURIER AND C IRCULANT M ATRICES ARE N OT R IGID

Definition 7.5. Let K/F be a field extension and let α ∈ K be algebraic over F. Let P be the minimal
polynomial of α. The conjugates of α are the roots of P (including α itself) in an extension of K over
which P decomposes into linear factors.

Definition 7.6 (Galois extensions). An algebraic field extension K/F is normal if for every irreducible
polynomial P over F, if P has a root in K then P splits into linear factors over K. The extension K/F is
Galois if it is normal and for all α ∈ K, all roots of the minimal polynomial of α over F are distinct.

Fact 7.7 ([13, Ch. V, Thm. 5.5]). If K is a finite field then every extension K/F is Galois.

Definition 7.8 (Galois group). For a Galois extension K/F we write Gal(K/F) to denote the set of those
automorphisms of K that fix F elementwise.

    We begin by stating some basic facts.

Fact 7.9. Let K/F be a finite Galois extension of degree g and let α ∈ K have degree m. Let G =
Gal(K/F). Then the following hold.

  (i) |G| = g.

  (ii) m | g.

 (iii) The conjugates of α ∈ K are the elements π(α) for all π ∈ Gal(K/F). The list (π(α) | π ∈ G)
       includes each conjugate of α exactly g/m times. In particular, if K = F[α] (i. e., m = g) then the
       degree of this extension is the number of conjugates of α.

 (iv) F is precisely the set of common fixed points of Gal(K/F).

    The following consequence of item (iv) is immediate.

Fact 7.10. Let K/F be a Galois extension. Let α ∈ K and let α1 , . . . , αm be the conjugates of α with
α1 = α. Then Q(α1 , . . . , αm ) ∈ F for any symmetric polynomial Q.

Fact 7.11 ([13, Ch. V, Thm. 5.4]). Let K be a finite field and F a subfield. If F = Fq then Gal(K/F) is a
cyclic group, generated by the Frobenius automorphism x 7→ xq .

    We have the following consequence.

Fact 7.12. Let K be a finite field and F = Fq a subfield. Let α ∈ K. Then the conjugates of α over F are
                                       j
precisely the elements of the form α q for nonnegative integers j. In particular, if K = Fqm = F[α] then
                                                                     0    1             m−1
m is the degree of α over F and the conjugates of α are exactly {α q , α q , . . . , α q }. Moreover, if n is
the order of α in the multiplicative group of K then m = ordq (n), the order of q modulo n.

    Now we introduce the concept of primitive roots of unity over finite fields and prove some of their
basic properties.

Fact 7.13 ([13, Ch. V, Thm. 5.3]). The multiplicative group of a finite field is cyclic. (In fact, the finite
subgroups of the multiplicative group of any field are cyclic.)

                       T HEORY OF C OMPUTING, Volume 16 (20), 2020, pp. 1–48                              31
                                       Z EEV DVIR AND A LLEN L IU

Definition 7.14. Let F be a field. We say that α ∈ F is an nth root of unity if α n = 1. We say that α ∈ F
is a primitive nth root of unity if α 6= 0 and the order of α in F× is n.

Fact 7.15 ([14, Thm. 2.47(ii)]). A primitive nth root of unity exists in the finite field of order q if and
only if n | q − 1.

Fact 7.16. Let F be a finite field of order q and let n | q − 1. Then the number of nth roots of unity is
precisely n, they are the powers of any primitive nth root of unity, and their sum is 0 if n ≥ 2 and 1 if
n = 1.

Proof. The nth roots of unity are precisely the roots of the polynomial xn − 1. They form a multiplicative
group which is therefore cyclic. The primitive nth roots of unity in F are precisely the generators of this
group. The sum of the roots of xn − 1 is the negative of the coefficient of xn−1 in xn − 1.

Fact 7.17. Let Fq be the finite field of order q and let n be an integer with gcd(q, n) = 1. Let ω be a
primitive nth root of unity in some extension field of Fq . Then the degree of the minimal polynomial of ω
over Fq is ordq (n), the order of q modulo n. The conjugates of ω are
                                                              ordq (n)−1
                                           ω, ω q , . . . , ω q            .

Proof. Immediate from Fact 7.12.

7.2   Modifications to the main proofs
In this section, we sketch how to modify the main proofs to work over finite fields. We work over a finite
field Fq where q is a fixed constant (when we say parameters are chosen to be sufficiently large, they may
be chosen in terms of q). We will first define the DFT matrices over finite fields.

Definition 7.18. Let Fq be a finite field and N be an integer with gcd(N, q) = 1. Pick a canonical
primitive N th root of unity ω in some extension of Fq . The (x, y) entry of the N × N matrix DFTN,Fq
(0 ≤ x, y ≤ N − 1) is ω xy . We will omit the second subscript Fq and just write DFTN when the base field
is clear from context.

Remark 7.19. The matrix DFTN,Fq is well-defined over any extension of Fq that contains ω. It can easily
be verified that the properties proved in Section 2 (namely that DFTN diagonalizes circulant matrices)
also hold in the finite field setting.

    The first lemma in this section allows us to lift to a field extension and then argue that if a matrix is
highly non-rigid over some low-degree extension then it also cannot be rigid over the base field.

Lemma 7.20. Consider a finite field Fq and a finite extension Fq [γ] where γ 6= 0. If the degree of γ over
Fq is g then for any matrix M ∈ Fn×n
                                 q   and any positive integer r,

                                             F              F [γ]
                                            rMq (gr) ≤ rMq (r) .


                       T HEORY OF C OMPUTING, Volume 16 (20), 2020, pp. 1–48                              32
                              F OURIER AND C IRCULANT M ATRICES ARE N OT R IGID

Proof. Let the conjugates of γ be γ1 , . . . , γg where γ1 = γ. Let k be a positive integer such that γ1k +
· · · + γgk 6= 0. Such 0 ≤ k ≤ g − 1 exists because the columns of the Vandermonde matrix generated by
                                                F [γ]
the γi are linearly independent. Let s = rMq (r). There must be a matrix E ∈ Fq [γ]n×n with at most s
nonzero entries in each row and column such that rankFq [γ] (M − E) ≤ r. Now consider the g matrices
E1 = E, E2 , . . . , Eg where Ei is obtained by taking E and replacing γ with its ith conjugate, γi . While
naturally, we would like to consider the matrix (E1 + · · · + Eg )/g and write
                                       E1 + · · · + Eg M − E1       M − Eg
                                 M−                   =       +···+        ,
                                             g           g            g
the above expression is only valid when gcd(g, q) = 1 so we will need a slight modification. Define the
matrix E 0 as follows.
                                          1                                  
                               E0 = k                ·  γ k
                                                            E
                                                          1 1 + · · · + γ k
                                                                          g g .
                                                                            E
                                    γ1 + · · · + γgk
    Note that γ1k + · · · + γgk ∈ Fq and also γ1k E1 + · · · + γgk Eg ∈ Fn×n      q      since the entries are symmetric
                                          0      n×n           0
polynomials in (γ1 , . . . , γg ). Thus E ∈ Fq and E clearly has at most s nonzero entries in each row and
column. Next we observe that
                                             1                                                   
                          M − E0 = k                    ·  γ k
                                                             1 (M − E1 ) + · · · +  γ k
                                                                                      g (M  − Eg )  .
                                       γ1 + · · · + γgk

Note that rankFq [γi ] (M − Ei ) ≤ r for all i. This is because the determinant of every r × r submatrix of
M − E can be written as a formal polynomial in γ with coefficients in Fq and since γ is a root of each of
these polynomials, γi must be as well, implying that the determinant of each of the r × r submatrices of
M − Ei is 0. Thus, we conclude that rankFq (M − E 0 ) ≤ gr. Writing M = (M − E 0 ) + E 0 , we immediately
get the desired conclusion.

    Following the proof of Theorem 3.1, we can prove an analogue over finite fields. All we needed in
Theorem 3.1 was that we were working over a field that contained the roots of unity in the definition of
the GWH matrix. Over finite fields, it suffices to work over an extension that contains the necessary roots
of unity.
Theorem 7.21. Let Fq be a finite field and N = d n for positive integers d, n, q with gcd(d, q) = 1. Let ω
be a primitive d th root of unity in some extension of Fq . Let 0 < ε < 0.01 and assume n ≥ 1/ψ where

                                                            ε2
                                            ψ=                          .
                                                  400 log2 (1/ε)d log d
Let Hd,n = DFTd ⊗ · · · ⊗ DFTd . Then
           |      {z         }
                       n

                                                 F [ω]
                                                         N 1−ψ ≤ N ε .
                                                              
                                                rHqd,n

   With the above, we can now prove a finite field version of Lemma 4.10. The proof is the same as the
proof of Lemma 4.10, using Theorem 7.21 in place of Theorem 3.1. The only necessary change is that
we need to work over an extension of Fq that contains all of the necessary roots of unity.

                           T HEORY OF C OMPUTING, Volume 16 (20), 2020, pp. 1–48                                     33
                                                 Z EEV DVIR AND A LLEN L IU

Lemma 7.22. Let 0 < ε < 0.01 be some chosen parameter, Fq be a fixed finite field, and D be some
sufficiently large constant (possibly depending on ε and q). Consider        positive integers t1 ≤ t2 · · · ≤ tn
                                                       ti2 (logti )2
                                                                        
with gcd(ti , q) = 1 for all i. Also assume ai ≥ max        ε 10
                                                                     , D for all i. Let P = t1a1 · · ·tnan and L =
d2 log log Pe. Consider the field extension Fq [ω1 , . . . , ωn ] where ωi is a primitive ti th root of unity.6 Let
DFTti be the ti × ti DFT matrix with entries in the field extension. Let

                             A = (DFTt1 ⊗ · · · ⊗ DFTt1 ) ⊗ · · · ⊗ (DFTtn ⊗ · · · ⊗ DFTtn ) .
                                  |       {z          }              |       {z          }
                                                 a1                                     an

Then we have                                                                     
                                           F [ω1 ,...,ωn ]        6      2
                                          r Aq                P1−ε /(10Ltn logtn ) ≤ P5ε .

   We also need a slight modification in the proof of Lemma 4.15. We will use the following definitions
from Section 4.

    • x is a sufficiently large integer.

    • l is an integer such that
                                                              g100 (x) ≤ l ≤ g10 (x)
       where gk (x) is defined as in Notation 5.1.

    • N is (l, x)-factorable.

    • multN (S), factN (S) are defined as in Definition 4.11 and the matrix M(S) is defined as in Defini-
      tion 4.13.

Remark 7.23. Note that since x is sufficiently large and N is (l, x)-factorable, gcd(N, q) = 1. This will
be important later on.

Remark 7.24. Note that if γ is a primitive N th root of unity, the matrix M(S) is defined over the extension
Fq [γ] for all subsets S.

Lemma 7.25. Let Fq be a fixed finite field. Let x be sufficiently large and N = q1 q2 · · · ql be an (l, x)-
factorable number with gcd(N, q) = 1 and g100 (x) ≤ l ≤ g10 (x). Let t1 , . . . ,ta be the set of prime powers
at most x0.3 that are relatively prime to q. Let ω1 , . . . , ωa be primitive t1 th , . . . ,ta th roots of unity and let γ
be a primitive N th root of unity. For a subset S ⊂ [l] with |S| = k and M(S) (as defined in Definition 4.13)
a factN (S) × factN (S) matrix, we have
                                                                   
                                 Fq [γ,ω1 ,...,ωa ]     factN (S)
                                rM(S)                       6  0.37
                                                                      ≤ (factN (S))6ε
                                                      exp (ε x )

as long as k ≥ (log x)xC0 +200 .

   6 The ω need not be distinct. We only need F [ω , . . . , ω ] to be an extension that contains all of ω , . . . , ω .
          i                                    q 1            n                                           1           n



                            T HEORY OF C OMPUTING, Volume 16 (20), 2020, pp. 1–48                                          34
                              F OURIER AND C IRCULANT M ATRICES ARE N OT R IGID

Proof Sketch. Without loss of generality S = {1, 2, . . . , k}. Recall that in the proof of Lemma 4.15, we
argued that the matrix M(S) is Zq1 −1 × · · · × Zqk −1 circulant. Then, using the prime factorizations of each
of q1 − 1, . . . , qk − 1, we wrote Zq1 −1 × · · · × Zqk −1 as a direct product of cyclic groups of prime power
order. Since each qi − 1 must factor into prime powers that are at most x0.3 , we argued that some of
these cyclic groups must appear many times in the direct product and then we could apply Lemma 4.10.
Over finite fields, the only necessary change in the proof of Lemma 4.15 is due to the fact that for an
integer b with gcd(b, q) > 1, primitive bth roots of unity do not exist over an extension of Fq . Thus, in
the direct product of cyclic groups of prime power order, we cannot use those factors whose order is not
relatively prime to q (because we cannot diagonalize Zb -circulant matrices when gcd(b, q) > 1). To deal
with this, we will use a more precise bound than (4.1) where prime powers not relatively prime to q are
also excluded from the product on the left hand side.
     Consider the factorizations of q1 − 1, . . . , qk − 1 into prime powers. For each prime power pei i with
pi ≤ x0.3 , let c(pei i ) be the number of indices j for which pei i appears (exactly) in the factorization of
  ei

q j − 1. Also let p be the characteristic of the finite field Fq that we are working over (so q is a power of
p). Note that
                                                                                 2               f
                     (q1 − 1) · · · (qk − 1) = ∏ t c(t) = pc(p) p2c(p ) · · · p f c(p )                   ∏       t c(t) ,
                                                      t                                              gcd(t,p)=1


where t ranges over all prime powers at most x0.3 and p f is the largest power of p that is at most x0.3 . For
a power of p, say pi , let d(pi ) be the number of indices j such that q j − 1 is divisible (not necessarily
exactly divisible) by pi . Let L = b(1000 +C0 ) log p log xc. Then we have

                    2           f                     2              f       L           i   f        i                      1000+C0
         pc(p) p2c(p ) · · · p f c(p ) = pd(p)+d(p )+···+d(p ) ≤ p∑i=1 d(p )+∑i=L+1 d(p ) ≤ pLk+ f x/(log x)
                                                                           1000+C0
                                    ≤ (log x)(1000+C0 )k xx/(log x)                  .

Next, consider all prime powers pei i for which c(pei i ) < x0.62 . These satisfy
                                                                              x0.3
                                                          c(t)       0.3 x0.62           0.92
                                           ∏          t          ≤ (x )              ≤ xx .
                                       t,c(t)≤x0.62


Now without loss of generality, say {t1 , . . . ,tn } is the subset of {t1 , . . . ,ta } (n ≤ a) consisting of the set of
                                                                            c(t )        c(t )
prime powers for which gcd(ti , p) = 1 and c(ti ) ≥ x0.62 . Let P = t1 1 · · ·tn n . From the above we know
that as long as x is sufficiently large
                                                                         εk         
                                                                  x
           factN (S)                                        (log x)C0 +1
 P ≥ x0.92   (1000+C )k      (x)
                                 ≥ (factN (S))(1−ε) · x0.92                          ≥ (factN (S))(1−ε) .
    x (log x)           g
                    0 x 1000                         x (log x)(1000+C0 )k xg1000 (x)

Recall g1000 (x) = x/(log x)1000+C0 is as defined in Notation 5.1. The remainder of the proof can be
completed in the same way as Lemma 4.15 using Lemma 7.22 in place of Lemma 4.10.

    Using the above we can prove the following analogue of Theorem 4.7.

                         T HEORY OF C OMPUTING, Volume 16 (20), 2020, pp. 1–48                                                         35
                                               Z EEV DVIR AND A LLEN L IU

Theorem 7.26. Let Fq be a fixed finite field and 0 < ε < 0.01 be some constant. Let x be sufficiently
large and N = q1 q2 · · · ql be an (l, x)-factorable number with gcd(N, q) = 1 and g100 (x) ≤ l ≤ g10 (x). Let
t1 , . . . ,ta be the set of prime powers at most x0.3 that are relatively prime to q. Let ω1 , . . . , ωa be primitive
t1 th , . . . ,ta th roots of unity and let γ be a primitive N th root of unity. Then
                                                                                    
                                         Fq [γ,ω1 ,...,ωn ]             N
                                        rDFT                                           ≤ N 7ε .
                                              N               exp (ε 6 (log N)0.36 )

Proof. As in the proof of Theorem 4.7, we can subdivide the matrix DFTN into submatrices of the form
M(S) for various subsets S ⊂ [l] using Lemma 4.14 (it is easily verified that Lemma 4.14 also holds
over finite fields). We can remove all of the rows and columns corresponding to integers divisible by
too many of the primes q1 , . . . , ql because the contribution of these rows and columns is low-rank. The
remaining entries can be subdivided into matrices of the form M(S) where |S| is sufficiently large so we
can then apply Lemma 7.25 to change a small number of entries in each row and column to reduce the
rank significantly. The precise computations are exactly the same as in Theorem 4.7.

   We will now combine Theorem 7.26 with Lemma 7.20 to get our main theorem for circulant matrices
over finite fields.

Theorem 7.27. Let 0 < ε < 0.01 be a given parameter and Fq be a fixed finite field. For all sufficiently
large N, if M is an N × N circulant or Toeplitz matrix then
                                                              
                                                  N
                                                                 ≤ N 15ε .
                                   Fq
                                  rM
                                        exp (ε 6 (log N)0.35 )

Proof. First we analyze circulant matrices of size N0 where N0 is (l, x)-factorable for some

                                                   g100 (x) ≤ l ≤ g10 (x) .

Note that as long as x is sufficiently large, N0 must be relatively prime to q. Since DFT matrices
diagonalize circulant matrices (even over finite fields), Theorem 7.26 and Lemma 2.21 imply that for M0 ,
an N0 × N0 circulant matrix where N0 satisfies the previously mentioned conditions,
                                                                             
                                Fq [γ,ω1 ,...,ωa ]             2N0
                              rM0                                               ≤ N014ε
                                                     exp (ε 6 (log N0 )0.36 )

where γ is a primitive N th root of unity and ω1 , . . . , ωa are primitive t1 th , . . . ,ta th roots of unity for t1 , . . . ,ta
being the set of prime powers at most x0.3 that are relatively prime to q. Now we analyze the degree of
the extension Fq [γ, ω1 , . . . , ωa ]. Note Fq [γ, ω1 , . . . , ωa ] ⊂ Fq [η] where η is a primitive root of unity of
order C = N0 lcm(t1 ,t2 , . . .ta ). By Fact 7.17, the degree of the extension Fq [η] is the order of q modulo
C. Since N0 factors intoa product of distinct x-good primes, by Fermat’s little theorem, the order of q
modulo N0 divides x0.3 !. Also, all prime powers dividing lcm(t1 ,t2 , . . . ,ta ) are at most x0.3 . Thus, the
order of q modulo C divides x0.3 !. Overall, the order of q mod C is at most
                                                        0.3
                                          x0.3 ! < x0.3x ≤ exp (log N0 )0.31 .
                                                                           


                           T HEORY OF C OMPUTING, Volume 16 (20), 2020, pp. 1–48                                               36
                           F OURIER AND C IRCULANT M ATRICES ARE N OT R IGID

Thus the degree of the extension Fq [γ, ω1 , . . . , ωa ] is at most exp (log N0 )0.31 . By Lemma 7.20
                                                                                      

                                                                       
                                                       N0
                                                                            ≤ N014ε .
                                    F
                                   rMq0
                                              exp (ε (log N0 )0.359 )
                                                    6


To complete the proof, we can simply repeat the arguments in the proof of Theorem 5.3. For a circulant
matrix M of arbitrary size N × N, note that it is possible to embed an M in the upper left corner of a
circulant matrix of any size at least 2N. By Lemma 5.2, there exists an N0 that is (l, x)-factorable for
some g100 (x) ≤ l ≤ g10 (x) such that
                                             N0            N0
                                                   2
                                                     ≤N≤       .
                                          (log N0 )         2
We deduce                                                              
                                                       N0
                                                                            ≤ N014ε .
                                     F
                                    rMq
                                              exp (ε (log N0 )0.359 )
                                                    6

Rewriting the bounds in terms of N we get
                                                             
                                                 N
                                                                ≤ N 15ε .
                                  Fq
                                 rM
                                       exp (ε 6 (log N)0.35 )

8 G-circulant matrices over finite fields
We will now generalize Theorem 6.2 to matrices over a finite field Fq except we will require the additional
condition that gcd(|G|, q) = 1. Write the underlying abelian group G as a direct product of cyclic groups
Zn1 × · · · × Zna . While for matrices with entries in C, it sufficed to work with the Kronecker product of the
DFT matrices DFTn1 ⊗ · · · ⊗ DFTna , we require slightly different techniques for rigidity over a fixed finite
field as an extension containing all of the necessary roots of unity could have too high degree. Instead of
working through DFT matrices, we will work directly with the G-circulant matrices themselves.
     While for sufficiently large cyclic groups, we did not require the condition that gcd(|G|, q) = 1 (see
Theorem 7.27), we require the condition for general abelian groups because we need to use Theorem 7.21
to deal with the case when G contains the direct product of many copies of a small cyclic group.
In particular, our techniques do not handle a group such as Z p2 × · · · × Z p2 where p is equal to the
characteristic of the field Fq . It is an interesting open question to see if the condition that gcd(|G|, q) = 1
can be eliminated. The work in [7] deals with the case where q = pa for a prime p and G is a direct
product of many cyclic groups of order p but not the case when G is a direct product of many cyclic
groups of order p2 (or some other power of p).
     The first important observation is that Theorem 5.3 can be slightly strengthened so that to reduce
the rank of any circulant matrices, the locations to be changed are fixed and the changes are fixed linear
combinations of the entries of the circulant matrix. More precisely, we make the following definition.

Definition 8.1. Given a group G of order |G| = n, we say G is (r, s)-reducible over Fq if the following
condition holds. There exist

    • a set S ⊂ [n] × [n] of positions where S contains at most s positions in each row and column,

                       T HEORY OF C OMPUTING, Volume 16 (20), 2020, pp. 1–48                                 37
                                                Z EEV DVIR AND A LLEN L IU

    • matrices A, B ∈ Fn×n
                       q   where rank(A), rank(B) ≤ r,

    • matrices E1 , . . . , En ∈ Fn×n
                                  q   with all nonzero entries in S, and

    • matrices Y1 , . . . ,Yn , Z1 , . . . , Zn ∈ Fqn×n

such that for any G-circulant matrix M with top row (x1 , . . . , xn ), we have

                    M = A(x1Y1 + · · · + xnYn ) + (x1 Z1 + · · · + xn Zn )B + (x1 E1 + · · · + xn En ) .

      In such a decomposition, the matrices A, B will be called (r, s)-reduction matrices and the matrices
Y1 , . . . ,Yn , Z1 , . . . , Zn , E1 , . . . , En will be called (r, s)-reduction helpers. We write YM = x1Y1 + · · · + xnYn
and similarly for ZM and EM .

    Following the proof of Theorem 5.3, we can show that ZN is
                                                                  
                                               N               15ε
                                                            ,N
                                     exp (ε 6 (log N)0.35 )

reducible over C. We now prove an analogue of this result for finite fields.

Claim 8.2. For fixed 0 < ε < 0.01 and all sufficiently large N, the group ZN is
                                                                   
                                                N               15ε
                                                             ,N
                                      exp (ε 6 (log N)0.35 )

reducible over Fq .

Proof. First consider an integer N0 that is (l, x)-factorable for some g100 (x) ≤ l ≤ g10 (x). As long as x
is sufficiently large, gcd(N0 , q) = 1. Let M0 be a N0 × N0 circulant matrix (i. e., a G-circulant matrix for
G = ZN0 ) over Fq and let the entries in its top row be x1 , . . . , xN0 . Let γ be a primitive N0 th root of unity
and t1 , . . .tn be the set of prime powers at most x0.3 that are relatively prime to q. Let ω1 , . . . , ωn be roots
of unity of order t1 , . . . ,tn respectively. By Theorem 7.26, there exists a matrix E over Fq [γ, ω1 , . . . , ωn ]
with at most N07ε nonzero entries in each row and column such that

                                                                        N0
                                       rank(DFTN0 −E) ≤                                .
                                                              exp (ε 6 (log N0 )0.36 )

Now write

               M0 = DFT∗N0 ·D · DFTN0 = (DFTN0 −E)∗ D · DFTN0 +E ∗ D(DFTN0 −E) + E ∗ DE

where D is a diagonal matrix whose entries are linear combinations of x1 , . . . , xN0 . Note that all of the
above matrices have entries contained in Fq [γ, ω1 , . . . , ωn ] ⊆ Fq [η] where η is a primitive root of unity of
order C = N0 lcm(t1 , . . . ,tn ). As argued before in the proof of Theorem 7.27, the degree of the extension
is at most exp (log N0 )0.31 . Let the conjugates of η be η1 = η, η2 , . . . , ηm . Let DFT1N0 , . . . , DFTm
                                                                                                            N0 be


                           T HEORY OF C OMPUTING, Volume 16 (20), 2020, pp. 1–48                                          38
                            F OURIER AND C IRCULANT M ATRICES ARE N OT R IGID

obtained by taking DFTN0 and replacing η with its conjugates. Define D1 , . . . , Dm , E 1 , . . . , E m similarly.
As in the proof of Lemma 7.20, there exists an integer k such that η1k + · · · + ηmk 6= 0. We now have
                                                                                                               !
                           m                                                 m                      m
           1                                                                       i∗                     i∗
 M0 = k
     η1 + · · · + ηmk
                           ∑ ηik (DFTiN −E i )∗ Di · DFTiN + ∑ ηik E Di (DFTiN −E i ) + ∑ ηik E Di E i
                                            0                        0                      0
                                                                                                                   .
                           i=1                                               i=1                    i=1


Note that 1/(η1k + · · · + ηmk ) ∈ Fq and all three of the sums are matrices whose entries are linear combina-
tions of x1 , . . . , xN0 with coefficients in Fq . The last term satisfies the desired sparsity constraint as it has
at most N014ε nonzero entries in each row and column and the locations of these entries are independent
of M0 .
    It remains to argue that the first two terms satisfy the desired rank constraint. Note that the span of
the columns of (DFT1N0 −E 1 ), . . . , (DFTm          m
                                                N0 −E ) has dimension at most

                                           mN0                 N0
                                                        ≤
                                   exp (ε (log N0 ) ) exp (ε (log N0 )0.359 )
                                         6         0.36     6


over Fq [η]N0 . Therefore, the dimension of the intersection of this subspace with FNq 0 , say V , has dimension
at most
                                                      N0
                                                                     .
                                           exp (ε 6 (log N0 )0.359 )
In particular we can write
                               m
                            ∑ ηik (DFTiN −E i )∗ Di · DFTiN = x1C1 + · · · + xN CN
                                                0                        0              0       0
                            i=1

for some fixed matrices C1 , . . . ,CN0 with entries in Fq . Also all columns of C1 , . . . ,CN0 must be in V so
each can be written as AYi where A is a fixed matrix with rank at most
                                                            mN0
                                                                           .
                                                    exp (ε (log N0 )0.36 )
                                                          6


Thus there exists fixed matrices Y1 , . . . ,YN0 ∈ Fn×n
                                                    q   and a matrix A satisfying the desired rank constraint
such that
                           m
                          ∑ ηik (DFTiN −E i )∗ Di · DFTiN = A(x1Y1 + · · · + xN YN ) .
                                        0                        0                      0       0
                          i=1

A similar argument shows that the second term can also be written in the desired form.
    Now to extend to arbitrary N (not necessarily (l, x)-factorable), simply note that any circulant matrix
of size N can be embedded into a circulant matrix of any given size at least 2N where each entry of the
larger matrix is equal to some entry of the original matrix. We can then apply Lemma 5.2 and complete
the proof in the same way as Theorem 5.3.

   Claim 8.2 allows us to deal with large cyclic groups. We will also need a way of dealing with a direct
product of many copies of a small cyclic group.

                        T HEORY OF C OMPUTING, Volume 16 (20), 2020, pp. 1–48                                      39
                                          Z EEV DVIR AND A LLEN L IU

Claim 8.3. Let Fq be a finite field and N = d n for positive integers d, n, q with gcd(d, q) = 1. Let
0 < ε < 0.01 and assume n ≥ 2/ψ where

                                                        ε2
                                           ψ=                       .
                                              400 log2 (1/ε)d log d

Let G = Zd ⊗ · · · ⊗ Zd . Then G is N 1−ψ/2 , N 2ε reducible over Fq .
                                                  
        |    {z       }
                n

Proof. Let ω be a primitive d th root of unity in some extension of Fq . We use Theorem 7.21 to find a
sparse matrix E with entries in Fq [ω] such that Hd,n − E has low rank where

                                          Hd,n = DFTd ⊗ · · · ⊗ DFTd .
                                                 |      {z         }
                                                                    n

We can then repeat the same argument as in the proof of Claim 8.2, using the fact that Hd,n diagonalizes
any G-circulant matrix.

    Now we introduce the main technical result of this section that allows us to deal with direct products
of different groups without going through the corresponding DFT matrices.

Claim 8.4. Consider a list of abelian groups, G1 , . . . , Ga , such that |Gi | = ni . Assume for each 1 ≤ i ≤ a ,
Gi is (ri , si )-reducible over Fq . Let G = G1 × · · · × Ga and |G| = n = n1 n2 · · · na . Then the group G is
(r, s)-reducible over Fq where
                                                               √
                                        r=       ∑         2l ∏ ri ni ∏ ni0 ,
                                             S⊂[a],|S|=l      i∈S             i0 ∈S
                                                                                 /

                                           s=        ∑        2|S| ∏ ni ∏ si0 .
                                                S⊂[a],|S|<l             i∈S   i0 ∈S
                                                                                 /

Proof. Let M be a G-circulant matrix. For each 1 ≤ i ≤ a, let Ai , Bi be the (ri , si )-reduction matrices for
the group Gi . Note that the reducibility assumption means that any Gi -circulant matrix can be written as
a sum of three matrices where the first contains a fixed high dimensional subspace in its left nullspace,
the second contains a fixed high dimensional subspace in its right nullspace, and the third is sparse. The
first step in our proof will involve writing M as a sum of 3a matrices. Roughly, each of these 3a matrices
corresponds to choosing one of the three possible components (large left nullspace, large right nullspace,
or sparse) for each of the groups Gi .
     More formally, for each i ∈ [a], consider the group Gi . Let its (ri , si )-reduction helpers be {Ygi }, {Zgi },
{Egi } for gi ∈ Gi . Let Y (gi ) = AiYgi , Z(gi ) = Zgi Bi and E(gi ) = Egi . By definition, for a G-circulant
matrix MGi with top row given by {xg } for g ∈ Gi ,

                                    MGi = ∑ (Y (gi ) + Z(gi ) + E(gi ))xgi .
                                             gi ∈G

Thus, for fixed gi , hi , ki ∈ Gi , the entry of Y (gi ) + Z(gi ) + E(gi ) indexed by (hi , ki ) is equal to 1 if
hi + ki = gi (in the group Gi ) and 0 otherwise.

                        T HEORY OF C OMPUTING, Volume 16 (20), 2020, pp. 1–48                                     40
                                 F OURIER AND C IRCULANT M ATRICES ARE N OT R IGID

    We index the rows and columns of M with ordered tuples (h1 , . . . , ha ) and (k1 , . . . , ka ) respectively
(where hi , ki ∈ Gi ). Let the entries in the top row of M be xg1 ,...,ga where (g1 , . . . , ga ) ranges over
G1 × · · · × Ga . Now for each ordered tuple I = (i1 , . . . , ia ) ∈ {1, 2, 3}a , we will construct a |G| × |G|
matrix MI . Let S1 (I), S2 (I), S3 (I) ⊂ [a] denote the subsets of locations where the entry of I is 1, 2 or 3
respectively. We define
                                                                                         
                                                           O                            O                          O
                   MI =       ∑           xg1 ,...,ga               Y (gi ) ⊗                 Z(gi ) ⊗                  E(gi )
                          (g1 ,...,ga )                   i∈S1 (I)                    i∈S2 (I)                    i∈S3 (I)

where the sum is over all (g1 , . . . , ga ) ∈ G1 × · · · × Ga . The first important observation is that

                                                                M=         ∑          MI .                                             (8.1)
                                                                        I∈{1,2,3}a

To see this, it suffices to note that the coefficients of xg1 ,...,ga on the right hand side for fixed g1 , . . . , ga are
given by the matrix
                                                                                     
                                               O                           O                           O
                             ∑                          Y (gi ) ⊗                Z(gi ) ⊗                   E(gi ) =
                         I∈{1,2,3}a           i∈S1 (I)                   i∈S2 (I)                     i∈S3 (I)
                                                                                    O
                                                                                            (Y (gi ) + Z(gi ) + E(gi )) .
                                                                                    i∈[a]

The entry indexed by (h1 , . . . , ha ) and (k1 , . . . , ka ) on the right hand side is equal to 1 if hi + ki = gi for all
i and 0 otherwise. This completes the proof of (8.1).
    We would like to write M as a sum of three matrices, say P1 , P2 , P3 , whose entries are linear forms
in the variables xg1 ,...,ga and such that P1 = AY, P2 = ZB for some fixed low-rank matrices A, B and P3 is
sparse. Write
                                         M=           ∑ MI + ∑ MI .
                                                             I∈{1,2,3}a              I∈{1,2,3}a
                                                            |S3 (I)|≤a−l            |S3 (I)|>a−l

We will prove that P1 , P2 can be obtained by splitting the first sum and we can set P3 to be equal to the
second sum. For each 1 ≤ i ≤ a, there exists a set of linearly independent vectors vi1 , . . . , vini −ri such that
vij Ai = 0 and a set of linearly independent vectors ui1 , . . . , uini −ri such that Bi uij = 0 for all 1 ≤ j ≤ ni − ri .
We can complete the set {vi1 , . . . , vini −ri } to a basis {vi1 , . . . , vini } and similar for {ui1 , . . . , uini }. Consider
the basis of Fnq consisting of the vectors v1j1 ⊗ v2j2 ⊗ · · · ⊗ vaja where ( j1 , . . . , ja ) ∈ [n1 ] × · · · × [na ]. Now
assume we are given a matrix MI with I ∈ {1, 2, 3}a . The key observation is that if ji ≤ ni − ri for some
index i ∈ S1 (I), then
                                              v1j1 ⊗ v2j2 ⊗ · · · ⊗ vaja MI = 0 .
                                                                        
                                                                                                                             (8.2)
This is because Y (gi ) = AiYgi so by construction, viji Y (gi ) = 0 for all gi ∈ Gi . Using this observation and
examining the definition of MI , we immediately get (8.2). Similarly, we get that if ji ≤ ni − ri for some
index i ∈ S2 (I) then
                                       MI u1j1 ⊗ u2j2 ⊗ · · · ⊗ uaja = 0 .
                                                                    


                           T HEORY OF C OMPUTING, Volume 16 (20), 2020, pp. 1–48                                                         41
                                                      Z EEV DVIR AND A LLEN L IU

Let R1 ⊂ {1, 2, 3}a be the set of ordered tuples I such that
                                                                       ri                  ri
                                                             ∏ ni ≤ ∏ ni .
                                                           i∈S1 (I)             i∈S2 (I)

Let R2 = {1, 2, 3}a \R1 . We now write

                                               ∑          MI =          ∑          MI +               ∑        MI
                                            I∈{1,2,3}a                I∈R1                           I∈R2
                                           |S3 (I)|≤a−l          |S3 (I)|≤a−l                   |S3 (I)|≤a−l


and will argue that the first term, which we call P1 , has a fixed, high dimensional subspace contained in its
left nullspace while the second term, which we call P2 , has a fixed, high dimensional subspace contained
in its right nullspace. We work with the basis v1j1 ⊗ v2j2 ⊗ · · · ⊗ vaja where ( j1 , . . . , ja ) ∈ [n1 ] × · · · × [na ]
and count the number of these basis vectors that are not in the left nullspace of P1 . For each I ∈ R1 , by
(8.2), MI contributes at most
                                              ∏ ri ∏ ni
                                                            i∈S1 (I)        i∈[a]\S1 (I)

basis vectors for which vMI is nonzero. Furthermore if S1 (I) ⊂ S1 (I 0 ) for two distinct ordered tuples
I and I 0 , the contributions of MI 0 are redundant with the contributions of MI . Thus, we can ignore the
contributions of MI for ordered tuples I for which |S3 (I)| < a − l. Overall, the number of basis vectors
outside the left nullspace of P1 is at most
                                                                               √                                             √
        ∑          ∏ ri ∏                  ni ≤       ∑           ∏             ri ni ∏ ni =                    ∑        2l ∏ ri ni ∏ ni0   (8.3)
         I∈R1    i∈S1 (I)   i∈[a]\S1 (I)           I∈{1,2,3}a i∈[a]\S3 (I)                 i∈S3 (I)        S⊂[a],|S|=l     i∈S     i0 ∈S
                                                                                                                                      /
    |S3 (I)|=a−l                                  |S3 (I)|=a−l

where to obtain the above inequality, we first used the fact that I ∈ R1 and then used that for a fixed set
S3 (I), there are 2l possible ordered tuples I. Note that the basis v1j1 ⊗ v2j2 ⊗ · · · ⊗ vaja where ( j1 , . . . , ja ) ∈
[n1 ] × · · · × [na ] is fixed (i. e., independent of the entries of M). Thus we can write P1 = AX where the
entries of X are linear forms in the entries of M and A is a fixed matrix with rank bounded above by the
expression in (8.3). A similar argument allows us to write P2 = Y B for a fixed matrix B with the desired
rank.
      Now it remains to bound the sparsity of

                                                                       ∑          MI .
                                                                 I∈{1,2,3}a
                                                                |S3 (I)|>a−l

We claim that the number of nonzero entries in each row and column of MI is at most

                                                              ∏ si ∏                       ni .
                                                            i∈S3 (I)        i∈[a]\S3 (I)

To see this, note that for each i, the matrices E(gi ), as gi ranges over all of Gi , have all of their nonzeros
contained in a fixed set Si where Si contains at most si distinct locations in each row and column. For

                             T HEORY OF C OMPUTING, Volume 16 (20), 2020, pp. 1–48                                                            42
                               F OURIER AND C IRCULANT M ATRICES ARE N OT R IGID

                                                           3
each fixed subset S3 (I), there are exactly 2|S (I)| possible ordered tuples I. Thus the number of nonzero
entries in each row and column of the sum is at most

                                 ∑          ∏ si ∏                   ni =      ∑          2|S| ∏ si ∏ ni .
                             I∈{1,2,3}a    i∈S3 (I)   i∈[a]\S3 (I)          S⊂[a],|S|<l      i∈[a]\S   i∈S
                            |S3 (I)|>a−l

This completes the proof that the group G is (r, s)-reducible over Fq .

    We are now ready to prove the main theorem about rigidity of G-circulant matrices over finite fields.
Theorem 8.5. Let Fq be a fixed finite field and ε < 0.01 be a fixed constant. Let G be an abelian group.
As long as |G| is sufficiently large and gcd(|G|, q) = 1, for any G-circulant matrix M over Fq , we have
                                                                 
                                                   |G|
                                                                    ≤ |G|100ε .
                                    Fq
                                   rM
                                         exp (ε 20 (log |G|)0.3 )
Proof. By the Fundamental Theorem of Finite Abelian Groups we can write G = Zn1 × · · · × Zna . The
proof will essentially follow the same method as the proof of Theorem 6.1 except using Claim 8.4 to deal
with direct products of cyclic groups that are roughly the same order.
    Without loss of generality, n1 ≤ n2 ≤ · · · ≤ na . We will choose k to be a fixed, sufficiently large positive
                                                                                                         j−1  j
integer (possibly depending on q, ε). Consider the ranges I1 = [k, k2 ), I2 = [k2 , k4 ), . . . I j = [k2 , k2 ) . . .
and so on. Let S j be a multiset defined by S j = I j ∩ {n1 , . . . , na }. Fix a j and let the elements of S j be
x1 ≤ · · · ≤ xb . By Claim 8.2, we have that (since k sufficiently large) for each xi , the group Zxi is
                                                                        
                                                      xi             15ε
                                                                  ,x                                          (8.4)
                                          exp (ε 6 (log xi )0.35 ) i
reducible over Fq . Now we will use Claim 8.4 to argue about the group G j = Zx1 × · · · × Zxb . Set l = dεbe
in Claim 8.4. We get that G j = Zx1 × · · · × Zxb is (r, s) reducible for some r, s that are obtained by plugging
(8.4) into the expressions in Claim 8.4. We bound r and s more carefully below. We have

                                  2dεbe x1 · · · xb            (2b)dεbe
                       
                      b                                                            x1 · · · xb
              r≤                                            ≤
                    dεbe (exp (ε 6 (log x1 )0.35 ))dεbe/2        εb dεbe
                                                               (3)       (exp (ε (log x1 )0.35 ))
                                                                                6                dεbe/2
                                                                                              dεbe/2
                                                                               36
                                                     = x1 · · · xb                                     .
                                                                     ε exp (ε (log x1 )0.35 )
                                                                      2       6


    As long as k is sufficiently large, we have
                                                      dεbe/2                                         dεbe/2
                                      36                                                  1
        r ≤ x1 · · · xb                                        ≤ x1 · · · xb
                          ε 2 exp (ε 6 (log x1 )0.35 )                         exp (ε 6 (log x1 )0.34 )
                                                                                            x1 · · · xb
                                                                              ≤
                                                                                 exp (ε (log x1 · · · xb )0.33 )
                                                                                         7


where in the last step we used the fact that xi ≤ x12 for all i. Next we bound the sparsity s. We have

      s ≤ 4b xb · · · xb−bεbc+1 (xb−bεbc · · · x1 )15ε = 4b (x1 · · · xb )15ε (xb · · · xb−bεbc+1 )1−15ε ≤ (x1 · · · xb )18ε

                           T HEORY OF C OMPUTING, Volume 16 (20), 2020, pp. 1–48                                               43
                                           Z EEV DVIR AND A LLEN L IU

where in the last step above, we used the fact that xi ≤ x12 for all i. Thus, we have shown that the group
G j = Zx1 × · · · × Zxb is
                                                                                      
                                           x1 · · · xb                             18ε
                                                                   , (x1 · · · xb )
                                 exp (ε 7 (log x1 · · · xb )0.33 )
reducible over Fq .
    The above allows us to deal with direct products of large cyclic groups that are all of roughly the
same order. In the direct product G = Zn1 × · · · × Zna , we will split the terms into cyclic groups of small
order, which can be dealt with using Claim 8.3, and several products of large cyclic groups that can each
be dealt with using the above. We now formalize this argument. For each integer c between 2 and k with
gcd(c, q) = 1, let mc be the number of copies of c in the set {n1 , . . . , na }. If mc ≥ k2 (log k)2 /ε 4 then by
Claim 8.3, the group Zc × · · · × Zc is
                      |     {z     }
                             mc

                                                    4   2       2
                                                                             
                                             cmc (1−ε /(k (log k) ) , c2mc ε

reducible over Fq . Let L = d2 log log |G|e and ensure that |G| is sufficiently large so that L > k. Let T be
the set of integers c between 2 and k with gcd(c, q) = 1 such that cmc ≥ |G|ε/(2L) . Note that as long as
|G| is sufficiently large, all elements of T must satisfy mc ≥ k2 (log k)2 /ε 4 . Let R be the set of indices j
for which ∏x∈S j x ≥ |G|ε/(2L) . Note that S j is clearly empty for j ≥ L. Recall that gcd(|G|, q) = 1 so the
group G can be written as
                                                                                              !
                            G=
                              
                                      ×
                                      2≤c<k
                                                 Zc × · · · × Zc 
                                                  |    {z       } 
                                                                     ×            ×G
                                                                                 1≤ j≤L
                                                                                              j       .
                                                          mc
                                    gcd(c,q)=1

    Define
                                                                                     

                                       ×Z| × ·{z· · × Z} × 
                                                                              O
                                  B=               c          c                    G j .
                                          c∈T
                                           /             mc                   j∈R
                                                                               /


    Note that
                                                         k+L
                                          |B| ≤ |G|ε/(2L)      ≤ |G|ε .

Also G = B × D where
                                                                                      !
                                  D=     ×Z| ⊗ ·{z· · ⊗ Z} × ×G
                                          c∈T
                                                     c          c
                                                                              j∈R
                                                                                      j       .
                                                         mc

Now we apply Claim 8.4 again on D where we view D as a direct product of groups of the form

                        T HEORY OF C OMPUTING, Volume 16 (20), 2020, pp. 1–48                                  44
                             F OURIER AND C IRCULANT M ATRICES ARE N OT R IGID

Z ⊗ · · · ⊗ Zc and G j and we set l = 1. We get that D is (rD , sD ) reducible over Fq where
| c {z       }
     mc
                                                                                
                                1                              1                              |D|
      rD ≤ 2|D|  ∑           4    2         +∑                                ≤        8
                                                                                                             ,
                   c∈T   cmc ε /(2k log k)   j∈R exp 0.5ε 7 (log ∏x∈S j x)0.33      exp (ε   (log |G|)0.32 )

      sD ≤ |D|18ε .

Finally, note that the group B is trivially (0, |B|) reducible over Fq . Thus, by Claim 8.4, G = B × D is
(rG , sG ) reducible over Fq for

                                           2|D| · |B|                     |G|
                            rG ≤                               ≤                         ,
                                    exp (0.5ε 8 (log |G|)0.32 ) exp (ε 8 (log |G|)0.31 )
                            sG ≤ |B| · |D|18ε ≤ |G|19ε .

The above immediately implies that for any G-circulant matrix M with gcd(|G|, q) = 1,
                                                             
                                               |G|
                                                                ≤ |G|100ε .
                                Fq
                               rM
                                     exp (ε 20 (log |G|)0.3 )
This completes the proof.



9    Final remarks and open questions
Our main results, Theorems 6.2, 6.3, and 8.5, naturally raise some open questions. Recall that the N × N
DFT (Discrete Fourier Transform) matrix is the matrix (ω i j ) (i, j = 0, . . . , N − 1) where ω is a primitive
N th root of unity.

    • Are the N × N DFT matrices rigid over the N th cyclotomic field Q[ω] (where ω is a primitive N th
      root of unity) ? (Compare this question with Theorem 6.3.)

    • Do there exist circulant matrices, or G-circulant matrices for some class of abelian groups G, that
      are rigid over Q ? (Again, compare with Theorem 6.3.)

    • Does there exist a finite field Fq and G-circulant matrices for some class of abelian groups G with
      gcd(|G|, q) > 1 that are rigid over Fq ? (Compare this question with Theorem 8.5.)

    • Do there exist rigid G-circulant matrices over C for some class of (necessarily non-abelian) groups
      G?

    When G is non-abelian, it is no longer possible to simultaneously diagonalize the matrices MG ( f ) for
all f but there is a change of basis matrix A such that AMG ( f )A∗ is block-diagonal where the diagonal
blocks correspond to the irreducible representations of G. When all of the irreducible representations
of G have small degree (dimension), it may be possible to use similar techniques to the ones used here.

                         T HEORY OF C OMPUTING, Volume 16 (20), 2020, pp. 1–48                                   45
                                      Z EEV DVIR AND A LLEN L IU

On the other hand, this suggests that perhaps MG ( f ) is a candidate for rigidity when all irreducible
representations of G have large degree. Frobenius proved in 1896 that the group SL2 (F p ) of 2 × 2
matrices over F p with determinant 1 has no nontrivial irreducible representations of degree less than
(p − 1)/2 over C and thus has highly nonabelian structure [10]. (See [6] for an accessible presentation.)
Thus we make the following conjecture.

Conjecture 9.1. For large primes p, a random G-circulant (0, 1)-matrix MG ( f ) for G = SL2 (F p ) is
Valiant-rigid over C with high probability. Here by “random” we mean the function f : SL2 (F p ) → {0, 1}
is chosen randomly.


Acknowledgments
We thank Lajos Rónyai for suggesting the literature references cited in Section 7.1. We would like to
thank Laci Babai and the editorial team at ToC for many important comments.


References
 [1] J OSH A LMAN AND L IJIE C HEN: Efficient construction of rigid matrices using an NP oracle. In
     Proc. 60th FOCS, pp. 1034–1055. IEEE Comp. Soc., 2019. [doi:10.1109/FOCS.2019.00067] 3

 [2] J OSH A LMAN AND RYAN W ILLIAMS: Probabilistic rank and matrix rigidity. In Proc. 49th STOC,
     pp. 17:1–17:23. ACM Press, 2017. [doi:10.1145/3055399.3055484, arXiv:1611.05558] 3, 4, 6, 8

 [3] R ICHARD A RRATIA AND L OUIS G ORDON: Tutorial on large deviations for the binomial distribu-
     tion. Bull. Mathematical Biology, 51(1):125–131, 1989. [doi:10.1007/BF02458840] 16

 [4] L ÁSZLÓ BABAI AND B OHDAN K IVVA: Matrix rigidity: More conjectures refuted. In preparation.
     5

 [5] ROGER C. BAKER AND G LYN H ARMAN: Shifted primes without large prime factors. Acta
     Arithmetica, 83(4):331–361, 1998. [doi:10.4064/aa-83-4-331-361] 7, 17

 [6] G IULIANA DAVIDOFF , P ETER S ARNAK , AND A LAIN VALETTE: Elementary Number Theory,
     Group Theory, and Ramanujan Graphs. Volume 55 of LMS Student Texts. Cambridge Univ. Press,
     2003. [doi:10.1017/CBO9780511615825] 46

 [7] Z EEV DVIR AND B ENJAMIN E DELMAN: Matrix rigidity and the Croot-Lev-Pach lemma. Theory of
     Computing, 15(8):1–7, 2019. [doi:10.4086/toc.2019.v015a008, arXiv:1708.01646] 3, 4, 6, 8, 14, 37

 [8] Z EEV DVIR AND A LLEN L IU: Fourier and circulant matrices are not rigid. In 34th Computational
     Complexity Conference (CCC 2019). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2019.
     [doi:10.4230/LIPIcs.CCC.2019.17]

 [9] J OEL F RIEDMAN: A note on matrix rigidity.               Combinatorica, 13(2):235–239, 1993.
     [doi:10.1007/BF01303207] 3

                      T HEORY OF C OMPUTING, Volume 16 (20), 2020, pp. 1–48                           46
                         F OURIER AND C IRCULANT M ATRICES ARE N OT R IGID

[10] F ERDINAND G EORG F ROBENIUS: Über Gruppencharaktere. Sitzungsberichte der Preußischen
     Akademie der Wissenschaften zu Berlin, 39:985–1021, 1896. 46

[11] O DED G OLDREICH AND AVISHAY TAL: Matrix rigidity of random Toeplitz matrices. Comput.
     Complexity, 27(2):305–350, 2018. Preliminary version in STOC’16. [doi:10.1007/s00037-016-
     0144-9] 3, 7

[12] A BHINAV K UMAR , S ATYANARAYANA V. L OKAM , V IJAY M. PATANKAR , AND M. N. JAYALAL
     S ARMA: Using elimination theory to construct rigid matrices. Comput. Complexity, 23(4):531–563,
     2014. [doi:10.1007/s00037-013-0061-0, arXiv:0910.5301] 3

[13] S ERGE L ANG: Algebra. Volume 211 of Grad. Texts in Math. Springer, 3rd edition, 1996. 30, 31

[14] RUDOLF L IDL AND H ARALD N IEDERREITER: Finite Fields. Encycl. Math. Appl. Cambridge
     Univ. Press, 2nd edition, 1996. [doi:10.1017/CBO9780511525926] 30, 32

[15] S ATYANARAYANA V. L OKAM: On the rigidity of Vandermonde matrices. Theoret. Comput. Sci.,
     237(1–2):477–483, 2000. [doi:10.1016/S0304-3975(00)00008-6] 3

[16] S ATYANARAYANA V. L OKAM: Quadratic lower bounds on matrix rigidity. In Internat. Conf.
     on Theory and Appl. of Models of Computation (TAMC’06), pp. 295–307. Springer, 2006.
     [doi:10.1007/11750321_28] 3

[17] S ATYANARAYANA V. L OKAM: Complexity lower bounds using linear algebra. Foundations and
     Trends in Theoretical Computer Science, 4(1–2):1–155, 2009. [doi:10.1561/0400000011] 3

[18] M OHAMMAD A MIN S HOKROLLAHI , DANIEL A. S PIELMAN , AND VOLKER S TEMANN: A remark
     on matrix rigidity. Inform. Process. Lett., 64(6):283–285, 1997. [doi:10.1016/S0020-0190(97)00190-
     7] 3

[19] L ESLIE G. VALIANT: Graph-theoretic arguments in low-level complexity. In Math. Found. Comp.
     Sci. (MFCS’77), pp. 162–176. Springer, 1977. [doi:10.1007/3-540-08353-7_135] 3, 11


AUTHORS

      Zeev Dvir
      Associate professor
      Department of Mathematics and
      Department of Computer Science
      Princeton University
      Princeton, NJ, USA
      zdvir princeton edu
      https://www.cs.princeton.edu/~zdvir




                     T HEORY OF C OMPUTING, Volume 16 (20), 2020, pp. 1–48                          47
                                   Z EEV DVIR AND A LLEN L IU

    Allen Liu
    Student
    Department of Mathematics
    Massachusetts Institute of Technology
    Cambridge, MA, USA
    cliu568 mit edu



ABOUT THE AUTHORS

    Z EEV DVIR was born in Jerusalem, Israel. He received his Ph. D. from the Weizmann
       Institute in Israel in 2008. His advisors were Ran Raz and Amir Shpilka. He has a broad
       interest in theoretical computer science and mathematics and especially in computational
       complexity, pseudorandomness, coding theory and discrete mathematics.


    A LLEN L IU was born in Rochester, NY. He is currently a fourth-year undergraduate student
       in mathematics at the Massachusetts Institute of Technology. His research interests
       include computational complexity and learning theory.




                   T HEORY OF C OMPUTING, Volume 16 (20), 2020, pp. 1–48                          48