The Shifted Partial Derivative Complexity of Elementary Symmetric Polynomials

Authors Herv{\'e} Fournier, Nutan Limaye, Meena Mahajan, Srikanth Srinivasan,
Plaintext
                            T HEORY OF C OMPUTING, Volume 13 (9), 2017, pp. 1–34
                                         www.theoryofcomputing.org




The Shifted Partial Derivative Complexity of
    Elementary Symmetric Polynomials
                Hervé Fournier∗ †                   Nutan Limaye∗              Meena Mahajan∗
                                                 Srikanth Srinivasan∗
            Received September 29, 2015; Revised September 1, 2016; Published September 25, 2017




       Abstract: We continue the study of the shifted partial derivative measure, introduced by
       Kayal (ECCC 2012), which has been used to prove many strong depth-4 circuit lower bounds
       starting from the work of Kayal, and that of Gupta et al. (CCC 2013).
           We show a strong lower bound on the dimension of the shifted partial derivative space of
       the elementary symmetric polynomials of degree d in N variables for d < lg N/ lg lg N. This
       extends the work of Nisan and Wigderson (Computational Complexity 1997), who studied
       the partial derivative space of these polynomials. Prior to our work, there have been no
       results on the shifted partial derivative measure of these polynomials.
           Our result implies a strong lower bound for elementary symmetric polynomials in the
       homogeneous ΣΠΣΠ model with bounded bottom fan-in. This strengthens (under our degree
       assumptions) a lower bound of Nisan and Wigderson who proved the analogous result for
       the homogeneous ΣΠΣ model (i. e., ΣΠΣΠ circuits with bottom fan-in 1).
           Our main technical lemma gives a lower bound for the ranks of certain inclusion-like
       matrices.

ACM Classification: F.1.3, F.2.2
AMS Classification: 68Q17, 68Q15, 68Q25
Key words and phrases: arithmetic circuits, depth-4 circuits, inclusion matrices
     An extended abstract of this paper appeared in the Proceedings of the 40th International Symposium on Mathematical
Foundations of Computer Science, 2015 [5].
   ∗ Supported by IFCPAR/CEFIPRA Project No 4702-1(A).
   † Supported by ANR project CompA (project number: ANR-13-BS02-0001-01).



© 2017 Hervé Fournier, Nutan Limaye, Meena Mahajan, and Srikanth Srinivasan
c b Licensed under a Creative Commons Attribution License (CC-BY)                         DOI: 10.4086/toc.2017.v013a009
             H ERV É F OURNIER , N UTAN L IMAYE , M EENA M AHAJAN , AND S RIKANTH S RINIVASAN

1     Introduction
1.1     Motivation
In an influential paper of Valiant [26] the two complexity classes VP and VNP were defined, which can
be thought of as algebraic analogues of Boolean complexity classes P and NP, respectively. Whether
VP equals VNP or not is one of the most fundamental problems in the study of algebraic computation.
It follows from the work of Valiant [26] that a super-polynomial lower bound for arithmetic circuits
computing the Permanent implies VP 6= VNP.
     The best known lower bound on uniform polynomials for general arithmetic circuits is Ω(N lg N) [4]
which is unfortunately quite far from the desired super-polynomial lower bound. Over the years, though
there has been no stronger lower bound for general arithmetic circuits, many super-polynomial lower
bounds have been obtained for special classes for arithmetic circuits [19, 21, 20].
     A very interesting such subclass of arithmetic circuits is the class of bounded-depth arithmetic circuits.
The question of proving lower bounds for bounded-depth circuits and in particular depth-3 and depth-4
circuits has received a lot of attention subsequent to the recent progress in efficient depth reduction of
arithmetic circuits [27, 1, 15, 25]. This sequence of results essentially implies that “strong enough” lower
bounds for depth-4 homogeneous circuits suffice to separate VP from VNP. More formally, it proves
that any sequence { fN }N of homogeneous
                                      √
                                                N-variate polynomials
                                                                    √
                                                                        of degree d = N O(1) in VP has depth-4
homogeneous circuits of size N O( d) . Hence, proving an N ω( d) lower bound for depth-4 homogeneous
circuits suffices to separate VP from VNP.
     Even more can be said about the depth-4 circuits obtained in the above results. For any integer
parameter t ≤ d, they give a ΣΠΣΠ circuit for fN where the layer-1 product gates (just above the inputs)
have fan-in at most t and the layer-3 gates are again Π gates with fan-in O(d/t). We will refer to such
circuits as ΣΠ[O(d/t)] ΣΠ[t] circuits. The depth-reduction results mentioned above produce     √      a depth-4
homogeneous ΣΠ[O(d/t)] ΣΠ[t] circuit of size N O((d/t)+t) and top fan-in N O(d/t) ; at t = d de, we get the
above depth-reduction result.
     The tightness of these results follows from recent progress on lower bounds for the model of
ΣΠ  [O(d/t)] ΣΠ[t] circuits. A flurry of results followed the groundbreaking work of Kayal [11], who
augmented the partial derivative method of Nisan and Wigderson [19] to devise a new complexity
measure called the shifted partial derivative measure. He used this measure to prove an exponential
lower bound for a special class of depth-4 circuits. Building on this, the first non-trivial lower bound for
ΣΠ[O(d/t)] ΣΠ[t] circuits was proved by Gupta, Kamath, Kayal, and Saptharishi [9] for the determinant
and permanent polynomials. This was further improved by Kayal, Saha, and Saptharishi [13] who gave
a family of explicit polynomials in VNP for which the shifted partial derivative complexity is (nearly)
as large as possible1 and hence showed a lower bound of N Ω(d/t) for the top fan-in of ΣΠ[O(d/t)] ΣΠ[t]
circuits computing these polynomials. Later, a similar result for a polynomial in VP was proved in [6]
and this was subsequently strengthened by Kumar and Saraf [17], who gave a polynomial computable
by homogeneous ΠΣΠ circuits such that any ΣΠ[O(d/t)] ΣΠ[t] circuits computing it must have top fan-in
N Ω(d/t) . Finally, using a variant of the shifted partial derivative measure, Kayal et al. [12] and Kumar and
Saraf [16] were able to prove similar lower bounds for general depth-4 homogeneous circuits as well.
    1 i. e., as large as it can be for a “generic” or “random” polynomial (as explained after Theorem 1.1).



                             T HEORY OF C OMPUTING, Volume 13 (9), 2017, pp. 1–34                             2
      T HE S HIFTED PARTIAL D ERIVATIVE C OMPLEXITY OF E LEMENTARY S YMMETRIC P OLYNOMIALS

     In this paper, we investigate the shifted partial derivative measure of the elementary symmetric
polynomials, a natural family of polynomials whose complexity has been the focus of much previous
work [19, 24, 23, 10]. Nisan and Wigderson [19] proved tight lower bounds on the depth-3 homogeneous
circuit complexity of these polynomials. Shpilka and Wigderson [24] and Shpilka [23] studied the general
(i. e., possibly inhomogeneous) depth-3 circuit complexity of these polynomials, and showed that for
certain degrees, the O(N 2 ) upper bound due to Ben-Or (see [24]) is tight.
     Under some degree constraints, we show strong lower bounds on the dimension of the shifted partial
derivative space of these polynomials, which implies that the elementary symmetric polynomial on N
variables of degree d cannot be computed by a ΣΠ[O(d/t)] ΣΠ[t] circuit of top fan-in less than N Ω(d/t) . This
strengthens the result of Nisan and Wigderson [19] for these degree parameters.
     By the upper bound of Ben-Or mentioned above, this also gives the first example of an explicit
polynomial with small ΣΠΣ circuits for which such a strong lower bound is known.

1.2    Our results
We show that, in a suitable range of parameters, the shifted partial derivative measure of the N-variate
elementary symmetric polynomial of degree d—denoted SNd —is large.

Theorem 1.1. Let α ∈ (0, 1/2) be a constant. Let N, d, k, τ ∈ N be such that 4k ≤ d ≤ α lg N/ lg lg N,
k = bd/(τ + 1)c, and τ ≥ 3 is an odd number. Over a field of characteristic zero, for any δ satisfying
α ≤ 1 − δ (τ + 1) < 1 − δ τ ≤ 1 − α, and for ` = bN 1−δ c, the following holds.

                                                 (1 − o(1)) · N+`    · N−`
                                                                         
                                                                 `
                                dimh∂k SNd i≤` ⩾                         k
                                                                              .
                                                  (3N 1−δ τ /2)k · (d + 1)τ

Here the o(1) term goes to 0 as N → ∞. For any multilinear polynomial F(X)        on N variables, the quantity
dimh∂k Fi≤` is at most the number of monomial shifts—which is N+`
                                                                            
                                                                          `   —times  the number of possible
                                                      N
                                                        
partial derivatives of order k, which is at most k . Our result says that this trivial upper bound is (in
some sense) close to optimal for the polynomial SNd . (The (N 1−δ τ )k factor in the denominator can be
made N εk for any constant ε > 0. To get an arbitrarily small exponent 1 − δ τ, the constant α must be
small. For, say, d ∈ o(lg N/ lg lg N), and for sufficiently large N, any α > 0 works.) All previous lower
bound results using the shifted partial derivative method also obtain similar statements [9, 6, 17, 16].
     To illustrate the ideas of the proof, we first prove a special case of Theorem 1.1, namely the case
when 2k divides d and τ equals d/k − 1 (all other conditions on the parameters are the same). This result
is stated in Theorem 3.1 in Section 3. (Note that in the statement of Theorem 1.1, the values of d, k do
not fix τ. E. g., for (d, k) = (40, 2), τ can be 13, 15, 17, or 19. The proof of Theorem 3.1, as given, works
when τ is fixed to be d/k − 1.) In Section 4, we describe the modifications needed to carry out the proof
when k = bd/(τ + 1)c for some odd number τ ≥ 3, establishing Theorem 1.1.
     The proof of Theorem 1.1 requires that the field over which the polynomial and the circuits are
defined should have characteristic 0. In Section 5 we state Theorem 5.1, our most general result, i. e.,
for general parameters (no divisibility constraint) and over any characteristic, and sketch its proof. Due
to the positive characteristic, in Theorem 5.1 we incur a loss of k + 1 in the denominator (compared to
Theorem 1.1).

                        T HEORY OF C OMPUTING, Volume 13 (9), 2017, pp. 1–34                                3
            H ERV É F OURNIER , N UTAN L IMAYE , M EENA M AHAJAN , AND S RIKANTH S RINIVASAN

    A corollary to our main result is an N Ω(d/t) lower bound on the top fan-in of any ΣΠ[O(d/t)] ΣΠ[t]
circuit computing SNd .
Theorem 1.2. Let ε ∈ (0, 1) be a constant. Let N, d, D,t ∈ N be such that
                                         10t      ε lg N
                                             ≤d≤           ,             D ≤ N 1−ε .
                                          ε      5 lg lg N

Any ΣΠ[D] ΣΠ[t] circuit of top fan-in s computing SNd satisfies s = N Ω(d/t) .
    It is worth noting that in most lower bounds of this flavour, the upper product gates have fan-in D
bounded by O(d/t). Our lower bound works for potentially much larger values of D.
    It√is known that for every d ≤ N, the elementary symmetric polynomial SNd has depth-4√ circuits of size
N2O( d) [24, Theorem 5.2]. A closer look at the construction there shows that the N2O( d) size circuits
are ΣΠ[O(d)] ΣΠ[d] circuits with bottom fan-in up to and including d. In contrast, our bound shows that for
small d, if the bottom fan-in t is restricted to be a fraction of d, then, even allowing for much larger D,
the top fan-in and hence size of an ΣΠ[D] ΣΠ[t] circuit shoots up to N Ω(d/t) .
    As a corollary to Theorem 1.2, we obtain a lower bound for homogeneous depth-4 circuits with
bounded bottom fan-in.
Corollary 1.3. Let parameters N, d,t be as in Theorem 1.2. Any ΣΠ[O(d/t)] ΣΠ[t] circuit computing SNd
must have top fan-in at least N Ω(d/t) . In particular, any homogeneous ΣΠΣΠ circuit C with bottom fan-in
bounded by t computing SNd must have top fan-in at least N Ω(d/t) .
    By the above depth-reduction results, thislower bound is tight up to the constant factor in the exponent.
Before our work, Nisan and Wigderson Nd /2d for all d, however with respect to homogeneous ΣΠΣ
circuits (i. e., the case t = 1).

1.3    Techniques
The analysis of the shifted partial derivative measure for any polynomial essentially requires the analysis
of the rank of a matrix arising from the shifted partial derivative space. In this paper, we analyse the
matrix arising from the shifted partial derivative space of the symmetric polynomials.
    Our analysis is quite different from those in previous papers (such as [6, 12, 16]), which are based
on either monomial counting (meaning that we find a large identity or upper triangular submatrix inside
our matrix) or an analytic inequality of Alon [2]. Neither of these techniques seems to be applicable in
our case. This is already visible from the work of Nisan and Wigderson [19], who analyse the partial
derivative matrix (without shifts) of the elementary symmetric polynomials. This matrix turns out to
be the well-known Disjointness matrix, defined as follows: for fixed parameters N, s,t ∈ N such that
s + t ≤ N, the rows and columns of this matrix are labelled by subsets of [N] of size s and t respectively;
the (S, T ) entry in the matrix is 1 if S ∩ T = 0/ and 0 otherwise.2 It is known (see [18, Sec. 2.3] for
example) that this matrix is full rank (i. e., has rank equal to the minimum of the number of rows and
columns) in characteristic 0 and almost full rank in other characteristics [28]. However, it is not clear
how to use either of the two techniques mentioned above to prove this result.
   2 Variants allowing sets of size at most s and t have also been considered.



                            T HEORY OF C OMPUTING, Volume 13 (9), 2017, pp. 1–34                            4
      T HE S HIFTED PARTIAL D ERIVATIVE C OMPLEXITY OF E LEMENTARY S YMMETRIC P OLYNOMIALS

    In our analysis of the shifted partial derivative space, we block-diagonalize our matrix3 into matrices
each of which is a more complicated version of the Inclusion matrix (similar to the Disjointness matrix
mentioned above and also known to be full rank), and bound its rank from below by using a technique
that, to the best of our knowledge, has not been used in this context before.
    Our method of bounding the rank of the matrices, in spirit, resembles the techniques in the papers [7, 8]
by Frankl and Wilson. In these papers, the authors consider the problem of bounding the sizes of some
interesting families of sets. They reduce this problem to analyzing ranks of certain classes of matrices
and use the linear algebra method [3] to bound the ranks. Though our method seems similar to these
techniques in a general sense, there are some differences. In their work, the results pertaining to ranks of
the matrices are extremal in nature. However, our matrix is very concrete. On the other hand, the matrices
considered in their work are simpler than the matrix we obtain while analyzing the rank of the shifted
partial derivative space of elementary symmetric polynomials. We give a brief overview of our technique
in the next section.
    Disjointness and inclusion matrices arise naturally in other branches of theoretical computer science
such as Boolean circuit complexity [22], communication complexity [18, Chapter 2] and also in combi-
natorics [28, 14]. Therefore, we believe that our analysis of the inclusion-like matrix arising from the
symmetric polynomial may find other applications.

1.4     Organisation of the paper
In Section 2, we set up basic notation, fix the main parameters, and give a high-level outline of our proof
of Theorem 3.1. In Section 3 we give the details of the actual proof. The circuit size lower bound from
Theorem 1.2 is established in Section 6. Recall that Theorem 3.1 is stated over fields of characteristic
zero when some parameters exhibit some divisibility property (which makes the combinatorics nicer).
The way to handle more general parameters is explained in Section 4. In Section 5 we describe how to
extend our results to arbitrary fields.


2     Proving Theorem 3.1: high-level outline
2.1     Notation
For a positive integer n, we let [n] = {1, . . . , n}. Let X = {x1 , . . . , xN }. For A ⊆ [N] we define XA = ∏i∈A xi .
The elementary symmetric polynomial of degree d over the set X of variables is defined as

                                                 SNd (X) =       ∑         XA ,
                                                             A⊆[N],|A|=d

and is abbreviated with SNd .
   For k, ` ∈ N and a multivariate polynomial f ∈ F[x1 , . . . , xn ], we define
                              (                                                                      )
                                j1      jn     ∂k f
             h∂k f i≤` = span x1 . . . xn · i1               i1 + · · · + in = k, j1 + . . . + jn ≤ ` .
                                           ∂ x1 . . . ∂ xnin
    3 Actually, we only work with a carefully chosen submatrix of the overall matrix.



                            T HEORY OF C OMPUTING, Volume 13 (9), 2017, pp. 1–34                                     5
            H ERV É F OURNIER , N UTAN L IMAYE , M EENA M AHAJAN , AND S RIKANTH S RINIVASAN

Our complexity measure is the dimension of this space, i. e., dim(h∂k f i≤` ) [11, 9].
   For a monomial m = ∏Ni=1 xini , deg(m) = n1 + n2 + · · · + nN is the total degree of m. We denote by
degxi (m) the degree of the variable xi in m (here degxi (m) = ni ). We define the support of m as

                                           supp(m) = {i ∈ [N] | ni > 0} .

For a monomial m and p > 0, let

                                       supp p (m) = {i ∈ [N], degxi (m) = p} .

      Let M`N the set of monomials of degree at most ` over the set X of variables. For integers n1 , . . . , n p ,
let
                          M`N (n1 , . . . , n p ) = {m ∈ M`N , | suppi (m)| = ni for i ∈ [p]}.

      Given p > 0, a monomial m ∈ M`N can be uniquely written as
                                                          p
                                               m = m̃ · ∏(Xsuppi (m) )i .
                                                         i=1

We write m ≡ [m̃, S1 , . . . , S p ] if Si = suppi (m) for all i ∈ [p] and
                                                               p
                                                  m = m̃ · ∏(XSi )i .
                                                              i=1

    For a finite set S, let U(S) denote the uniform distribution over the set S.
    We assume that we are working over a field F of characteristic zero. Our results also hold in non-
zero characteristic (see Section 5), but the first step of our proof (Lemma 3.2) becomes a little more
cumbersome.


2.2     Proof outline
We want to lower bound dim(h∂k SNd i≤` ) for suitable k, `. An alternate way of looking at the vector space
h∂k SNd i≤` is as follows. We fix some spanning set S for the set of all partial derivatives of SNd of order k and
consider the set P of all the polynomials obtained by multiplying the polynomials in S with monomials
of degree at most `. We define a matrix M whose columns contain the polynomials in the set P (seen as
vectors of coefficients of the various monomials). Lower bounding dim(h∂k SNd i≤` ) is equivalent to lower
bounding rank(M).
     Our lower bound on dim(h∂k SNd i≤` ) proceeds in 3 steps.
Step 1: We choose a suitable subset S of the partial derivative space. It is convenient to work with a set
that is slightly different from the set of partial derivatives themselves. To understand the advantage of this,
consider the simple setting where we are looking at the partial derivatives of the degree-2 polynomial SN2
of order 1. It is not difficult to show that the partial derivative with respect to variable xi is ri := ∑ j6=i x j .
Over characteristic zero, this set of polynomials is known to be linearly independent. One way to show

                           T HEORY OF C OMPUTING, Volume 13 (9), 2017, pp. 1–34                                    6
     T HE S HIFTED PARTIAL D ERIVATIVE C OMPLEXITY OF E LEMENTARY S YMMETRIC P OLYNOMIALS

this is by showing that each polynomial xi can be written as a linear combination of the r j s; explicitly,
one can write                                            !
                                             1
                                     xi =          ∑ r j − ri .
                                           n − 1 j∈[n]

Since the xi s are distinct monomials, they are clearly linearly independent and we are done. This illustrates
the advantage in moving to a “sparser” basis for the partial derivative space. We do something like this
for larger d and k (Lemma 3.2).
Step 2: After choosing the set S, we construct the set P of shifts of S (actually, we will only consider
a subset of P) and lower bound the rank of the corresponding matrix M. To do this, we also prune the
set of rows of the matrix M. In other words, we consider a carefully chosen set of monomials M and
project each polynomial in P down to these monomials. The objective in doing this is to infuse some
structure into the matrix while at the same time preserving its rank (up to small losses). Having chosen
M, we show that the corresponding submatrix can be block-diagonalized into matrices each of which is
described by a simple inclusion pattern between the (tuples of) sets labelling its rows and columns. This
is done in Lemma 3.13, Lemma 3.16, and Lemma 3.17.
Step 3: The main technical step in the proof is to lower bound the rank of the inclusion pattern matrix
mentioned above with an algebraic trick. We illustrate            this
                                                                    technique here with a toy example. Fix
parameters N, s ∈ N with s ≤ N/2 and define the Ns × Ns matrix DisjN,s whose rows and columns are
                                                            

labelled by sets of size s from the  universe    [N] and the (S, T ) entry is 1 if S ∩ T = 0/ and 0 otherwise. We
can similarly also define the Ns × Ns matrix IncN,s similarly with the only difference being that the (S, T )
                                           

entry is 1 if and only if S ⊆ T ; note that IncN,s is simply the identity matrix with the required dimensions
and is hence clearly full rank. It is also known that DisjN,s is full rank over fields of characteristic 0 (see,
e. g., [18, Chapter 2]). We prove a weaker      statement here in order to illustrate our proof method: we show
                                             N
that when s = o(N), DisjN,s has rank s (1 − o(1)).
     To see this, consider the following alternate     way of looking at the above matrices. We identify the
labels of the rows—which are elements of [N]        s —with their characteristic vectors, which are elements of
                                          N
the N-dimensional hypercube {0, 1} of weight exactly s. Each column is associated with a polynomial p
over 0-1 variables y1 , . . . , yN such that the entry in the column at the row labelled by a ∈ {0, 1}N is equal
to p(a). Specifically, in the matrix IncN,s , a column labelled T ⊆ [N] is associated with the monomial
mT = ∏i∈T yi ; it should be clear that this monomial evaluates to 1 at row a only if a encodes a subset
contained in T (which must be T itself). Similarly, in the matrix DisjN,s the column corresponding to T is
associated with the polynomial qT = ∏i∈T (1 − yi ). Now consider the following simple identity.
                                                                              0
                             mT = ∏ yi = ∏(1 − (1 − yi )) = ∑ (−1)|T | qT 0 .
                                    i∈T     i∈T                   T 0 ⊆T

The above tells us that the columns of IncN,s are spanned by the set of all column vectors corresponding
to polynomials of the form Q = {qT 0 | |T 0 | ≤ s}. Since IncN,s has rank Ns , the set of column vectors in
Q must have rank at least Ns . The subset of these columns corresponding to |T 0 | = s are exactly the
                              

columns of DisjN,s . Note that the remaining columns (corresponding to |T 0 | < s) are only ∑i<s Ni in
                                                                                                       

number and this is only o( Ns ) since s = o(N). Hence, the columns of DisjN,s must have rank at least
                               
 N
   
  s (1 − o(1)).


                         T HEORY OF C OMPUTING, Volume 13 (9), 2017, pp. 1–34                                   7
            H ERV É F OURNIER , N UTAN L IMAYE , M EENA M AHAJAN , AND S RIKANTH S RINIVASAN

    The main technical lemma (Lemma 3.21) is a generalization of the above trick to our setting. Given
the matrix whose rank we wish to lower bound (like DisjN,s above), we first find a full-rank matrix that is
closely related to our matrix and then show that the columns of our matrix can (with the aid of just a few
other columns) generate the columns of the full-rank matrix.

2.3     The main parameters
For proving Theorem 1.1, recall the parameters: α ∈ (0, 1/2), N, d, k, τ satisfying
                                                             
                                                           d
                       4k ≤ d ≤ α lg N/ lg lg N, k =            , τ ≥ 3 odd.
                                                         τ +1

      Our parameter choices are any δ satisfying α ≤ 1 − δ (τ + 1) and 1 − δ τ ≤ 1 − α, and ` = bN 1−δ c.
      The following are easy to verify for our choice of parameters.

Fact 2.1. τ 2 = o(`), τ = o(N δ ), and τ` = o(N). Also, (lg N)τ = O(N α ), and N δ (τ+1) = O(N 1−α ) = o(N).

      These facts also hold in the settings of Theorem 3.1 and Theorem 5.1.


3      Proving Theorem 1.1: details of a simpler case
In this section we first establish Theorem 3.1 given below. This is a restriction of Theorem 1.1 to the
special case when 2k divides d. This case illustrates all the ideas and technical constructs used. Small
modifications, described in the next section, establish Theorem 1.1.

Theorem 3.1. Let α ∈ (0, 1/2) be a constant. Let N, d, k ∈ N be such that 4k ≤ d ≤ α lg N/ lg lg N and
2k | d. Over a field of characteristic zero, for τ = d/k − 1, for any δ satisfying α ≤ 1 − δ (τ + 1) <
1 − δ τ ≤ 1 − α, and for ` = bN 1−δ c, the following holds.

                                                 (1 − o(1)) · N+`
                                                                    N−`
                                         d                       `  · k
                               dimh∂k SN i≤` ⩾        1−δ      k
                                                                           .
                                                  (3N     τ /2) · (d + 1)τ

3.1     Choice of basis: Step 1 of the proof
Lemma 3.2. Let k ≤ d ≤ N. Over fields of characteristic 0, the vector space spanned by the set of
k-partial derivatives of SNd , that is, h∂k SNd i≤0 , contains {pT | T ⊆ [N], |T | = k} where
                                                                    d−2k
                               pT =         ∑            XA = XT · SN−k  (X \ T ) .
                                      T ⊆A⊆[N],|A|=d−k

Proof. If d < 2k then the statement is vacuously true (there is no such pT ). So now assume that d ≥ 2k.
     For any set S ⊆ [N] of size |S| = k, let rS be the polynomial obtained by deriving SNd with respect to
all the variables in S (once each). Note that rS contains all degree-(d − k) multilinear monomials that
avoid variables xi for i ∈ S. Similarly, for S ⊆ [N] of size |S| < k, define the polynomials rS as sums of
degree-(d − k) multilinear monomials avoiding xi for i ∈ S. Correspondingly, define the complementary

                         T HEORY OF C OMPUTING, Volume 13 (9), 2017, pp. 1–34                             8
      T HE S HIFTED PARTIAL D ERIVATIVE C OMPLEXITY OF E LEMENTARY S YMMETRIC P OLYNOMIALS

polynomials: for T ⊆ [N] with |T | ≤ k, pT consists of all degree-(d − k) multilinear monomials that
include xi (i ∈ T ). Formally, for S, T ⊆ [N] with |S|, |T | ≤ k,
                          rS (x) =                  ∑         XA ;       pT (x) :=            ∑           XB .
                                          |A|=d−k, A∩S=0/                             T ⊆B⊆[N],|B|=d−k

     The claim is that linear combinations of the partial derivative polynomials, rS with |S| = k, generate
the polynomials pT (|T | = k). We can show this in two steps.
     The first step is to show that the rS0 (|S0 | = k) generate the polynomials rS (|S| ≤ k). Fix any S with
|S| = s < k. Let F = {S0 ⊆ [N] | |S0 | = k, S0 ⊇ S}. A simple computation shows that any monomial that
avoids all the variables in S appears in the same      positive number M of all the polynomials rS0 (S0 ∈ F)
                                         N−(d−k+s)
(in fact, it can be checked that M =         k−s      , though this will not be important for us). Hence, we
have rS = (1/M) ∑S0 ∈F rS0 , which shows that rS is indeed in the span of rS0 (|S0 | = k). Note that this step
assumes that we are working in characteristic 0.
     Finally, once we have rS for all set sizes in [k], we generate pT for |T | = k using inclusion-exclusion.
The monomials of PT are the monomials of SNd , minus monomials that miss at least one element of T , plus
the monomials that miss at least two elements of T (since we overcounted), and so on. The monomials
to be added/removed are exactly the monomials of various polynomials rS . To set this up formally, let
{0, 1}tN denote the set of all 0-1 vectors of Hamming weight exactly t. We use the natural correspondence
between this set and the set of all subsets of [N] of size exactly t.
            pT (x) =          ∑                 XB
                       T ⊆B⊆[N],|B|=d−k
                                                    !          !                                 !                     !
                  =       ∑                ∏ xi         ∏ yi         =       ∑          ∏ xi            ∏(1 − (1 − yi ))
                       y∈{0,1}N           i:yi =1       i∈T              y∈{0,1}N      i:yi =1          i∈T
                              d−k                                               d−k
                                                    !                                   !
                                                                     |S|
                  =       ∑                ∏ xi         ∑ (−1) ∏(1 − yi )
                       y∈{0,1}N           i:yi =1       S⊆T                i∈S
                              d−k
                                                                     !                    !
                                    |S|
                  = ∑ (−1)                    ∑           ∏ xi              ∏(1 − yi )        = ∑ (−1)|S| rS (x) .
                       S⊆T                y∈{0,1}N       i:yi =1            i∈S                   S⊆T
                                                 d−k

    This shows that each pT (|T | = k) is a linear combination of the rS (|S| ≤ k) and hence also of the
derivative polynomials rS0 (|S0 | = k).
      Let                  n                                                    o
                        P = m · pT T ⊆ [N], |T | = k, m ∈ M`N , supp(m) ∩ T = 0/ .
From Lemma 3.2, P ⊆ h∂k SNd i≤` . Hence, a lower bound on the dimension of span P is also a lower bound
on dim(h∂k SNd i≤` ).

3.2     Choice of shifts: Step 2 of the proof
Instead of considering arbitrary shifts m as in the definition of P, we will consider shifts by monomials m
with various values of | suppi (m)| for i ∈ [τ]. We first present a technical lemma that is needed to establish
the lower bound. It is a concentration bound for support sizes in random monomials.

                          T HEORY OF C OMPUTING, Volume 13 (9), 2017, pp. 1–34                                             9
           H ERV É F OURNIER , N UTAN L IMAYE , M EENA M AHAJAN , AND S RIKANTH S RINIVASAN

Definition 3.3. For i ∈ [τ], sˆi denotes the average number of variables with degree exactly i. That is,

                                           sˆi = Em∼U(M` ) [| suppi (m)|] .
                                                         N


Definition 3.4 (Good signature). Given m ∈ M`N , the signature of m, s(m), is the tuple (s1 , . . . , sτ ) such
that m ∈ M`N (s1 , . . . , sτ ). We call the signature (s1 , . . . , sτ ) a good signature if for each i ∈ [τ], we have
ŝi /2 ≤ si ≤ 3ŝi /2. Let S0 denote the set of all good signatures.
    The following lemma shows that for our choice of parameters, the average values ŝi for i ∈ [τ + 1]
are significantly large, and most monomials in M`N in fact have good signatures. This lemma imposes
the most stringent condition on d, τ: if τ is too large, then we cannot use the bounds from Fact 2.1, we
cannot get tight estimates on ŝi , and hence we cannot show that most monomials have good signatures.
Lemma 3.5. For our choice of the main parameters, the following statements hold.
                           ŝi
   1. For i ∈ [τ − 1],         ≥ Nδ .
                         ŝi+1

   2. For i ∈ [τ], ŝi = N 1−iδ (1 − o(1)).

   3. Pr[s(m) ∈ S0 ] = 1 − o(1).
Proof. Pick a random monomial of degree at most ` uniformly at random; m ∼ U(M`N ).
   Consider the following random variables.
   1. For i ∈ [τ + 1] and j ∈ [N], Zi, j denotes the 0-1 random variable that is 1 if and only if the variable
      x j has degree at least i in m.
      Let pi denote Em [Zi, j ]. (The average is the same for all j.)

   2. For i ∈ [τ + 1], let Zi = ∑ j∈[N] Zi, j be the random variable denoting the number of variables that
      have degree at least i in m.
      Let µi denote Em [Zi ]; clearly, µi = N pi .

   3. For i ∈ [τ], the number of variables of degree exactly i is given by Yi := Zi − Zi+1 .
      Then ŝi = Em [Yi ]; clearly, ŝi = µi − µi+1 .
    Consider pi , the probability that the variable x j has degree at least i in a monomial m ∼ U(M`N ). We
see that any such monomial can be written uniquely as m = m0 (x j )i where m0 is a monomial of degree at
most ` − i in the same set of variables X. Since the number of such monomials is exactly |M`−i N |, we have

                                          N+`−i
                                                
                              |M`−i
                                 N |        `−i
                        pi =          = N+` .
                               |M`N |        `

In particular,

                          µi     pi    N +`−i N +`
                              =      =       ≥     = 1 + Nδ ≥ Nδ .
                         µi+1   pi+1     `−i    `

                         T HEORY OF C OMPUTING, Volume 13 (9), 2017, pp. 1–34                                       10
      T HE S HIFTED PARTIAL D ERIVATIVE C OMPLEXITY OF E LEMENTARY S YMMETRIC P OLYNOMIALS

Hence,

                                       ŝi             pi − pi+1    pi − pi+1    pi
                                              =                   ≥           =      − 1 ≥ Nδ ,
                                     ŝi+1            pi+1 − pi+2      pi+1     pi+1
proving the first part of the lemma.
   Next, we use a simple fact about binomial coefficients.
Fact 3.6. For any N, `, i ∈ N such that i < `, we have
                                          i                       i        N+`−i                    i
                                                                                       
                                   `−i                      `−i               `−i               `
                                                  ≤                      ≤    N+`
                                                                                      ≤                    .
                                                          N +`−i
                                                                                  
                                   N +`                                        `
                                                                                              N +`

    By Fact 3.6, we have
                            i  i
                          `      `
                 pi ≤          ≤     ≤ N −iδ ;                           µi ≤ N 1−iδ ;           ŝi ≤ µi ≤ N 1−iδ .
                        N +`     N
    Also by Fact 3.6, we have
                                                  i                "         #i
                                                                       1 − `i
                                 
                                   `−i `      N           −iδ
                        pi ≥                          = N (1 − o(1))
                                                                       1 + N`
                                                                              
                                    ` N      N +`
                                                 2        
                              −iδ                   i    i`
                           ≥ N (1 − o(1)) exp −O      +        .
                                                    ` N

   By our choice of parameters (Fact 2.1), we have i2 /` ≤ (τ + 1)2 /` = o(1) and i`/N = o(1). Hence,
we have
                                                                         
            −iδ                     1−iδ                             µi+1
      pi ≥ N (1 − o(1)); µi ≥ N          (1 − o(1)); ŝi ≥ µi 1 −           ≥ N 1−iδ (1 − o(1)) .
                                                                       µi
Putting together the upper and lower bounds proves the second part of the lemma.
    In order to prove the third part of the lemma, we use the second moment method. To do this, we will
need to bound the second moment of Zi . In order to do this, we will need the following claim.
Claim 3.7. For any distinct j1 , j2 , we have E [Zi, j1 Zi, j2 ] ≤ E [Zi, j1 ] E [Zi, j2 ] = p2i .4
    The claim is proved below. First, we use this claim to finish the proof. The following standard
analysis will bound the second moment of Zi .

                      E Zi2 = ∑ E [Zi, j1 Zi, j2 ] = ∑ E Zi,2 j1 + ∑ E [Zi, j1 Zi, j2 ]
                                                             
                                        j1 , j2                     j1                j1 6= j2

                                    ≤ N pi + ∑ p2i ≤ N pi + N 2 p2i = µi + µi2 ,
                                                      j1 6= j2

   4 This claim posits a weak form of negative dependence on the degrees of distinct variables. It is actually not too hard to

prove much stronger forms of negative dependence which yield stronger probability estimates than the ones we give here.
However, these estimates suffice for our purposes.


                           T HEORY OF C OMPUTING, Volume 13 (9), 2017, pp. 1–34                                           11
           H ERV É F OURNIER , N UTAN L IMAYE , M EENA M AHAJAN , AND S RIKANTH S RINIVASAN

where the first inequality follows from the fact that Zi,2 j1 = Zi, j1 and Claim 3.7.
   Hence, we can bound the variance of Zi as follows.

                                    E (Zi − µi )2 = E Zi2 − µi2 ≤ µi .
                                                        


    Thus, by the Chebyshev inequality, we have

                                                          E (Zi − µi )2
                                                                       
                                                                            16
                                 Pr[|Zi − µi | > µi /4] ≤                 ≤    .
                                                             (µi /4)2       µi

    Union bounding over all i ∈ [τ + 1], we have

                                                            1    16              1
                Pr[∃i, |Zi − µi | > µi /4] ≤ 16     ∑ µi = N ∑ pi
                                                  i∈[τ+1]             i∈[τ+1]

                                               16                          16N (τ+1)δ (1 + o(1))
                                          ≤        ∑     N iδ (1 + o(1)) ≤
                                               N i∈[τ+1]                             N
                                               16 (1 + o(1))
                                          ≤                  = o(1) .
                                                    Nα
Thus with probability (1 − o(1)) we have |Zi − µi | ≤ µi /4 for all i ∈ [τ]. When this event occurs, we see
that for all i ∈ [τ],

                                              µi µi+1  µi            µi            ŝi (1 + o(1)) ŝi
 |Yi − ŝi | = |Zi − Zi+1 − (µi − µi+1 )| ≤     +     < (1 + N −δ ) < (1 + o(1)) ≤               <
                                              4   4    4             4                     4       2
for sufficiently large N. So s(m) ∈ S0 , which proves the third part of the lemma.
    Finally, it remains to prove the Claim.

Proof of Claim. Without loss of generality, assume that j1 = 1 and j2 = 2. Then, E [Zi, j1 Zi, j2 ] is the
probability that a random m ∼ M`N is divisible by the monomial x1i x2i . Such a monomial m may be
uniquely factored as x1i x2i m0 where m0 is a monomial of degree at most ` − 2i. Thus, the probability of
this event is precisely
                                                       N+`−2i
                                                              
                                            |M`−2i
                                               N |      `−2i
                                                    = N+` .
                                             |M`N |       `
    The statement of the claim says that the above quantity may be bounded by

                                                            N+`−i
                                                                  !2
                                                             `−i
                                                p2i =        N+`
                                                                        .
                                                              `

Rearranging, we see that this is equivalent to the following inequality
                                           N+`−2i        N+`−i
                                                             
                                                  `−2i           `−i
                                                 N+`−i
                                                           ≤    N+`
                                                                            .
                                                   `−i            `


                        T HEORY OF C OMPUTING, Volume 13 (9), 2017, pp. 1–34                            12
       T HE S HIFTED PARTIAL D ERIVATIVE C OMPLEXITY OF E LEMENTARY S YMMETRIC P OLYNOMIALS

Expanding the binomials above and canceling common terms between the numerators and denominators,
we may rewrite the inequality as
                                                                       
                      ` − 2i + 1          `−i           `−i+1             `
                                    ···          ≤                  ···
                    N + ` − 2i + 1      N +`−i        N +`−i+1          N +`
which is easy to verify since each term on the left hand side is upper bounded by the corresponding term
on the right hand side. This proves the claim.

    This completes the proof of Lemma 3.5.

Remark 3.8. By Lemma 3.5, for any good signature (s1 , . . . , sτ ), we have

                          si /si+1 = Ω(N δ )         and         sτ = Ω(N 1−τδ ) = Ω(N α ) .

Also
                                      (s1 ,...,sτ ) good MN (s1 , . . . , sτ )|
                                  |
                                      S                    `
                                                                                = 1 − o(1) .
                                                      |M`N |
    Given a signature (s1 , . . . , sτ ), let P(s1 , . . . , sτ ) denote the set of polynomials
                               n                                                                      o
          P(s1 , . . . , sτ ) = m · pT T ⊆ [N], |T | = k, m ∈ M`N (s1 , . . . , sτ ), supp(m) ∩ T = 0/ .

Note that all polynomials in P(s1 , . . . , sτ ) are homogeneous of degree at most ` + d − k.
Definition 3.9. For any signature s = (s1 , . . . , sτ ), let ri (s) = si for 1 ⩽ i ⩽ τ − 1 and rτ (s) = sτ + k;
also, let r(s) = ∑i ri (s) = (∑i si ) + k. Usually the signature s will be clear from context, and we use ri
and r instead of ri (s) and r(s) respectively. The matrix M(s1 , . . . , sτ ) is the matrix whose columns are
indexed by polynomials m · pT ∈ P(s1 , . . . , sτ ) and rows by the monomials w ∈ M`+d−k  N    (r1 , . . . , rτ ). The
coefficient in row w and column m · pT is the coefficient of the monomial w in the polynomial m · pT .
   Note that the columns of M(s1 , . . . , sτ ) are simply the polynomials in P(s1 , . . . , sτ ) projected to the
monomials that label the rows. In particular, a lower bound on the rank of M(s1 , . . . , sτ ) implies a lower
bound on the rank of the vector space spanned by P(s1 , . . . , sτ ).
   The number of columns of the matrix M(s1 , . . . , sτ ) is equal to
                                                                                                
                                                       N                     N − (s1 + · · · + sτ )
              |P(s1 , . . . , sτ )| =                                      ·
                                       s1 . . . sτ N − (s1 + · · · + sτ )            k
                                                                             
                                                         N
                                    =
                                       s1 . . . sτ k N − (s1 + · · · + sτ + k)
while the number of its rows is equal to
                                                                                       
                         N                                        N
                                   =
                     r1 . . . rτ         s1 . . . sτ−1 sτ + k N − (s1 + · · · + sτ + k)
                                                                                          
                                          1                           N
                                   = s +k ·                                                .
                                         τ
                                           k
                                                    s1 . . . sτ k N − (s1 + · · · + sτ + k)

                         T HEORY OF C OMPUTING, Volume 13 (9), 2017, pp. 1–34                                      13
            H ERV É F OURNIER , N UTAN L IMAYE , M EENA M AHAJAN , AND S RIKANTH S RINIVASAN

Hence M(s1 , . . . , sτ ) has |P(s1 , . . . , sτ )| columns but only |P(s1 , . . . , sτ )|/ sτ k+k rows, and the rank of the
                                                                                                  

matrix is no more than the number of rows in the matrix. The following lemma, proved in Section 3.3,
shows a lower bound that is quite close to this trivial upper bound.

Lemma 3.10. With parameters as above, for any good signature s = (s1 , . . . , sτ ),

                                                                |P(s1 , . . . , sτ )|
                                 rank(M(s1 , . . . , sτ )) ⩾         sτ +k
                                                                                     (1 − o(1)) .
                                                                         k

    Since P(s1 , . . . , sτ ) ⊆ P ⊆ h∂k f i≤` , the above immediately yields a lower bound on dim(h∂k f i≤` ).
Our final lower bound, which further improves this, is proved by considering polynomials corresponding
to a set of signatures.

Definition 3.11. Given a set of signatures S, define

                                M`N (S) =            M`N (s)                 P(S) =          P(s) .
                                               [                                        [
                                                                  and
                                               s∈S                                     s∈S

Also define the matrix M(S) as follows: the columns of M(S) are labelled by polynomials q ∈ P(S) and
the rows by monomials
                                   w ∈ M`+d−k
                                       [
                                             N   (r1 (s), . . . , rτ (s)) .
                                                   s∈S

The (w, q)th entry is the coefficient of w in q.

   Note that a lower bound on the rank of M(S) immediately lower bounds the dimension of the space
spanned by P(S) and hence also dimh∂k SNd i≤` .

Definition 3.12. A set of signatures S is well-separated if given any distinct signatures s = (s1 , . . . , sτ )
and s0 = (s01 , . . . , s0τ ) from S, maxi∈[τ] |si − s0i | ⩾ d + 1.

    To analyze the rank of M(S), we observe that for a well-separated set of signatures S, the matrix
M(S) is block-diagonalizable with |S| blocks, where the blocks are the matrices M(s) for s ∈ S. Since we
already have a lower bound on the ranks of M(s) (for good s), this will allow us to obtain a lower bound
on the rank of M(S) as well.

Lemma 3.13. Let S be a well-separated set of signatures. Then, the matrix M(S) is block-diagonalizable
with blocks M(s) for s ∈ S.

Proof. The proof is straightforward. Since the rows and columns of M(S) are labelled by elements of

                                       M`+d−k                                                 P(s)
                                 [                                                      [
                                        N     (r1 (s), . . . , rτ (s))       and
                                 s∈S                                                    s∈S

respectively, we can group them in blocks in a natural way: corresponding to each s ∈ S, we consider
the rows corresponding to MN`+d−k (r1 (s), . . . , rτ (s)) and columns corresponding to P(s). This is possible
since for s 6= s0 , the set of monomials M`+d−k
                                          N       (r1 (s), . . . , rτ (s)) and M`+d−k
                                                                                N     (r1 (s0 ), . . . , rτ (s0 )) are disjoint

                           T HEORY OF C OMPUTING, Volume 13 (9), 2017, pp. 1–34                                             14
     T HE S HIFTED PARTIAL D ERIVATIVE C OMPLEXITY OF E LEMENTARY S YMMETRIC P OLYNOMIALS

(because the mapping s 7→ (r1 (s), . . . , rτ (s)) is one-to-one). Clearly the diagonal block corresponding to
s ∈ S is exactly the matrix M(s).
    To argue that the matrix is block-diagonal, consider the entry in row w and column m · pT , where
for some signatures s, s0 ∈ S, the monomials m, w are in the sets m ∈ M`N (s) and w ∈ M`N (r(s0 )). (Recall
from the end of Section 3.1 that we only consider columns where the support of m is disjoint from T .)
Assume that this entry is 1. Hence for some A ⊆ [N] of size d − k containing T , the monomial w has the
form w = m · XA . Thus, any monomial w appearing in m · pT has the property that

                                       | suppi (w)| − | suppi (m)| ≤ d − k .

Thus, for i < τ, we have |s0i − si | ≤ (d − k) < d, and for i = τ, we have sτ − (d − k) ≤ s0τ + k ≤ sτ + (d − k).
Hence |s0τ − sτ | ≤ d.
    Since S is well-separated, both s and s0 are in S, and |s0i − si | ≤ d for all i ∈ [τ], it must be the case
that s0 = s.

    This allows us to give a good bound on the matrix M(S) if S is well-separated.
Lemma 3.14. For a well-separated set S of good signatures,

                                           (1 − o(1)) N−`
                                                           
                           rank(M(S)) ≥                 k
                                                             · |M`N (S)| .
                                            (3N 1−δ τ /2)k
Proof.

         rank(M(S)) = ∑ rank(M(s))                        (by Lemma 3.13, block-diagonalizability)
                         s∈S
                               |P(s)|
                       ≥∑       sτ +k
                                       (1 − o(1))        (by Lemma 3.10).
                         s∈S     k

Consider the numerator, the number of columns in M(s). For each monomial m, the set T generating
column m · pT can be chosen in N−| supp(m)|
                                           
                                     k       ways. So
                                                          
                                                 `     N −`
                                     |P(s)| ≥ |MN (s)|       .
                                                         k

   Next consider the denominator. Using the fact k = o(lg N) (Fact 2.1), while sτ ≥ ŝτ /2 = Ω(N 1−δ τ )
and ŝτ ≤ N 1−δ τ from Lemma 3.5, we have
                                                                        !k
                                   skτ       3ŝτ k            3N 1−δ τ
                                              
                     sτ + k                            1                      1
                              ≤          ≤                   ≤                      .
                        k       1 − o(1)      2     1 − o(1)      2        1 − o(1)

    Putting these back into our expression for rank(M(S)), we get

                                  |M`N (s)| N−`                 (1 − o(1)) N−`
                                                                                
                rank(M(S)) ≥ ∑                k
                                                k (1 − o(1)) =
                                                                              k
                                                                      1−δ τ /2)k
                                                                                   |M`N (S)| .
                               s∈S 3N
                                       1−δ τ /2                  (3N


                        T HEORY OF C OMPUTING, Volume 13 (9), 2017, pp. 1–34                                  15
            H ERV É F OURNIER , N UTAN L IMAYE , M EENA M AHAJAN , AND S RIKANTH S RINIVASAN

    Finally, we observe that there is a well-separated set S of good signatures such that the matrix M(S)
is quite large. Recall from Definition 3.4 that S0 is the set of all good signatures.

Proposition 3.15. There is a well-separated set of good signatures, S ⊆ S0 , satisfying

                                                               |M`N (S0 )|
                                                |M`N (S)| ≥                .
                                                                (d + 1)τ

Proof. Let D be the set D = {0, 1, . . . , d}. Define the mapping f : M`N (S0 ) −→ Dτ as follows: for
m ∈ M`N (S0 ) with signature s = (s1 , . . . , sτ ), set f (m) = (d1 , . . . , dτ ) = d where di ≡ si mod (d + 1). Then
there must be a dˆ ∈ Dτ such that
                                                                  `
                                                         ˆ ≥ |MN (S0 )| .
                                                | f −1 (d)|
                                                               (d + 1)τ
Define S to be this set f −1 (d);
                               ˆ it is easy to see that S is well-separated.

3.3    Bounding the rank of M: Step 3 of the proof
We now prove the lower bound on the rank of the matrix M(s1 , . . . , sτ ) as claimed in Lemma 3.10. We
first block-diagonalize it with matrices that have a simple combinatorial structure (their entries are 0 or 1
depending on intersection patterns of the sets that label the rows and columns). We then lower bound the
ranks of these matrices: this is the main technical step in the proof.

Lemma 3.16. Fix any signature (s1 , . . . , sτ ). The entry of M(s1 , . . . , sτ ) in row w ≡ [w̃, R1 , . . . , Rτ ] and
column m · pT with m ≡ [m̃, S1 , . . . , Sτ ] belongs to {0, 1} and is not zero if and only if w̃ = m̃ and the
following system is satisfied.                 
                                               
                                                  T ⊆ R1
                                                   S1 ⊆ R1 ∪ R2
                                               
                                               
                                               
                                               
                                                S2 ⊆ R2 ∪ R3
                                               
                                                  ..
                                               
                                               
                                                  .
                                                   S    ⊆ Rτ−1 ∪ Rτ
                                               
                                               
                                                τ−1
                                               
                                               
                                                   Sτ ⊆ Rτ
Moreover, the system above implies that T ∪ S1 ∪ · · · ∪ Sτ = R1 ∪ · · · ∪ Rτ .

Proof. The entry in row w and column m · pT belongs to {0, 1} and is not zero if and only if there exists
A ⊆ [N] with |A| = d − k such that T ⊆ A and XA · m = w. Assume there is such an A.
    Say w ≡ [w̃, R1 , . . . , Rτ ] and m ≡ [m̃, S1 , . . . , Sτ ]. Let m = ∏τi=1 (XSi )i and w = ∏τi=1 (XRi )i be the degree
at most τ parts of m and w respectively.
    Note that
                                                          τ         τ
                                 deg(w) − deg(m) = ∑ iri − ∑ isi = τk = d − k
                                                         i=1       i=1

by our choice of parameters rτ and k. Putting this together with the fact that w = XA · m for |A| = d − k,
we see that XA can only “contribute” to the “degree at most τ” part of m: formally, w = XA · m and hence,
w̃ = m̃.

                           T HEORY OF C OMPUTING, Volume 13 (9), 2017, pp. 1–34                                          16
     T HE S HIFTED PARTIAL D ERIVATIVE C OMPLEXITY OF E LEMENTARY S YMMETRIC P OLYNOMIALS

    Further, since
                                                         τ                  τ
                                    XA · m = XA\T XT ∏(XSi )i = ∏(XRi )i = w ,
                                                         i=1               i=1
and T ∩ (S1 ∪ · · · ∪ Sτ ) = 0,
                             / we have T ⊆ R1 . Since XA is multilinear, Si ⊆ Ri ∪ Ri+1 for all i ∈ [τ − 1], and
Sτ ⊆ Rτ is obvious.
    Conversely, assume that w̃ = m̃ and the inclusions T ⊆ R1 , Si ⊆ Ri ∪ Ri+1 for all i ∈ [τ − 1] and
Sτ ⊆ Rτ are satisfied. Then T ∪ S1 ∪ · · · ∪ Sτ ⊆ R1 ∪ · · · ∪ Rτ . Since
                                                             τ         τ
                              |T ∪ S1 ∪ · · · ∪ Sτ | = k + ∑ si = ∑ ri = |R1 ∪ · · · ∪ Rτ | ,
                                                          i=1         i=1

we get T ∪ S1 ∪ · · · ∪ Sτ = R1 ∪ · · · ∪ Rτ . Let Ai = Ri \ Si for i ∈ [τ] and A = A1 ∪ · · · ∪ Aτ . The sets Ai are
disjoint (because the Ri are disjoint). Moreover, |Aτ | = |Rτ \ Sτ | = rτ − sτ = k; and by induction,
                                                                                   τ        τ
                    |Ai | = |Ri \ Si | = |(Ri ∪ · · · ∪ Rτ ) \ (Si ∪ · · · ∪ Sτ )| = ∑ r j − ∑ s j = k .
                                                                                  j=i      j=i

Hence |Ai | = k for all i ∈ [τ]. Then |A| = τk = d − k. Moreover, T = A1 ⊆ A. And it holds that
                                                    τ             τ
                                               XA ∏(XSi )i = ∏(XRi )i .
                                                   i=1           i=1

Since w̃ = m̃, it follows that XA · m = w. Hence the entry in row w and column m · pT is 1.

Lemma 3.17. Let (s1 , . . . , sτ ) be any signature. The matrix M(s1 , . . . , sτ ) is block-diagonalizable with
blocks of size                                                          
                                              r                  r
                                                        ×                    .
                                        r1 r2 . . . rτ    s1 s2 . . . sτ k
Proof. Recall that r = s1 + · · · + sτ + k (Definition 3.9).
    Given a monomial w̃ and R ⊆ [N], a block of rows of M(s1 , . . . , sτ ) is defined by the set of monomials
w such that w ≡ [w̃, R1 , . . . , Rτ ] for some R1 , . . . , Rτ satisfying R1 ∪ · · · ∪ Rτ = R. In the same way, given m̃
and S ⊆ [N], a block of columns is defined by the set of polynomials m · pT such that m ≡ [m̃, S1 , . . . , Sτ ]
for some S1 , . . . , Sτ such that T ∪ S1 ∪ · · · ∪ Sτ = S.
    By Lemma 3.16, all blocks such that w̃ 6= m̃ or R 6= S are zero. Hence the matrix M(s1 , . . . , sτ ) is
diagonal by blocks of size                                                       
                                                   r                      r
                                                               ×                      .
                                             r1 r2 . . . rτ        s1 s2 . . . sτ k
    To obtain a lower bound on the rank of each block in the block diagonalization, we first establish a
technical lemma involving a multinomial inequality.
Lemma 3.18. For a good signature s = (s1 , . . . , sτ ), and the corresponding r as in Definition 3.9,
                                                                                    
                                         r                               r
              ∑        s01 s02 . . . s0τ−1 r − ∑τ−1  0 = s s ... s              τ−1     (1 + o(1)) .
     s0 ⩾s ,...,s0 ⩾s                           i=1 si        1 2      τ−1 r − ∑i=1 si
          1   1   τ−1   τ−1




                          T HEORY OF C OMPUTING, Volume 13 (9), 2017, pp. 1–34                                        17
           H ERV É F OURNIER , N UTAN L IMAYE , M EENA M AHAJAN , AND S RIKANTH S RINIVASAN

Proof. Note that τ = o(N δ ) (Fact 2.1). Also, from the definition of a good signature and from Remark 3.8,
si = Ω(si+1 N δ ) for i ∈ [τ]. Also, by our choice of parameters k, τ, we have r − (s1 + · · · + sτ−1 ) = sτ + k ∈
O(sτ ), so sτ−1 = Ω((sτ + k)N δ ).
    Thus for sufficiently large N we can find an absolute constant K > 0 such that
                                                                          
                                  r − (s1 + · · · + sτ−1 ) sτ−1         s2     K    1
                           max                            ,      ,··· ,      ≤ δ ≤     .
                                          sτ−1              sτ−2        s1    N    20τ

Define                                                                                                             
                                                                                          b
              S p (b, a1 , . . . , a p ) =       ∑                                                    p               .
                                             δ1 ,...δ p ∈N
                                                                 a1 + δ1 a2 + δ2 . . . a p + δ p b − ∑i=1 (ai + δi )

   The claimed result is a bound on Sτ−1 (r, s1 , s2 , . . . , sτ−1 ), and is a special case of the following, with
A = K/N δ . (For our choice of parameters, Kτ/N δ = o(1).)

Claim 3.19. Let A be a non-negative real number and p, b, a1 , . . . , a p be positive integers such that
b ≥ a1 + · · · + a p and
                                                                                               
                             b − (a1 + · · · + a p ) a p a p−1           a2                                 1
                         max                        ,      ,      ,··· ,                            ≤A≤        .
                                     ap               a p−1 a p−2        a1                                20p

Then the following holds.
                                                                                               
                                                                              b
                            S p (b, a1 , . . . , a p ) ⩽                               p            (1 + 5Ap) .
                                                                  a1 a2 . . . a p b − ∑i=1 ai

Proof. We prove this by induction on p.
    For the base case p = 1, we prove the following slightly stronger statement that will be used in the
inductive step as well.

Claim 3.20. Let R be a non-negative real number, and a, b be integers with 1 ≤ a ≤ b, satisfying
(b − a)/a ≤ R ≤ 1/20. Then
                                                     
                                               b       b
                           S1 (b, a) , ∑          0
                                                     ⩽    (1 + 2R) .
                                      δ 0 ∈N
                                              a+δ      a

Proof. By hypothesis, b − a ≤ aR. Notice that for δ 0 ⩾ 0,
                                                   b
                                                             
                                                 a+δ 0 +1            b−a−δ0 b−a
                                                                 =           ⩽   ⩽ R.
                                                   b                 a+δ0 +1
                                                        
                                                  a+δ 0
                                                                               a

Hence,                                                                              !  
                                                                         ∞
                                                b                   b            δ0    b
                    S1 (b, a) = ∑                                ⩽     1 + ∑ (R)     ⩽    (1 + 2R) .
                                    δ 0 ∈N
                                               a+δ0                 a     δ 0 =1
                                                                                       a


                           T HEORY OF C OMPUTING, Volume 13 (9), 2017, pp. 1–34                                           18
     T HE S HIFTED PARTIAL D ERIVATIVE C OMPLEXITY OF E LEMENTARY S YMMETRIC P OLYNOMIALS

    Note that Claim 3.20 implies the base case of induction for Claim 3.19.
    Now let p ≥ 2, and assume that the claim holds for all p0 < p. We have
                                                           
                                                       b
                  S p (b, a1 , . . . , a p ) = ∑             S p−1 (b − (a1 + δ1 ), a2 , . . . , a p ) .
                                              δ ∈N
                                                    a1 + δ1
                                               1

Note that for all non-negative δ1 ,
     (b − (a1 + δ1 )) − (a2 + · · · + a p ) ≤ (b − a1 ) − (a2 + · · · + a p ) = b − (a1 + a2 + · · · + a p ) ≤ a p A .
Hence the induction hypothesis is applicable to all the S p−1 terms, giving
                                                                                          
                                           b                    b − (a1 + δ1 )
  S p (b, a1 , . . . , a p ) ⩽ ∑                                              p                (1 + 5A(p − 1))
                               δ1 ∈N
                                      a1 + δ1        a2 a3 . . . a p b − (∑i=1    ai ) − δ1
                                                                                 
                                           b                    b − a1
                             ⩽ ∑                                            p          (1 + 5A(p − 1))
                               δ1 ∈N
                                      a1 + δ1        a2 a3 . . . a p b − ∑i=1    ai
                                                                                   
                                                         b − (a1 + δ1 )
                               (because                               p                 is a decreasing function of δ1 )
                                              a2 a3 . . . a p b − (∑i=1  ai ) − δ1
                                                                                                   
                                             b − a1                                               b
                             =                           p        (1 +  5A(p    − 1))    ∑ a1 + δ1
                                  a2 a3 . . . a p b − ∑i=1   ai                         δ1 ∈N
                                                                 
                                              b − a1
                             = ∗                           p        (1 + 5A(p − 1)) S1 (b, a1 ) .
                                   a2 a3 . . . a p b − ∑i=1    ai
We need to show that Claim 3.20 is applicable to S1 (b, a1 ). Note that
                            p         p             p
          b − a1      b − ∑i=1  ai + ∑i=2 ai b − ∑i=1  ai a p        a2
                    =                        =           + +···+
            a1                  a1                 a1      a1        a
                              p                                      1 
                        b − ∑i=1 ai
                                                      
                                          ap        a2         ap        a2         a2
                    =                          ···       +           ···    + ··· +
                            ap          a p−1       a1        a p−1      a1         a1
                           p
                    ⩽     ∑ Ai ⩽ 2A because A ≤ 1/20 .
                         i=1

Also,
                                                          1      1
                                                    2A ≤      ≤     .
                                                         10p 20
So we can use Claim 3.20 with R = 2A, a = a1 , and b. Continuing our derivation, we get
                                                                                     
                                                   b − a1                                b
          S p (b, a1 , . . . , a p ) ⩽                        p       (1 + 5A(p − 1))        (1 + 4A)
                                        a2 a3 . . . a p b − ∑i=1 ai                     a1
                                                                       
                                                        b
                                     =                           p        (1 + 5A(p − 1)) (1 + 4A)
                                        a1 a2 a3 . . . a p b − ∑i=1  ai
                                                                       
                                                        b
                                     ≤                           p        (1 + 5Ap)
                                        a1 a2 a3 . . . a p b − ∑i=1  ai
(using the assumption p ≤ 1/(20A)).

                           T HEORY OF C OMPUTING, Volume 13 (9), 2017, pp. 1–34                                            19
                H ERV É F OURNIER , N UTAN L IMAYE , M EENA M AHAJAN , AND S RIKANTH S RINIVASAN

    This completes the proof of Lemma 3.18.

    We now lower bound the rank of each block in the block diagonalization.
Lemma 3.21 (Main Technical lemma). Fix any good signature (s1 , . . . , sτ ). The rank of any diagonal
block of M(s1 , . . . , sτ ) is                    
                                          r
                                                     (1 − o(1)) .
                                 s1 s2 . . . sτ + k
Proof. Let M 0 be a diagonal block of the matrix M(s1 , . . . , sτ ). Recall from Lemma 3.17 that such a
diagonal block is defined by a monomial w̃ and a subset R ⊆ [N]. Rows of this block are labelled with all
monomials w ≡ [w̃, R1 , . . . , Rτ ] such that R1 ∪ · · · ∪ Rτ = R and columns of this block are labelled with
all polynomials m · pT where m ≡ [w̃, S1 , . . . , Sτ ] is such that T ∪ S1 ∪ · · · ∪ Sτ = R. First, we set up some
notation.
     For a partition B̃ = (B1 , . . . , B p ) of R, let b̃ = (b1 , . . . , b p ) be the tuple of part sizes, bi = |Bi |. We say
that b̃ is the signature of B̃.
     We say (a1 , . . . , a p )  (b1 , . . . , b p ) if ai ≤ bi for all i ∈ [p], and (a1 , . . . , a p ) ≺ (b1 , . . . , b p ) if
                         (a1 , . . . , a p )  (b1 , . . . , b p )       but   (a1 , . . . , a p ) 6= (b1 , . . . , b p ) .
    Define the following collections of partitions of R.
         X      = {R̃ = (R1 , . . . , Rτ ) | signature(R̃) = (r1 , . . . , rτ )} ,
         Y      = {S̃ = (S1 , . . . , Sτ , T ) | signature(S̃) = (s1 , . . . , sτ , k)} ,
            0
        Z       = {Q̃ = (Q1 , . . . , Qτ ) | signature(Q̃) = (q1 , . . . , qτ ); (s1 , . . . , sτ−1 )  (q1 , . . . , qτ−1 )} ,
         Z = {Q̃ = (Q1 , . . . , Qτ ) | signature(Q̃) = (q1 , . . . , qτ ); (s1 , . . . , sτ−1 ) ≺ (q1 , . . . , qτ−1 )} .
Note that                                                                       
                                                                       r
                                                        |X| =                      .
                                                              s1 s2 . . . sτ + k
Also, Z 0 \ Z is precisely X. By Lemma 3.18, |Z 0 | = |X|(1 + o(1)). Hence |Z| = |X| · o(1).
   For any S̃ ∈ Y , define the partition S˜X = (S1 , . . . , Sτ−1 , Sτ ∪ T ) ∈ X. We say that S̃ “extends” S˜X .
   The rows and columns of M 0 are indexed by elements of X and Y respectively (Lemma 3.17).
   We define two auxiliary matrices M1 and M2 as follows. The rows and columns of M1 are indexed by
elements of X. The entries of M1 are in {0, 1} and are defined as follows.
                                         1 if R0i ⊆ Ri ∪ Ri+1 for each i ∈ [τ − 1],
                                       
                                  0
                        M1 (R̃, R̃ ) =
                                         0 otherwise.
    The rows and columns of M2 are indexed by elements of X and Z respectively. The entries of M2 are
in {0, 1} and are defined as follows.
                                    
                                      1 if Qi ⊆ Ri ∪ Ri+1 for each i ∈ [τ − 1],
                      M2 (R̃, Q̃) =
                                      0 otherwise.
    Let I be the identity matrix with rows and columns indexed by elements of X.
    Our proof proceeds by establishing the following two claims.

                               T HEORY OF C OMPUTING, Volume 13 (9), 2017, pp. 1–34                                               20
     T HE S HIFTED PARTIAL D ERIVATIVE C OMPLEXITY OF E LEMENTARY S YMMETRIC P OLYNOMIALS

Claim 3.22. The columns of M 0 and M2 together span the columns of M1 ; hence

                                               rank(M1 ) ≤ rank(M 0 ) + rank(M2 ) .

Claim 3.23. The columns of M1 and M2 together span the columns of I; hence

                                                rank(I) ≤ rank(M1 ) + rank(M2 ) .

    It then follows that

        rank(M 0 ) ≥ rank(M1 ) − rank(M2 ) ≥ rank(I) − 2 rank(M2 ) ≥ |X| − 2|Z| = |X|(1 − o(1))

which is what we had set out to prove.
    To prove the claims, we set up some notation. For A ⊆ [τ], define the function ϕA : X × 2R −→ {0, 1}
as follows.                                    
                                                 1 if S ⊆ i∈A Ri ,
                                                            S
                                  ϕA (R̃, S) =
                                                 0 otherwise.
Note that if S = 0,    / then ϕA (R̃, S) = 1 for every A and R̃.
    With some abuse of notation, for sets of size 1 or 2 we drop the set notation. eg ϕi1 ,i2 (R̃, j) is the same
as ϕ{i1 ,i2 } (R̃, { j}).
    Since τ is odd and at least 3, we can express the functions ϕA (·, ·) for singleton sets A in terms of the
functions ϕB (·, ·) where |B| = 2. In particular, for A = {1} and for A = {τ}, we write

                                  ϕτ (R̃, j) = 1 − ϕ1,2 (R̃, j) − ϕ3,4 (R̃, j) − · · · − ϕτ−2,τ−1 (R̃, j)
                  and             ϕ1 (R̃, j) = 1 − ϕ2,3 (R̃, j) − ϕ4,5 (R̃, j) − · · · − ϕτ−1,τ (R̃, j) .

    For A = {1} or A = {τ} and S ⊆ [N] we write

                                                    ϕA (R̃, S) =       ∏ ϕA (R̃, j) .
                                                                       j∈S

    We use these functions to compactly describe the columns of M 0 , M1 , M2 , I. By definition,
                            τ−1
          M1 [R̃, R̃0 ] = ∏ ϕi,i+1 (R̃, R0i ) ;
                            i=1
                            τ−1
          M2 [R̃, Q̃] = ∏ ϕi,i+1 (R̃, Qi ) ;
                            i=1
                                                !                             !
                              τ                         τ−1
            I[R̃, R̃0 ] =    ∏      ϕi (R̃, R0i )   =   ∏     ϕi,i+1 (R̃, R0i )   ϕτ (R̃, R0τ ) = M1 [R̃, R̃0 ]ϕτ (R̃, R0τ ) ;
                             i=1                        i=1

where the second equality follows from the fact that R̃, R̃0 have the same signature. (RHS = 1 ⇒
ϕτ (R̃, R0τ ) = 1 ⇒ R0τ ⊆ Rτ ⇒ R0τ = Rτ because the sets are equi-sized. Then RHS = 1 further ⇒
ϕτ−1,τ (R̃, R0τ−1 ) = 1 ⇒ R0τ−1 ⊆ Rτ−1 ∪ Rτ . But R0τ−1 is disjoint from R0τ = Rτ . So R0τ−1 ⊆ Rτ−1 , and since
they are equi-sized, they must be the same. Continuing this way, we conclude R̃ = R̃0 .)

                            T HEORY OF C OMPUTING, Volume 13 (9), 2017, pp. 1–34                                                 21
           H ERV É F OURNIER , N UTAN L IMAYE , M EENA M AHAJAN , AND S RIKANTH S RINIVASAN

Proof of Claim 3.22. Starting with Lemma 3.16,

               M 0 [R̃, S̃] = ϕ1,2 (R̃, S1 )ϕ2,3 (R̃, S2 ) . . . ϕτ−1,τ (R̃, Sτ−1 )ϕτ (R̃, Sτ )ϕ1 (R̃, T )
                                                    !                       !                   !
                                     τ−1
                           =         ∏ ϕi,i+1 (R̃, Si )       ∏ ϕτ (R̃, j)           ∏ ϕ1 (R̃, T )
                                     i=1                      j∈Sτ                   j∈T
                                                        !
                                     τ−1
                           =         ∏ ϕi,i+1 (R̃, Si )
                                     i=1
                                                                                                       !
                                                                                                     
                                 ×        ∏ 1 − ϕ1,2 (R̃, j) − ϕ3,4 (R̃, j) − · · · − ϕτ−2,τ−1 (R̃, j)
                                          j∈Sτ
                                                                                              !
                                                                                            
                                 × ∏ 1 − ϕ2,3 (R̃, j) − ϕ4,5 (R̃, j) − · · · − ϕτ−1,τ (R̃, j)
                                          j∈T
                                                                               τ−1
                           =                           ∑                       ∏ (−1)|Q \S | ϕi,i+1 (R̃, Qi )
                                                                                             i   i

                                                                               i=1
                                  partition Q̃ = (Q1 , . . . , Qτ−1 , Qτ ) :
                                             ∀i ∈ [τ − 1], Si ⊆ Qi
                                                 τ−1
                           =      ∑ αS̃,Q̃ ∏ ϕi,i+1 (R̃, Qi ) .
                                 Q̃∈Z 0          i=1


The coefficients αS̃,Q̃ are all in {−1, 0, 1}. Observe that:

    • The coefficient αS̃,S˜X is 1, and the corresponding term is precisely M1 [R̃, S˜X ].

    • The coefficient αS̃,Q̃ is 0 for all Q̃ ∈ X \ {S˜X }. (One of the requirements Si ⊆ Qi must be violated.)

    • All other Q̃ are in Z, and the corresponding term is precisely M2 [R̃, Q̃].

Hence
                                     M 0 [R̃, S̃] = M1 [R̃, S˜X ] + ∑ αS̃,Q̃ M2 [R̃, Q̃] .
                                                                     Q̃∈Z

Since the coefficients in this combination do not depend on the row R̃, we obtain

                                     M 0 [∗, S̃] = M1 [∗, S˜X ] + ∑ αS̃,Q̃ M2 [∗, Q̃] .
                                                                     Q̃∈Z


    For every R̃0 ∈ X, arbitrarily pick any S̃ ∈ Y extending it. Then

                                     M1 [∗, R̃0 ] = M 0 [∗, S̃] − ∑ αS̃,Q̃ M2 [∗, Q̃] .
                                                                     Q̃∈Z



                         T HEORY OF C OMPUTING, Volume 13 (9), 2017, pp. 1–34                                   22
      T HE S HIFTED PARTIAL D ERIVATIVE C OMPLEXITY OF E LEMENTARY S YMMETRIC P OLYNOMIALS

Proof of Claim 3.23. Starting with I and proceeding in exactly the same way, we obtain
             I[R̃, R̃0 ] = M1 [R̃, R̃0 ]ϕτ (R̃, R0τ )
                                                                                                            !
                                                                                                          
                        = M1 [R̃, R̃0 ]        ∏ 1 − ϕ1,2 (R̃, j) − ϕ3,4 (R̃, j) − · · · − ϕτ−2,τ−1 (R̃, j)
                                            j∈R0τ

                        = M1 [R̃, R̃0 ] + ∑ βR̃0 ,Q̃ M2 [R̃, Q̃]
                                               Q̃∈Z

for some coefficients βR̃0 ,Q̃ independent of R̃. Hence
                                      I[∗, R̃0 ] = M1 [∗, R̃0 ] + ∑ βR̃0 ,Q̃ M2 [∗, Q̃] .
                                                                   Q̃∈Z

    With both claims established, the proof of Lemma 3.21 is now complete.
   Lemma 3.10 can now be proved using the block-diagonal decomposition (Lemma 3.17) and the rank
lower bound (Lemma 3.21).
Proof of Lemma 3.10. By Lemma 3.17, we know that M(s1 , . . . , sτ ) can be block-diagonalized into blocks
of size                                                              
                                           r                  r
                                                    ×                     .
                                    r1 r2 · · · rτ     s1 s2 · · · sτ k
Let B denote the number of blocks in this block diagonalization.
    By Lemma 3.21, we know that each block has rank
                                                                              
                                   r                                  r
                  (1 − o(1))                  = (1 − o(1))
                             r1 r2 · · · rτ                 s1 s2 · · · sτ + k
                                                                                        
                                                                1              r
                                              = (1 − o(1)) · s +k
                                                               τ
                                                                 k
                                                                        s1 s2 · · · sτ k
                                                            1 − o(1)
                                                        =     sτ +k
                                                                     · (# of columns in each block)
                                                               k
where the first equality is a result of our choice of parameters and the second follows from the combinato-
rial identity:                                                                
                                           r               1           r
                                                       = s +k                     .
                                  s1 s2 · · · sτ + k      τ
                                                            k
                                                                s1 s2 · · · sτ k
    Thus, the rank of the matrix M(s1 , . . . , sτ ), which is the sum of the ranks of the blocks, is
                                      1 − o(1)
                                        sτ +k
                                               · (# of columns in each block) · B
                                           k
                                     1 − o(1)
                                    = s +k · (# of columns in M(s1 , . . . , sτ ))
                                        τ
                                               k
                                      |P(s1 , . . . , sτ )|
                                    =      sτ +k
                                                           (1 − o(1)) ,
                                                   k
since |P(s1 , . . . , sτ )| is the number of columns in M(s1 , . . . , sτ ).

                           T HEORY OF C OMPUTING, Volume 13 (9), 2017, pp. 1–34                                 23
          H ERV É F OURNIER , N UTAN L IMAYE , M EENA M AHAJAN , AND S RIKANTH S RINIVASAN

3.4   Putting it together
We now have all the ingredients to establish that the shifted partial derivative measure of SNd is large.
Theorem 3.1 (Restated). Let α ∈ (0, 1/2) be a constant. Let N, d, k ∈ N be such that 4k ≤ d ≤
α lg N/ lg lg N and 2k | d. Over a field of characteristic zero, for τ = d/k − 1, for any δ satisfying
α ≤ 1 − δ (τ + 1) < 1 − δ τ ≤ 1 − α, and for ` = bN 1−δ c, the following holds.

                                               (1 − o(1)) · N+`
                                                                  N−`
                                       d                       `   · k
                              dimh∂k SN i≤` ⩾       1−δ      k
                                                                         .
                                                (3N     τ /2) · (d + 1)τ

Proof of Theorem 3.1. By Lemma 3.2, dimh∂k SNd i≤` ≥ dim(span(P)). This in turn is at least as large as
rank(M(S)) for any set S of signatures, since M(S) is a submatrix of the matrix that describes a basis for
P. By Proposition 3.15, there is a well-separated set of good signatures S with large |M`N (S)|. Choose
such a set. Then

       dimh∂k SNd i≤` ≥ dim(span(P))                               (by Lemma 3.2)
                     ≥ rank(M(S))
                       (1 − o(1)) N−`
                                         
                     ≥       1−δ
                                     k
                                       k
                                           · |M`N (S)|             (by Lemma 3.14)
                        (3N      τ /2)
                       (1 − o(1)) N−`
                                         
                                     k       |M`N (S0 )|
                     ≥                     ·                       (by Proposition 3.15)
                        (3N 1−δ τ /2)k        (d + 1)τ
                       (1 − o(1)) N−`
                                         
                                     k       |M`N |(1 − o(1))
                     ≥                     ·                       (by Lemma 3.5 and Remark 3.8)
                        (3N 1−δ τ /2)k           (d + 1)τ
                       (1 − o(1)) N−`       N+`
                                               
                                     k        `
                     =                            .
                       (3N 1−δ τ /2)k (d + 1)τ

4     Modification for the general case of parameters
The proof of Theorem 3.1 handles the case when d is divisible by 2k, and thus, by our choice of τ, it is
also divisible by τ + 1. In this section, we state the modifications which we make to the proof so that it
works in a more general setting, and thus prove Theorem 1.1.
Theorem 1.1 (Restated). Let α ∈ (0, 1/2) be a constant. Let N, d, k ∈ N be such that 4k ≤ d ≤
α lg N/ lg lg N and k = bd/(τ + 1)c for some odd number τ ≥ 3. Over a field of characteristic zero,
for any δ satisfying α ≤ 1 − δ (τ + 1) < 1 − δ τ ≤ 1 − α, and for ` = bN 1−δ c, the following holds.

                                                (1 − o(1)) · N+`
                                                                   N−`
                                       d                        `  · k
                               dimh∂k SN i≤` ⩾       1−δ      k
                                                                          .
                                                 (3N     τ /2) · (d + 1)τ
Proof sketch. We follow the proof outline for Theorem 3.1.
    1. The bounds concerning the parameters (Fact 2.1) and good signatures (Lemma 3.5) continue to
       hold.

                       T HEORY OF C OMPUTING, Volume 13 (9), 2017, pp. 1–34                                 24
      T HE S HIFTED PARTIAL D ERIVATIVE C OMPLEXITY OF E LEMENTARY S YMMETRIC P OLYNOMIALS

   2. In Section 2 in Definition 3.9, we had set the parameters (r1 , . . . , rτ ) corresponding to a signature
      (s1 , . . . , sτ ). Here we modify this slightly. Let g be the loss due to the floor function; g , d −k(τ +1).
      (Note: earlier, we had g = 0.) We have 0 ≤ g ≤ τ, and τk + g = d − k. Now, we let ri = si for
      i ∈ [τ − 2], let rτ−1 = sτ−1 − g, and let rτ = sτ + k + g. It can be verified that with this choice,
                                  τ           τ                              τ         τ        
                                  ∑ ri = ∑ si + k                  and         ∑ iri = ∑ isi + d − k .
                                  i=1         i=1                            i=1         i=1


   3. We define matrix M(s1 , s2 , . . . , sτ ) as in Definition 3.9 but with respect to the new parameter setting.
      It is easy to see that Lemma 3.16 and Lemma 3.17 hold in this new setting as well, because the
      proof only uses the fact that
                                        τ           τ                               τ          τ
                                        ∑ iri = ∑ isi + d − k            and       ∑ ri = ∑ si + k .
                                        i=1        i=1                             i=1         i=1


   4. Finally, we note that Lemma 3.21 holds with a few modifications to the proof. Recall the overall
      strategy: for partitions X,Y, Z, Z 0 and matrices M 0 , M1 , M2 , I, we had shown

             rank(M 0 ) ≥ rank(M1 ) − rank(M2 ) ≥ rank(I) − 2 rank(M2 ) ≥ |X| − 2|Z| = |X|(1 − o(1)) .

       We now define an additional set Ỹ of partitions as follows.

                             Ỹ    = {S̃ = (S1 , . . . , Sτ ∪ T ) | signature(S̃) = (s1 , . . . , sτ + k)} .

       Notice that Ỹ = Z \ Z 0 . Let D be the matrix whose rows are labelled by X and columns by Ỹ ,
       and defined in the following way: D[(R1 , . . . , Rτ ), (S1 , . . . , Sτ−1 , Sτ0 )] = 1 if Si ⊆ Ri ∪ Ri+1 for all
       i ∈ [τ − 1] and Sτ0 ⊆ Rτ ; otherwise, it is 0.5 Now our strategy is to show

                                  rank(M 0 ) ≥ rank(M1 ) − rank(M2 ) ≥ rank(D) − 2 rank(M2 )
                                               ≥ |Ỹ |(1 − o(1)) − 2|Z| = |Ỹ |(1 − o(1)) .

       The first step (columns of M 0 and M2 span those of M1 ) works exactly as before. So does the
       second step (columns of M1 and M2 span those of D): this is because for R̃ = (R1 , . . . , Rτ ) and
       S̃ = (S1 , . . . , Sτ−1 , Sτ0 ), we have

                  D[R̃, S̃] = M1 [R̃, S̃]ϕτ (R̃, Sτ0 )
                                                                                                                     !
                                                                                                                   
                             = M1 [R̃, S̃]              ∏ 1 − ϕ1,2 (R̃, j) − ϕ3,4 (R̃, j) − · · · − ϕτ−2,τ−1 (R̃, j) .
                                                        j∈Sτ0

       The step showing rank(M2 ) is small follows from Lemma 3.18 which holds as is, since Z, Z 0 depend
       only on the signature s and not on how we set r.
   5 Note that, in the original parameter setting when g = 0, we have Ỹ = Z \ Z 0 = X and D is the identity matrix I. When g 6= 0,

due to the new setting of parameters, Ỹ 6= X and D is different from I.


                            T HEORY OF C OMPUTING, Volume 13 (9), 2017, pp. 1–34                                               25
    H ERV É F OURNIER , N UTAN L IMAYE , M EENA M AHAJAN , AND S RIKANTH S RINIVASAN

We now need one additional step showing that D has rank |Ỹ |(1 − o(1)), and then we note that
                                                             
                                                    s
                                  |Ỹ | =                       .
                                           s1 s2 . . . sτ + k

First notice that the matrix D has a block diagonal structure. For a given pair of tuples

                             (R1 , R2 , . . . , Rτ−2 )   and       (S1 , S2 , . . . , Sτ−2 )

the block corresponding to this pair is
                      ∃Rτ−1 , Rτ , Sτ−1 , Sτ0 such that R̃ = (R1 , R2 , . . . , Rτ−2 , Rτ−1 , Rτ )
                                                                                                  
             (R̃, S̃)                                                                                .
                      and S̃ = (S1 , S2 , . . . , Sτ−2 , Sτ−1 , Sτ0 )
If (R1 , R2 , . . . , Rτ−2 ) 6= (S1 , S2 , . . . , Sτ−2 ) then any entry of the matrix in the block corresponding
to the pair is zero. If (S1 , S2 , . . . , Sτ−2 ) = (R1 , R2 , . . . , Rτ−2 ), then in the block defined by this pair,
consider sub-blocks where rows are grouped by Rτ−1 ∪ Rτ and columns by Sτ−1 ∪ Sτ0 . Again, entries
outside the diagonal sub-blocks are all zeroes.
So now consider a sub-block, with U = Rτ−1 ∪ Rτ = Sτ−1 ∪ Sτ0 . Each row can be thought of as
labelled by Rτ (this determines Rτ−1 as U \ Rτ ) and each column by Sτ0 (again, this determines
Sτ−1 ), where |Rτ | = rτ = sτ + g + k, and |Sτ0 | = sτ + k. And the entry in the cell labelled by row Rτ
and column Sτ0 is 1 exactly when Sτ0 ⊆ Rτ . Thus, this sub-block is an inclusion matrix. We use a
theorem of Wilson [28] to analyze the rank of each sub-block in the matrix. This theorem is stated
below.
                                                                    u the u × u inclusion matrix in
                                                                                  
Let u, v, w be non-negative integer parameters. We denote by Wvw           v      w
which each row is labelled by a set of size v from a universe of size u and each column is labelled
by a set of size w from the same universe. The (A, B)th entry of the matrix is 1 if A ⊆ B and 0
otherwise.
Theorem 4.1 (Wilson [28]). Assume u, v, w ∈ N. For v ≤ min{w, u − w}, the rank of the matrix
  u in characteristic p > 0 is equal to
Wvw
                                                      
                                             u        u
                                        ∑ i     −
                                                    i−1

where the sum is over indices i such that p does not divide w−i
                                                                
                                                            v−i .

When i = v, w−i
                  
               v−i = 1, i. e., this binomial coefficient is not divisible by any p. Therefore, for any p,
the term corresponding to i = v appears    in the summation.  Since each term in the summation is non-
                         u is at least u − u . (Note also that this holds over any characteristic, a
                                                   
negative, the rank of Wvw               v       v−1
fact that will be useful in Section 5).
                                                u , with u = s
Our sub-blocks are transposes of the matrices Wvw             τ−1 + sτ + k, v = sτ + k, w = sτ + k + g.
Hence each sub-block has rank at least
                                                                
                                 sτ−1 + sτ + k       sτ−1 + sτ + k
                                                −                    .
                                     sτ + k            sτ + k − 1

                   T HEORY OF C OMPUTING, Volume 13 (9), 2017, pp. 1–34                                           26
     T HE S HIFTED PARTIAL D ERIVATIVE C OMPLEXITY OF E LEMENTARY S YMMETRIC P OLYNOMIALS

       From Lemma 3.5, k, g, sτ = o(sτ−1 ). Hence the rank of each sub-block is at least
                                                         
                                            sτ−1 + sτ + k
                                                           (1 − o(1))
                                               sτ + k
       which is (1 − o(1))(number of columns in sub-block). Due to the diagonal structure, we can add
       up over all the sub-blocks and blocks to get
                        rank(D) ≥ (1 − o(1))(number of columns in D) = (1 − o(1))|Ỹ | .

    5. Putting together the steps as done in Section 3.4 establishes Theorem 1.1.


5    Modified proof for non-zero characteristic
In this section we desribe how to adapt our proofs of Theorem 1.1 and Theorem 3.1 to work over fields
with positive characteristic. The bound we obtain is slightly, but not significantly, weaker.
    We first observe that the only place in our proofs of Theorem 1.1 and Theorem 3.1 where we use
characteristic zero is Step 1 in the proof of Lemma 3.2. It is easy to see that all other steps, including the
adaptation described in Section 4, are independent of the characteristic.
    While working in positive characteristic, we replace Lemma 3.2 by a different statement. Recall the
use of Lemma 3.2; it allowed us to lower-bound dim h∂k SNd i≤` by the dimension of
                            n                                                       o
                       P = m · pT T ⊆ [N], |T | = k, m ∈ M`N , supp(m) ∩ T = 0/ .

Here, we consider a somewhat      different set P0 , and also end up with a 1/(k + 1) factor while lower-
                     d
                          
bounding dim h∂k SN i≤` .
   Let N 0 := N − k. We work with N 0 variables, with well-chosen 0-1 settings to the last k variables.
Recall that for any set S ⊆ [N] such that |S| ≤ k, the polynomial rS (x) is defined as follows.
                                       rS (x) :=             ∑                XA ,
                                                    A⊆[N],|A|=d−k,A∩S=0/

where XA := ∏i∈A xi . Let D denote the set {rS | S ⊆ [N], |S| = k}.
   Similarly, for any set S ⊆ [N 0 ] such that |S| ≤ k, define the polynomial rS0 (x) as follows.
                                       rS0 (x) :=            ∑                XA .
                                                    A⊆[N 0 ],|A|=d−k,A∩S=0/

     In Step 1 of the proof of Lemma 3.2 we showed that for any S0 ⊆ [N] with |S0 | ≤ k, the polynomial
rS0 is in the span of D. In the case of non-zero characteristic, we create k + 1 sets D0 , D1 , . . . , Dk , each
of dimension at most that of D, and show that for each S0 ⊆ [N 0 ] with |S0 | ≤ k, the union of these sets
contains rS0 0 .
     For every 0 ≤ i ≤ k, let πi : {x1 , . . . , xN } → {x1 , . . . , xN 0 } ∪ {0, 1} be defined as follows.

                                              x j if 1 ≤ j ≤ N 0 ,
                                             

                               πi (x j ) =        1    if (N 0 + 1) ≤ j ≤ (N 0 + i),
                                                  0     otherwise.
                                             


                        T HEORY OF C OMPUTING, Volume 13 (9), 2017, pp. 1–34                                  27
            H ERV É F OURNIER , N UTAN L IMAYE , M EENA M AHAJAN , AND S RIKANTH S RINIVASAN

The map πi naturally extends to a ring homomorphism from F[x1 , . . . , xN ] to F[x1 , . . . , xN 0 ]. For each
0 ≤ i ≤ k, let Di := πi (D) = {πi (rS ) | rS ∈ D}.
      Consider any S0 ⊆ [N 0 ] of size at most k. We augment S0 to a set S00 ⊆ [N] of size exactly k, using the
last k reserved indices, by defining S00 = S0 ∪ {N 0 + 1, . . . , N 0 + (k − |S|)}. (If |S0 | = k, then S00 = S0 .) Now
the projection πk−|S0 | applied to the augmented-set-polynomial rS00 gives back the polynomial rS0 0 ; that is,
rS0 0 = πk−|S0 | (rS00 ) ∈ Dk−|S0 | . Therefore we get
                                                                            k
                                       {rS0 0 | S0 ⊆ [N 0 ], |S0 | ≤ k} ⊆         Di .
                                                                            [

                                                                            i=1
    Let
                                           p0T (x) :=          ∑              XB .
                                                        T ⊆B⊆[N 0 ],|B|=d−k

Let P0 denote {p0T | T ⊆ [N 0 ], |T | = k}. The inclusion-exclusion argument in Step 2 of the proof of
Lemma 3.2 works exactly as before, over the set [N 0 ], and tells us that P0 is contained in
                                          span{rS0 0 | S0 ⊆ [N 0 ], |S0 | ≤ k}
and therefore, in span(∪i Di ). We define P0 to be set of shifts of P0 of degree (at most) `, similar to P
defined in Section 3.1, but restricting even the shifts to variables from [x1 , . . . , xN 0 ].
            n                                                          o
       P0 = m · p0T T ⊆ [N 0 ], |T | = k, m ∈ M`N 0 , supp(m) ∩ T = 0/ ⊆ {m · q | m ∈ M`N 0 , q ∈ P0 } .

Since P0 ⊆ span ∪i Di , we get
                                             k n                                         o
                                 P0 ⊆ span         m · πi (rS ) | rS ∈ D, m ∈ M`N 0
                                             [

                                             i=0
                                              k n                                  o
                                                   πi (m · rS ) | rS ∈ D, m ∈ M`N 0 .
                                             [
                                    = span
                                             i=0

    The equality holds because for monomials m ∈ M`N 0 , for every 0 ≤ i ≤ k, m = πi (m).
    Now, note that for any finite set X ⊆ F[x1 . . . , xN ] and a linear map π between vector spaces
F[x1 , . . . , xN ] and F[x1 , . . . , xN−k ], dim(span π(X)) ≤ dim(span X). Therefore, we get that
                        n                                  o            n                        o
             dim span πi (m · rS ) | rS ∈ D, m ∈ M`N 0         ≤ dim span m · rS | rS ∈ D, m ∈ M`N 0 .
    Thus,
                                                                                    !
                                0             k n                                o
                       dim span P                  πi (m · rS ) | rS ∈ D, m ∈ MN 0
                                                                               `
                                                [
                                     ≤ dim span
                                                             i=0
                                              k      n                                 o
                                          ≤ ∑ dim span πi (m · rS ) | rS ∈ D, m ∈ M`N 0
                                             i=0
                                                             n                         o
                                          ≤ (k + 1) dim span m · rS | rS ∈ D, m ∈ M`N 0
                                                                  
                                          ≤ (k + 1) dim h∂k SNd i≤` .


                         T HEORY OF C OMPUTING, Volume 13 (9), 2017, pp. 1–34                                       28
     T HE S HIFTED PARTIAL D ERIVATIVE C OMPLEXITY OF E LEMENTARY S YMMETRIC P OLYNOMIALS

    We can now proceed with the proof of Theorem       1.1 or Theorem 3.1 exactly as before, using
dim (span P0 ) as our lower bound for dim h∂k SNd i≤` . For our choice of parameters, recall that k, d ∈
o(lg N), so N 0 = N − k = Ω(N) and d = o(lg N 0 ) as well. Hence the earlier proof goes through. We have
established the following result.
Theorem 5.1. Let α ∈ (0, 1/2) be a constant. Let N, d, k ∈ N be such that 4k ≤ d ≤ α lg N/ lg lg N and
k = bd/(τ + 1)c for some odd number τ ≥ 3. For any δ satisfying α ≤ 1 − δ (τ + 1) < 1 − δ τ ≤ 1 − α,
and for ` = bN 1−δ c, the following holds.
                                                                   N−k+`
                                                                            N−k−`
                         d        dim (span {P0 }) (1 − o(1))         `     ·     k
                dimh∂k SN i≤` ⩾                   ⩾           ·     1−δ
                                                                                          .
                                       k+1            k+1       (3N     τ /2)k · (d + 1)τ

6    Lower bound on the size of depth-four circuits
In this section, we establish the lower bounds claimed in Theorem 1.2 and Corollary 1.3. As in [9], we
say that a ΣΠΣΠ circuit C is a ΣΠ[D] ΣΠ[t] circuit if the product gates at level 1 (just above the input
variables) have fan-in at most t and the product gates at level 3 have fan-in bounded by D.
    The following is implicit in [9] and is stated explicitly in [13].
Lemma 6.1 ([13], Lemma 4). Let P be a polynomial on N variables computed by a ΣΠ[D] ΣΠ[t] circuit of
top fan-in s. Then, we have
                                                                    
                                                  D   N + ` + (t − 1)k
                            dim(h∂k Pi≤` ) ≤ s ·    ·                    .
                                                  k     ` + (t − 1)k
    We are now ready to prove
Theorem 1.2 (Restated). Let ε ∈ (0, 1) be a constant. Let N, d, D,t ∈ N be such that
                               10t      ε lg N
                                   ≤d≤                   and      D ≤ N 1−ε .
                                ε      5 lg lg N

Any ΣΠ[D] ΣΠ[t] circuit of top fan-in s computing SNd satisfies s = N Ω(d/t) .
Proof. Assume there exists a Σs Π[D] ΣΠ[t] circuit computing SNd .
    We first illustrate the proof for one setting: the field has characteristic zero, ε = 3/4, and 4t + 2
divides d. Choosing α = 3/20, τ = 4t + 1, k = d/(4t + 2), δ = 1/(2τ), all the conditions for invoking
Theorem 3.1 are met. From Theorem 3.1 and Lemma 6.1, when N is large enough, it holds that
                                   N−`       N+`
                                                
                                     k         `             1 − o(1)
                              s ≥ D N+`+(t−1)k          1−δ τ /2)k (d + 1)τ
                                                                              .
                                     k
                                                      (3N
                                            `+(t−1)k

    For large enough N, since k, ` = o(N), we have
                                   N−`
                                            N −`−k k
                                                     k
                                    k                   N
                                    D
                                       ≥            ≥      .
                                    k
                                                D       2D

                        T HEORY OF C OMPUTING, Volume 13 (9), 2017, pp. 1–34                          29
           H ERV É F OURNIER , N UTAN L IMAYE , M EENA M AHAJAN , AND S RIKANTH S RINIVASAN

    Since kt < d = o(lg N), for large enough N we have
                                 N+`               kt 
                                                          1 kt
                                                            
                                                `
                                   `
                              N+`+(t−1)k
                                          ≥           ≥         ≥ N −δtk−o(k) .
                                              N +`       2N δ
                               `+(t−1)k

    By Fact 2.1, (d + 1)τ ≤ (lg N)τ ≤ N α .
    We are given that D ≤ N 1−ε .
    Putting it all together, we have obtained that asymptotically,
                                                k                                                    k
                             N                          1 − o(1)   1                 N
       s≥                                             ·          ≥ α·                                      .   (6.1)
                2D · N δt+o(1) · (3N 1−δ τ /2)            Nα      N     N 1−ε · N 1−δ τ · N δt · N o(1)

    By our choice of parameters, 1 − δ τ = 1/2. Also, t ≤ τ/4, so δt ≤ δ τ/4 = 1/8. And 1 − ε = 1/4.
Thus we see that Equation (6.1) yields a lower bound of N Ω(k) = N Ω(d/t) .
    The above proof idea (with some changes in parameters) can be made to give lower bounds of N Ω(d/t)
for D ≤ N 1−ε for any constant ε > 0. Firstly, to handle the absence of a divisibility constraint (in the
above setting, we had assumed that 4t + 2 divides d), we should use Theorem 1.1 instead of Theorem 3.1.
Then, we must choose α, τ, δ appropriately. It can be verified that if we choose α = ε/5, δ = ε/(10t),
and let τ be the smallest odd integer such that 1 − δ τ ≤ ε/2, everything works out.
    Finally, to obtain the same result over fields of positive characteristic, we can follow the same outline,
replacing the use of Theorem 3.1 or Theorem 1.1 by Theorem 5.1. This Theorem gives a slightly weaker
bound for dim(h∂k Pi≤` ). However, in the asymptotic bound stated in Theorem 1.2, the degradation of
this bound is irrelevant.

Corollary 1.3 (Restated). Let parameters N, d,t be as in Theorem 1.2. Any ΣΠ[O(d/t)] ΣΠ[t] circuit
computing SNd must have top fan-in at least N Ω(d/t) . In particular, any homogeneous ΣΠΣΠ circuit C with
bottom fan-in bounded by t computing SNd must have top fan-in at least N Ω(d/t) .

Proof. The first statement is an immediate corollary of Theorem 1.2 since d/t ≤ d = N o(1) . The second
follows from the first by a standard trick [9]: given any homogeneous ΣΠΣΠ circuit, we can ensure
that the fan-in of the layer-3 product gates is at most O(d/t) by repeatedly multiplying out pairs of
polynomials of degree at most t/2 that feed into it. This does not change the top fan-in of the circuit and
ensures that the bottom fan-in remains bounded by t. At the end of this procedure, each Π gate on layer 3
has at most 1 polynomial of degree < t/2 feeding into it; homogeneity now entails that the fan-in of the
Π-gate must be at most 2d/t + 1.

Remark 6.2. The above corollary also yields a lower bound for the model of regular formulas, which
were introduced and studied in the work of Kayal et al. [13]. [13, Theorem 9] shows that if we have a
lower bound of N Ω(d/t) for ΣΠ[O(d/t)] ΣΠ[t] circuits computing a polynomial F[x1 , . . . , xN ] for all t < d/100
then it follows that any regular formula computing F must have size N Ω(logt) . Using this theorem in
conjunction with Corollary 1.3 above we immediately get a lower bound of N Ω(log d) on the size of any
regular formula computing SNd (x1 , . . . , xN ) for d as in Corollary 1.3.


                          T HEORY OF C OMPUTING, Volume 13 (9), 2017, pp. 1–34                                   30
    T HE S HIFTED PARTIAL D ERIVATIVE C OMPLEXITY OF E LEMENTARY S YMMETRIC P OLYNOMIALS

References
 [1] M ANINDRA AGRAWAL AND V. V INAY: Arithmetic circuits: A chasm at depth four. In Proc. 49th
     FOCS, pp. 67–75. IEEE Comp. Soc. Press, 2008. [doi:10.1109/FOCS.2008.32] 2

 [2] N OGA A LON: Perturbed identity matrices have high rank: Proof and applications. Combin. Probab.
     Comput., 18(1-2):3–15, 2009. [doi:10.1017/S0963548307008917] 4

 [3] L ÁSZLÓ BABAI AND P ETER F RANKL: Linear Algebra Methods in Combinatorics. 1992. Book
     manuscript, University of Chicago. 5

 [4] WALTER BAUR AND VOLKER S TRASSEN: The complexity of partial derivatives. Theoret. Comput.
     Sci., 22(3):317–330, 1983. [doi:10.1016/0304-3975(83)90110-X] 2

 [5] H ERVÉ F OURNIER , N UTAN L IMAYE , M EENA M AHAJAN , AND S RIKANTH S RINIVASAN: The
     shifted partial derivative complexity of elementary symmetric polynomials. In Proc. 40th Math.
     Found. Comp. Sci. (MFCS’15), volume 9235 of LNCS, pp. 324–335. Springer, 2015. Preliminary
     version in ECCC. [doi:10.1007/978-3-662-48054-0_27] 1

 [6] H ERVÉ F OURNIER , N UTAN L IMAYE , G UILLAUME M ALOD , AND S RIKANTH S RINIVASAN:
     Lower bounds for depth-4 formulas computing iterated matrix multiplication. SIAM J. Comput.,
     44(5):1173–1201, 2015. Preliminary version in STOC’14. [doi:10.1137/140990280] 2, 3, 4

 [7] P ETER F RANKL: Intersection theorems and mod p rank of inclusion matrices. J. Comb. Theory,
     Ser. A, 54(1):85–94, 1990. [doi:10.1016/0097-3165(90)90007-J] 5

 [8] P ETER F RANKL AND R ICHARD M. W ILSON: Intersection theorems with geometric consequences.
     Combinatorica, 1(4):357–368, 1981. [doi:10.1007/BF02579457] 5

 [9] A NKIT G UPTA , P RITISH K AMATH , N EERAJ K AYAL , AND R AMPRASAD S APTHARISHI: Ap-
     proaching the chasm at depth four. J. ACM, 61(6):33:1–33:16, 2014. Preliminary version in CCC’13.
     [doi:10.1145/2629541] 2, 3, 6, 29, 30

[10] PAVEL H RUBEŠ AND A MIR Y EHUDAYOFF: Homogeneous formulas and symmetric polynomials.
     Comput. Complexity, 20(3):559–578, 2011. [doi:10.1007/s00037-011-0007-3, arXiv:0907.2621] 3

[11] N EERAJ K AYAL: An exponential lower bound for the sum of powers of bounded degree polynomials.
     Elect. Colloq. Comput. Complexity (ECCC), 19(81), 2012. ECCC. 2, 6

[12] N EERAJ K AYAL , N UTAN L IMAYE , C HANDAN S AHA , AND S RIKANTH S RINIVASAN: An exponen-
     tial lower bound for homogeneous depth four arithmetic formulas. SIAM J. Comput., 46(1):307–335,
     2017. Preliminary version in FOCS’14. [doi:10.1137/151002423] 2, 4

[13] N EERAJ K AYAL , C HANDAN S AHA , AND R AMPRASAD S APTHARISHI: A super-polynomial lower
     bound for regular arithmetic formulas. In Proc. 46th STOC, pp. 146–153. ACM Press, 2014.
     Preliminary version in ECCC. [doi:10.1145/2591796.2591847] 2, 29, 30

                     T HEORY OF C OMPUTING, Volume 13 (9), 2017, pp. 1–34                          31
          H ERV É F OURNIER , N UTAN L IMAYE , M EENA M AHAJAN , AND S RIKANTH S RINIVASAN

[14] P ETER K EEVASH AND B ENNY S UDAKOV: Set systems with restricted cross-intersections
     and the minimum rank of inclusion matrices. SIAM J. Discrete Math., 18(4):713–727, 2005.
     [doi:10.1137/S0895480103434634] 5

[15] PASCAL KOIRAN: Arithmetic circuits: The chasm at depth four gets wider. Theoret. Comput. Sci.,
     448:56–65, 2012. [doi:10.1016/j.tcs.2012.03.041, arXiv:1006.4700] 2

[16] M RINAL K UMAR AND S HUBHANGI S ARAF: On the power of homogeneous depth 4 arithmetic
     circuits. In Proc. 55th FOCS, pp. 364–373. IEEE Comp. Soc. Press, 2014. Preliminary version in
     ECCC. [doi:10.1109/FOCS.2014.46, arXiv:1404.1950] 2, 3, 4

[17] M RINAL K UMAR AND S HUBHANGI S ARAF: The limits of depth reduction for arithmetic formulas:
     It’s all about the top fan-in. SIAM J. Comput., 44(6):1601–1625, 2015. Preliminary version in
     STOC’14. [doi:10.1137/140999220] 2, 3

[18] E YAL K USHILEVITZ AND N OAM N ISAN: Communication Complexity. Cambridge Univ. Press,
     1997. 4, 5, 7

[19] N OAM N ISAN AND AVI W IGDERSON: Lower bounds on arithmetic circuits via partial
     derivatives. Comput. Complexity, 6(3):217–234, 1997. Preliminary version in FOCS’95.
     [doi:10.1007/BF01294256] 2, 3, 4

[20] R AN R AZ: Separation of multilinear circuit and formula size. Theory of Computing, 2(6):121–135,
     2006. [doi:10.4086/toc.2006.v002a006] 2

[21] R AN R AZ: Multi-linear formulas for permanent and determinant are of super-polynomial
     size.    J. ACM, 56(2):8:1–8:17, 2009.   Preliminary versions in STOC’04 and ECCC.
     [doi:10.1145/1502793.1502797] 2

[22] A LEXANDER A. R AZBOROV: Lower bounds on the size of bounded depth circuits over a com-
     plete basis with logical addition. Math. Notes of the Acad. Sci. USSR, 41(4):333–338, 1987.
     [doi:10.1007/BF01137685] 5

[23] A MIR S HPILKA: Affine projections of symmetric polynomials. J. Comput. System Sci., 65(4):639–
     659, 2002. Preliminary version in CCC’01. [doi:10.1016/S0022-0000(02)00021-1] 3

[24] A MIR S HPILKA AND AVI W IGDERSON: Depth-3 arithmetic circuits over fields of char-
     acteristic zero. Comput. Complexity, 10(1):1–27, 2001. Preliminary version in CCC’99.
     [doi:10.1007/PL00001609] 3, 4

[25] S ÉBASTIEN TAVENAS: Improved bounds for reduction to depth 4 and depth 3. Inform. and Comput.,
     240:2–11, 2015. Preliminary version in MFCS’13. [doi:10.1016/j.ic.2014.09.004, arXiv:1304.5777]
     2

[26] L ESLIE G. VALIANT: Completeness classes in algebra. In Proc. 11th STOC, pp. 249–261. ACM
     Press, 1979. [doi:10.1145/800135.804419] 2

                      T HEORY OF C OMPUTING, Volume 13 (9), 2017, pp. 1–34                         32
    T HE S HIFTED PARTIAL D ERIVATIVE C OMPLEXITY OF E LEMENTARY S YMMETRIC P OLYNOMIALS

[27] L ESLIE G. VALIANT, S VEN S KYUM , S TUART J. B ERKOWITZ , AND C HARLES R ACKOFF: Fast
     parallel computation of polynomials using few processors. SIAM J. Comput., 12(4):641–644, 1983.
     Preliminary version in MFCS’81. [doi:10.1137/0212043] 2

[28] R ICHARD M. W ILSON: A diagonal form for the incidence matrices of t-subsets vs. k-subsets.
     European J. Combin., 11(6):609–615, 1990. [doi:10.1016/S0195-6698(13)80046-7] 4, 5, 26


AUTHORS

      Hervé Fournier
      Institut de Mathématiques de Jussieu-Paris Rive Gauche
      Université Paris Diderot, France
      fournier math univ-paris-diderot fr
      https://webusers.imj-prg.fr/~herve.fournier


      Nutan Limaye
      Department of Computer Science and Engineering
      IIT Bombay, Mumbai, India
      nutan cse iitb ac in
      https://www.cse.iitb.ac.in/~nutan


      Meena Mahajan
      The Institute of Mathematical Sciences, Chennai, India
      meena imsc res in
      http://www.imsc.res.in/~meena


      Srikanth Srinivasan
      Department of Mathematics
      IIT Bombay, Mumbai, India
      srikanth math iitb ac in


ABOUT THE AUTHORS

      H ERVÉ F OURNIER obtained his Ph. D. at ENS Lyon under the supervision of Pascal Koiran.
         His interests include algebraic complexity and its connections to computational geometry.




                      T HEORY OF C OMPUTING, Volume 13 (9), 2017, pp. 1–34                           33
    H ERV É F OURNIER , N UTAN L IMAYE , M EENA M AHAJAN , AND S RIKANTH S RINIVASAN

N UTAN L IMAYE graduated from The Institute of Mathematical Sciences, Chennai, India,
   in 2009; her advisor was Meena Mahajan. Her thesis focused on the interconnection
   between language classes and complexity classes. She is interested in Boolean and
   arithmetic circuit complexity and graph algorithms. She likes all things Japanese and has
   been trying to learn Japanese for the past year.


M EENA M AHAJAN is a professor in the theoretical computer science group at the Institute
   of Mathematical Sciences, Chennai, India, reaching there by way of education at the IITs
   at Mumbai and Chennai. At work, she researches questions concerning computational
   complexity. As recreation, she loves solving jigsaw puzzles (creating order out of
   disorder), and also solving combinatorial puzzles, which crop up in real-life situations
   far oftener than one may think.


S RIKANTH S RINIVASAN got his undergraduate degree from the Indian Institute of Technol-
    ogy Madras, where his interest in the theory side of CS was piqued under the tutelage
    of N. S. Narayanswamy. Subsequently, he obtained his Ph. D. from The Institute of
    Mathematical Sciences in 2011; his advisor was V. Arvind. His research interests span
    all of TCS (in theory), but in practice are limited to circuit complexity, derandomization,
    and related areas of mathematics. He enjoys running and pretending to play badminton.




                T HEORY OF C OMPUTING, Volume 13 (9), 2017, pp. 1–34                              34