Proof Complexity Lower Bounds from Algebraic Circuit Complexity

Authors Michael A. Forbes, Amir Shpilka, Iddo Tzameret, Avi Wigderson,
Plaintext
                           T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88
                                        www.theoryofcomputing.org

                                         S PECIAL ISSUE : CCC 2016



         Proof Complexity Lower Bounds from
             Algebraic Circuit Complexity
  Michael A. Forbes*                     Amir Shpilka†                 Iddo Tzameret‡           Avi Wigderson§
                  Received June 15, 2016; Revised January 3, 2021; Published November 1, 2021




       Abstract. We give upper and lower bounds on the power of subsystems of the Ideal Proof
       System (IPS), the algebraic proof system recently proposed by Grochow and Pitassi (J. ACM,
       2018), where the circuits comprising the proof come from various restricted algebraic circuit
       classes. This mimics an established research direction in the Boolean setting for subsystems
       of Extended Frege proofs, where proof-lines are circuits from restricted Boolean circuit
       classes. Except one, all of the subsystems considered in this paper can simulate the well-
       studied Nullstellensatz proof system, and prior to this work there were no known lower
      A conference version of this paper appeared in the Proceedings of the 31st Computational Complexity Conference
(CCC’16) [29].
    * This work was performed when the author was at the Department of Computer Science of Princeton University, supported
by the Princeton Center for Theoretical Computer Science.
    † The research leading to these results has received funding from the European Community’s Seventh Framework Programme

(FP7/2007-2013) under grant agreement number 257575.
    ‡ Partly supported by The National Natural Science Foundation of China Grant (61373002) and has received funding from

the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant
agreement No. 101002742).
    § This research was partially supported by NSF grant CCF-1412958.



ACM Classification: F.1.3, F.1.2
AMS Classification: 68Q17, 68Q15, 03F20
Key words and phrases: Proof complexity, algebraic circuit complexity, lower bounds, algebraic proof
systems, IPS, polynomial calculus, hardness versus randomness, functional lower bounds, ABPs


© 2021 Michael A. Forbes, Amir B. Shpilka, Iddo Tzameret, and Avi Wigderson
c b Licensed under a Creative Commons Attribution License (CC-BY)                           DOI: 10.4086/toc.2021.v017a010
              M ICHAEL F ORBES , A MIR S HPILKA , I DDO T ZAMERET, AND AVI W IGDERSON

      bounds when measuring proof size by the algebraic complexity of the polynomials (except
      with respect to degree, or to sparsity).
           We give two general methods of converting certain algebraic circuit lower bounds into
      proof complexity ones. However, we need to strengthen existing lower bounds to hold for
      either the functional model or for multiplicities (see below). Our techniques are reminiscent
      of existing methods for converting Boolean circuit lower bounds into related proof complexity
      results, such as feasible interpolation. We obtain the relevant types of lower bounds for
      a variety of classes (sparse polynomials, depth-3 powering formulas, read-once oblivious
      algebraic branching programs, and multilinear formulas), and infer the relevant proof
      complexity results. We complement our lower bounds by giving short refutations of the
      previously studied subset-sum axiom using IPS subsystems, allowing us to conclude strict
      separations between some of these subsystems.
           Our first method is a functional lower bound, a notion due to Grigoriev and Razborov
      (Appl. Algebra Eng. Commun. Comput., 2000), which says that not only does a polynomial
       f require large algebraic circuits, but that any polynomial g agreeing with f on the Boolean
      cube also requires large algebraic circuits. For our classes of interest, we develop functional
      lower bounds where g(x) equals 1/p(x) where p is a constant-degree polynomial, which in
      turn yield corresponding IPS lower bounds for proving that p is nonzero over the Boolean
      cube. In particular, we show superpolynomial lower bounds for refuting variants of the
      subset-sum axiom in various IPS subsystems.
           Our second method is to give lower bounds for multiples, that is, to give explicit polyno-
      mials whose all (nonzero) multiples require large algebraic circuit complexity. By extending
      known techniques, we are able to obtain such lower bounds for our classes of interest, which
      we then use to derive corresponding IPS lower bounds. Such lower bounds for multiples are
      of independent interest, as they have tight connections with the algebraic hardness versus
      randomness paradigm.


1    Introduction
Propositional proof complexity aims to understand and analyze the computational resources required to
prove propositional tautologies, in the same way that circuit complexity studies the resources required to
compute Boolean functions. A typical goal would be to establish, for a given proof system, superpoly-
nomial lower bounds on the size of any proof of some propositional tautology. The seminal paper of
Cook and Reckhow [15] showed that this goal relates quite directly to fundamental hardness questions in
computational complexity such as the NP vs. coNP question: establishing superpolynomial lower bounds
for every propositional proof system would separate NP from coNP (and thus also P from NP). We refer
the reader to Krajíček [48] for more on this subject.
    Propositional proof systems come in a large variety, as different ones capture different forms of
reasoning, either reasoning used to actually prove theorems, or reasoning used by algorithmic techniques
for different types of search problems (as failure of the algorithm to find the desired object constitutes
a proof of its nonexistence). Much of the research in proof complexity deals with propositional proof
systems originating from logic or geometry. Logical proof systems include such systems as resolution

                      T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                             2
             P ROOF C OMPLEXITY L OWER B OUNDS FROM A LGEBRAIC C IRCUIT C OMPLEXITY

(whose variants are related to popular algorithms for automated theorem proving and SAT solving),
as well as the Frege proof system (capturing the most common logic text-book systems) and its many
subsystems. Geometric proof systems include cutting-plane proofs, capturing reasoning used in algorithms
for integer programming, as well as proof systems arising from systematic strategies for rounding linear-
or semidefinite-programming such as the lift-and-project or sum-of-squares hierarchies.
     In this paper we focus on algebraic proof systems, in which propositional tautologies (or rather
contradictions) are expressed as unsatisfiable systems of polynomial equations and algebraic tools are
used to refute them. This study originates with the work of Beame, Impagliazzo, Krajíček, Pitassi and
Pudlák [9], who introduced the Nullstellensatz refutation system (based on Hilbert’s Nullstellensatz),
followed by the Polynomial Calculus system of Clegg, Edmonds, and Impagliazzo [13], which is a
“dynamic” version of the Nullstellensatz. In both systems the main measures of proof size that have been
studied are the degree and sparsity of the polynomials appearing in the proof. Substantial work has led to
a very good understanding of the power of these systems with respect to these measures (see for example
[12, 69, 31, 41, 11, 4] and references therein).
     However, the above measures of degree and sparsity are rather rough measures of a complexity of a
proof. Accordingly, Grochow and Pitassi [35] have recently advocated measuring the complexity of such
proofs by their algebraic circuit size and shown that the resulting proof system can polynomially simulate
strong proof systems such as the Frege system (see also prior work on measuring algebraic proofs by
their algebraic circuit size [59, 32, 66, 65, 83] as well as the subsequent survey [60]). This naturally leads
to the question of establishing lower bounds for this stronger proof system, even for restricted classes of
algebraic circuits.
     In this article we establish such lower bounds for previously studied restricted classes of algebraic
circuits, and show that these lower bounds are interesting by providing non-trivial upper bounds in
these proof systems for refutations of interesting sets of polynomial equations. This provides what are
apparently the first examples of lower bounds on the algebraic circuit size of propositional proofs in the
Ideal Proof System (IPS) framework of Grochow and Pitassi [35].
     We note that obtaining proof complexity lower bounds from circuit complexity lower bounds is an
established tradition that takes many forms. Most prominent are the lower bounds for subsystems of
the Frege proof system defined by small-depth Boolean circuits, and lower bounds of Pudlák [61] on
Resolution and Cutting Planes system using the so-called feasible interpolation method. We refer the
reader again to Krajíček [48] for more details. Our approach here for algebraic systems shares features
with both of these approaches.
     The rest of this introduction is arranged as follows. In Section 1.1 we give the necessary background in
algebraic proof complexity, and explain the IPS system. In Section 1.2 we define the algebraic complexity
classes that will underlie the subsystems of IPS we will study. In Section 1.3 we state our results and
explain our techniques, for both the algebraic and proof complexity worlds.


1.1   Algebraic proof systems
We now describe the algebraic proof systems that are the subject of this paper. If one has a set of
polynomials (called axioms) f1 , . . . , fm ∈ F[x1 , . . . , xn ] over some field F, then (the weak version of)
Hilbert’s Nullstellensatz shows that the system f1 (x) = · · · = fm (x) = 0 is unsatisfiable (over the algebraic

                        T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                                 3
                  M ICHAEL F ORBES , A MIR S HPILKA , I DDO T ZAMERET, AND AVI W IGDERSON

closure of F) if and only if there are polynomials g1 , . . . , gm ∈ F[x] such that ∑ j g j (x) f j (x) = 1 (as a
formal identity), or equivalently, that 1 is in the ideal generated by the { f j } j .
    Beame, Impagliazzo, Krajíček, Pitassi, and Pudlák [9] suggested to treat these {g j } j as a proof of
the unsatisfiability of this system of equations, called a Nullstellensatz refutation. This is in particular
relevant for complexity theory as one can restrict attention to Boolean solutions to this system by
adding the Boolean axioms, that is, adding the polynomials {xi2 − xi }ni=1 to the system. One can then
naturally encode NP-complete problems such as the satisfiability of 3CNF formulas as the satisfiability
of a system of constant-degree polynomials, and a Nullstellensatz refutation is then an equation of
the form ∑mj=1 g j (x) f j (x) + ∑ni=1 hi (x)(xi2 − xi ) = 1 for g j , hi ∈ F[x]. This proof system is sound (only
refuting unsatisfiable systems over {0, 1}n ) and complete (refuting any unsatisfiable system, by Hilbert’s
Nullstellensatz).
    Given that the above proof system is sound and complete, it is then natural to ask what is its power
to refute unsatisfiable systems of polynomial equations over {0, 1}n . To understand this question one
must define the notion of the size of the above refutations. Two popular notions are that of the degree,
and the sparsity (number of monomials). One can then show (see for example Pitassi [59]) that for
any unsatisfiable system which includes the Boolean axioms, there exists a refutation where the g j are
multilinear and the hi have degree at most O(n + d), given that each f j has degree at most d. In particular,
this implies that for any unsatisfiable system with d = O(n) there is a refutation of degree O(n) and
involving at most exp(O(n)) monomials. This intuitively agrees with the fact that coNP is a subset of
non-deterministic exponential time.
    Building on the suggestion of Pitassi [59] and various investigations into the power of strong alge-
braic proof systems ([32, 66, 65]), Grochow and Pitassi [35] have recently considered more succinct
descriptions of polynomials where one measures the size of a polynomial by the size of an algebraic
circuit needed to compute it. This is potentially much more powerful as there are polynomials such as
the determinant which are of high degree and involve exponentially many monomials and yet can be
computed by small algebraic circuits. They named the resulting system the Ideal Proof System (IPS)
which we now define.

Definition 1.1 (Ideal Proof System (IPS), Grochow-Pitassi [35]). Let f1 (x), . . . , fm (x) ∈ F[x1 , . . . , xn ] be
a system of polynomials. An IPS refutation for showing that the polynomials { f j } j have no common
solution in {0, 1}n is an algebraic circuit C(x, y, z) ∈ F[x, y1 , . . . , ym , z1 , . . . , zn ], such that

   1. C(x, 0, 0) = 0.

   2. C(x, f1 (x), . . . , fm (x), x12 − x1 , . . . , xn2 − xn ) = 1.

The size of the IPS refutation is the size of the circuit C. If C is of individual degree ≤ 1 in each y j and zi ,
then this is a linear IPS refutation (called Hilbert IPS by Grochow-Pitassi [35]), which we will abbreviate
as IPSLIN . If C is of individual degree ≤ 1 only in the y j then we say this is a IPSLIN0 refutation. If C
comes from a restricted class of algebraic circuits C, then this is a called a C-IPS refutation, and further
called a C-IPSLIN refutation if C is linear in y, z, and a C-IPSLIN0 refutation if C is linear in y.

   Notice also that our definition here by default adds the equations {xi2 − xi }i to the system { f j } j . For
convenience we will often denote the equations {xi2 − xi }i as x2 − x. One need not add the equations

                             T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                                4
               P ROOF C OMPLEXITY L OWER B OUNDS FROM A LGEBRAIC C IRCUIT C OMPLEXITY

x2 − x to the system in general, but this is the most interesting regime for proof complexity and thus we
adopt it as part of our definition.
    The C-IPS system is sound for any C, and Hilbert’s Nullstellensatz shows that C-IPSLIN is complete
for any complete class of algebraic circuits C (that is, classes which can compute any polynomial,
possibly requiring exponential complexity). We note that we will also consider non-complete classes
such as multilinear-formulas (which can only compute multilinear polynomials, but are complete for
multilinear polynomials), where we will show that the multilinear-formula-IPSLIN system is not complete
for the language of all unsatisfiable sets of multilinear polynomials (Theorem 4.7), while the stronger
multilinear-formula-IPSLIN0 version is complete (Theorem 4.12). However, for the standard conversion
of unsatisfiable CNFs into polynomial systems of equations, the multilinear-formula-IPSLIN system is
complete (Theorem 1.2).
    The following theorem, due to Grochow and Pitassi [59, 35], shows that the IPS system has surprising
power and that lower bounds on this system give rise to computational lower bounds.
Theorem 1.2 (Pitassi [59],Grochow-Pitassi [35]). Let ϕ = C1 ∧ · · · ∧ Cm be an unsatisfiable CNF on
n-variables, and let f1 , . . . , fm ∈ F[x1 , . . . , xm ] be its encoding as a polynomial system of equations. If
there is a size-s Frege proof (or Extended Frege proof) that { f j } j , {xi2 − xi }i is unsatisfiable, then there is
a formula-IPSLIN (circuit-IPSLIN , resp.) refutation of size poly(n, m, s) that is checkable in randomized
poly(n, m, s) time.1
    Further, { f j } j has an IPSLIN refutation, where the refutation uses multilinear polynomials in VNP.
Thus, if every IPS refutation of { f j } j requires formula (or circuit) size ≥ s, then there is an explicit
polynomial (that is, in VNP) that requires size ≥ s algebraic formulas (circuits, resp.).

Remark 1.3. One point to note is that the transformation from Extended Frege to IPS refutations yields
circuits of polynomial size but without any guarantee on their degree. In particular, such circuits can
compute polynomials of exponential degree. In contrast, the conversion from Frege to IPS refutations
yields polynomial sized algebraic formulas and those compute polynomials of polynomially bounded
degree. This range of parameters, polynomials of polynomially bounded degree, is the more common
setting studied in algebraic complexity.

    The fact that C-IPS refutations are efficiently checkable (with randomness) follows from the fact that
we need to verify the polynomial identities stipulated by the definition. That is, one needs to solve an
instance of the polynomial identity testing (PIT) problem for the class C: given a circuit from the class
C decide whether it computes the identically zero polynomial. This problem is solvable in (one-sided)
probabilistic polynomial time (coRP) for general algebraic circuits, and there are various restricted classes
for which deterministic algorithms are known (see Section 3.1).
    Motivated by the fact that PIT of non-commutative formulas 2 can be solved deterministically ([64])
and admit exponential-size lower bounds ([52]), Li, Tzameret and Wang [50] have shown that IPS over
non-commutative polynomials (along with additional commutator axioms) can simulate Frege (they also
provided a quasipolynomial simulation of IPS over non-commutative formulas by Frege; see Li, Tzameret
and Wang [50] for more details).
   1 We note that Grochow and Pitassi [35] proved this for Extended Frege and circuits, but essentially the same proof relates

Frege and formula size.
   2 These are formulas over a set of non-commuting variables.



                           T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                                            5
                M ICHAEL F ORBES , A MIR S HPILKA , I DDO T ZAMERET, AND AVI W IGDERSON

Theorem 1.4 (Li, Tzameret and Wang [50]). Let ϕ = C1 ∧ · · · ∧Cm be an unsatisfiable CNF on n-variables,
and let f1 , . . . , fm ∈ F[x1 , . . . , xm ] be its encoding as a polynomial system of equations. If there is a size-s
Frege proof that { f j } j , {xi2 − xi }i is unsatisfiable, then there is a non-commutative-IPS refutation of
formula-size poly(n, m, s), where the commutator axioms xi x j − x j xi are also included in the polynomial
system being refuted. Further, this refutation is checkable in deterministic poly(n, m, s) time.
    The above results naturally motivate studying C-IPS for various restricted classes of algebraic circuits,
as lower bounds for such proofs then intuitively correspond to restricted lower bounds for the Extended
Frege proof system. In particular, as exponential lower bounds are known for non-commutative formulas
([52]), this possibly suggests that such methods could even attack the full Frege system itself.

1.2     Algebraic circuit classes
Having motivated C-IPS for restricted circuit classes C, we now give formal definitions of the algebraic
circuit classes of interest to this paper, all of which were studied independently in algebraic complexity.
Some of them capture the state-of-art in our ability to prove lower bounds and provide efficient determin-
istic identity tests, so it is natural to attempt to fit them into the proof complexity framework. We define
each and briefly explain what we know about it. As the list is long, the reader may consider skipping to
the results (Section 1.3), and refer to the definitions of these classes as they arise.
     Algebraic circuits and formula (over a fixed chosen field) compute polynomials via addition and
multiplication gates, starting from the input variables and constants from the field. For background on
algebraic circuits in general and their complexity measures we refer the reader to the survey of Shpilka
and Yehudayoff [78]. We next define the restricted circuit classes that we will be studying in this paper.

1.2.1    Small-depth classes
We start by defining what are the simplest and most restricted classes of algebraic circuits. The first class
simply represents polynomials as a sum of monomials. This is also called the sparse representation of the
polynomial. Notationally we call this model ∑ ∏ formulas (to capture the fact that polynomials computed
in the class are represented simply as sums of products), but we will more often call these polynomials
“sparse”.

Definition 1.5. The class C = ∑ ∏ computes polynomials in their sparse representation, that is, as a sum
of monomials. The graph of computation has two layers with an addition gate at the top and multiplication
gates at the bottom. The size of a ∑ ∏ circuit of a polynomial f is the product of the number of monomials
in f , the number of variables, and the degree.

    This class of circuits is what is used in the Nullstellensatz proof system. In our terminology ∑ ∏-
IPSLIN is exactly the Nullstellensatz proof system.
    Another restricted class of algebraic circuits is that of depth-3 powering formulas (sometimes also
                                                                                    V
called “diagonal depth-3 circuits”). We will sometimes abbreviate this name as a “∑ ∑ formula”, where
V
   denotes the powering operation. Specifically, polynomials that are efficiently computed by small
formulas from this class can be represented as sum of powers of linear functions. This model appears
implicitly in Shpilka [75] and explicitly in the paper of Saxena [71].

                         T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                                       6
             P ROOF C OMPLEXITY L OWER B OUNDS FROM A LGEBRAIC C IRCUIT C OMPLEXITY

                                                                          V
Definition 1.6. The class of depth-3 powering formulas, denoted ∑ ∑, computes polynomials of the
following form
                                                       s
                                              f (x) = ∑ `i (x)di ,
                                                      i=1
where `i (x) are linear functions. The degree of this ∑ ∑ representation of f is maxi {di } and its size is
                                                            V

n · ∑si=1 (di + 1).
    One reason for considering this class of circuits is that it is a simple, but non-trivial model that
is somewhat well-understood. In particular, the partial derivative method of Nisan–Wigderson [54]
implies lower bounds for this model and efficient polynomial identity testing algorithms are known
([71, 3, 26, 27, 24], as discussed further in Section 3.1).
    We also consider a generalization of this model where we allow powering of low-degree polynomials.
Definition 1.7. The class ∑ ∑ ∏t computes polynomials of the following form
                              V

                                                       s
                                              f (x) = ∑ fi (x)di ,
                                                      i=1
                                                                                  n+t
                                                                                            · ∑si=1 (di + 1).
                                                                                        
where the degree of the fi (x) is at most t. The size of this representation is    t

    We remark that the reason for defining the size this way is that we think of the fi as represented as
sum of monomials (there are n+t  t  n-variate monomials of degree at most t) and the size captures the
complexity of writing this as an algebraic formula. This model is the simplest that requires the method
of shifted partial derivatives of Kayal [46, 36] to establish lower bounds, and this has recently been
generalized to obtain polynomial identity testing algorithms ([21], as discussed further in Section 3.1).

1.2.2   Oblivious algebraic branching programs
Algebraic branching programs (ABPs) form a model whose computational power lies between that of
algebraic circuits and algebraic formulas, and certain read-once and oblivious ABPs are a natural setting
for studying the partial derivative matrix lower bound technique of Nisan [52].
Definition 1.8 (Nisan [52]). An algebraic branching program (ABP) with unrestricted weights of
depth D and width ≤ r, on the variables x1 , . . . , xn , is a directed acyclic graph such that:
    • The vertices are partitioned in D + 1 layers V0 , . . . ,VD , so that V0 = {s} (s is the source node), and
      VD = {t} (t is the sink node). Further, each edge goes from Vi−1 to Vi for some 0 < i ≤ D.
    • max |Vi | ≤ r.
    • Each edge e is weighted with a polynomial fe ∈ F[x].
The (individual) degree d of the ABP is the maximum (individual) degree of the edge polynomials fe .
The size of the ABP is the product n · r · d · D.
    Each s-t path is said to compute the polynomial which is the product of the labels of its edges, and
the algebraic branching program itself computes the sum over all s-t paths of such polynomials.
    There are also the following restricted ABP variants.

                        T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                                   7
               M ICHAEL F ORBES , A MIR S HPILKA , I DDO T ZAMERET, AND AVI W IGDERSON

    • An algebraic branching program is said to be oblivious if for every layer `, all the edge labels in
      that layer are univariate polynomials in a single variable xi` .

    • An oblivious branching program is said to be a read-once oblivious ABP (roABP) if each xi appears
      in the edge label of exactly one layer, so that D = n. That is, each xi appears in the edge labels in
      exactly one layer. The layers thus define an order of the variables, which will be x1 < · · · < xn if
      not otherwise specified.

    • An oblivious branching program is said to be a read-k oblivious ABP if each variable xi appears in
      the edge labels of exactly k layers, so that D = kn.

    • An ABP is non-commutative if each fe is from the ring Fhxi of non-commuting variables and has
      deg fe ≤ 1, so that the ABP computes a non-commutative polynomial.

     Intuitively, roABPs are the algebraic analog of read-once Boolean branching programs, the non-
uniform model of the class RL, which are well-studied in Boolean complexity. Algebraically, roABPs
are also well-studied. In particular, roABPs are essentially equivalent to non-commutative ABPs ([27]),
a model at least as strong as non-commutative formulas. That is, as an roABP reads the variables in a
fixed order (hence not using commutativity) it can be almost directly interpreted as a non-commutative
ABP. Conversely, as non-commutative multiplication is ordered, one can interpret a non-commutative
polynomial in a read-once fashion by (commutatively) exponentiating a variable to its index in a monomial.
For example, the non-commutative xy − yx can be interpreted commutatively as x1 y2 − y1 x2 = xy2 − x2 y,
and one can show that this conversion preserves the relevant ABP complexity ([27]). The study of
non-commutative ABPs dates to Nisan [52], who proved lower bounds for non-commutative ABPs
(and thus also for roABPs, in any order). In a sequence of more recent papers, polynomial identity
testing algorithms were devised for roABPs ([64, 25, 27, 24, 2], see also Section 3.1). In terms of proof
complexity, Tzameret [83] studied a proof system with lines given by roABPs, and recently Li, Tzameret
and Wang [50] (Theorem 1.4) showed that IPS over non-commutative formulas is essentially equivalent
in power to the Frege proof system. Due to the close connections between non-commutative ABPs and
roABPs, this last result suggests the importance of proving lower bounds for roABP-IPS as a way of
attacking lower bounds for the Frege proof system (although we obtain roABP-IPSLIN lower bounds
without obtaining non-commutative-IPSLIN lower bounds).
     Finally, we mention that recently Anderson, Forbes, Saptharishi, Shpilka, and Volk [6] obtained
exponential lower bounds for read-k oblivious ABPs (when k = o(log n/ log log n)) as well as a slightly
subexponential polynomial identity testing algorithm.

1.2.3   Multilinear formulas
The last model that we consider is that of multilinear formulas.

Definition 1.9 (Multilinear formula). An algebraic formula is a multilinear formula if the polynomial
computed by each gate of the formula is multilinear (as a formal polynomial, that is, as an element of
F[x1 , . . . , xn ]). The product depth is the maximum number of multiplication gates on any input-to-output
path in the formula.

                       T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                              8
             P ROOF C OMPLEXITY L OWER B OUNDS FROM A LGEBRAIC C IRCUIT C OMPLEXITY

                                                  (a)

                                      IPSLIN                IPS


                                                                                    (c)
                            roABP-IPSLIN                 multlin.-formula-IPS
                  (d)
                                                                     (b)
                              V
                            ∑ ∑-IPSLIN                       sparse-IPSLIN

Figure 1: This diagram summarizes the main results of this paper, listing some of the upper bounds
given in Section 4, as well as some of the functional lower bounds of Section 5. The lower bounds via
multiples (Section 7) are not included here, as those lower bounds are slightly non-standard. Arrows
indicate efficient simulation of proof systems, where unlabelled arrows are straightforward. Labelled
results are as follows. (a) The simulation of general IPS by linear-IPS is given by Theorem 4.4. (b) The
simulation of sparse-IPSLIN (equivalent to the Nullstellensatz system) by multilinear-formula-IPS is given
by Theorem 4.12. (c) The region below the dashed line consists of proof systems where we prove
superpolynomial lower bounds (Theorem 5.15). (d) The region above the dotted line can efficiently
refute the subset-sum axiom ∑i xi + 1 = 0 over x ∈ {0, 1}n (Corollary 4.14, Theorem 4.15). In contrast,
Polynomial Calculus cannot efficiently refute this system of equations ([41]).


    Raz [63, 62] proved quasi-polynomial lower bounds for multilinear formulas and separated multilinear
formulas from multilinear circuits. Raz and Yehudayoff proved exponential lower bounds for small-depth
multilinear formulas [68]. Only slightly sub-exponential polynomial identity testing algorithms are
known for small-depth multilinear formulas ([57]).

1.3     Our results and techniques
In this section we summarize our results and techniques, stating some results in less than full generality
to more clearly convey the result. Figure 1 presents an extremely abbreviated listing of some highlights
of our results. We present the results in the order that we later prove them. We start by giving upper
bounds for IPS proofs (Section 1.3.1). We then describe our functional lower bounds and the IPSLIN
lower bounds they imply (Section 1.3.2). Finally, we discuss lower bounds for multiples and state our
corresponding lower bounds for IPS (Section 1.3.3).

1.3.1    Upper bounds for proofs within subclasses of IPS
Various previous articles have studied restricted algebraic proof systems and shown non-trivial upper
bounds. The general simulation (Theorem 1.2) of Grochow and Pitassi [35] showed that the formula-
IPS and circuit-IPS systems can simulate powerful proof systems such as Frege and Extended Frege,

                        T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                           9
                 M ICHAEL F ORBES , A MIR S HPILKA , I DDO T ZAMERET, AND AVI W IGDERSON

respectively. Li, Tzameret and Wang [50] (Theorem 1.4) have shown that even non-commutative-formula-
IPS can simulate Frege. Grigoriev and Hirsch [32] have shown that proofs manipulating depth-3 algebraic
formulas can refute hard axioms such as the pigeonhole principle, the subset-sum axiom, and Tseitin
tautologies. Raz and Tzameret [66, 65] somewhat strengthened their results by restricting the proof to
depth-3 multilinear proofs (in a dynamic system, see Appendix A).
    However, these upper bounds are for proof systems (IPS or otherwise) for which no proof lower
bounds are known. In this article we also study upper bounds for restricted subsystems of IPS. In
particular, we compare linear-IPS versus the full IPS system, and show that even for restricted C, C-IPS
can refute interesting unsatisfiable systems of equations arising from NP-complete problems (and we
will obtain corresponding proof lower bounds for these C-IPS systems).
    Our first upper bound is to show that linear-IPS can simulate the full IPS proof system when the
axioms are computationally simple, which essentially resolves a question of Grochow and Pitassi [35,
Open Question 1.133 ].

Theorem 1.10 (Theorem 4.4). For |F| ≥ poly(d), if f1 , . . . , fm ∈ F[x1 , . . . , xn ] are degree-d polynomials
computable by size-s algebraic formulas (or circuits) and they have a size-t formula-IPS (or circuit-IPS)
refutation, then they also have a size-poly(d, s,t) formula-IPSLIN (circuit-IPSLIN , resp.) refutation.

    This theorem is established by pushing the “non-linear” dependencies on the axioms into the IPS
refutation itself, which is possible as the axioms are assumed to themselves be computable by small
circuits. We note that Grochow and Pitassi [35] showed such a conversion, but only for IPS refutations
computable by sparse polynomials. Also, we remark that this result holds even for circuits of unbounded
degree, as opposed to just those of polynomial degree.
    We then turn our attention to IPS involving only restricted classes of algebraic circuits, and show
that they are complete proof systems. This is clear for complete models of algebraic circuits such
as sparse polynomials, depth-3 powering formulas 4 and roABPs. The models of sparse-IPSLIN and
roABP-IPSLIN can efficiently simulate the Nullstellensatz proof system measured in terms of number of
monomials, as the former is equivalent to this system, and the latter follows as sparse polynomials have
small roABPs. Note that depth-3 powering formulas cannot efficiently compute sparse polynomials in
general (Theorem 6.9) so cannot efficiently simulate the Nullstellensatz system. For multilinear formulas,
showing completeness (much less an efficient simulation of sparse-IPSLIN ) is more subtle as not every
polynomial is multilinear, however the result can be obtained by a careful multilinearization.
                                                                                                              V
Theorem 1.11 (Theorem 4.7 and Theorem 4.12). The proof systems of sparse-IPSLIN , ∑ ∑-IPSLIN (in
large characteristic fields), and roABP-IPSLIN are complete proof systems (for systems of polynomials
with no Boolean solutions). The multilinear-formula-IPSLIN proof system is not complete, but the depth-2
multilinear-formula-IPSLIN0 proof system is complete (for multilinear axioms) and can polynomially
simulate sparse-IPSLIN (for low-degree axioms).

   However, we recall that multilinear-formula-IPSLIN is complete when refuting unsatisfiable CNF
formulas (Theorem 1.2).
   3 This refers to the following version: http://arxiv.org/abs/1404.3820v1.
   4 Showing that depth-3 powering formulas are complete (in large characteristic) can be seen from the fact that any multilinear

monomial can be computed in this model, see for example Fischer [19].


                          T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                                               10
             P ROOF C OMPLEXITY L OWER B OUNDS FROM A LGEBRAIC C IRCUIT C OMPLEXITY

    We next consider the equation ∑ni=1 αi xi − β along with the Boolean axioms {xi2 − xi }i . Deciding
whether this system of equations is satisfiable is the NP-complete subset-sum problem, and thus we do
not expect small refutations in general (unless NP = coNP). Indeed, Impagliazzo, Pudlák, and Sgall [41]
(Theorem A.4) have shown lower bounds for refutations in the polynomial calculus system (and thus
also the Nullstellensatz system) even when α = 1. Specifically, they showed that such refutations require
both Ω(n)-degree and exp(Ω(n))-many monomials, matching the worst-case upper bounds for these
complexity measures. In the language of this paper, they gave exp(Ω(n))-size lower bounds for refuting
this system in ∑ ∏-IPSLIN (which is equivalent to the Nullstellensatz proof system). In contrast, we
establish here poly(n)-size refutations for α = 1 in the restricted proof systems of roABP-IPSLIN and
depth-3 multilinear-formula-IPSLIN (despite the fact that multilinear-formula-IPSLIN is not complete).
Theorem 1.12 (Corollary 4.14 and Theorem 4.15). Let F be a field of characteristic char(F) > n. Then
the system of polynomial equations ∑ni=1 xi − β , {xi2 − xi }ni=1 is unsatisfiable for β ∈ F \ {0, . . . , n}, and
there are explicit poly(n)-size refutations within roABP-IPSLIN , as well as within depth-3 multilinear-
formula-IPSLIN .
    This theorem is proven by noting that the polynomial p(t) := ∏nk=0 (t − k) vanishes on ∑i xi modulo
  2
{xi − xi }ni=1 , but p(β ) is a nonzero constant. This implies that ∑i xi − β divides p(∑i xi ) − p(β ). Denoting
                                             1                                     2      n
the quotient by f (x), it follows that −p(β    ) · f (x) · (∑i xi − β ) ≡ 1 mod {xi − xi }i=1 , which is nearly a
linear-IPS refutation except for the complexity of establishing this relation over the Boolean cube. We
show that the quotient f is easily expressed as a depth-3 powering circuit. Unfortunately, proving the
above equivalence to 1 modulo the Boolean cube is not possible in the depth-3 powering circuit model.
However, by moving to more powerful models (such as roABPs and multilinear formulas) we can give
proofs of this multilinearization to 1 and thus give proper IPS refutations.

1.3.2   Linear-IPS lower bounds via functional lower bounds
Having demonstrated the power of various restricted classes of IPS refutations by refuting the subset-sum
axiom, we now turn to lower bounds. We give two paradigms for establishing lower bounds, the first
of which we discuss here, named a functional circuit lower bound. This idea appeared in the work of
Grigoriev and Razborov [34] as well as in the recent paper by Forbes, Kumar and Saptharishi [23]. We
briefly motivate this type of lower bound as a topic of independent interest in algebraic circuit complexity,
and then discuss the lower bounds we obtain and their implications to obtaining proof complexity lower
bounds.
     In Boolean complexity, the primary object of interest are functions. Generalizing slightly, one can
even seek to compute functions f : {0, 1}n → F for some field F. In contrast, in algebraic complexity one
seeks to compute polynomials as elements of the ring F[x1 , . . . , xn ]. These two regimes are tied by the fact
that every polynomial f ∈ F[x] induces a function fˆ : {0, 1}n → F via the evaluation fˆ : x 7→ f (x). That
is, the polynomial f functionally computes the function fˆ. As an example, x2 − x functionally computes
the zero function despite being a nonzero polynomial.
     Traditional algebraic circuit lower bounds for the n × n permanent are lower bounds for computing
permn as an element in the ring F[{xi, j }1≤i, j≤n ]. This is a strong notion of “computing the permanent”,
while one can consider the weaker notion of functionally computing the permanent, that is, a polynomial
 f ∈ F[{xi, j }] such that f = permn over {0, 1}n×n , where f is not required to equal permn as a polynomial.

                        T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                                  11
                M ICHAEL F ORBES , A MIR S HPILKA , I DDO T ZAMERET, AND AVI W IGDERSON

As permn : {0, 1}n×n → F is #P-hard (for fields of large characteristic), assuming plausible conjectures
(such as the polynomial hierarchy being infinite) it follows that any polynomial f which functionally
computes permn must require large algebraic circuits. Unconditionally obtaining such a result is what we
term a functional lower bound.

Goal 1.13 (Functional Circuit Lower Bound ([34, 23])). Obtain an explicit function fˆ : {0, 1}n → F
such that for any polynomial f ∈ F[x1 , . . . , xn ] satisfying f (x) = fˆ(x) for all x ∈ {0, 1}n , it must be that f
requires large algebraic circuits.

     Obtaining such a result is challenging, in part because one must lower bound all polynomials agreeing
with the function fˆ (of which there are infinitely many). Prior work ([33, 34, 49]) has established
functional lower bounds for functions when computing with polynomials over constant-sized finite fields,
and the recent paper by Forbes, Kumar and Saptharishi [23] has established some lower bounds for any
field.
     While it is natural to hope that existing methods would yield such lower bounds, many lower-bound
techniques inherently use that algebraic computation is syntactic. In particular, techniques involving
partial derivatives (which include the partial derivative method of Nisan–Wigderson [54] and the shifted
partial derivative method of Kayal [46, 36]) cannot as is yield functional lower bounds as knowing a
polynomial on {0, 1}n is not enough to conclude information about its partial derivatives.
     We now explain how functional lower bounds imply lower bounds for linear-IPS refutations in certain
cases. Suppose one considers refutations of the unsatisfiable polynomial system f (x), {xi2 − xi }ni=1 . A
linear-IPS refutation would yield an equation of the form g(x) · f (x) + ∑i hi (x) · (xi2 − xi ) = 1 for some
polynomials g, hi ∈ F[x]. Viewing this equation modulo the Boolean cube, we have that g(x) · f (x) ≡ 1
mod {xi2 − xi }i . Equivalently, since f (x) is unsatisfiable over {0, 1}n , we see that g(x) = 1/ f (x) for
x ∈ {0, 1}n , as f (x) is never zero so this fraction is well-defined. It follows that if the function x 7→ 1/ f (x)
induces a functional lower bound then g(x) must require large complexity, yielding the desired linear-IPS
lower bound.
     Thus, it remains to instantiate this program. While we are successful, we should note that this
approach as is seems to only yield proof complexity lower bounds for systems with one non-Boolean
axiom and thus cannot encode polynomial systems where each equation depends on O(1) variables (such
as those naturally arising from 3CNFs).
     Our starting point is to observe that the subset-sum axiom already induces a weak form of functional
lower bound, where the complexity is measured by degree.

Theorem 1.14 (Theorem 5.4). Let F be a field of a characteristic at least poly(n) and β ∈           / {0, . . . , n}.
Then ∑i xi − β , {xi − xi }i is unsatisfiable and any polynomial f ∈ F[x1 , . . . , xn ] with f (x) = ∑ x1i −β for
                     2
                                                                                                       i
x ∈ {0, 1}n , satisfies deg f ≥ n.

    A lower bound of d n2 e + 1 was previously established by Impagliazzo, Pudlák, and Sgall [41]
(Theorem A.4), but the lower bound of n (which is tight) will be crucial for our results.
    We then lift this result to obtain lower bounds for stronger models of algebraic complexity. In
particular, by replacing “xi ” with “xi yi ” we show that the function ∑ xi1yi −β has maximal evaluation
                                                                          i
dimension between x and y, which is some measure of interaction between the variables in x and those in
y (see Section 3.3). This measure is essentially functional, so that one can lower bound this measure by

                        T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                                     12
               P ROOF C OMPLEXITY L OWER B OUNDS FROM A LGEBRAIC C IRCUIT C OMPLEXITY

understanding the functional behavior of the polynomial on finite sets such as the Boolean cube. Our
lower bound for evaluation dimension follows by examining the above degree bound. Using known
relations between this complexity measure and algebraic circuit classes, we can obtain lower bounds for
depth-3 powering linear-IPS.

Theorem 1.15 (Theorem 5.10). Let F be a field of characteristic ≥ poly(n) and β ∈       / {0, . . . , n}. Then
  n                2             2
∑i=1 xi yi − β , {xi − xi }i , {yi − yi }i is unsatisfiable and any ∑ ∑-IPSLIN refutation requires size ≥
                                                                     V

exp(Ω(n)).

    The above axiom only gets maximal interaction between the variables across a fixed partition of
the variables. By introducing auxiliary variables we can create such interactions in variables across
any partition of (some) of the variables. By again invoking results showing such structure implies
computational hardness we obtain more linear-IPS lower bounds.

                                                                                                     / {0, . . . , 2n
                                                                                                                      
Theorem 1.16 (Theorem 5.15). Let F be a field of characteristic ≥ poly(n) and β ∈                                   2 }. Then
∑i< j zi, j xi x j − β , {xi2 − xi }ni=1 , {z2i, j − zi, j }i< j is unsatisfiable, and any roABP-IPSLIN refutation (in any
order of the variables) requires exp(Ω(n)) size. Further, any multilinear-formula-IPS refutation requires
                                                                                                                  1/d  2
nΩ(log n) -size, and any product-depth-d multilinear-formula-IPS refutation requires nΩ((n/log n) /d ) -size.

    Note that our result for multilinear-formulas is not just for the linear-IPS system, but actually for
the full multilinear-formula-IPS system. Thus, we show that even though roABP-IPSLIN and depth-3
multilinear formula-IPSLIN0 can refute the subset-sum axiom in polynomial size, slight variants of this
axiom do not have polynomial-size refutations.

1.3.3    Lower bounds for multiples
While the above paradigm can establish superpolynomial lower bounds for linear-IPS, it does not seem
able to establish lower bounds for the general IPS proof system over non-multilinear polynomials, even for
restricted classes. This is because such systems would induce equations such as h(x) f (x)2 + g(x) f (x) ≡ 1
mod {xi2 − xi }ni=1 , where we need to design a computationally simple axiom f so that this equation implies
at least one of h or g is of large complexity. In a linear-IPS proof it must be that h is zero, so that for
any x ∈ {0, 1}n we can solve for g(x), that is, g(x) = 1/ f (x). However, in general knowing f (x) does not
uniquely determine g(x) or h(x), which makes this approach significantly more complicated. Further,
even though we can efficiently simulate IPS by linear-IPS (Theorem 4.4) in general, this simulation
increases the complexity of the proof so that even if one started with a C-IPS proof for a restricted circuit
class C the resulting IPSLIN proof may not be in C-IPSLIN .
     For the reasons mentioned above, we introduce a second paradigm, called lower bounds for multiples,
which can yield C-IPS lower bounds for various restricted classes C. We begin by defining this question.

Goal 1.17 (Lower Bounds for Multiples). Design an explicit polynomial f (x) such that for any nonzero
g(x) we have that the multiple g(x) f (x) is hard to compute.

    We now explain how such lower bounds yield IPS lower bounds. Consider the system f , {xi2 − xi }i
with a single non-Boolean axiom. An IPS refutation is a circuit C(x, y, z) such that C(x, 0, 0) = 0 and
C(x, f , x2 − x) = 1, where (as mentioned) x2 − x denotes {xi2 − xi }i . Expressing C(x, f , x2 − x) as a

                          T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                                           13
                 M ICHAEL F ORBES , A MIR S HPILKA , I DDO T ZAMERET, AND AVI W IGDERSON

univariate in f , we obtain that ∑i≥1 Ci (x, x2 − x) f i = 1 − C(x, 0, x2 − x) for some polynomials Ci . For
most natural measures of circuit complexity 1 − C(x, 0, x2 − x) has complexity roughly bounded          by
that of C itself. Thus, we see that a multiple of f has a small circuit, as ∑i≥1 Ci (x, x2 − x) f i−1 · f =
1 −C(x, 0, x2 − x), and one can use the properties of the IPS refutation to show this is in fact a nonzero
multiple. Thus, if we can show that all nonzero multiples of f require large circuits then we rule out a
small IPS refutation.
     We now turn to methods for obtaining polynomials with hard multiples. Intuitively if a polynomial f
is hard then so should small modifications such as f 2 + x1 f , and this intuition is supported by the result
of Kaltofen [44] which shows that if a polynomial has a small algebraic circuit then so do all of its factors.
As a consequence, if a polynomial requires superpolynomially large algebraic circuits then so do all
of its multiples. However, Kaltofen’s [44] result is about general algebraic circuits, and there are very
limited results about the complexity of factors of restricted algebraic circuits ([18, 56]) so that obtaining
polynomials for hard multiples via factorization results seems difficult.
     However, note that lower bound for multiples has a different order of quantifiers than the factoring
question. That is, Kaltofen’s [44] result speaks about the factors of any small circuit, while the lower
bound for multiples speaks about the multiples of a single polynomial. Thus, it seems plausible that
existing methods could yield such explicit polynomials, and indeed we show this is the case.
     We begin by noting that obtaining lower bounds for multiples is a natural instantiation of the algebraic
hardness versus randomness paradigm. In particular, Heintz–Schnorr [39] and Agrawal [1] showed that
obtaining deterministic (black-box) polynomial identity testing algorithms implies lower bounds (see
Section 3.1 for more on PIT), and we strengthen that connection here to lower bounds for multiples. We
can actually instantiate this connection, and we use slight modifications of existing PIT algorithms to
show that multiples of the determinant are hard in some models.

Theorem 1.18 (Informal Version of Theorem 6.2 and Theorem 6.7). Let C be a restricted class of
n-variate algebraic circuits. Full derandomization of PIT algorithms for C yields a (weakly) explicit
polynomial whose nonzero multiples require exp(Ω(n)) size as C-circuits.
    In particular, when C is the class of sparse polynomials, depth-3 powering formulas, ∑ ∑ ∏O(1)
                                                                                          V

formulas (in characteristic zero), or “every-order” roABPs, then all nonzero multiples of the n × n
determinant are exp(Ω(n))-hard in these models.

    The above statement shows that derandomization implies hardness. We also partly address the
converse direction by arguing (Section 6.1) that hardness-to-randomness construction of Kabanets and
Impagliazzo [43] only requires lower bounds for multiples to derandomize PIT. Unfortunately, this
direction is harder to instantiate for restricted classes as it requires lower bounds for classes with suitable
closure properties.5
    Unfortunately the above result is slightly unsatisfying from a proof complexity standpoint as the
(exponential-size) lower bounds for the subclasses of IPS one can derive from the above result would
involve the determinant polynomial as an axiom. While the determinant is efficiently computable, it is not
computable by the above restricted circuit classes (indeed, the above result proves that). Thus, this would

                                                                                                      ∑ ∑ ∏O(1) formulas)
   5 Although, we note that one can instantiate this connection with depth-3 powering formulas (or even   V

using the lower bounds for multiples developed in this paper, building on the work of Forbes [21]. However, the resulting PIT
algorithms are worse than those developed by Forbes [21].


                          T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                                           14
               P ROOF C OMPLEXITY L OWER B OUNDS FROM A LGEBRAIC C IRCUIT C OMPLEXITY

not fit the real goal of proof complexity which seeks to show that there are statements whose proofs must
be superpolynomially larger than the length of the statement. Thus, if we measure the size of the IPS
proof and the axioms with respect to the same circuit measure, the lower bounds for multiples approach
cannot establish such superpolynomial lower bounds.
     However, we believe that lower bounds for multiples could lead, with further ideas, to proof complex-
ity lower bounds in the conventional sense. That is, it seems plausible that by adding extension variables
we can convert complicated axioms to simple, local axioms by tracing through the computation of that
axiom. That is, consider the axiom xyzw. This can be equivalently written as {a − xy, b − zw, c − ab, c},
where this conversion is done by considering a natural algebraic circuit for xyzw, replacing each gate with
a new variable, and adding an axiom ensuring the new variables respect the computation of the circuit.
While we are unable to understand the role of extension variables in this article we aim to give as simple
axioms as possible whose multiples are all hard as this may facilitate future work on extension variables.
     We now discuss the lower bounds for multiples we obtain.6

Theorem 1.19 (Corollaries 6.9, 6.11, 6.13, 6.21, and 6.23). We obtain the following lower bounds for
multiples.

     • All nonzero multiples of x1 · · · xn require exp(Ω(n)) size as a depth-3 powering formula (over any
       field), or as a ∑ ∑ ∏O(1) formula (in characteristic zero).
                        V


     • All nonzero multiples of (x1 + 1) · · · (xn + 1) require exp(Ω(n))-many monomials.

     • All nonzero multiples of ∏i (xi + yi ) require exp(Ω(n)) width as an roABP in any order of the
       variables where x precedes y.

     • All nonzero multiples of ∏i< j (xi + x j ) require exp(Ω(n)) width as an roABP in any order of the
       variables, as well as exp(Ω(n)) width as a read-twice oblivious ABP.

     We now briefly explain our techniques for obtaining these lower bounds, focusing on the simplest case
of depth-3 powering formulas. It follows from the partial derivative method of Nisan and Wigderson [53]
(see Kayal [45]) that such formulas require exponential size to compute the monomial x1 . . . xn exactly.
Forbes and Shpilka [26], in giving a PIT algorithm for this class, showed that this lower bound can be
scaled down and made robust. That is, if one has a size-s depth-3 powering formula, it follows that if it
computes a monomial xi1 · · · xi` for distinct i j then ` ≤ O(log s) (so the lower bound is scaled down). One
                                                                                                     ai     ai
can then show that regardless of what this formula actually computes the leading monomial xi1 1 · · · xi` `
(for distinct i j and positive ai j ) must have that ` ≤ O(log s). One then notes that leading monomials are
multiplicative. Thus, for any nonzero g the leading monomial of g · x1 . . . xn involves n variables so that if
g · x1 . . . xn is computed in size-s then n ≤ O(log s), giving s ≥ exp(Ω(n)) as desired. One can then obtain
the other lower bounds using the same idea, though for roABPs one needs to define a leading diagonal
(refining an argument of Forbes–Shpilka [25]).
     We now conclude our IPS lower bounds.
  6 While we discussed functional lower bounds for multilinear formulas, this class is not interesting for the lower bounds for

multiples question. This is because a multiple of a multilinear polynomial may not be multilinear, and thus clearly cannot have a
multilinear formula.


                           T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                                              15
                  M ICHAEL F ORBES , A MIR S HPILKA , I DDO T ZAMERET, AND AVI W IGDERSON

Theorem 1.20 (Theorem 7.2, Theorem 7.3). We obtain the following lower bounds for subclasses of IPS.

      • In characteristic zero, the system of polynomials x1 · · · xn , x1 + · · · + xn − n, {xi2 − xi }ni=1 is unsatisfi-
                         V
        able, and any ∑ ∑-IPS refutation requires exp(Ω(n)) size.

      • In characteristic > n, the system of polynomials, ∏i< j (xi + x j − 1), x1 + · · · + xn − n, {xi2 − xi }i
        is unsatisfiable, and any roABP-IPS refutation (in any order of the variables), must be of size
        exp(Ω(n)).

     Note that the first result is a non-standard encoding of 1 = AND(x1 , . . . , xn ) = 0. Similarly, the second
is a non-standard encoding of AND(x1 , . . . , xn ) = 1 yet XOR(xi , x j ) = 1 for all i, j.

1.4     Subsequent developments
After the publication of the preliminary version of this article [29], there was follow-up work by Alekseev,
Hirsch, Tzameret and Grigoriev [5] who proved a conditional superpolynomial size lower bound against
general (unrestricted) IPS refutations over the rational numbers of a subset-sum principle with large
coefficients. The hard instance in [5] is the so-called Binary Value Principle ∑ni=1 2i−1 xi + 1 = 0, for
Boolean variables xi , and the lower bound is conditioned on the Shub–Smale hypothesis stating that
factorial numbers cannot be computed by small (poly-logarithmic size) algebraic expressions consisting
of the constants 0, 1, −1 only (that is, variable-free expressions) [79].

1.5     Organization
The rest of the paper is organized as follows. Section 2 contains the basic notation for the paper.
In Section 3 we give background from algebraic complexity, including several important complexity
measures such as coefficient dimension and evaluation dimension (see Section 3.2 and Section 3.3). We
present our upper bounds for IPS in Section 4. In Section 5 we give our functional lower bounds and from
them obtain lower bounds for IPSLIN . Section 6 contains our lower bounds for multiples of polynomials
and in Section 7 we derive lower bounds for IPS using them. In Section 8 we list some problems which
were left open by this work.
    In Appendix A we describe various other algebraic proof systems and their relations to IPS, such
as the dynamic Polynomial Calculus of Clegg, Edmonds, and Impagliazzo [13], the ordered formula
proofs of Tzameret [83], and the multilinear proofs of Raz and Tzameret [66]. In Appendix B we give an
explicit description of a multilinear polynomial occurring in our IPS upper bounds.


2     Notation
In this section we briefly describe notation used in this paper. We denote [n] := {1, . . . , n}. For a
vector a ∈ Nn , we denote xa := x1a1 · · · xnan so that in particular x1 = ∏ni=1 xi . The (total) degree of a
monomial xa , denoted deg xa , is equal to |a|1 := ∑i ai , and the individual degree, denoted ideg xa , is
equal to |a|∞ := max{ai }i . A monomial xa depends on |a|0 := |{i : ai 6= 0}| many variables. Degree and
individual degree can be defined for a polynomial f , denoted deg f and ideg f respectively, by taking the

                          T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                                       16
              P ROOF C OMPLEXITY L OWER B OUNDS FROM A LGEBRAIC C IRCUIT C OMPLEXITY

maximum over all monomials with nonzero coefficients in f . We will sometimes compare vectors a and
b as “a ≤ b”, which is to be interpreted coordinatewise. We will use ≺ to denote a monomial order on
F[x], see Section 3.6.
    Polynomials will often be written out in their monomial expansion. At various points we will need to
extract coefficients from polynomials. When “taking the coefficient of yb in f ∈ F[x, y]” we mean that
both x and y are treated as variables and thus the coefficient returned is a scalar in F, and this will be
denoted Coeffyb ( f ). However, when “taking the coefficient of yb in f ∈ F[x][y]” we mean that x is now
part of the ring of scalars, so the coefficient will be a polynomial in F[x], and this coefficient will be
denoted Coeffx|yb ( f ).
     For a vector a ∈ Nn we denote a≤i ∈ Ni to be the restriction of a to the first i coordinates. For a set
S ⊆ [n] we let S denote the complement set. We will denote the size-k subsets of [n] by [n]
                                                                                             
                                                                                            k . We will use
ml : F[x] → F[x] to denote the multilinearization operator, defined by Theorem 3.12. We will use x2 − x
to denote the set of equations {xi2 − xi }i .
     To present algorithms that are field independent, this paper works in a model of computation where
field operations (such as addition, multiplication, inversion and zero-testing) over F can be computed at
unit cost, see for example Forbes [20, Appendix A]. We say that an algebraic circuit is t-explicit if it can
be constructed in t steps in this unit-cost model.


3     Algebraic complexity theory background
In this section we state some known facts regarding the algebraic circuit classes that we will be studying.
We also give some important definitions that will be used later in the paper.


3.1    Polynomial identity testing
In the polynomial identity testing (PIT) problem, we are given an algebraic circuit computing some
polynomial f , and we have to determine whether “ f ≡ 0”. That is, we are asking whether f is the zero
polynomial in F[x1 , . . . , xn ]. By the Polynomial Identity Lemma7 , if 0 6= f ∈ F[x] is a polynomial of degree
≤ d and S ⊆ F, and α ∈ Sn is chosen uniformly at random, then f (α) = 0 with probability at most 8
d/|S|. Thus, given the circuit, we can perform these evaluations efficiently, giving an efficient randomized
procedure for deciding whether “ f ≡ 0?”. It is an important open problem to find a derandomization of
this algorithm, that is, to find a deterministic procedure for PIT that runs in polynomial time (in the size
of circuit) (cf. [78, Chap. 4] or the surveys [72, 73]).
    Note that in this randomized algorithm we only use the circuit to compute the evaluation f (α). Such
algorithms are said to run in the black-box model. In contrast, an algorithm that can access the internal
structure of the circuit runs in the white-box model. It is a folklore result that efficient deterministic
black-box algorithms are equivalent to constructions of small hitting sets. That is, a hitting set is a set of
   7 This is sometimes called the “Schwartz–Zippel–DeMillo–Lipton Lemma” following [86, 74, 17], however, this lemma goes

back to Øystein Ore, 1922 [58] and was subsequently rediscovered by other authors; see [8] who unearthed the history of this
lemma.
   8 Note that this is non-trivial only if d < |S| ≤ |F|, which in particular implies that f is not the zero function.



                          T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                                          17
                M ICHAEL F ORBES , A MIR S HPILKA , I DDO T ZAMERET, AND AVI W IGDERSON

inputs so that any nonzero circuit from the relevant class evaluates to nonzero on at least one of the inputs
in the set. For more on PIT we refer to the survey of Shpilka and Yehudayoff [78].
    A related notion to that of a hitting set is that of a generator, which is essentially a low-dimensional
curve whose image contains a hitting set. The equivalence between hitting sets and generators can be
found in the above mentioned survey.
Definition 3.1. Let C ⊆ F[x1 , . . . , xn ] be a set of polynomials. A polynomial G : F` → Fn is a generator
for C with seed length ` if for all f ∈ C, f ≡ 0 iff f ◦ G ≡ 0. That is, f (x) = 0 in F[x] iff f (G(y)) = 0 in
F[y].
    In words, a generator for a circuit class C is a mapping G : F` → Fn , such that for any nonzero
polynomial f , computed by a circuit from C, it holds that the composition f (G) is nonzero as well. By
considering the image of G on S` , where S ⊆ F is of polynomial size, we obtain a hitting set for C.
    We now list some existing work on derandomizing PIT for some of the classes of polynomials we
study in this paper.
Sparse Polynomials: There are many papers giving efficient black-box PIT algorithms for ∑ ∏ formulas.
For example, Klivans and Spielman [47] gave a hitting set of polynomial size.
Depth-3 Powering Formulas: Saxena [71] gave a polynomial-time white-box PIT algorithm and
Forbes, Shpilka, and Saptharishi [24] gave a sO(lg lg s) -size hitting set for size-s depth-3 powering formulas.
∑ ∑ ∏O(1) Formulas: Forbes [21] gave an sO(lg s) -size hitting set for size-s ∑ ∑ ∏O(1) formulas (in
  V                                                                                       V

large characteristic).
Read-once Oblivious ABPs: Raz and Shpilka [64] gave a polynomial-time white-box PIT algorithm.
A long sequence of papers calumniated in the work of Agrawal, Gurjar, Korwar, and Saxena [2], who
gave a sO(lg s) -sized hitting set for size-s roABPs.
Read-k Oblivious ABPs:          Recently, Anderson, Forbes, Saptharishi, Shpilka and Volk [6] obtained a
                                                   1−1/2k−1 )
white-box PIT algorithm running in time 2Õ(n                   for n-variate poly(n)-sized read-k oblivious ABPs.

3.2   Coefficient dimension and roABPs
This paper proves various lower bounds on roABPs using a complexity measures known as coefficient
dimension. In this section, we define this measures and recall basic properties. Full proofs of these claims
can be found for example in the thesis of Forbes [20].
    We first define the coefficient matrix of a polynomial, called the “partial derivative matrix” in the prior
work of Nisan [52] and Raz [63]. This matrix is formed from a polynomial f ∈ F[x, y] by arranging its
coefficients into a matrix. That is, the coefficient matrix has rows indexed by monomials xa in x, columns
indexed by monomials yb in y, and the (xa , yb )-entry is the coefficient of xa yb in the polynomial f . We
now define this matrix, recalling that Coeffxa yb ( f ) is the coefficient of xa yb in f .
Definition 3.2. Consider f ∈ F[x, y]. Define the coefficient matrix of f as the scalar matrix

                                            (C f )a,b := Coeffxa yb ( f ) ,

where coefficients are taken in F[x, y], for |a|1 , |b|1 ≤ deg f .

                        T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                                  18
             P ROOF C OMPLEXITY L OWER B OUNDS FROM A LGEBRAIC C IRCUIT C OMPLEXITY

   We now give the related definition of coefficient dimension, which looks at the dimension of the row-
and column-spaces of the coefficient matrix. Recall that Coeffx|yb ( f ) extracts the coefficient of yb in f ,
where f is treated as a polynomial in F[x][y].

Definition 3.3. Let Coeffx|y : F[x, y] → 2F[x] be the space of F[x][y] coefficients, defined by
                                                   n               o
                                  Coeffx|y ( f ) := Coeffx|yb ( f )    n
                                                                         ,
                                                                      b∈N

where coefficients of f are taken in F[x][y].
   Similarly, define Coeffy|x : F[x, y] → 2F[y] by taking coefficients in F[y][x].

   The following basic lemma shows that the rank of the coefficient matrix equals the coefficient
dimension, which follows from simple linear algebra.
Lemma 3.4 (Nisan [52]). Consider f ∈ F[x, y]. Then the rank of the coefficient matrix C f obeys

                               rankC f = dim Coeffx|y ( f ) = dim Coeffy|x ( f ) .

   Thus, the ordering of the partition ((x, y) versus (y, x)) does not matter in terms of the resulting
dimension. The above matrix-rank formulation of coefficient dimension can be rephrased in terms of
low-rank decompositions.
Lemma 3.5. Let f ∈ F[x, y]. Then dim Coeffx|y ( f ) equals the minimum r such that there are g ∈ F[x]r
and h ∈ F[y]r such that f can be written as f (x, y) = ∑ri=1 gi (x)hi (y).
    We now state a convenient normal form for roABPs (see for example Forbes [20, Corollary 4.4.2]).
Lemma 3.6. A polynomial f ∈ F[x1 , . . . , xn ] is computed by width-r roABP iff there exist matrices
Ai (xi ) ∈ F[xi ]r×r of (individual) degree ≤ deg f such that f = (∏ni=1 Ai (xi ))1,1 . Further, this equivalence
preserves explicitness of the roABPs up to poly(n, r, deg f )-factors.
   By splitting an roABP into such variable-disjoint inner-products one can obtain a lower bound for
roABP width via coefficient dimension. In fact, this complexity measure characterizes roABP width.
Lemma 3.7. Let f ∈ F[x1 , . . . , xn ] be a polynomial. If f is computed by a width-r roABP        then r ≥
maxi dim Coeffx≤i |x>i ( f ). Further, f is computable by a width- maxi dim Coeffx≤i |x>i ( f ) roABP.
   Using this complexity measure it is rather straightforward to prove the following closure properties of
roABPs.

Fact 3.8. If f , g ∈ F[x] are computable by width-r and width-s roABPs respectively, then

    • f + g is computable by a width-(r + s) roABP.

    • f · g is computable by a width-(rs) roABP.

Further, roABPs are also closed under the following operations.

    • If f (x, y) ∈ F[x, y] is computable by a width-r roABP in some order of the variables then the partial
      substitution f (x, α), for α ∈ F|y| , is computable by a width-r roABP in the induced order on x,
      where the degree of this roABP is bounded by the degree of the roABP for f .

                       T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                                  19
                 M ICHAEL F ORBES , A MIR S HPILKA , I DDO T ZAMERET, AND AVI W IGDERSON

      • If f (z1 , . . . , zn ) is computable by a width-r roABP in order of the variables z1 < · · · < zn , then
        f (x1 y1 , . . . , xn yn ) is computable by a poly(r, ideg f )-width roABP in order of the variables x1 <
        y1 < · · · < xn < yn .
Further, these operations preserve the explicitness of the roABPs up to polynomial factors in all relevant
parameters.
    We now state the extension of these techniques which yield lower bounds for read-k oblivious ABPs,
as recently obtained by Anderson, Forbes, Saptharishi, Shpilka and Volk [6].
Theorem 3.9 ([6]). Let f ∈ F[x1 , . . . , xn ] be a polynomial computed by a width-w read-k oblivious ABP.
Then there exists a partition x = (u, v, w) such that
   1. |u|, |v| ≥ n/kO(k) .

   2. |w| ≤ n/10.

   3. dimF(w) Coeffu|v ( fw ) ≤ w2k , where fw is f as a polynomial in F(w)[u, v].

3.3     Evaluation dimension
While coefficient dimension measures the size of a polynomial f (x, y) by taking all coefficients in y,
evaluation dimension is a complexity measure due to Saptharishi [70] that measures the size by taking all
possible evaluations in y over the field. This measure will be important for our applications as one can
restrict such evaluations to the Boolean cube and obtain circuit lower bounds for computing f (x, y) as a
polynomial via its induced function on the Boolean cube. We begin with the definition.
Definition 3.10 (Saptharishi [70]). Let S ⊆ F. Let Evalx|y,S : F[x, y] → 2F[x] be the space of F[x][y]
evaluations over S, defined by
                                                       n         o
                               Evalx|y,S ( f (x, y)) := f (x, β )  |y|
                                                                       .
                                                                      β ∈S

Define Evalx|y : F[x, y] → 2F[x] to be Evalx|y,S when S = F.
    Similarly, define Evaly|x,S : F[x, y] → 2F[y] by replacing x with all possible evaluations α ∈ S|x| , and
likewise define Evaly|x : F[x, y] → 2F[y] .
   The equivalence between evaluation dimension and coefficient dimension was shown by Forbes–
Shpilka [27] by appealing to interpolation.
Lemma 3.11 (Forbes–Shpilka [27]). Let f ∈ F[x, y]. For any S ⊆ F we have that Evalx|y,S ( f ) ⊆
span Coeffx|y ( f ) so that dim Evalx|y,S ( f ) ≤ dim Coeffx|y ( f ). In particular, if |S| > ideg f then
dim Evalx|y,S ( f ) = dim Coeffx|y ( f ).
    While evaluation dimension and coefficient dimension are equivalent when the field is large enough,
when restricting our attention to inputs from the Boolean cube this equivalence no longer holds (in
particular, we have to consider all polynomials that obtain the same values on the Boolean cube and not
just one polynomial), but evaluation dimension will be still helpful as it always lower bounds coefficient
dimension.

                         T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                                20
              P ROOF C OMPLEXITY L OWER B OUNDS FROM A LGEBRAIC C IRCUIT C OMPLEXITY

3.4   Multilinear polynomials and multilinear formulas
We now turn to multilinear polynomials and classes that respect multilinearity such as multilinear formulas.
We first state some well-known facts about multilinear polynomials.

Fact 3.12. For any two multilinear polynomials f , g ∈ F[x1 , . . . , xn ], f = g as polynomials iff they agree
on the Boolean cube {0, 1}n . That is, f = g iff f |{0,1}n = g|{0,1}n .
    Further, there is a multilinearization map ml : F[x] → F[x] such that for any f , g ∈ F[x],

   1. ml( f ) is multilinear.

   2. f and ml( f ) agree on the Boolean cube, that is, f |{0,1}n = ml( f )|{0,1}n .

   3. deg ml( f ) ≤ deg f .

   4. ml( f g) = ml(ml( f ) ml(g)), and if f and g are defined on disjoint sets of variables then ml( f g) =
      ml( f ) ml(g).

   5. ml is linear, so that for any α, β ∈ F, ml(α f + β g) = α ml( f ) + β ml(g).
                                 min{ai ,1}
   6. ml(x1a1 · · · xnan ) = ∏i xi            .

   7. If f is the sum of at most s monomials (s-sparse) then so is ml( f ).

Also, if fˆ is a function {0, 1}n → F that only depends on the coordinates in S ⊆ [n], then the unique
multilinear polynomial f agreeing with fˆ on {0, 1}n is a polynomial only in {xi }i∈S .
    One can also extend the multilinearization map ml : F[x] → F[x] to matrices ml : F[x]r×r → F[x]r×r
by applying the map entrywise and the above properties still hold.

   Throughout the rest of this paper ‘ml’ will denote the multilinearization operator. Raz [63, 62]
gave lower bounds for multilinear formulas using the above notion of coefficient dimension, and Raz-
Yehudayoff [67, 68] gave simplifications and extensions to constant-depth multilinear formulas.
Theorem 3.13 (Raz-Yehudayoff [63, 68]). Let f ∈ F[x1 , . . . , x2n , z] be a multilinear polynomial in the set
of variables x and auxiliary variables z. Let fz denote the polynomial f in the ring F[z][x]. Suppose that
for any partition x = (u, v) with |u| = |v| = n that

                                                  dimF(z) Coeffu|v fz ≥ 2n .

Then f requires ≥ nΩ(log n) -size to be computed as a multilinear formula, and for d = o(log n/log log n), f
                     1/d 2
requires nΩ((n/log n) /d ) -size to be computed as a multilinear formula of product-depth-d.

3.5   Depth-3 powering formulas
In this section we review facts about depth-3 powering formulas. We begin with the duality trick of
Saxena [71], which shows that one can convert a power of a linear form to a sum of products of univariate
polynomials.

                          T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                             21
                M ICHAEL F ORBES , A MIR S HPILKA , I DDO T ZAMERET, AND AVI W IGDERSON

Theorem 3.14 (Saxena’s Duality Trick [77, 71, 22]). Let n ≥ 1, and d ≥ 0. If |F| ≥ nd + 1, then there are
poly(n, d)-explicit univariates fi, j ∈ F[xi ] such that
                                                             s
                                    (x1 + · · · + xn )d = ∑ fi,1 (x1 ) · · · fi,n (xn ) ,
                                                            i=1

where deg fi, j ≤ d and s = (nd + 1)(d + 1).
     We note that the parameters we use in Theorem 3.14 are from the proof of Forbes, Gupta, and
Shpilka [22], which works over any large enough field (irrespective of its characteristic). Saxena [71]
proved a stronger statement with s = nd + 1, however, his proof only worked over fields of large enough
characteristic. A similar version of this trick also appeared in Shpilka–Wigderson [77].
     Noting that the product fi,1 (x1 ) · · · fi,n (xn ) trivially has a width-1 roABP (in any order of the variables),
it follows that (x1 + · · · + xn )d has a poly(n, d)-width roABP over a large enough field. Thus, size-s ∑ ∑
                                                                                                                  V

formulas have poly(s)-size roABPs over large enough fields by appealing to closure properties of roABPs
(Theorem 3.8). As it turns out, this result also holds over any field as Forbes–Shpilka [27] adapted
Saxena’s [71] duality to work over any field. Their version works over any field, but loses the above clean
form (sum of product of univariates).
Theorem 3.15 (Forbes–Shpilka [27]). Let f ∈ F[x] be expressed as f (x) = ∑si=1 (αi,0 + αi,1 x1 + · · · +
αi,n xn )di . Then f is computable by a poly(r, n)-explicit width-r roABP of degree maxi {di }, in any order
of the variables, where r = ∑i (di + 1).
    One way to see this claim is to observe that for any variable partition, a linear function can be
expressed as the sum of two variable-disjoint linear functions `(x1 , x2 ) = `1 (x1 ) + `2 (x2 ). By the binomial
theorem, the d-th power of this expression is a summation of d + 1 variable-disjoint products, which
implies a coefficient dimension upper bound of d + 1 (Lemma 3.5) and thus also an roABP-width upper
bound (Lemma 3.7). One can then sum over the linear forms.
    While this simulation suffices for obtaining roABP upper bounds, we will also want the clean
form obtained via duality for application to multilinear-formula IPS proofs of the subset-sum axiom
(Theorem 4.15).

3.6     Monomial orders
We recall here the definition and properties of a monomial order, following Cox, Little and O’Shea [16].
We first fix the definition of a monomial in our context.

Definition 3.16. A monomial in F[x1 , . . . , xn ] is a polynomial of the form xa = x1a1 · · · xnan for a ∈ Nn .

    We will sometimes abuse notation and associate a monomial xa with its exponent vector a, so that we
can extend this order to the exponent vectors. Note that in this definition “1” is a monomial, and that
scalar multiples of monomials such as 2x are not considered monomials. We now define a monomial
order, which will be total order on monomials with certain natural properties.

Definition 3.17. A monomial ordering is a total order ≺ on the monomials in F[x] such that

      • For all a ∈ Nn \ {0}, 1 ≺ xa .

                         T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                                      22
             P ROOF C OMPLEXITY L OWER B OUNDS FROM A LGEBRAIC C IRCUIT C OMPLEXITY

    • For all a, b, c ∈ Nn , xa ≺ xb implies xa+c ≺ xb+c .

   For nonzero f ∈ F[x], the leading monomial of f (with respect to a monomial order ≺), denoted
LM( f ), is the largest monomial in Supp( f ) := {xa : Coeffxa ( f ) 6= 0} with respect to the monomial order
≺. The trailing monomial of f , denoted TM( f ), is defined analogously to be the smallest monomial in
Supp( f ). The zero polynomial has neither leading nor trailing monomial.
   For nonzero f ∈ F[x], the leading (or trailing) coefficient of f , denoted LC( f ) (TC( f ), resp.), is
Coeffxa ( f ) where xa = LM( f ) (xa = TM( f ), resp.).

    Henceforth in this paper we will assume F[x] is equipped with some monomial order ≺. The
results in this paper will hold for any monomial order. However, for concreteness, one can consider the
lexicographic ordering on monomials, which is easily seen to be a monomial ordering (see also Cox,
Little and O’Shea [16]).
    We begin with a simple lemma about how taking leading or trailing monomials (or coefficients) is
homomorphic with respect to multiplication.

Lemma 3.18. Let f , g ∈ F[x] be nonzero polynomials. Then the leading monomial and trailing monomials
and coefficients are homomorphic with respect to multiplication, that is, LM( f g) = LM( f ) LM(g) and
TM( f g) = TM( f ) TM(g), as well as LC( f g) = LC( f ) LC(g) and TC( f g) = TC( f ) TC(g).

Proof: We do the proof for leading monomials and coefficients, the claim for trailing monomials and
coefficients is symmetric.
    Let f (x) = ∑a αa xa and g(x) = ∑b βb xb . Isolating the leading monomials,

         f (x) = LC( f ) · LM( f ) +      ∑          αa xa ,               g(x) = LC(g) · LM(g) +              ∑        βb xb ,
                                       xa ≺LM( f )                                                          xb ≺LM(g)

with LC( f ) = αLM( f ) and LC(g) = βLM(g) being nonzero. Thus,
                                                                                                  

  f (x)g(x) = LC( f ) LC(g) · LM( f ) LM(g) + LC( f ) LM( f )                       ∑        βb xb 
                                                                                  xb ≺LM(g)
                                                                                                                             

                              + LC(g) LM(g)                   ∑          αa x a  +       ∑          αa x a     ∑        βb xb  .
                                                            xa ≺LM( f )                  xa ≺LM( f )            xb ≺LM(g)


Using that xa xb ≺ LM( f ) LM(g) whenever xa ≺ LM( f ) or xb ≺ LM(g) due to the definition of a monomial
order, we have that LM( f ) LM(g) is indeed the maximal monomial in the above expression with nonzero
coefficient, and as its coefficient is LC( f ) LC(g).

    We now recall the well-known fact that for any set of polynomials the dimension of their span in F[x]
is equal to the number of distinct leading or trailing monomials in their span.
Lemma 3.19. Let S ⊆ F[x] be a set of polynomials. Then dim span S = |LM(span S)| = |TM(span S)|. In
particular, dim span S ≥ |LM(S)| , |TM(S)|.

                       T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                                                          23
                M ICHAEL F ORBES , A MIR S HPILKA , I DDO T ZAMERET, AND AVI W IGDERSON

4     Upper bounds for linear-IPS
While the primary focus of this article is on lower bounds for restricted classes of the IPS proof system, we
begin by discussing upper bounds to demonstrate that these restricted classes can prove the unsatisfiability
of non-trivial systems of polynomials equations. In particular we go beyond existing work on upper
bounds ([32, 66, 65, 35, 50]) and place interesting refutations in IPS subsystems where we will also prove
lower bounds, as such upper bounds demonstrate the non-triviality of our lower bounds.
    We begin by discussing the power of the linear-IPS proof system. While one of the most novel
features of IPS proofs is their consideration of non-linear certificates, we show that in powerful enough
models of algebraic computation, linear-IPS proofs can efficiently simulate general IPS proofs, essentially
answering an open question of Grochow and Pitassi [35]. A special case of this result was obtained by
Grochow and Pitassi [35], where they showed that IPSLIN can simulate ∑ ∏-IPS. We then consider the
subset-sum axioms, previously considered by Impagliazzo, Pudlák, and Sgall [41], and show that they
can be refuted in polynomial size by the C-IPSLIN proof system where C is either the class of roABPs, or
the class of multilinear formulas.


4.1   Simulating IPS proofs with linear-IPS
We show here that general IPS proofs can be efficiently simulated by linear-IPS, assuming that the axioms
to be refuted are described by small algebraic circuits. Grochow and Pitassi [35] showed that whenever
the IPS proof computes sparse polynomials, one can simulate it by linear-IPS using (possibly non-sparse)
algebraic circuits. We give here a simulation of IPS when the proofs use general algebraic circuits.
    To give our simulation, we will need to show that if a small circuit f (x, y) is divisible by y, then the
quotient f (x,y)/y also has a small circuit. Such a result clearly follows from Strassen’s [81] elimination of
divisions in general, but we give two constructions for the quotient which tailor Strassen’s [81] technique
to optimize certain parameters.
    The first construction assumes that f has degree bounded by d, and produces a circuit for the quotient
whose size depends polynomially on d. This construction is efficient when f is computed by a formula or
branching program (so that d is bounded by the size of f ). In particular, this construction will preserve the
depth of f in computing the quotient, and for that reason we only present it for formulas. The construction
proceeds via interpolation to decompose f (x, y) = ∑i fi (x)yi into its constituent parts { fi (x)}i and then
directly constructs f (x,y)/y = ∑i fi (x)yi−1 .

Lemma 4.1. Let F be a field with |F| ≥ d + 1. Let f (x, y) ∈ F[x1 , . . . , xn , y] be a degree ≤ d polynomial
expressible as f (x, y) = ∑0≤i≤d fi (x)yi for fi ∈ F[x]. Assume f is computable by a size-s depth-D formula.
Then for a ≥ 1 one can compute
                                                   d
                                                  ∑ fi (x)yi−a ,
                                                  i=a

by a poly(s, a, d)-size depth-(D +2) formula. Further, given d and the formula for f , the resulting formula
is poly(s, a, d)-explicit. In particular, if ya | f (x, y) then the quotient f (x,y)/ya has a poly(s, a, d)-explicit
formula of size poly(s, a, d) and depth (D + 2).

                        T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                                    24
              P ROOF C OMPLEXITY L OWER B OUNDS FROM A LGEBRAIC C IRCUIT C OMPLEXITY

Proof: Express f (x, y) ∈ F[x][y] by f (x, y) = ∑0≤i≤d fi (x)yi . As |F| ≥ 1 + degy f , by interpolation there
are poly(d)-explicit constants αi, j , β j ∈ F such that
                                                          d
                                                fi (x) = ∑ αi, j f (x, β j ) .
                                                         j=0

It then follows that
                                                                !
                       d              d     d                                d   d
                               i−a
                    ∑ fi (x)y        =∑    ∑ αi, j f (x, β j ) yi−a = ∑ ∑ αi, j f (x, β j )yi−a ,
                    i=a              i=a   j=0                              i=a j=0

which is clearly a formula of the appropriate size, depth, and explicitness. The claim about the quotient
f (x,y)/ya follows from seeing that if the quotient is a polynomial then f (x,y)/ya = ∑d f i (x)yi−a .
                                                                                       i=a

     The above construction suffices in the typical regime of algebraic complexity where the circuits
compute polynomials whose degree is polynomially related to their circuit size. However, the simulation
of Extended Frege by general IPS proved by Grochow-Pitassi [35] (Theorem 1.2) yields IPS refutations
with circuits of possibly exponential degree (see also Theorem 1.3). This motivates the search for an
efficient division lemma in this regime. We now provide such a lemma, which is a variant of Strassen’s [81]
homogenization technique for efficiently computing the low-degree homogeneous components of an
unbounded degree circuit. As weaker models of computation (such as formulas and branching programs)
cannot compute polynomials of degree exponential in their size, we only present this lemma for circuits.
Lemma 4.2. Let f (x, y) ∈ F[x1 , . . . , xn , y] be a polynomial expressible as f (x, y) = ∑i fi (x)yi for fi ∈ F[x],
and assume f is computable by a size-s circuit. Then for a ≥ 1 there is an O(a2 s)-size circuit with output
gates computing
                                         f0 (x), . . . , fa−1 (x), ∑ fi (x)yi−a .
                                                                  i≥a

Further, given a and the circuit for f , the resulting circuit is poly(s, a)-explicit. In particular, if ya | f (x, y)
then the quotient f (x,y)/ya has a poly(s, a)-explicit circuit of size O(a2 s).
Proof: The proof proceeds by viewing the computation in the ring F[x][y], and splitting each gate in
the circuit for f into its coefficients in terms of y. However, to avoid a dependence on the degree, we
only split out the coefficients of y0 , y1 , . . . , ya−1 , and then group the rest of the coefficients together.
That is, every node v computing the polynomial fv (x, y) = ∑i≥0 f(v,i) (x)yi in the circuit Φ for f , is split
into a + 1 nodes computing f(v,0) (x), . . . , f(v,a−1) (x) and ∑i≥a fi (x)yi−a , respectively. This is similar to
homogenizing algebraic circuits, see [78]. We can then locally update this split by appropriately keeping
track of how addition and multiplication affects this grouping of coefficients. We note that we can assume
without loss of generality that the circuit for f has fan-in 2, as this only increases the size of the circuit by
a constant factor (measuring the size of the circuit in number of edges) and simplifies the construction.
     construction: Let Φ denote the circuit for f . For a gate v in Φ, denote Φv to be the sub-circuit rooted
at v in Φ and let fv be the polynomial computed by the gate v. We will define the new circuit Ψ, which
will be defined by the gates {(v, i) : v ∈ Φ, 0 ≤ i ≤ a} and the wiring between them, as follows.
    • Φv ∈ F: Ψ(v,0) := Φv , Ψ(v,i) := 0 for i ≥ 1.

                           T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                                   25
               M ICHAEL F ORBES , A MIR S HPILKA , I DDO T ZAMERET, AND AVI W IGDERSON

    • Φv = x j : Ψ(v,0) := x j , Ψ(v,i) := 0 for i ≥ 1.

    • Φv = y: Ψ(v,1) := 1, Ψ(v,i) := 0 for i 6= 1.

    • Φv = Φu + Φw : Ψ(v,i) := Ψ(u,i) + Ψ(w,i) , all i.

    • Φv = Φu × Φw , 0 ≤ i < a:
                                               Ψ(v,i) :=      ∑ Ψ(u, j) × Ψ(w,i− j) .
                                                           0≤ j≤i


    • Φv = Φu × Φw , i = a:

                                2(a−1)
                    Ψ(v,a) :=    ∑ y`−a ∑ Ψ(u,i) × Ψ(w, j) + ∑ Ψ(u,i) × Ψ(w,a) × yi
                                 `=a           i+ j=`                      0≤i<a
                                              0≤i, j<a

                                                 +    ∑ Ψ(u,a) × Ψ(w, j) × y j + Ψ(u,a) × Ψ(w,a) × ya .
                                                     0≤ j<a


     complexity: Split the gates in Ψ into two types, those gates (v, i) where i = a and v is a multiplication
gate in Φ, and then the rest. For the former type, Ψ(v,a) is computable by a size-O(a2 ) circuit in its
children, and there are at most s such gates. For the latter type, Ψ(v,i) is computable by a size-O(a) circuit
in its children, and there are at most O(as) such gates. Therefore, the total size is O(a2 s).
     correctness: We now establish correctness as a subclaim. For a gate (v, i) in Ψ, let g(v,i) denote the
polynomial that it computes.

Subclaim 4.3. For each gate v in Φ, for 0 ≤ i < a we have that g(v,i) = Coeffx|yi ( fv ) and for i = a we
have that g(v,a) = ∑i≥a Coeffx|yi ( fv )yi−a . In particular, fv = ∑ai=0 g(v,i) yi .

Sub-Proof: Note that the second part of the claim follows from the first. We now establish the first part
by induction on the gates of the circuit.

    • Φv ∈ F: fv = γ for γ ∈ F. By construction, g(v,0) = γ = Coeffx|y0 ( fv ), and for i ≥ 1, g(v,i) = 0 =
      Coeffx|yi ( fv ).

    • Φv = x j : fv = x j . By construction, g(v,0) = x j = Coeffx|y0 ( fv ), and for i ≥ 1, g(v,i) = 0 =
      Coeffx|yi ( fv ).

    • Φv = y: By construction, g(v,1) = 1 = Coeffx|y1 ( fv ), and for i 6= 1, g(v,i) = 0 = Coeffx|yi ( fv ).

    • Φv = Φu + Φw :

                                         g(v,i) = g(u,i) + g(w,i)
                                               = Coeffx|yi ( fu ) + Coeffx|yi ( fw )
                                               = Coeffx|yi ( fu + fw ) = Coeffx|yi ( fv ) .

                        T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                                  26
             P ROOF C OMPLEXITY L OWER B OUNDS FROM A LGEBRAIC C IRCUIT C OMPLEXITY

    • Φv = Φu × Φw , 0 ≤ i < a:

                                                g(v,i) =     ∑ g(u, j) · g(w,i− j)
                                                            0≤ j≤i

                                                        =    ∑ Coeffx|y ( fu ) · Coeffx|y ( fw )
                                                                                     j                         i− j
                                                            0≤ j≤i

                                                        = Coeffx|yi ( fu · fw ) = Coeffx|yi ( fv ) .

    • Φv = Φu × Φw , i = a:
                       2(a−1)
            g(v,a) =    ∑ y`−a ∑ g(u,i) · g(w, j) + ∑ g(u,i) · g(w,a) · yi
                        `=a          i+ j=`                                   0≤i<a
                                    0≤i, j<a

                                            +        ∑ g(u,a) · g(w, j) · y j + g(u,a) · g(w,a) · ya
                                                 0≤ j<a
                       2(a−1)
                 =      ∑ y−a ∑ Coeffx|y ( fu )yi · Coeffx|y ( fw )y j
                                                              i                              j
                        `=a          i+ j=`
                                    0≤i, j<a
                                                                                                                                  !
                                            + ∑ Coeffx|yi ( fu ) ·                       ∑ Coeffx|y ( fw )y j−a · yi
                                                                                                           j
                                                 0≤i<a                                   j≥a
                                                                                                   !
                                            +        ∑       ∑ Coeffx|y ( fu )yi−a · Coeffx|y ( fw ) · y j
                                                                                 i                                        j
                                                 0≤ j<a      i≥a
                                                                                         !                                            !
                                            +        ∑ Coeffx|y ( fu )yi−a · ∑ Coeffx|y ( fw )y j−a · ya
                                                                          i                                           j
                                                     i≥a                                           j≥a

                 =       ∑ Coeffx|y ( fu )Coeffx|y ( fw )yi+ j−a + ∑ Coeffx|y ( fu )Coeffx|y ( fw )yi+ j−a
                                            i                         j                                               i                   j
                        i+ j≥a                                                                   0≤i<a
                       0≤i, j<a                                                                   j≥a

                       +      ∑ Coeffx|y ( fu )Coeffx|y ( fw )yi+ j−a + ∑ Coeffx|y ( fu )Coeffx|y ( fw )yi+ j−a
                                                 i                        j                                                   i               j
                            i≥a                                                                   i, j≥a
                           0≤ j<a

                 =      ∑ Coeffx|y ( fu )Coeffx|y ( fw ) · yi+ j−a
                                        i                         j
                       i+ j≥a

                 = ∑ Coeffx|y` ( fu · fw ) · y`−a
                       `≥a

                 = ∑ Coeffx|y` ( fv ) · y`−a .                                                                                                    
                       `≥a


    The correctness then follows by examining vout , the output gate of Φ, so that fvout = f . The gates
(vout , 0), . . . , (vout , a) are then outputs of Ψ and by the above subclaim have the desired functionality.
    quotient: The claim about the quotient f (x,y)/ya follows from seeing that if the quotient is a polynomial
then f (x,y)/ya = ∑i≥a fi (x)yi−a which is one of the outputs of the constructed circuit.

                           T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                                                                  27
                M ICHAEL F ORBES , A MIR S HPILKA , I DDO T ZAMERET, AND AVI W IGDERSON

   We now give our simulation of general IPS by linear-IPS. In the set of axioms below, we do not
separate out the Boolean axioms from the rest, as this simplifies notation.

Proposition 4.4. Let f1 , . . . , fm ∈ F[x1 , . . . , xn ] be unsatisfiable polynomials with an IPS refutation C ∈
F[x, y1 , . . . , ym ]. Then f1 , . . . , fm have a linear-IPS refutation C0 ∈ F[x, y] under the following conditions.

   1. Suppose f1 , . . . , fm ,C are computed by size-s formulas, have degree at most d, and |F| ≥ d +1. Then
      C0 is computable by a poly(s, d, m)-size formula of depth-O(D), and C0 is poly(s, d, m)-explicit
      given d and the formulas for f1 , . . . , fm ,C.

   2. Suppose f1 , . . . , fm ,C are computed by size-s circuits. Then C0 is computable by a poly(s, m)-size
      circuit, and C0 is poly(s, m)-explicit given the circuits for f1 , . . . , fm ,C.

Proof: Express C(x, y) as a polynomial in F[x][y], so that C(x, y) = ∑a>0 Ca (x)y a , where we use that
C(x, 0) = 0 to see that we can restrict a to a > 0. Partitioning the a ∈ Nn based on the index of their first
nonzero value, and denoting a<i for the first i − 1 coordinates of a, we obtain

                                            C(x, y) = ∑ Ca (x)ya
                                                      a>0
                                                       n
                                                    =∑       ∑ Ca (x)ya .
                                                      i=1 a:a<i =0,
                                                            ai >0


Now define Ci (x, y) := ∑a:a<i =0, Ca (x)ya−ei , where ei is the i-th standard basis vector. Note that this is a
                              ai >0
valid polynomial as in this summation we assume ai > 0 so that a − ei ≥ 0. Thus,

                                                       n
                                            C(x, y) = ∑ Ci (x, y)yi .
                                                      i=1

     We now define C0 (x, y) := ∑ni=1 Ci (x, f (x))yi and claim it is the desired linear-IPS refutation, where
note that we have only partially substituted in the fi for the yi . First, observe that it is a valid refutation, as
C0 (x, 0) = ∑ni=1 Ci (x, f (x)) · 0 = 0, and C0 (x, f (x)) = ∑ni=1 Ci (x, f (x)) fi (x) = C(x, f (x)) = 1 via the above
equation and using that C is a valid IPS refutation.
     We now argue that C0 can be efficiently computed in the two above regimes.
     (1): Up to constant loss in the depth and polynomial loss in the size, for bounding the complexity of
 0
C it suffices to bound the complexity of each Ci (x, f (x)). First, note that

                        Ci (x, y)yi =     ∑ Ca (x)ya = C(x, 0, yi , y>i ) −C(x, 0, 0, y>i ) ,
                                        a:a<i =0,
                                          ai >0


where each “0” here is a vector of i − 1 zeros. Clearly each of C(x, 0, yi , y>i ) and C(x, 0, 0, y>i ) have
formula size and depth bounded by that of C. From our division lemma for formulas (Theorem 4.1) it
follows that Ci (x, y) = y1i (C(x, 0, yi , y>i ) −C(x, 0, 0, y>i )) has a poly(s, d)-size depth-O(D) formula of the

                         T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                                      28
             P ROOF C OMPLEXITY L OWER B OUNDS FROM A LGEBRAIC C IRCUIT C OMPLEXITY

                                                                  
desired explicitness (as degy C(x, 0, yi , y>i ) −C(x, 0, 0, y>i ) ≤ degy C ≤ d ≤ |F| − 1, so that F is large
enough). We replace y ← f (x) to obtain Ci (x, f (x)) of the desired size and explicitness, using that the f j
themselves have small-depth formulas.
    (2): This follows as in (1), using now the division lemma for circuits (Theorem 4.2).

     Grochow and Pitassi [35, Open Question 1.139 ] asked whether one can relate the complexity of IPS
and linear-IPS, as they only established such relations for simulating ∑ ∏-IPS by (general) linear-IPS. Our
above result essentially answers this question for general formulas over sufficiently large fields, and for
circuits, over any field, under the assumption that the unsatisfiable polynomial system f1 = · · · = fm = 0
can be expressed using small algebraic formulas or circuits, respectively. This is a reasonable assumption
as it is the most common regime for proof complexity. However, the above result does not fully close the
question of Grochow-Pitassi [35] with respect to simulating C-IPS by D-IPSLIN for various restricted
subclasses C, D of algebraic computation. That is, for such a simulation our result requires D to at
the very least contain C composed with the axioms f1 , . . . , fm , and when applying this to the models
considered in this paper (sparse polynomials, depth-3 powering formulas, roABPs, multilinear formulas)
this seems to non-negligibly increase the complexity of the algebraic reasoning.

4.2   Multilinearizing roABP-IPSLIN
We now exhibit instances where one can efficiently prove that a polynomial equals its multilinearization
modulo the Boolean axioms. That is, for a polynomial f computed by a small circuit we wish to prove
that f ≡ ml( f ) mod x2 − x by expressing f − ml( f ) = ∑i hi · (xi2 − xi ) so that the hi also have small
circuits. Such a result will simplify the search for linear-IPS refutations by allowing us to focus on the
non-Boolean axioms. That is, if we are able to find a refutation of f , x2 − x given by

                                            ∑ g j f j ≡ 1 mod x2 − x ,
                                             j

where the g j have small circuits, multilinearization results of the above form guarantee that there are hi
so that
                                      ∑ g j f j + ∑ hi · (xi2 − xi ) = 1 ,
                                        j           i

which is a proper linear-IPS refutation.
    We establish such a multilinearization result when f is an roABP in this section, and consider the
cases where f is the product of a low-degree multilinear polynomial and a multilinear formula in the
next section. We will use these multilinearization results in our construction of IPS refutations of the
subset-sum axiom (Section 4.4).
    We begin by noting that multilinearization for these two circuit classes is rather special, as these
classes straddle the conflicting requirements of neither being too weak nor too strong. That is, some
circuit classes are simply too weak to compute their multilinearizations. An example is the class of
depth-3 powering formulas, where (x1 + · · · + xn )n has a small ∑ ∑ formula, but its multilinearization
                                                                    V

has the leading term n!x1 · · · xn and thus requires exponential size as a ∑ ∑ formula (by appealing to
                                                                            V

  9 This refers to the following version: http://arxiv.org/abs/1404.3820v1
                                                                         .


                       T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                               29
                 M ICHAEL F ORBES , A MIR S HPILKA , I DDO T ZAMERET, AND AVI W IGDERSON

Theorem 6.8). On the other hand, some circuit classes are too strong to admit efficient multilinearization
(under plausible complexity assumptions). That is, consider an n×n symbolic matrix X where (X)i, j = xi, j
and the polynomial f (X, y) := (x1,1 y1 + · · · + x1,n yn ) · · · (xn,1 y1 + · · · + xn,n yn ), which is clearly a simple
depth-3 (∏ ∑ ∏) circuit. Viewing this polynomial in F[X][y], one sees that CoeffX|y1 ···yn f = perm(X),
where perm(X) is the n × n permanent. Viewing ml( f ), the multilinearization of f , in F[X][y] one sees
that ml( f ) is of degree n and its degree n component is the coefficient of y1 · · · yn in ml( f ), which is still
perm(X). Hence, by interpolation, one can extract this degree n part and thus can compute a circuit for
perm(X) given a circuit for ml( f ). Since we believe perm(X) does not have small algebraic circuits it
follows that the multilinearization of f does not have small circuits. These examples show that efficient
multilinearization is a somewhat special phenomenon.
    We now give our result for multilinearizing roABPs, where we multilinearize variable by variable via
telescoping.

Proposition 4.5. Let f ∈ F[x1 , . . . , xn ] be computable by a width-r roABP, in order of the variables
x1 < · · · < xn , and with individual degrees at most d. Then ml( f ) has a poly(r, n, d)-explicit width-
r roABP in order of the variables x1 < · · · < xn , and there are poly(r, n, d)-explicit width-r roABPs
h1 , . . . , hn ∈ F[x] in order of the variables x1 < · · · < xn such that

                                                             n
                                         f (x) = ml( f ) + ∑ h j · (x2j − x j ) .
                                                            j=1


Further, ideg h j ≤ ideg f and the individual degree of the roABP for ml( f ) is ≤ 1.

Proof: Let f (x) = (∏ni=1 Ai (xi ))1,1 where Ai ∈ F[xi ]r×r have deg Ai ≤ d. We apply the multilinearization
map ml : F[x] → F[x] to matrices ml : F[x]r×r → F[x]r×r by applying the map entrywise (Theorem 3.12).
It follows then that Ai (xi ) − ml(Ai (xi )) ≡ 0 mod xi2 − xi , so that Ai (xi ) − ml(Ai (xi )) = Bi (xi ) · (xi2 − xi )
for some Bi (xi ) ∈ F[xi ]r×r where ideg Bi (xi ) ≤ ideg Ai (xi ).
     The idea of the proof is to construct the new roABP in stages. We first multilinearize the matrix
corresponding to x1 entry wise. This clearly does not change the dimension of the matrix. We can thus
write A1 (x1 ) = ml(A1 (x1 )) + B1 (x1 ) · (x12 − x1 ). We can therefore write our original roABP as the sum of
to roABPs, the first one having ml(A1 (x1 )) as the first matrix and the other one having B1 (x1 ) · (x12 − x1 ),
where the matrices in the other layers remain unchanged. The second roABP computes a polynomial
of the form h1 · (x12 − x1 ). We now continue multilinearizing the second layer in the first roABP etc.
Continuing in this way we get our desired representation of f .
     Formally, define ml≤i to be the map which multilinearizes the first i variables and leaves the others
intact, so that ml≤0 is the identity map and ml≤n = ml. Telescoping,
                                             !
            n                      n
           ∏ Ai (xi ) = ml<1 ∏ Ai (xi )
           i=1                    i=1
                                             !          "                   !                      !#
                                   n               n               n                      n
                       = ml≤n     ∏ Ai (xi ) + ∑ ml< j ∏ Ai (xi ) − ml≤ j ∏ Ai (xi )
                                  i=1             j=1             i=1                    i=1



                         T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                                       30
               P ROOF C OMPLEXITY L OWER B OUNDS FROM A LGEBRAIC C IRCUIT C OMPLEXITY

using that the identity ml(gh) = ml(g) ml(h), for polynomials on disjoint sets of variables (Theorem 3.12),
naturally extends from scalars to matrices, and also to partial-multilinearization (by viewing ml≤i as
multilinearization in F[x>i ][x≤i ]),
                                                             "                                                                            #
                             n                          n
                       = ∏ ml(Ai (xi )) + ∑                     ∏ ml(Ai (xi )) ∏ Ai (xi ) − ∏ ml(Ai (xi )) ∏ Ai (xi )
                            i=1                     j=1           i< j               i≥ j            i≤ j                i> j
                                                             "                                                           #
                             n                          n                                                  
                       = ∏ ml(Ai (xi )) + ∑                     ∏ ml(Ai (xi )) · A j (x j ) − ml(A j (x j )) · ∏ Ai (xi )
                            i=1                     j=1           i< j                                                   i> j
                                                             "                                                               #
                             n                          n
                       = ∏ ml(Ai (xi )) + ∑                     ∏ ml(Ai (xi )) · B j (x j ) · ∏ Ai (xi ) · (x2j − x j ) ,
                            i=1                     j=1           i< j                        i> j


where we have used the definition of Bi (xi ) from above. Taking the (1, 1)-entry in the above yields that
                                  !
                        n
           f (x) =     ∏ Ai (xi )
                       i=1          1,1
                                          !                                                                   !
                        n                                   n
                 =     ∏ ml(Ai (xi ))               +∑               ∏ ml(Ai (xi )) · B j (x j ) · ∏ Ai (xi )            · (x2j − x j ) .
                       i=1                1,1               j=1      i< j                        i> j              1,1

Thus, define
                                          !                                                                                 !
                        n
               fˆ :=   ∏ ml(Ai (xi ))               ,               h j :=   ∏ ml(Ai (xi )) · B j (x j ) · ∏ Ai (xi )                 .
                       i=1                    1,1                             i< j                          i> j                1,1


It follows by construction that fˆ and each h j are computable by width-r roABPs of the desired explicitness
in the correct order of the variables. Further, ideg h j ≤ ideg f and fˆ has individual degree ≤ 1. Thus, the
above yields that f = fˆ + ∑ j h j · (x2j − x j ), from which it follows that ml( f ) = fˆ, as desired.

     We now conclude that in designing an roABP-IPSLIN refutation ∑ j g j · f j + ∑i hi · (xi2 − xi ) of
f1 (x), . . . , fm (x), x2 − x, it suffices to bound the complexity of the g j ’s.

Proposition 4.6. Let f1 , . . . , fm ∈ F[x1 , . . . , xn ] be a system of polynomials unsatisfiable over x ∈ {0, 1}n
computable by width-s roABPs in order of the variables x1 < · · · < xn . Suppose that there are g j ∈ F[x]
such that
                                              m
                                              ∑ g j (x) f j (x) ≡ 1 mod x2 − x ,
                                              j=1

where the g j have width-r roABPs in the order of the variables x1 < · · · < xn . Then there is an
roABP-IPSLIN refutation C(x, y, z) of individual degree at most 1 + ideg f and computable in width-
poly(s, r, n, m) in any order of the variables where x1 < · · · < xn . Furthermore, this refutation is
poly(s, r, ideg g, ideg f , n, m)-explicit given the roABPs for f j and g j .

                            T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                                                             31
                 M ICHAEL F ORBES , A MIR S HPILKA , I DDO T ZAMERET, AND AVI W IGDERSON

Proof: We begin by noting that we can multilinearize the g j , so that ∑mj=1 ml(g j (x)) f j (x) ≡ 1 mod x2 −x,
and that ml(g j ) are poly(r, ideg g, n, m)-explicit multilinear roABPs of width-r by Theorem 4.5. Thus, we
assume going forward that the g j are multilinear.
    As g j and f j are computable by roABPs, their product g j f j is computable by a width-rs roABP in the
order of the variables x1 < · · · < xn (Theorem 3.8) with individual degree at most 1 + ideg f j (Lemma 3.6).
Thus, by the above multilinearization (Theorem 4.5), there are h j,i ∈ F[x] such that
                                                                        n
                                  g j (x) f j (x) = ml(g j f j ) + ∑ h j,i (x) · (xi2 − xi ) .
                                                                      i=1

where the h j,i are computable by width-rs roABPs of individual degree at most 1 + ideg f j . We now
define                                                            !
                                                    m                    n             m
                                  C(x, y, z) := ∑ g j (x)y j − ∑                      ∑ h j,i (x) zi .
                                                    j=1                 i=1           j=1

By the closure operations of roABPs (Theorem 3.8) it follows that C is computable by the appropriately
sized roABPs in the desired orders of the variables moreover that C has the desired individual degree, and
finally that C has the desired explicitness.
    We now show that this is a valid IPS refutation. Observe that C(x, 0, 0) = 0 and that
                                                                        !
                                                m                            n          m
                           C(x, f , x2 − x) = ∑ g j (x) f j (x) − ∑                    ∑ h j,i (x) (xi2 − xi )
                                               j=1                       i=1           j=1
                                                                                                          !
                                                m                                 n
                                            =∑            g j (x) f j (x) − ∑          h j,i (x)(xi2 − xi )
                                               j=1                               i=1
                                                m
                                            = ∑ ml(g j f j )
                                               j=1


as ∑mj=1 g j (x) f j (x) ≡ 1   mod x2 − x we have that
                                                                                             !
                                     m                            m
                                     ∑ ml(g j f j ) = ml ∑ g j (x) f j (x)                       =1,
                                     j=1                          j=1

where we appealed to linearity, and uniqueness, of multilinearization (Theorem 3.12), so that
                                                m
                           C(x, f , x2 − x) = ∑ ml(g j f j ) = 1 ,
                                               j=1

as desired.

4.3    Multilinear-formula-IPSLIN0
We now turn to proving that g · f ≡ ml(g · f ) mod x2 − x when f is low-degree and g is a multilinear
formula. This multilinearization can be used to complete our construction of multilinear-formula-IPS

                           T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                                 32
               P ROOF C OMPLEXITY L OWER B OUNDS FROM A LGEBRAIC C IRCUIT C OMPLEXITY

refutations of subset-sum axiom (Section 4.4), though our actual construction will multilinearize more
directly (Theorem 4.15).
     More importantly, the multilinearization we establish here shows that multilinear-formula-IPS can
efficiently simulate sparse-IPSLIN (when the axioms are low-degree and multilinear). Such a simulation
holds intuitively, as multilinear formulas can efficiently compute any sparse (multilinear) polynomial,
and as we work over the Boolean cube we are morally working with multilinear polynomials. While
this intuition suggests that such a simulation follows immediately, this intuition is false. Specifically,
while sparse-IPSLIN is a complete refutation system for any system of unsatisfiable polynomials over the
Boolean cube, multilinear-formula-IPSLIN is incomplete as seen by the following example (though, by
Theorem 1.2, multilinear-formula-IPSLIN is complete for refuting unsatisfiable CNFs).

Example 4.7. Consider the unsatisfiable system of equations xy + 1, x2 − x, y2 − y. A multilinear linear-
IPS proof is a tuple of multilinear polynomials ( f , g, h) ∈ F[x, y] such that f · (xy + 1) + g · (x2 − x) +
                                                 1
h · (y2 − y) = 1. In particular, f (x, y) = xy+1   for x, y ∈ {0, 1}, which implies by interpolation over the
Boolean cube that f (x, y) = 1 · (1 − x)(1 − y) + 1 · (1 − x)y + 1 · x(1 − y) + 21 · xy = 1 − 21 · xy. Thus
 f · (xy + 1) contains the monomial x2 y2 . However, as g, h are multilinear we see that x2 y2 cannot appear
in g · (x2 − x) + h · (y2 − y) − 1, so that the equality f · (xy + 1) + g · (x2 − x) + h · (y2 − y) = 1 cannot hold.
Thus, xy + 1, x2 − x, y2 − y has no linear-IPS refutation only using multilinear polynomials.

      Put another way, the above example shows that in a linear-IPS refutation ∑ j g j f j + ∑i hi · (xi2 − xi ) = 1,
while one can multilinearize the g j (with a possible increase in complexity) and still retain a refutation,
one cannot multilinearize the hi in general.
      Thus, to simulate sparse-IPSLIN (a complete proof system) we must resort to using the more general
IPSLIN0 over multilinear formulas, where recall (Theorem 1.1) that the IPSLIN0 refutation system refutes
 f , x2 − x with a polynomial C(x, y, z) where C(x, 0, 0) = 0, C(x, f , x2 − x) = 1, with the added restriction
that when viewing C as in F[x, z][y] the degree of C with respect to the y-variables is at most 1, that
is, degy C ≤ 1. In fact, we establish such a simulation using the subclass of refutations of the form
C(x, y, z) = ∑ j g j y j +C0 (x, z) where C0 (x, 0) = 0. Note that such refutations are only linear in the non-
Boolean axioms, which allows us to circumvent Theorem 4.7.
      We now show how to prove g · f ≡ ml(g · f ) mod x2 − x when f is low-degree and g is a multilinear
formula, where the proof is supplied by the equality g · f − ml(g · f ) = C(x, x2 − x) where C(x, 0) = 0,
and where we seek to show that C has a small multilinear formula. We begin with the special case of f
and g being the same monomial.
Lemma 4.8. Let xa = ∏ni=1 xiai . Then, for a ≤ 1,

                                              (xa )2 − xa = C(x, x2 − x) ,

where C(x, z) ∈ F[x, z1 , . . . , zn ] is defined by

                                      C(x, z) := (z + x)a − xa =    ∑ zb xa−b ,
                                                                   0<b≤a


so that C(x, 0) = 0.

                          T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                                  33
                M ICHAEL F ORBES , A MIR S HPILKA , I DDO T ZAMERET, AND AVI W IGDERSON

     Note that the first expression for C is a poly(n)-sized depth-3 expression, while the second is an
exp(n)-sized depth-2 expression. This difference will, going forward, show that we can multilinearize
efficiently in depth-3, but can only efficiently multilinearize low-degree monomials in depth-2. We now
give an IPS proof for showing how a monomial times a multilinear formula equals its multilinearization.

Lemma 4.9. Let g ∈ F[x1 , . . . , xn , y1 , . . . , yd ] be a multilinear polynomial.      Then there is a C ∈
F[x, y, z1 , . . . , zd ] such that

                                g(x, y) · y1 − ml(g(x, y) · y1 ) = C(x, y, y2 − y) ,

and C(x, y, 0) = 0.
    If g is t-sparse, then C is computable as a poly(t, n, 2d )-size depth-2 multilinear formula (which is
poly(t, n, 2d )-explicit given the computation for g), as well as being computable by a poly(t, n, d)-size
depth-3 multilinear formula with a +-output-gate (which is poly(t, n, d)-explicit given the computation
for g). If g is computable by a size-t depth-D multilinear formula, then C is computable by a poly(t, 2d )-
size depth-(D + 2) multilinear formula with a +-output-gate (which is poly(t, 2d )-explicit given the
computation for g).

Proof: defining C: Express g as g = ∑0≤a≤1 ga (x)y a in the ring F[x][y], so that each ga is multilinear.
Then
                                                                              !
            g(x, y) · y1 − ml(g(x, y) · y1 ) = ∑ ga (x)ya · y1 − ml   ∑ ga (x)ya · y1
                                                a                      a
                                                                
                                             = ∑ ga (x) ya+1 − y1 = ∑ ga (x)y1−a (ya )2 − ya ,
                                                                                            
                                                a                              a


and appealing to Lemma 4.8,

                                             = ∑ ga (x)y1−a ((y2 − y) + y)a − ya
                                                                                       
                                                a
                                             = C(x, y, y2 − y) ,

where we define
                                  C(x, y, z) := ∑ ga (x)y1−a (z + y)a − ya .
                                                                          
                                                    a

As (z + y)a = ya under z ← 0 we have that C(x, y, 0) = 0.
    g is t-sparse: As g is t-sparse and multilinear, so are each ga , so that ga (x) = ∑ti=1 αa,i xca,i , and thus

                                         t
                       C(x, y, z) = ∑ ∑ αa,i xca,i y1−a (z + y)a − ya
                                                                           
                                      a i=1
                                         t                                 t
                                  = ∑ ∑ αa,i xca,i y1−a (z + y)a − ∑ ∑ αa,i xca,i y1 ,
                                      a i=1                           a i=1


                        T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                                   34
             P ROOF C OMPLEXITY L OWER B OUNDS FROM A LGEBRAIC C IRCUIT C OMPLEXITY

where this is clearly an explicit depth-3 (∑ ∏ ∑) multilinear formula (as y 1−a is variable-disjoint from
(z + y)a ), and the size of this formula is poly(n,t, d) as there are at most t such a where ga 6= 0 as g is
t-sparse. Continuing the expansion, appealing to Lemma 4.8,

                                           t                                          t
                                 = ∑ ∑ αa,i xca,i y1−a             ∑    zb ya−b − ∑ ∑ αa,i xca,i y1
                                     a i=1                      0≤b≤a              a i=1
                                           t
                                 = ∑ ∑ αa,i xca,i      ∑ zb ya−b
                                     a i=1           0<b≤a
                                           t
                                 = ∑∑          ∑ αa,i xc zb ya−b ,
                                                             a,i

                                     a i=1 0<b≤a


which is then easily seen to be explicit and poly(n,t, 2d )-sparse appealing to the above logic.
    g a multilinear formula: By interpolation, it follows that for each a there are poly(2d )-explicit con-
stants α a,β such that ga (x) = ∑β ∈{0,1}d αa,β g(x, β ). From this it follows that ga is computable by a depth
D + 1 multilinear formula of size poly(t, 2d ). Expanding the definition of C we get that

               C(x, y, z) = ∑ ga (x)y1−a (z + y)a − ya
                                                                   
                             a

                         = ∑ ga (x)y1−a (z + y)a − ∑ ga (x)y1
                             a                              a
                                                            1−a
                         =∑       ∑        αa,β g(x, β )y          (z + y)a − ∑     ∑        αa,β g(x, β )y1 ,
                             a β ∈{0,1}d                                       a β ∈{0,1}d


which is clearly an explicit depth-(D + 2) multilinear formula of size poly(t, 2d ), as D ≥ 1 so that the
computation of the zi + yi is parallelized with computing g(x, β ), and we absorb the subtraction into the
overall top-gate of addition.

   It is then straightforward to extend this to multilinearizing the product of a low-degree sparse
multilinear polynomial and a multilinear formula, as we can use that multilinearization is linear.
Corollary 4.10. Let g ∈ F[x1 , . . . , xn ] be a multilinear polynomial and f ∈ F[x] a s-sparse multilinear
polynomial of degree ≤ d. Then there is a C ∈ F[x, z1 , . . . , zn ] such that

                                       g · f − ml(g · f ) = C(x, x2 − x) ,

and C(x, 0) = 0.
    If g is t-sparse, then C is computable as a poly(s,t, n, 2d )-size depth-2 multilinear formula (which is
poly(s,t, n, 2d )-explicit given the computations for f , g), as well as being computable by a poly(s,t, n, d)-
size depth-3 multilinear formula with a +-output-gate (which is poly(s,t, n, d)-explicit given the compu-
tations for f , g). If g is computable by a size-t depth-D multilinear formula, then C is computable by a
poly(s,t, 2d )-size depth-(D + 2) multilinear formula with a +-output-gate (which is poly(s,t, 2d )-explicit
given the computations for f , g).

                       T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                                     35
                M ICHAEL F ORBES , A MIR S HPILKA , I DDO T ZAMERET, AND AVI W IGDERSON

    We now show how to derive multilinear-formula-IPSLIN0 refutations for f , x2 − x from equations of
the form ∑ j g j f j ≡ 1 mod x2 − x.
Corollary 4.11. Let f1 , . . . , fm ∈ F[x1 , . . . , xn ] be degree at most d multilinear s-sparse polynomials that
are unsatisfiable over x ∈ {0, 1}n . Suppose that there are multilinear g j ∈ F[x] such that
                                          m
                                          ∑ g j (x) f j (x) ≡ 1 mod x2 − x .
                                          j=1

Then there is a multilinear-formula-IPSLIN0 refutation C(x, y, z) such that
    • If the g j are t-sparse, then C is computable by a poly(s,t, n, m, 2d )-size depth-2 multilinear formula
      (which is poly(s,t, n, m, 2d )-explicit given the computations for the f j , g j ), as well as being com-
      putable by a poly(s,t, n, m, d)-size depth-3 multilinear formula (which is poly(s,t, n, m, d)-explicit
      given the computations for f j , g j ).

    • If the g j are computable by size-t depth-D multilinear formula, then C is computable by a
      poly(s,t, m, 2d )-size depth-(D + 2) multilinear formula (which is poly(s,t, m, 2d )-explicit given
      the computations for f j , g j ).
Proof: construction: By the above multilinearization (Corollary 4.10), there are C j ∈ F[x, z] such that

                                      g j (x) f j (x) = ml(g j f j ) +C j (x, x2 − x) .

where C j (x, 0) = 0. We now define
                                                         m
                                       C(x, y, z) := ∑ (g j (x)y j −C j (x, z)) .
                                                        j=1

We now show that this is a valid IPS refutation. Observe that C(x, 0, 0) = 0 and that
                                                 m
                              C(x, f , x2 − x) = ∑ g j (x) f j (x) −C j (x, x2 − x)
                                                                                   
                                                       j=1
                                                        m
                                                    = ∑ ml(g j f j ) .
                                                       j=1

As ∑m                            2
    i=1 g j (x) f j (x) ≡ 1 mod x − x we have that
                                                                                          !
                                                m                        m
                                  2
                        C(x, f , x − x) = ∑ ml(g j f j ) = ml            ∑ g j (x) f j (x)    =1,
                                                j=1                      i=1

as desired.
    complexity: The claim now follows from appealing to Corollary 4.10 for bounding the complexity of
the C j . That is, if the g j are t-sparse then ∑mj=1 g j (x)y j is tm-sparse and thus computable by a poly(t, m)-
size depth-2 multilinear formula with +-output-gate. As each C j is computable by a poly(s,t, n, 2d )-size

                        T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                                   36
               P ROOF C OMPLEXITY L OWER B OUNDS FROM A LGEBRAIC C IRCUIT C OMPLEXITY

depth-2 or poly(s,t, n, d)-size depth-3 multilinear formula (each having a +-output-gate), it follows that
C(x, y, z) := ∑mj=1 (g j (x)y j −C j (x, z)) can be explicitly computed by a poly(s,t, n, m, 2d )-size depth-2 or
poly(s,t, n, m, d)-size depth-3 multilinear formula (collapsing addition gates into a single level).
    If the g j are computable by size-t depth-D multilinear formula then ∑mj=1 g j (x)y j is computable by
size poly(m,t)-size depth-(D + 2) multilinear formula (with a +-output-gate), and each C j is computable
by a poly(s,t, 2d )-size depth-(D + 2) multilinear formula with a +-output-gate, from which it follows that
C(x, y, z) := ∑mj=1 (g j (x)y j −C j (x, z)) can be explicitly computed by a poly(s,t, m, 2d )-size depth-(D + 2)
multilinear formula by collapsing addition gates.

     We now conclude by showing that multilinear-formula-IPSLIN0 can efficiently simulate sparse-IPSLIN
when the axioms are low-degree. As this latter system is complete, this shows the former is as well. That
is, we allow the refutation to depend non-linearly on the Boolean axioms, but only linearly on the other
axioms.

Corollary 4.12. Let f1 , . . . , fm ∈ F[x1 , . . . , xn ] be degree at most d s-sparse polynomials unsatisfiable
over x ∈ {0, 1}n . Suppose they have an ∑ ∏-IPSLIN refutation, that is, that there are t-sparse polynomials
g1 , . . . , gm , h1 , . . . , hn ∈ F[x] such that ∑mj=1 g j f j + ∑ni=1 hi · (xi2 − xi ) = 1. Then f , x2 − x can be refuted
by a depth-2 multilinear-formula-IPSLIN0 proof of poly(s,t, n, m, 2d )-size, or by a depth-3 multilinear-
formula-IPSLIN0 proof of poly(s,t, n, m, d)-size.

Proof: This follows from Theorem 4.11 by noting that ∑ j ml(g j ) f j ≡ 1 mod x2 − x, and that the ml(g j )
are multilinear and t-sparse.

4.4    Refutations of the subset-sum axiom
We now give efficient IPS refutations of the subset-sum axiom, where these IPS refutations can be even
placed in the restricted roABP-IPSLIN or multilinear-formula-IPSLIN subclasses. That is, we give such
refutations for whenever the polynomial ∑i αi xi − β is unsatisfiable over the Boolean cube {0, 1}n , where
the size of the refutation is polynomial in the size of the set A := {∑i αi xi : x ∈ {0, 1}n }. A motivating
example is when α = 1 so that A = {0, . . . , n}.
    To construct our refutations, we first show that there is an efficiently computable polynomial f such
that f (x) · (∑i αi xi − β ) ≡ 1 mod x2 − x. This will be done by considering the univariate polynomial
p(t) := ∏α∈A (t − α). Using that for any univariate p(x) that x − y divides p(x) − p(y), we see that
p(∑i αi xi ) − p(β ) is a multiple of ∑i αi xi − β . As ∑i αi xi − β is unsatisfiable it must be that β ∈
                                                                                                        / A. This
                                      2
implies that p(∑i αi xi ) ≡ 0 mod x − x while p(β ) 6= 0. Consequently, p(∑i αi xi ) − p(β ) is equivalent
to a nonzero constant modulo x2 − x, yielding the Nullstellensatz refutation

                            1      p(∑i αi xi ) − p(β )
                                 ·                      · (∑i αi xi − β ) ≡ 1      mod x2 − x .
                          −p(β )      ∑i αi xi − β

By understanding the quotient p(∑∑i ααi xi xi )−p(β )                                                                  V
                                              i −β
                                                      we see that it can be efficiently computed by a small ∑ ∑
                                    i
formula and thus an roABP, so that using our multilinearization result for roABPs (Theorem 4.5) this
yields the desired roABP-IPSLIN refutation. However, this does not yield the desired multilinear-formula-
IPSLIN refutation. For this, we need to (over a large field) convert the above quotient to a sum of products

                          T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                                          37
                M ICHAEL F ORBES , A MIR S HPILKA , I DDO T ZAMERET, AND AVI W IGDERSON

of univariates using duality (Theorem 3.14). We can then multilinearize this to a sum of products of linear
univariates, which is a depth-3 multilinear formula. By appealing to our proof-of-multilinearization result
for multilinear formula (Theorem 4.11) one obtains a multilinear-formula-IPSLIN0 refutation, and we give
a direct proof which actually yields the desired multilinear-formula-IPSLIN refutation.
     We briefly remark that for the special case of α = 1, one can explicitly describe the unique multilinear
polynomial f such that f (x)(∑i xi − β ) ≡ 1 mod x2 − x. This description (Theorem B.1) shows that f is
a linear combination of elementary symmetric polynomials, which implies the desired complexity upper
bounds for this case via known bounds on the complexity of elementary symmetric polynomials ([54]).
However, this proof strategy is more technical and thus we pursue the more conceptual outline given
above to bound the complexity of f for general A.

Proposition 4.13. Let α ∈ Fn , β ∈ F and A := {∑ni=1 αi xi : x ∈ {0, 1}n } be so that β ∈
                                                                                        / A. Then there is a
multilinear polynomial f ∈ F[x] such that

                                     f (x) · (∑i αi xi − β ) ≡ 1          mod x2 − x .

For any |F|, f is computable by a poly(|A|, n)-explicit poly(|A|, n)-width roABP of individual degree ≤ 1.
    If |F| ≥ poly(|A|, n), then f is computable as a sum of product of linear univariates (and hence a
depth-3 multilinear formula)
                                                       s
                                          f (x) = ∑ fi,1 (x1 ) · · · fi,n (xn ) ,
                                                      i=1

where each fi, j ∈ F[xi ] has deg fi, j ≤ 1, s ≤ poly(|A|, n), and this expression is poly(|A|, n)-explicit.

Proof: computing A: We first note that A can be computed from α in poly(|A|, n)-steps (as opposed to
                                                j
the naive 2n steps). That is, define A j := {∑i=1  αi xi : x1 , . . . , x j ∈ {0, 1}}, so that A0 = 0/ and An = A. It
follows that for all j, we have that A j ⊆ A and thus |A j | ≤ |A|, and that A j+1 ⊆ A j ∪ (A j + α j ). It follows
that we can compute A j+1 from A j in poly(|A|) time, so that A = An can be computed in poly(|A|, n)-time.
    defining f : Define p(t) ∈ F[t] by p(t) := ∏α∈A (t − α), so that p(A) = 0 and p(β ) 6= 0. Express p(t)
                                                |A|
in its monomial representation as p(t) = ∑k=0 γk t k , where the γk can be computed in poly(|A|) time from
α by computing A as above. Then observe that
                                                                     !
                                                |A|
                                                   tk − β k
                           p(t) − p(β ) =      ∑ γk t − β                (t − β )
                                               k=0
                                                                             !
                                                |A|        k−1
                                         =     ∑ γk ∑ t j β (k−1)− j (t − β )
                                               k=0          j=0
                                                                                    ! !
                                               |A|−1          |A|
                                                                         (k−1)− j
                                         =      ∑             ∑ γk β                 t j (t − β ) .
                                                j=0         k= j+1



                        T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                                    38
             P ROOF C OMPLEXITY L OWER B OUNDS FROM A LGEBRAIC C IRCUIT C OMPLEXITY

Thus, plugging in t ← ∑i αi xi , we can define the polynomial g(x) ∈ F[x] by

                                         p(∑i αi xi ) − p(β )
                               g(x) :=
                                            ∑i αi xi − β
                                                                  !             !j
                                       |A|−1     |A|
                                    = ∑         ∑ γk β (k−1)− j       ∑ αi xi        .                     (4.1)
                                         j=0   k= j+1                  i

Hence,

                                   g(x)(∑i αi xi − β ) = p(∑i αi xi ) − p(β ) .

For any x ∈ {0, 1}n we have that ∑i αi xi ∈ A. As p(A) = 0 it follows that p(∑i αi xi ) = 0 for all x ∈ {0, 1}n .
This implies that p(∑i αi xi ) ≡ 0 mod x2 − x, yielding

                                 g(x)(∑i αi xi − β ) ≡ −p(β ) mod x2 − x .

As −p(β ) ∈ F \ {0}, we have that

                                 1
                                      · g(x) · (∑i αi xi − β ) ≡ 1    mod x2 − x .
                               −p(β )

We now simply multilinearize, and thus define the multilinear polynomial
                                                              
                                                    1
                                    f (x) := ml          · g(x) .
                                                  −p(β )

First, we see that this has the desired form, using the interaction of multilinearization and multiplication
(Theorem 3.12).
                                                                     
                                           1
                               1 = ml           g(x) · (∑i αi xi − β )
                                         −p(β )
                                                                           
                                                1
                                 = ml ml              · g(x) ml(∑i αi xi − β )
                                             −p(β )
                                                           
                                 = ml f · ml(∑i αi xi − β )
                                                        
                                 = ml f · (∑i αi xi − β ) .

Thus, f · (∑i αi xi − β ) ≡ 1 mod x2 − x as desired.
    computing f as an roABP: By Equation 4.1 we see that g(x) is computable by a poly(|A|, n)-size
 V                                                   V
∑ ∑-formula, and by the efficient simulation of ∑ ∑-formula by roABPs (Theorem 3.15) g(x) and thus
  1
−p(β ) · g(x) are computable by poly(|A|, n)-width roABPs of poly(|A|, n)-degree. Noting that roABPs
                                                                       1
can be efficiently multilinearized (Theorem 4.5) we see that f = ml( −p(β ) · g(x)) can thus be computed
by such an roABP also, where the individual degree of this roABP is at most 1. Finally, note that each of
these steps has the required explicitness.

                       T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                                  39
                  M ICHAEL F ORBES , A MIR S HPILKA , I DDO T ZAMERET, AND AVI W IGDERSON

    computing f via duality: We apply duality (Theorem 3.14) to see that over large enough fields there
are univariates g j,`,i of degree at most |A|, where
                                                                   !               !j
                                 |A|−1      |A|
                                                        (k−1)− j
                        g(x) = ∑            ∑ γk β                       ∑ αi xi
                                  j=0     k= j+1                          i
                                                                   !
                                 |A|−1      |A|                        (n j+1)( j+1)
                              = ∑           ∑ γk β (k−1)− j                   ∑        g j,`,1 (x1 ) · · · g j,`,n (xn )
                                  j=0     k= j+1                           `=1

                                           
                        |A|
Absorbing the constant ∑k= j+1 γk β (k−1)− j into these univariates and re-indexing,

                                   s
                              = ∑ gi,1 (x1 ) · · · gi,n (xn )
                                 i=1

for some univariates gi, j , where s ≤ |A|(n|A| + 1)(|A| + 1) = poly(|A|, n).
    We now obtain f by multilinearizing the above expression.
                                                      
                                              1
                                  f = ml           g(x)
                                           −p(β )
                                                                                !
                                                    s
                                              1
                                    = ml           ∑ gi,1 (x1 ) · · · gi,n (xn )
                                           −p(β ) i=1

absorbing the constant 1/−p(β ) and renaming,
                                                                                   !
                                                    s
                                         = ml      ∑ g0i,1 (x1 ) · · · g0i,n (xn )
                                                   i=1
                                                                                                   !
                                                    s
                                         = ml      ∑ ml(g0i,1 (x1 )) · · · ml(g0i,n (xn ))
                                                   i=1


defining fi, j (x j ) := ml(gi, j (x j )), so that deg fi, j ≤ 1,
                                                                                   !
                                                    s
                                         = ml      ∑ fi,1 (x1 ) · · · fi,n (xn )
                                                   i=1


and we can drop the outside ml as this expression is now multilinear,

                                             s
                                         = ∑ fi,1 (x1 ) · · · fi,n (xn ) ,
                                            i=1

showing that f is computable as desired, noting that this expression has the desired explicitness.

                           T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                                           40
             P ROOF C OMPLEXITY L OWER B OUNDS FROM A LGEBRAIC C IRCUIT C OMPLEXITY

    Note that computing f via duality also implies an roABP for f , but only in large enough fields
|F| ≥ poly(|A|, n). Of course, the field must at least have |F| ≥ |A|, but by using the field-independent
                  V
conversion of ∑ ∑ to roABP (Theorem 3.15) this shows that F need not be any larger than A for the
refutation to be efficient.
    The above shows that one can give an “IPS proof” g(x)(∑i αi xi − β ) + ∑i hi (x)(xi2 − xi ) = 1, where
g is efficiently computable. However, this may be an inefficient IPS proof as it does not bound the
complexity of the hi . We now extend this to an efficient IPS proof by using the above multilinearization
results for roABPs (Theorem 4.6), leveraging that ∑i αi xi − β is computable by an roABP in any order of
the variables (and that the above result works in any order of the variables).

Corollary 4.14. Let α ∈ Fn , β ∈ F and A := {∑ni=1 αi xi : x ∈ {0, 1}n } be so that β ∈
                                                                                      / A. Then ∑i αi xi −
     2
β , x − x has a poly(|A|, n)-explicit roABP-IPSLIN refutation of individual degree 2 computable in width-
poly(|A|, n) in any order of the variables.

   Note that while the above results give a small ∑ ∑ formula g such that g · (∑i αi xi − β ) ≡ −p(β )
                                                             V

mod x2 − x for nonzero scalar −p(β ), this does not translate to a ∑ ∑-IPS refutation as ∑ ∑ formulas
                                                                    V                     V

cannot be multilinearized efficiently (see the discussion in Section 4.2).
    We now turn to refuting the subset-sum axioms by multilinear-formula-IPSLIN (which is not itself a
complete proof system, but will suffice here). While one can use the multilinearization techniques for
multilinear-formula-IPSLIN0 of Theorem 4.11 it gives slightly worse results due to its generality, so we
directly multilinearize the refutations we built above using that the subset-sum axiom is linear.



Proposition 4.15. Let α ∈ Fn , β ∈ F and A := {∑ni=1 αi xi : x ∈ {0, 1}n } be so that β ∈              / A. If
|F| ≥ poly(|A|, n), then ∑i αi xi − β , x2 − x has a poly(|A|, n)-explicit poly(|A|, n)-size depth-3 multilinear-
formula-IPSLIN refutation.



Proof: By Theorem 4.13, there is a multilinear polynomial f ∈ F[x] such that f (x) · (∑i αi xi − β ) ≡ 1
mod x2 − x, and f is explicitly given as


                                                   s
                                         f (x) = ∑ fi,1 (x1 ) · · · fi,n (xn ) ,
                                                  i=1




where each fi, j ∈ F[xi ] has deg fi, j ≤ 1 and s ≤ poly(|A|, n).
    We now efficiently prove that f (x) · (∑ni=1 αi xi − β ) is equal to its multilinearization (which is 1)
modulo the Boolean cube. The key step is that for a linear function p(x) = γx + δ we have that
(γx + δ )x = (γ + δ )x + γ(x2 − x) = p(1)x + (p(1) − p(0))(x2 − x).

                        T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                                 41
                 M ICHAEL F ORBES , A MIR S HPILKA , I DDO T ZAMERET, AND AVI W IGDERSON

    Thus,

                f (x) · (∑i αi xi − β )
                                                                      !
                               s
                     =        ∑ fi,1 (x1 ) · · · fi,n (xn ) · (∑i αi xi − β )
                              i=1
                          s
                     = ∑ −β fi,1 (x1 ) · · · fi,n (xn )
                         i=1
                                         s n                                                                      
                                       + ∑ ∑ α j ∏ fi,k (xk ) · fi, j (1)x j + ( fi, j (1) − fi, j (0))(x2j − x j )
                                         i=1 j=1             k6= j
                          s                                                     s   n
                     = ∑ −β fi,1 (x1 ) · · · fi,n (xn ) + ∑ ∑ α j ∏ fi,k (xk ) · fi, j (1)x j
                         i=1                                                   i=1 j=1    k6= j
                                             s       n
                                       + ∑ ∑ α j ∏ fi,k (xk ) · ( fi, j (1) − fi, j (0)) · (x2j − x j )
                                         i=1 j=1             k6= j


absorbing constants and renaming, using j = 0 to incorporate the above term involving β ,
                                                                                                  !
                          s        n     n                           n          s   n
                     = ∑ ∑ ∏ fi, j,k (xk ) + ∑                                 ∑ ∏ hi, j,k (xk ) (x2j − x j )
                         i=1 j=0 k=1                                 j=1       i=1 k=1


where each fi, j,k and hi, j,k are linear univariates. As f (x) · (∑ni=1 αi xi − β ) ≡ 1 mod x2 − x it follows that
∑i ∑ j ∏k fi, j,k (xk ) ≡ 1 mod x2 − x, but as each fi, j,k is linear it must actually be that ∑i ∑ j ∏k fi, j,k (xk ) =
1, so that,
                                                                           !
                                    n            s       n
                     = 1+ ∑                  ∑ ∏ hi, j,k (xk ) (x2j − x j ) .
                                   j=1       i=1 k=1


Define C(x, y, z) := f (x)y− ∑nj=1 h j (x)z j , where h j (x) := ∑si=1 ∏nk=1 hi, j,k (xk ). It follows that C(x, 0, 0) = 0
and that C(x, ∑i αi xi − β , x2 − x) = 1, so that C is a linear-IPS refutation. Further, as each f , h j is
computable as a sum of products of linear univariates, these are depth-3 multilinear formulas. By
distributing the multiplication of the variables y, z1 , . . . , zn to the bottom multiplication gates, we see that
C itself has a depth-3 multilinear formula of the desired complexity.


5    Lower bounds for linear-IPS via functional lower bounds
In this section we give functional circuit lower bounds for various measures of algebraic complexity,
such as degree, sparsity, roABPs and multilinear formulas. That is, while algebraic complexity typically
treats a polynomial f ∈ F[x1 , . . . , xn ] as a syntactic object, one can also see that it defines a function on
the Boolean cube fˆ : {0, 1}n → F. If this function is particularly complicated then one would expect
that this implies that the polynomial f requires large algebraic circuits. In this section we obtain such

                          T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                                        42
             P ROOF C OMPLEXITY L OWER B OUNDS FROM A LGEBRAIC C IRCUIT C OMPLEXITY

lower bounds, showing that for any polynomial f (not necessarily multilinear) that agrees with a certain
function on the Boolean cube must in fact have large algebraic complexity.
     Our lower bounds will proceed by first showing that the complexity of f is an upper bound for the
complexity of its multilinearization ml( f ). While such a statement is known to be false for general
circuits (under plausible assumptions, see Section 4.2), such efficient multilinearization can be shown for
the particular restricted models of computation we consider. In particular, this multilinearization is easy
for degree and sparsity, for multilinear formulas (as f is already multilinear), and for roABPs this is seen
in Section 4.2. As then ml( f ) is uniquely defined by the function fˆ (Theorem 3.12), we then only need to
lower bound the complexity of ml( f ) using standard techniques. We remark that the actual presentation
will not follow the above exactly, as for roABPs and multilinear formulas it is just as easy to work directly
with the underlying lower bound technique.
     We then observe that by deriving such lower bounds for carefully crafted functions fˆ : {0, 1}n → F
one can easily obtain linear-IPS lower bounds for the above circuit classes. That is, consider the system
of equations f (x), x2 − x, where f (x) is chosen so this system is unsatisfiable, hence f (x) 6= 0 for all
x ∈ {0, 1}n . Any linear-IPS refutation yields an equation g(x) · f (x) + ∑i hi (x)(xi2 − xi ) = 1, which implies
that g(x) = 1/ f (x) for all x ∈ {0, 1}n (that this system is unsatisfiable allows us to avoid division by zero).
It follows that the polynomial g(x) agrees with the function ĝ(x) := 1/ f (x) on the Boolean cube. If the
function ĝ has a functional lower bound then this implies g must have large complexity, giving the desired
lower bound for the linear-IPS refutation.
     The section proceeds as follows. We begin by detailing the above strategy for converting functional
lower bounds into lower bounds for linear-IPS. We then derive a tight functional lower bound of n for the
degree of 1/(∑i xi +1). We then extend this via random restrictions to a functional lower bound of exp(Ω(n))
on the sparsity of 1/(∑i xi +1). We can then lift this degree bound to a functional lower bound of 2n on the
evaluation dimension of 1/(∑i xi yi +1) in the x|y partition, which we then symmetrize to obtain a functional
                                                                                                         
lower bound on the evaluation dimension in any partition of the related function 1/ ∑i< j zi, j xi x j +1 . In each
case, the resulting linear-IPS lower bounds are immediate via the known relations of these measures to
circuit complexity classes (Section 3).


5.1   The strategy
We give here the key lemma detailing the general reduction from linear-IPS lower bounds to functional
lower bounds.

Lemma 5.1. Let C ⊆ F[x1 , . . . , xn ] be a set of polynomials closed under partial substitutions. Let f ∈ F[x],
where the system f (x), x2 − x is unsatisfiable. Suppose that for all g ∈ F[x] with

                                                  1
                                       g(x) =         ,    ∀x ∈ {0, 1}n ,
                                                f (x)

       / C. Then f (x), x2 − x does not have C-IPSLIN refutations (nor C-IPSLIN0 refutations).
that g ∈

Proof: Suppose for contradiction that f (x), x2 − x has the C-IPSLIN refutation C(x, y, z) = g(x) · y +
∑i hi (x) · zi where C(x, f , x2 − x) = 1 (and clearly C(x, 0, 0) = 0). As g = C(x, 1, 0), it follows that g ∈ C

                        T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                                  43
               M ICHAEL F ORBES , A MIR S HPILKA , I DDO T ZAMERET, AND AVI W IGDERSON

from the closure properties we assumed of C. Thus,

                                      1 = C(x, f , x2 − x)
                                        = g(x) · f (x) + ∑ hi (x)(xi2 − xi )
                                                             i
                                        ≡ g(x) · f (x) mod x2 − x .

Thus, for any x ∈ {0, 1}n , as f (x) 6= 0,
                                                 g(x) = 1/ f (x) .
However, this yields the desired contradiction, as this contradicts the assumed functional lower bound for
1/ f .


    We now note that the lower bound strategy of using functional lower bounds actually produces lower
bounds for IPSLIN0 (and even for the full IPS system if we have multilinear polynomials), and not just
IPSLIN . This is because we work modulo the Boolean axioms, so that any non-linear dependence on
these axioms vanishes, only leaving a linear dependence on the remaining axioms. This slightly stronger
lower bound is most interesting for multilinear-formulas, where the IPSLIN version is not complete in
general (Theorem 4.7) (though it is still interesting due to its short refutations of the subset-sum axiom
(Theorem 4.15)), while the IPSLIN0 version is complete (Theorem 4.12).

Lemma 5.2. Let C ⊆ F[x1 , . . . , xn ] be a set of polynomials closed under partial substitutions, and let
D be the set of differences of C, that is, D := {p(x) − q(x) : p, q ∈ C}. Let f ∈ F[x], where the system
f (x), x2 − x is unsatisfiable. Suppose that for all g ∈ F[x] with
                                                  1
                                       g(x) =         ,          ∀x ∈ {0, 1}n ,
                                                f (x)

       / D. Then f (x), x2 − x does not have C-IPSLIN0 refutations.
that g ∈
    Furthermore, if C (and thus D) are a set of multilinear polynomials, then f (x), x2 − x does not have
C-IPS refutations.

Proof: Suppose for contradiction that f (x), x2 − x has the C-IPSLIN0 refutation C(x, y, z). That
degy C(x, y, z) ≤ 1 implies there is the decomposition C(x, y, z) = C1 (x, z)y + C0 (x, z). As C1 (x, 0) =
C(x, 1, 0) − C(x, 0, 0), the assumed closure properties imply that C1 (x, 0) ∈ D. By the definition of an
IPS refutation, we have that 0 = C(x, 0, 0) = C1 (x, 0) · 0 + C0 (x, 0), so that C0 (x, 0) = 0. By using the
definition of an IPS refutation again, we have that 1 = C(x, f , x2 − x) = C1 (x, x2 − x) · f +C0 (x, x2 − x),
so that modulo the Boolean axioms,

                                   1 = C1 (x, x2 − x) · f +C0 (x, x2 − x)
                                     ≡ C1 (x, 0) · f +C0 (x, 0) mod x2 − x

using that C0 (x, 0) = 0,

                                     ≡ C1 (x, 0) · f      mod x2 − x .


                        T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                              44
              P ROOF C OMPLEXITY L OWER B OUNDS FROM A LGEBRAIC C IRCUIT C OMPLEXITY

Thus, for every x ∈ {0, 1}n we have that C1 (x, 0) = 1/ f (x) so that by the assumed functional lower bound
          / D, yielding the desired contradiction to the above C1 (x, 0) ∈ D.
C1 (x, 0) ∈
    Now suppose that C is a set of multilinear polynomials. Any C-IPS refutation C(x, y, z) of f (x), x2 − x
thus must have degy C ≤ 1 as C is multilinear, so that C is actually a C-IPSLIN0 refutation, thus the above
lower bound applies.

5.2    Degree of a polynomial
We now turn to obtaining functional lower bounds, and deriving the corresponding linear-IPS lower
bounds. We begin with a particularly weak form of algebraic complexity, the degree of a polynomial.
While it is trivial to obtain such bounds in some cases (as any polynomial that agrees with the AND
function on the Boolean cube {0, 1}n must have degree ≥ n), for our applications to proof complexity we
will need such degree bounds for functions defined by fˆ(x) = 1/ f (x) for simple polynomials f (x).
     We show that any multilinear polynomial agreeing with 1/ f (x), where f (x) is the subset-sum axiom
∑i i − β , must have the maximal degree n. We note that a degree lower bound of dn/2e + 1 was established
   x
by Impagliazzo, Pudlák, and Sgall [41] (Theorem A.4). They actually established this degree bound 10
when f (x) = ∑i αi xi − β for any α, while we only consider α = 1 here. However, we need the tight
bound of n here as it will be used crucially in the proof of Theorem 5.8.

Proposition 5.3. Let n ≥ 1 and F be a field with char(F) > n. Suppose that β ∈ F \ {0, . . . , n}. Let
f ∈ F[x1 , . . . , xn ] be a multilinear polynomial such that
                                                                !
                                         f (x)    ∑ xi − β          =1       mod x2 − x .
                                                   i

Then deg f = n.

Proof: ≤ n: This is clear as f is multilinear.
   ≥ n: Begin by observing that as β ∈ / {0, . . . , n}, this implies that ∑i xi − β is never zero on the Boolean
cube, so that the above functional equation implies that for x ∈ {0, 1}n the expression

                                                                      1
                                                       f (x) =              ,
                                                                  ∑i xi − β

is well defined.
    Now observe that this implies that f is a symmetric polynomial. That is, define the multilinear
polynomial g by symmetrizing f ,

                                                             1
                                    g(x1 , . . . , xn ) :=           f (xσ (1) , . . . , xσ (n) ) ,
                                                             n! σ∑
                                                                 ∈Sn

  10 The degree lower bound of Impagliazzo, Pudlák, and Sgall [41] (Theorem A.4) actually holds for the (dynamic) polynomial

calculus proof system (see Appendix A), while we only consider the (static) Nullstellensatz proof system here. Note that for
polynomial calculus Impagliazzo, Pudlák, and Sgall [41] also showed a matching upper bound of dn/2e + 1 for α = 1.


                          T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                                          45
               M ICHAEL F ORBES , A MIR S HPILKA , I DDO T ZAMERET, AND AVI W IGDERSON

where Sn is the symmetric group on n symbols. Then we see that f and g agree on x ∈ {0, 1}n , as
                                            1
                                g(x) =              f (xσ (1) , . . . , xσ (n) )
                                            n! σ∑
                                                ∈Sn
                                            1            1         1           1
                                        =                        =
                                            n! σ∑     x
                                                ∈Sn ∑i σ (i) − β   n!  ∑
                                                                      σ ∈Sn ∑i i − β
                                                                              x
                                            1             1           1
                                        =      · n! ·           =           = f (x) .
                                            n!        ∑i xi − β   ∑i xi − β
It follows then that g = f as polynomials, since they are multilinear and agree on the Boolean cube
(Theorem 3.12). As g is clearly symmetric, so is f . Thus f can be expressed as f = ∑dk=0 γk Sn,k (x), where
d := deg f , Sn,k := ∑S∈([n]) ∏i∈S xi is the k-th elementary symmetric polynomial, and γk ∈ F are scalars
                          k
with γd 6= 0.
    Now observe that for k < n, we can understand the action of multiplying Sn,k by ∑i xi − β .
                        !                           !
                 ∑ xi − β       Sn,k (x) =       ∑            ∑ xi − β ∏ x j
                  i                             S∈([n]         i             j∈S
                                                    k)
                                                                                                        !
                                        =        ∑            ∑ xi ∏ x j + ∑ xi ∏ x j − β ∏ x j
                                                S∈([n]
                                                      )       i∈S
                                                               /     j∈S      i∈S   j∈S           j∈S
                                                    k
                                                                                                 

                                        =        ∑  ∑ ∏ x j + (k − β ) ∏ x j                           mod x2 − x
                                                                                                 
                                                S∈([n]        |T |=k+1 j∈T                  j∈S
                                                    k)
                                                                T ⊇S

                                        = (k + 1)Sn,k+1 + (k − β )Sn,k                    mod x2 − x .

Note that we used that each subset of [n] of size k + 1 contains exactly k + 1 subsets of size k.
   Putting the above together, suppose for contradiction that d < n. Then,
                                            !
                        1 = f (x)       ∑ xi − β                    mod x2 − x
                                            i
                                                 !                     !
                                   d
                            =     ∑ γk Sn,k               ∑ xi − β           mod x2 − x
                                  k=0                     i
                                                                     !
                                  d                                
                            =     ∑ γk (k + 1)Sn,k+1 + (k − β )Sn,k                          mod x2 − x
                                  k=0

                            = γd (d + 1)Sn,d+1 + (degree ≤ d) mod x2 − x .

However, as γd 6= 0, d + 1 ≤ n (so that d + 1 6= 0 in F and Sn,d+1 is defined) this shows that 1 (a multilinear
degree 0 polynomial) equals γd (d + 1)Sn,d+1 + (degree ≤ d) (a multilinear degree d + 1 polynomial)
modulo x2 − x, which is a contradiction to the uniqueness of representation of multilinear polynomials
modulo x2 − x. Thus, we must have d = n.

                       T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                                          46
             P ROOF C OMPLEXITY L OWER B OUNDS FROM A LGEBRAIC C IRCUIT C OMPLEXITY

    To paraphrase the above argument, it shows that for multilinear f of deg f < n with ml( f (x) · (∑i xi −
β )) = 1 it holds that deg ml( f (x) · (∑i xi − β )) = deg f + 1. This contradicts the fact that deg 1 = 0, so
that deg f = n. It is tempting to attempt to argue this claim without using that ml( f (x) · (∑i xi − β )) = 1
in some way. That is, one could hope to argue that deg(ml( f (x) · (∑i xi − β ))) = deg f + 1 directly.
Unfortunately this is false, as seen by the example ml((x + y)(x − y)) = ml(x2 − y2 ) = x − y. However,
one can make this approach work to obtain a degree lower bound of dn/2e + 1, as shown by Impagliazzo,
Pudlák, and Sgall [41].
    Putting the above together with the fact that multilinearization is degree non-increasing we obtain
that any polynomial agreeing with ∑ x1i −β on the Boolean cube must be of degree ≥ n.
                                      i

Corollary 5.4. Let n ≥ 1 and F be a field with char(F) > n. Suppose that β ∈ F \ {0, . . . , n}. Let
f ∈ F[x1 , . . . , xn ] be a polynomial such that
                                                  !
                                   f (x)   ∑ xi − β   =1     mod x2 − x .
                                           i

Then deg f ≥ n.
Proof: By  simple properties of multilinearization      (Theorem 3.12) we see that 1 = ml f (x) ·
(∑i xi − β ) = ml ml( f ) · (∑i xi − β ) , so that ml( f ) · (∑i xi − β ) = 1 mod x2 − x. Thus deg f ≥
                                         

deg ml( f ) and deg ml( f ) = n by the above Theorem 5.3, yielding the claim.

    The above proof shows that the unique multilinear polynomial f agreeing with 1/(∑i xi −β ) on the
hypercube has degree n, but does so without actually specifying the coefficients of f . In Theorem B.1 we
compute the coefficients of this polynomial, giving an alternate proof that it has degree n (Corollary B.3).
In particular, this computation yields a small algebraic circuit for f , expressing it as an explicit linear
combination of elementary symmetric polynomials (which have small algebraic circuits).

5.3   Sparse polynomials
We now use the above functional lower bounds for degree, along with random restrictions, to obtain
functional lower bounds for sparsity. We then apply this to obtain exponential lower bounds for sparse-
IPSLIN refutations of the subset-sum axiom. Recall that sparse-IPSLIN is equivalent to the Nullstellensatz
proof system when we measure the size of the proof in terms of the number of monomials. While
we provide the proof here for completeness, we note that this result has already been obtained by
Impagliazzo-Pudlák-Sgall [41], who also gave such a lower bound for the stronger polynomial calculus
proof system.
    We first recall the random restrictions lemma. This lemma shows that by randomly setting half of the
variables to zero, sparse polynomials become sums of monomials involving few variables, which after
multilinearization is a low-degree polynomial.
Lemma 5.5. Let f ∈ F[x1 , . . . , xn ] be an s-sparse polynomial. Let ρ : F[x] → F[x] be the homomorphism
induced by randomly and independently setting each variable xi to 0 with probability 1/2 and leaving xi
intact with probability 1/2. Then with probability ≥ 1/2, each monomial in ρ( f (x)) involves ≤ lg s + 1
variables. Thus, with probability ≥ 1/2, deg ml(ρ( f )) ≤ lg s + 1.

                       T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                               47
                 M ICHAEL F ORBES , A MIR S HPILKA , I DDO T ZAMERET, AND AVI W IGDERSON

Proof: Consider a monomial xa involving ≥ t variables, t ∈ R. Then the probability that ρ(xa ) is nonzero
is at most 2−t . Now consider f (x) = ∑sj=1 α j xa j . By a union bound, the probability that any monomial
xa j involving at least t variables survives the random restriction is at most s2−t . For t = lg s + 1 this is at
most 1/2. The claim about the multilinearization of ρ( f (x)) follows by observing that for a monomial xa
involving ≤ lg s + 1 variables it must be that deg ml(ρ(xa )) ≤ lg s + 1 (Theorem 3.12).

    We now give our functional lower bound for sparsity. This follows from taking any refutation of the
subset-sum axiom and applying a random restriction. The subset-sum axiom will be relatively unchanged,
but any sparse polynomial will become (after multilinearization) low-degree, to which our degree lower
bounds (Section 5.2) can then be applied.

Proposition 5.6. Let n ≥ 8 and F be a field with char(F) > n. Suppose that β ∈ F \ {0, . . . , n}. Let
f ∈ F[x1 , . . . , xn ] be a polynomial such that

                                                                       1
                                                          f (x) =            ,
                                                                    ∑i i − β
                                                                      x

for x ∈ {0, 1}n . Then f requires ≥ 2n/4−1 monomials.

Proof: Suppose that f is s-sparse so that f (x) = ∑sj=1 α j xa j . Take a random restriction ρ as in Theo-
rem 5.5, so that with probability at least 1/2 we have that deg ml(ρ( f )) ≤ lg s+1. By the Chernoff bound,11
                                                                                         2
we see that ρ keeps alive at least n/4 variables with probability at least 1 − e−2·(1/4) ·n , which is ≥ 1 − e−1
for n ≥ 8. Thus, by a union bound the probability that ρ fails to have either that deg ml(ρ( f )) ≤ lg s + 1
or that it keeps at least n/4 variables alive is at most 1/2 + e−1 < 1. Thus a ρ exists obeying both properties.
    Thus, the functional equation for f implies that
                                                     !
                                      f (x)    ∑ xi − β         = 1 + ∑ hi (x)(xi2 − xi ) ,
                                                i                         i

for some hi ∈ F[x]. Applying the random restriction and multilinearization to both sizes of this equation,
we obtain that                                   !
                             ml(ρ( f )) ·       ∑ xi − β             ≡1       mod {xi2 − xi }ρ(xi )6=0 .
                                              ρ(xi )6=0

Thus, by appealing to the degree lower bound for this functional equation (Theorem 5.3) we obtain that
lg s + 1 ≥ deg ml(ρ( f )) is at least the number of variables which is ≥ n/4, so that s ≥ 2n/4−1 as desired.

    We remark that one can actually improve the sparsity lower bound to the optimal “≥ 2n ” by computing
the sparsity of the unique multilinear polynomial satisfying the above functional equation (Corollary B.3).
We now apply these functional lower bounds to obtain lower bounds for sparse-IPSLIN refutations of
∑i xi − β , x2 − x via our reduction (Theorem 5.1).
Corollary 5.7. Let n ≥ 1 and F be a field with char(F) > n. Suppose that β ∈ F \ {0, . . . , n}. Then
∑ni=1 xi − β , x2 − x is unsatisfiable and any sparse-IPSLIN refutation requires size exp(Ω(n)).
                                                                                                    2
                                                     1           n      ∑i i ∑i E[Xi ] ≤ −εn] ≤ e−2ε n .
  11 For independent [0, 1]-valued random variables X , . . . , X , Pr [ X −



                          T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                               48
             P ROOF C OMPLEXITY L OWER B OUNDS FROM A LGEBRAIC C IRCUIT C OMPLEXITY

5.4   Coefficient dimension in a fixed partition
We now seek to prove functional circuit lower bounds for more powerful models of computation such as
roABPs and multilinear formulas. As recalled in Section 3, the coefficient dimension complexity measure
can give lower bounds for such models. However, by definition it is a syntactic measure as it speaks
about the coefficients of a polynomial. Unfortunately, knowing that a polynomial f ∈ F[x] agrees with
a function fˆ : {0, 1}n → F on the Boolean cube {0, 1}n does not in general give enough information to
determine its coefficients. In contrast, the evaluation dimension measure is concerned with evaluations of
a polynomial (which is functional). Obtaining lower bounds for evaluation dimension, and leveraging the
fact that the evaluation dimension lower bounds coefficient dimension (Lemma 3.11) we can obtain the
desired lower bounds for this complexity measure.
    We now proceed to the lower bound. It will follow from the degree lower bound for the subset-sum
axiom (Theorem 5.4). That is, this degree bound shows that if f (z) · (∑i zi − β ) ≡ 1 mod z 2 − z then
 f must have degree ≥ n. We can then “lift” this lower bound by the use of a gadget, in particular by
replacing z ← x ◦ y, where ‘◦’ is the Hadamard (entrywise) product. Because the degree of f is maximal,
this gadget forces x and y to maximally “interact”, and hence the evaluation dimension is large in the x
versus y partition.

Proposition 5.8. Let n ≥ 1 and F be a field with char(F) > n. Suppose that β ∈ F \ {0, . . . , n}. Let
f ∈ F[x1 , . . . , xn , y1 , . . . , yn ] be a polynomial such that

                                                                1
                                            f (x, y) =                 ,
                                                          ∑i xi yi − β

for x, y ∈ {0, 1}n . Then dim Coeffx|y f ≥ 2n .

Proof: By showing that the coefficient dimension is not less than the evaluation dimension over the
Boolean cube (Lemma 3.11),

                               dim Coeffx|y f ≥ dim Evalx|y,{0,1} f
                                              = dim{ f (x, 1S ) : S ⊆ [n]}
                                              ≥ dim{ml( f (x, 1S )) : S ⊆ [n]} ,

where 1S ∈ {0, 1}n is the indicator vector for a set S. Note that we used that dimension is non-increasing
under linear maps. Now note that for x ∈ {0, 1}n ,

                                                               1
                                           f (x, 1S ) =               .
                                                          ∑i∈S xi − β

It follows that ml( f (x, 1S )) is a multilinear polynomial only depending on x|S (Theorem 3.12), and
by its functional behavior it follows from Theorem 5.3 that deg ml( f (x, 1S )) = |S|. As ml( f (x, 1S )) is
multilinear it thus follows that the leading monomial of ml( f (x, 1S )) is ∏i∈S xi , which is distinct for each
distinct S. This is also readily seen from the explicit description of ml( f (x, 1S )) given by Theorem B.1.

                       T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                                 49
               M ICHAEL F ORBES , A MIR S HPILKA , I DDO T ZAMERET, AND AVI W IGDERSON

Thus, we can lower bound the dimension of this space by the number of leading monomials (Lemma 3.19),

                            dim Coeffx|y f ≥ dim{ml( f (x, 1S )) : S ⊆ [n]}
                                                                           
                                           ≥ LM {ml( f (x, 1S )) : S ⊆ [n]}
                                              (                )
                                           =       ∏ xi : S ⊆ [n]
                                                   i∈S
                                               n
                                           =2 .

    Note that in the above proof we crucially leveraged that the degree bound of Theorem 5.3 is exactly
n, not just Ω(n). This exact bound allows us to determine the leading monomials of these polynomials,
which seems not to follow from degree lower bounds of Ω(n).
    As coefficient dimension lower bounds roABP-width (Lemma 3.7) and depth-3 powering formulas
can be computed by roABPs in any order of the variables (Theorem 3.15), we obtain as a corollary our
functional lower bound for these models.
Corollary 5.9. Let n ≥ 1 and F be a field with char(F) > n. Suppose that β ∈ F \ {0, . . . , n}. Let
f ∈ F[x1 , . . . , xn , y1 , . . . , yn ] be a polynomial such that
                                                             1
                                           f (x, y) =               ,
                                                         ∑i i i − β
                                                           x y
for x, y ∈ {0, 1}n . Then f requires width ≥ 2n to be computed as an roABP in any order of the variables
where x precedes y. In particular, f requires exp(Ω(n)) size as a depth-3 powering formula.
   We now conclude with a lower bound for linear-IPS over roABPs in certain orders of the variables,
and thus also for depth-3 powering formulas, by appealing to our reduction to functional lower bounds
(Theorem 5.1).
Corollary 5.10. Let n ≥ 1 and F be a field with char(F) > n. Suppose that β ∈ F \ {0, . . . , n}. Then
∑ni=1 xi yi − β , x2 − x, y 2 − y is unsatisfiable and any roABP-IPSLIN refutation, where the roABP reads
x before y, requires width ≥ exp(Ω(n)). In particular, any ∑ ∑-IPSLIN refutation requires size ≥
                                                                    V

exp(Ω(n)).
Proof: That this system is unsatisfiable is clear from construction. The proof then follows from applying
our functional lower bound (Corollary 5.9) to our reduction strategy (Theorem 5.1), where we use that par-
tial evaluations of small roABPs yield small roABPs in the induced order of the variables (Theorem 3.8),
and that depth-3 powering formulas are a subclass of roABPs (in any order) (Theorem 3.15).

    The above result shows an roABP-IPSLIN lower bound for orders of the variables where x precedes y,
and we complement this by giving an upper bound showing there are small roABP-IPSLIN upper bounds
for orders of the variables where x and y are tightly interleaved. This is achieved by taking the roABP-
IPSLIN upper bound of Corollary 4.14 for ∑i zi − β , z2 − z under the substitution zi ← xi yi , and observing
that such substitutions preserve roABP width in the x1 < y1 < · · · < xn < yn order (Theorem 3.8). In
                V
particular, as ∑ ∑ formulas are small roABPs in every order of the variables, this allows us to achieve
                                      V
an exponential separation between ∑ ∑-IPSLIN and roABP-IPSLIN .

                       T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                              50
               P ROOF C OMPLEXITY L OWER B OUNDS FROM A LGEBRAIC C IRCUIT C OMPLEXITY

Corollary 5.11. Let n ≥ 1 and F be a field with char(F) > n. Suppose that β ∈ F \ {0, . . . , n}. Then
∑ni=1 xi yi − β , x2 − x, y2 − y is unsatisfiable, has a poly(n)-explicit poly(n)-size roABP-IPSLIN refutation
in the order of the variables x1 < y1 < · · · < xn < yn , and every ∑ ∑-IPSLIN refutation requires size
                                                                           V

≥ exp(Ω(n)).
                                                                                                       V
Proof: Theorem 5.10 showed that this system is unsatisfiable and has the desired ∑ ∑-IPSLIN lower
bound, so that it remains to prove the roABP upper bound.
    By Theorem 4.13 the unique multilinear polynomial f ∈ F[z] such that f (z) · (∑ni=1 zi − β ) ≡ 1
mod z2 − z has a multilinear poly(n)-size roABP in the order of the variables z1 < · · · < zn . Applying the
variable substitution zi ← xi yi , it follows that f 0 (x, y) := f (x1 y1 , . . . , xn yn ) obeys f 0 · (∑ni=1 xi yi − β ) ≡ 1
mod x2 − x, y2 − y (as z2i − zi ≡ 0 mod x2 − x, y2 − y under the substitution zi ← xi yi ) and that f 0 is com-
putable by a poly(n)-size roABP in the order of the variables x1 < y1 < · · · < xn < yn (Theorem 3.8, using
that f has individual degree 1). Appealing to the efficient multilinearization of roABPs (Theorem 4.6)
completes the claim as ∑i xi yi − β is computable by a poly(n)-size roABP (in any order).

5.5    Coefficient dimension in any variable partition
The previous section gave functional lower bounds for coefficient dimension, and thus roABP width, in
the x|y variable partition. However, this lower bound fails for other order of the variablesings where x and
y are interleaved because of corresponding upper bounds (Theorem 5.11). In this section we extend the
lower bound to any order of the variablesing by using suitable auxiliary variables to plant the previous
lower bound into any partition we desire by suitably evaluating the auxiliary variables.
     We begin by developing some preliminaries for how coefficient dimension works in the presence
of auxiliary indicator variables. That is, consider a polynomial f (x, y, z) where we wish to study the
coefficient dimension of f in the x|y partition. We can view this polynomial as lying in F[z][x, y] so that
its coefficients are polynomials in z and one studies the dimension of the coefficient space in the field of
rational functions F(z). Alternatively one can evaluate z at some point z ← α so that f (x, y, α) ∈ F[x, y]
and study its coefficient dimension over F. The following straightforward lemma shows the first dimension
over F(z) is lower-bounded by the second dimension over F.
Lemma 5.12. Let f ∈ F[x, y, z]. Let fz denote f as a polynomial in F[z][x, y], so that for any α ∈ F|z| we
have that fα (x, y) = f (x, y, α) ∈ F[x, y]. Then for any such α,

                                 dimF(z) Coeffx|y fz (x, y) ≥ dimF Coeffx|y fα (x, y) .

Proof: Let f (x, y, z) be written in F[x, y, z] as f = ∑a,b fa,b (z)xa y b . By Lemma 3.4 we see that
dimF(z) Coeffx|y fz (x, y) is equal to the rank (over F(z)) of the coefficient matrix C fz , so that its en-
tries (C fz )a,b = fa,b (z) are in F[z]. Similarly, dimF Coeffx|y fα (x, y) is equal to the rank (over F) of the
coefficient matrix C fα , so that as f (x, y, α) = ∑a,b fa,b (α)xa yb we have that (C fα )a,b = fa,b (α), which is
in F. Thus, it follows that C fz |z←α = C fα .
    The claim then follows by noting that for a matrix M(w) ∈ F[w]r×r it holds that rankF(w) M(w) ≥
rankF M(β ) for any β ∈ F|w| . This follows as the rank of M(w) is equal to the maximum size of a minor
with a non-vanishing determinant. Thus, determinants are polynomials in w, and they can only further
vanish when w ← β .

                          T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                                           51
                M ICHAEL F ORBES , A MIR S HPILKA , I DDO T ZAMERET, AND AVI W IGDERSON

     We now use auxiliary variables to embed the coefficient dimension lower bound from Theorem 5.8
into any order of the variables. We do this by viewing the polynomial ∑i ui vi − β as using a matching
between variables in u and v. We then wish to embed this matching graph-theoretically into a complete
graph, where nodes are labelled with the variables x. Any equipartition of this graph will induce many
edges across this cut, and we can drop edges to find a large matching between the x variables which we
then identify as instance of ∑i ui vi − β . We introduce one new auxiliary variable zi, j per edge which, upon
setting it to 0 or 1, allows us to have this edge (respectively) dropped from or kept in the desired matching.
This leads to the new (symmetrized) equation ∑i< j zi, j xi x j − β , for which we now give the desired lower
bound.

Proposition 5.13. Let n ≥ 1 and F be a field with char(F) > 2n                                                   2n
                                                                                                                  
                                                                            2 . Suppose that β ∈ F \ {0, . . . , 2 }.
Let f ∈ F[x1 , . . . , x2n , z1 , . . . , z(2n) ] be a polynomial such that
                                  2


                                                                  1
                                           f (x, z) =                          ,
                                                        ∑i< j zi, j xi x j − β
                              2n
for x ∈ {0, 1}2n , z ∈ {0, 1}( 2 ) . Let fz denote f as a polynomial in F[z][x]. Then for any partition x = (u, v)
with |u| = |v| = n,
                                              dimF(z) Coeffu|v fz ≥ 2n .

Proof: We wish to embed ∑i ui vi − β in this instance via a restriction of z. Define the z-evaluation
           2n
α ∈ {0, 1}( 2 ) to restrict f to sum over those xi x j in the natural matching between u an v, so that
                                                 (
                                                  1 xi = uk , x j = vk
                                         αi. j =                       .
                                                  0 else

It follows then that f (u, v, α) = ∑n u1k vk −β for u, v ∈ {0, 1}n . Thus, by appealing to our lower bound
                                    k=1
for a fixed partition (Theorem 5.8) and the relation between the coefficient dimension in fz versus fα
(Theorem 5.12),

                               dimF(z) Coeffu|v fz (u, v) ≥ dimF Coeffu|v fα (u, v)
                                                             ≥ 2n .
                                                       √
    We remark that this lower bound is only exp(Ω( m)) where m = 2n + 2n
                                                                                 
                                                                                2 is the number of total
variables, while one could hope for an exp(Ω(m)) lower bound as 2m is the trivial upper bound for
multilinear polynomials. One can achieve such a lower bound by replacing the above auxiliary variable
scheme (which corresponds to a complete graph) with one derived from a constant-degree expander graph.
That is because in such graphs any large partition of the vertices induces a large matching across that
partition, where one can then embed the fixed-partition lower bounds of the previous section (Section 5.4).
However, we omit the details as this would not qualitatively change the results.
    We now obtain our desired functional lower bounds for roABPs and multilinear formulas.

                        T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                                    52
               P ROOF C OMPLEXITY L OWER B OUNDS FROM A LGEBRAIC C IRCUIT C OMPLEXITY

Corollary 5.14. Let n ≥ 1 and F be a field with char(F) > 2n                                                  2n
                                                                                                               
                                                                        2  . Suppose that β ∈ F \ {0, . . . ,  2 }. Let
f ∈ F[x1 , . . . , x2n , z1 , . . . , z(2n) ] be a polynomial such that
                                 2


                                                                      1
                                                f (x, z) =                         ,
                                                             ∑i< j i, j xi x j − β
                                                                  z
                                2n
for x ∈ {0, 1}2n , z ∈ {0, 1}( 2 ) . Then f requires width ≥ 2n to be computed by an roABP in any order of the
variables. Also, f requires nΩ(log n) -size to be computed as a multilinear formula. For d = o(log n/log log n),
                        1/d 2
 f requires nΩ((n/log n) /d ) -size multilinear formulas of product-depth-d.

Proof: roABPs: Suppose that f (x, z) is computable by a width-r roABP in some order of the variables.
By pushing the z variables into the fraction field, it follows that fz ( f as a polynomial in F[z][x]) is also
computable by a width-r roABP over F(z) in the induced order of the variables on x (Theorem 3.8). By
splitting x in half along its order of the variables one obtains the lower bound by combining the coefficient
dimension lower bound of Theorem 5.13 with its relation to roABPs (Lemma 3.7).
     multilinear formulas: This follows immediately from our coefficient dimension lower bound (Theo-
rem 5.13) and the Raz [63] and Raz-Yehudayoff [68] results (Theorem 3.13).

   As before, this immediately yields the desired roABP-IPSLIN and multilinear-formula-IPS lower
bounds.

Corollary 5.15. Let n ≥ 1 and F be a field with char(F) > 2n                                                                  2n
                                                                                                                               
                                                                                         2 . Suppose that β ∈ F \ {0, . . . , 2 }.
Then ∑i< j zi, j xi x j − β , x2 − x, z2 − z ∈ F[x1 , . . . , x2n , z1 , . . . , z(2n) ] is unsatisfiable, and any roABP-IPSLIN
                                                                             2
refutation (in any order of the variables) requires exp(Ω(n)) size. Further, any multilinear-formula-IPS
refutation requires nΩ(log n) -size, and any product-depth-d multilinear-formula-IPS refutation requires
            1/d 2
nΩ((n/log n) /d ) -size.

Proof: The system is unsatisfiable as any setting of x ∈ {0, 1}n yields a sum over at most 2n
                                                                                              
                                                                                            2   z-variables,
                             2n
                               
which must be in {0, . . . , 2 } which by hypothesis does not contain β .
    The roABP-IPSLIN lower bound follows immediately from the above functional lower bound (Theo-
rem 5.14) along with our reduction (Theorem 5.1), just as in Theorem 5.10.
    The multilinear-formula-IPS lower bound also follows immediately from the above functional lower
bound (Theorem 5.14) along with our reduction from IPS lower bounds to functional lower bounds for
multilinear polynomials (Theorem 5.2). In particular, this application uses that multilinear formulas are
closed under partial evaluations, and that taking the difference of two formulas will only double its size
and does not change the product depth.


6     Lower bounds for multiples of polynomials
In this section we consider the problem of finding explicit polynomials whose nonzero multiples are all
hard. Such polynomials are natural to search for, as intuitively if f is hard to compute then so should
small modifications such as x1 f 2 + 4 f 3 . This intuition is buttressed by Kaltofen’s [44] result that if a
polynomial has a small algebraic circuit then so do all of its factors (up to some pathologies in small

                           T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                                              53
                 M ICHAEL F ORBES , A MIR S HPILKA , I DDO T ZAMERET, AND AVI W IGDERSON

characteristic). Taken in a contrapositive, this says that if a polynomial f requires superpolynomial size
algebraic circuits, then so must all of its nonzero multiples. Thus, for general circuits the question of
lower bounds for multiples reduces to the standard lower bounds question.
    Unfortunately, for many restricted classes of circuits where lower bounds are known (depth-3 powering
formulas, sparse polynomials, roABPs) Kaltofen’s [44] result produces circuits for the factors which
do not fall into (possibly stronger) restricted classes of circuits where lower bounds are still known.12
Therefore, developing lower bounds for multiples against these restricted classes seems to require further
work beyond the standard lower bound question.
    We will begin by discussing the applications of this problem to the hardness versus randomness
paradigm in algebraic complexity. We then use existing derandomization results to show that multiples
of the determinant are hard for certain restricted classes. However, this method is very rigidly tied to
the determinant. Thus, we also directly study existing lower bound techniques for restricted models of
computation (depth-3 powering formulas, sparse polynomials, and roABPs) and extend these results
to also apply to multiples. We will show the applications of such polynomials to proof complexity in
Section 7.


6.1    Connections to hardness versus randomness and factoring circuits
To motivate the problem of finding polynomials with hard multiples, we begin by discussing the hardness
versus randomness approach to derandomizing polynomial identity testing. That is, Kabanets and
Impagliazzo [43] extended the hardness versus randomness paradigm of Nisan and Wigderson [53]
to the algebraic setting, showing that sufficiently good algebraic circuit lower bounds for an explicit
polynomial would qualitatively derandomize PIT. While much of the construction is similar (using
combinatorial designs, hybrid arguments, etc.) to the approach of Nisan and Wigderson [53] for Boolean
derandomization, there is a key difference. In the Boolean setting, obtaining a hardness versus randomness
connection requires converting worst-case hardness (no small computation can compute the function
everywhere) to average-case hardness (no small computation can compute the function on most inputs).
Such a reduction (obtained by Impagliazzo and Wigderson [42]) can in fact be obtained using certain
error-correcting codes based on multivariate polynomials (as shown by Sudan, Trevisan and Vadhan [82]).
    Such a worst-case to average-case reduction is also needed in the algebraic setting, but as multivariate
polynomials are one source of this reduction in the Boolean regime, it is natural to expect it to be easier
in the algebraic setting. Specifically, the notion of average-case hardness for a polynomial f (x) used in
Kabanets–Impagliazzo [43] is that for any g(x, y) satisfying g(x, f (x)) = 0, it must be that g then requires
large algebraic circuits (by taking g(x, y) := y − f (x) this implies f itself requires large circuits). This can
be interpreted as average-case hardness because if such a g existed with a small circuit, then for any value
α we have that g(α, y) is a univariate polynomial that vanishes on f (α). By factoring this univariate
(which can be done efficiently), we see that such g give a small list (of size at most deg g) of possible values
for f (α). By picking a random element from this list, one can correctly compute f (x) with noticeable
probability, which by an averaging argument one can convert to a (non-uniform) deterministic procedure
to compute f (x) on most inputs (over any fixed finite set). While this procedure (involving univariate
  12 While some results ([18, 56]) can bound the depth of the factors in terms of the depth of the input circuit, there are only

very weak lower bounds known for constant-depth algebraic circuits.


                          T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                                              54
             P ROOF C OMPLEXITY L OWER B OUNDS FROM A LGEBRAIC C IRCUIT C OMPLEXITY

factorization) is not an algebraic circuit, the above argument shows that the Kabanets–Impagliazzo [43]
notion is a natural form of average case hardness.
    To obtain this form of average-case hardness from worst-case hardness, Kabanets and Impagli-
azzo [43] used a result of Kaltofen [44], who showed that (up to pathologies in low-characteristic fields),
factors of small (general) circuits have small circuits. As g(x, f (x)) = 0 iff y − f (x) divides g(x, y), it
follows that if g(x, y) has a small circuit then so does y − f (x), and thus so does f (x). Taking the contra-
positive, if f requires large circuits (worst-case hardness) then any such g(x, y) with g(x, f (x)) = 0 also
requires large circuits (average-case hardness). Note that this says that any worst-case hard polynomial is
also average-case hard. In contrast, this is provably false for Boolean functions, where such worst-case to
average-case reductions thus necessarily modify the original function.
    As mentioned above, Kaltofen’s [44] factoring algorithm does not preserve structural restrictions
(such as multilinearity, homogeneousness, small-depth, read-once-ness, etc.) of the original circuit, so
that obtaining average-case hardness for restricted classes of circuits requires worst-case hardness for
much stronger classes. While follow-up work has reduced the complexity of the circuits resulting from
Kaltofen’s [44] algorithm (Dvir–Shpilka–Yehudayoff [18] and Oliveira [56] extended Kaltofen’s [44] to
roughly preserve the depth of the original computation) this work is limited to factoring polynomials of
small individual degree and does not seem applicable to other types of computations such as roABPs.
Indeed, it even remains an open question to show any non-trivial upper bounds on the complexity of the
factors of sparse polynomials. In fact, we actually have non-trivial lower bounds. Specifically, von zur
Gathen and Kaltofen [30] gave an explicit s-sparse polynomial (over any field) which has a factor with
sΩ(log s) monomials, and Volkovich [85] gave, for a prime p, an explicit n-variate n-sparse polynomial of
degree-p which in characteristic p has a factor with n+p−2 n−1    monomials (an exponential separation for
p ≥ poly(n)). We refer the reader to the survey of Forbes and Shpilka [28] for more on the challenges in
factoring small algebraic circuits.
    While showing the equivalence of worst-case and average-case hardness for restricted circuit classes
seems difficult, to derandomize PIT via the method of Kabanets–Impagliazzo [43] only requires a single
polynomial that is average case hard. To facilitate obtaining such hard polynomials, we now record an
easy lemma showing that polynomials with only hard multiples are average-case hard.

Lemma 6.1. Let f (x) ∈ F[x] and g(x, y) ∈ F[x, y] both be nonzero, where g(x, 0) 6= 0 also. If g(x, f (x)) = 0
then g(x, 0) is a nonzero multiple of f (x).

Proof: Let g(x, y) = ∑i gi (x)yi and g0 (x) := g(x, 0). That g(x, f (x)) = 0 implies that

                        0 = g(x, f (x)) = ∑ gi (x)( f (x))i = g0 (x) + ∑ gi (x)( f (x))i
                                            i                          i≥1


so that g0 (x) = f (x) · − ∑i≥1 gi (x)( f (x))i−1 as desired.
                                                 


    That is, saying that f (x) is not average-case hard means that g(x, f (x)) = 0 for a nonzero g(x, y).
One can assume that g(x, 0) 6= 0, as otherwise one can replace g by g/yi for some i ≤ deg g, as this only
mildly increases the size for most measures of circuit size (see for example Section 4.1). As then the
complexity of g(x, 0) is bounded by that of g(x, y) (for natural measures), the lemma shows then that f
has a nonzero multiple of low-complexity. Taken contrapositively, if f only has hard nonzero multiples

                       T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                               55
                  M ICHAEL F ORBES , A MIR S HPILKA , I DDO T ZAMERET, AND AVI W IGDERSON

then it is average-case hard in the sense needed for applying the technique of Kabanets–Impagliazzo [43].
This shows that lower bounds for multiples is essentially the lower bound needed for algebraic hardness
versus randomness.13
     While in the sections below we are able to give explicit polynomials with hard multiples for various
restricted classes of algebraic circuits, some of these classes (such as sparse polynomials and roABPs)
still do not have the required closure properties to use Kabanets–Impagliazzo [43] to obtain deterministic
PIT algorithms. Even for classes with the needed closure properties (such as ∑ ∑ ∏O(1) formulas, where
                                                                                V

the hard polynomial is the monomial), the resulting PIT algorithms are only worse than existing results
(which for ∑ ∑ ∏O(1) formulas is the result of Forbes [21]). However, it seems likely that future results
               V

establishing polynomials with hard multiples would imply new PIT algorithms.

6.2     Lower bounds for multiples via PIT
This above discussion shows that obtaining lower bounds for multiples is sufficient for instantiating
the hardness versus randomness paradigm. We now observe the converse, showing that one can obtain
such polynomials with hard multiples via derandomizing (black-box) PIT, or equivalently, producing
generators with small seed-length. That is, Heintz–Schnorr [39] and Agrawal [1] showed that one can use
explicit generators for small circuits to obtain hard polynomials, and we observe here that the resulting
polynomials also have only hard multiples.
    Note that we give the construction based on a non-trivial generator for a class of circuits. While
one can analogously prove the hitting-set version of this claim, it is weaker. That is, it is possible to
consider classes C of unbounded degree and still have generators with small seed-length (see for example
Theorem 6.5 below), but for such classes one must have hitting sets with infinite size (as hitting univariate
polynomials of unbounded degree requires an infinite number of points).

Lemma 6.2. Let C ⊆ F[x] be a class of polynomials and let G : F` → F|x| be a generator for C. Suppose
0 6= h ∈ F[x] has h ◦ G = 0. Then for any nonzero g ∈ F[x] we have that g · h ∈
                                                                              / C.

Proof: By definition of G, for any f ∈ C, f = 0 iff f ◦ G = 0. Then for any nonzero g, g · h 6= 0 and
(g · h) ◦ G = (g ◦ G) · (h ◦ G) = (g ◦ G) · 0 = 0. Thus, we must have that g · h ∈
                                                                                 / C.

     That is, if ` < n then such an h exists (as the coordinates of G are algebraically dependent) and such
an h can be found in exponential time by solving an exponentially large linear system. Thus, h is a
weakly explicit polynomial with only hard multiples, which is sufficient for instantiating hardness versus
randomness.
     While there are now a variety of restricted circuit classes with non-trivial (black-box) PIT results,
it seems challenging to find for any given generator G an explicit nonzero polynomial f with f ◦ G = 0.
Indeed, to the best of our knowledge no such examples have ever been furnished for interesting generators.
Aside from the quest for polynomials with hard multiples, this question is independently interesting
as it demonstrates the limits of the generator in question, especially for generators that are commonly
used. There is not even a consensus as to whether the generators currently constructed could suffice
  13 However, it is not an exact equivalence between lower bounds for multiples and average case hardness, as the converse to

Theorem 6.1 is false, as seen by considering g(x, y) := y − x(x + 1), so that x|g(x, 0) but g(x, x) 6= 0.


                            T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                                         56
                 P ROOF C OMPLEXITY L OWER B OUNDS FROM A LGEBRAIC C IRCUIT C OMPLEXITY

to derandomize PIT for general circuits. Agrawal [1] has even conjectured that a certain generator for
depth-2 circuits (sparse polynomials) would actually suffice for PIT of constant-depth circuits.
     We consider here the generator of Shpilka–Volkovich [76]. This generator has a parameter `, and
intuitively can be seen as an algebraic analogue of the Boolean pseudorandomness notion of a (randomness
efficient) `-wise independent hash function. Just as `-wise independent hash functions are ubiquitous in
Boolean pseudorandomness, the Shpilka–Volkovich [76] generator has likewise been used in a number of
papers on black-box PIT (for example [76, 7, 26, 24, 85, 21] is a partial list). Therefore, we believe it is
important to understand the limits of this generator.
     The Shpilka–Volkovich [76] generator is really a family of generators that all share a certain property.
                        SV                                               SV
Specifically, the map G`,n : Fr → Fn has the property 14 that the image G`,n (Fr ) contains all `-sparse vectors
in Fn . The most straightforward construction of a randomness efficient generator with this property (via
Lagrange interpolation, given by Shpilka–Volkovich [76]) has r = 2`. Even this construction is actually a
family of possible constructions, as there is significant freedom to choose the finite set of points where
Lagrange interpolation will be performed. Therefore, instead of studying a specific generator we seek to
                                                                                                        SV0
understand the power of the above property, and thus we are free to construct another generator G`,n with
                                                                                                      SV0
this property for which we can find an explicit nonzero f where f ◦ G`,n = 0 for small `. We choose a
variant of the original construction so that we can take f as the determinant.
     In the original Shpilka–Volkovich [76] generator, one first obtains the ` = 1 construction by using
two variables, the control variable y and another variable z. By using Lagrange polynomials to simulate
indicator functions, the value of y can be set to choose between the outputs ze1 , . . . , zen ∈ F[z]n , where
ei ∈ Fn is the i-th standard basis vector. By varying z one obtains all 1-sparse vectors in Fn . To obtain
  SV                                          SV
G`,n one can sum ` independent copies of G1,n . In contrast, our construction will simply use a different
control mechanism, where instead of using univariate polynomials we use bivariates.
Construction 6.3. Let n, ` ≥ 1. Let F be a field of size ≥ n. Let Ω := {ω1 , . . . , ωn } be distinct elements in
           SV0
F. Define G1,n2 : F3 → Fn×n by
                                SV0            
                                 G1,n2 (x, y, z)  = z · 1ωi ,Ω (x) · 1ω j ,Ω (y) .
                                                                 i, j

where 1ωi ,Ω (x) ∈ F[x] is the unique univariate polynomial of degree < n such that
                                                         (
                                                          1 j=i
                                         1ωi ,Ω (ω j ) =         .
                                                          0 else
           SV0
Define G`,n2 : F3` → Fn×n by the polynomial map
                         SV0                                            SV0                              SV0
                       G`,n2 (x1 , y1 , z1 , . . . , x` , y` , z` ) := G1,n2 (x1 , y1 , z1 ) + · · · + G1,n2 (x` , y` , z` ) ,
working in the ring F[x, y, z].                                                                                                             ♦
  14 The most obvious algebraic analogue of an `-wise independent hash function would require that for a generator G : Fr → Fn

that any subset S ⊆ [n] with |S| ≤ ` the output of G restricted to S is all of FS . This property is implied by the Shpilka–
Volkovich [76] property, but is strictly weaker, and is in fact too weak to be useful for PIT. That is, consider the map
(x1 , . . . , xn ) 7→ (x1 , . . . , xn , x1 + · · · + xn ). This map has this naive “algebraic n-wise independence” property, but does not even
fool linear polynomials (which the Shpilka–Volkovich [76] generator does).


                               T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                                                       57
                  M ICHAEL F ORBES , A MIR S HPILKA , I DDO T ZAMERET, AND AVI W IGDERSON

    Note that this map has n2 outputs. Now observe that it is straightforward to see that this map has the
desired property.
                                                                                                               SV0
Lemma 6.4. Assume the setup of Construction 6.3. Then the image of the generator, G`,n2 (F3` ), contains
all `-sparse vectors in Fn×n .
    To the best of the authors knowledge, existing work using the Shpilka–Volkovich [76] generator 15
only use the above property (and occasionally also the fact that a coordinatewise sum of constantly many
such generators is a generator of the original form with similar parameters ([3, 24, 38, 21]), which our
alternate construction also satisfies). Thus, we can replace the standard construction with our variant in
known black-box PIT results (such as [76, 3, 26, 24, 38, 21]), some of which we now state.
                                                                                SV0
Corollary 6.5. Assume the setup of Construction 6.3. Then GO(log s),n2 is a generator for the following
classes of n-variate polynomials, of arbitrary degree.

     • Polynomials of sparsity s ([76, 38, 21]).

     • Polynomials computable as a depth-3 powering formula of top-fan-in s ([3, 26]).

     • Polynomials computable as a ∑ ∑ ∏O(1) formula of top-fan-in s ([21]), in characteristic zero.
                                                 V


     • Polynomials computable by width-s roABPs in every order of the variables, also known as commu-
       tative roABPs ([3, 24]).
                                                           SV0
   The above result shows the power of the G`,n2 generator to hit restricted classes of circuits. We now
observe that it is also limited by its inability to hit the determinant.
                                                                                          SV0
Proposition 6.6. Assume the setup of Construction 6.3. The output of G`,n2 is an n × n matrix of rank ≤ `,
when viewed as a matrix over the ring F(x, y, z). Thus, taking detn ∈ F[X] to be the n × n determinant, we
                  SV0
have that detn ◦ G`,n2 = 0 for ` < n.

Proof: ` = 1: For a field K, a (nonzero) matrix M ∈ Kn×n is rank-1 if it can be expressed as an outer-
product, so that (M)i, j = αi β j for α, β ∈ Kn . Taking α, β ∈ F(x, y, z)n defined by αi := z1ωi ,Ω (x) and
                                                     SV0
β j := 1ω j ,Ω (y) we immediately see that G1,n2 (x, y, z) is rank-1.
                                                                   SV0                                   SV0
    ` > 1: This follows as rank is subadditive, and G`,n2 is the sum of ` copies of G1,n2 .
    detn vanishes: This follows as the n × n determinant vanishes on matrices of rank < n.
                                                                                          SV0
    Note that in the above we could hope to find an f such that f ◦ G`,n2 = 0 for all ` < n2 , but we are
only able to handle ` < n. Given the above result, along with Theorem 6.2, we obtain that the multiples
of the determinant are hard.

Corollary 6.7. Let detn ∈ F[X] denote the n × n determinant. Then any nonzero multiple f · detn of detn ,
for 0 6= f ∈ F[X], has the following lower bounds.
  15 Note that for black-box PIT it is important that we use a generator that contains all sparse vectors, instead of just the set of

sparse vectors. As an example, the monomial x1 · · · xn is zero on all k-sparse vectors for k < n, but is nonzero when evaluated on
the Shpilka–Volkovich [76] generator for any ` ≥ 1.


                            T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                                                 58
              P ROOF C OMPLEXITY L OWER B OUNDS FROM A LGEBRAIC C IRCUIT C OMPLEXITY

      • f · detn involves exp(Ω(n)) monomials.
      • f · detn requires size exp(Ω(n)) to be expressed as a depth-3 powering formula.
      • f · detn requires size exp(Ω(n)) to be expressed as a ∑ ∑ ∏O(1) formula, in characteristic zero.
                                                                     V

      • f · detn requires width-exp(Ω(n)) to be computed as an roABP in some order of the variables.
                              SV0
Proof: By Theorem 6.5, GO(log s),n2 is a generator for the above size-s computations in the above classes.
                                               SV0 
However, following Theorem 6.2, ( f · detn ) ◦ G`,n2 = 0 for ` < n. Thus, if f · detn (which is nonzero) is
computable in size-s it must be that O(log s) ≥ n, so that s ≥ exp(Ω(n)).
    Note that we crucially leveraged that the determinant vanishes on low-rank matrices. Therefore, the
above results do not seem to imply similar results for the permanent, despite the fact that the permanent is
a harder polynomial. That is, recall that because of VNP-completeness of the permanent the determinant
detn (X) can be written as a projection of the permanent, so that detn (X) = permm (A(X)) for an affine
matrix A(X) ∈ F[X]m×m with m ≤ poly(n). Then, given a multiple g(Y ) · permm (Y ) one would like
to use this projection to obtain g(A(X)) permm (A(X)) = g(A(X)) detn X, which is a multiple of detn .
Unfortunately this multiple may not be a nonzero multiple: it could be that g(A(X)) = 0, from which no
lower bounds for g(A(X)) detn (X) (and hence g(Y ) permm (Y )) can be derived.

6.3     Lower bounds for multiples via leading/trailing monomials
We now use the theory of leading (and trailing) monomials to obtain explicit polynomials with hard
multiples. We aim at finding as simple polynomials as possible so they will give rise to simple “axioms”
with no small refutations for our proof complexity applications in Section 7. These results will essentially
be immediate corollaries of previous work.

6.3.1    Depth-3 powering formulas
Kayal [45] observed that using the partial derivative method of Nisan and Wigderson [54] one can show
that these formulas require exp(Ω(n)) size to compute the monomial x1 · · · xn . Forbes and Shpilka [26]
later observed that this result can be made robust by modifying the hardness of representation technique
of Shpilka and Volkovich [76], in that similar lower bounds apply when the leading monomial involves
many variables, as we now quote.
Theorem 6.8 (Forbes–Shpilka [26]). Let f (x) ∈ F[x] be computed a ∑ ∑ formula of size ≤ s. Then the
                                                                               V

leading monomial xa = LM( f ) involves |a|0 ≤ lg s variables.
    We now note that as the leading monomial is multiplicative (Theorem 3.18) this lower bound
automatically extends to multiples of the monomial.
Corollary 6.9. Any nonzero multiple of x1 · · · xn requires size ≥ 2n to be computed as a ∑ ∑ formula.
                                                                                                     V

Proof: Consider any 0 6= g(x) ∈ F[x1 , . . . , xn ]. Then as the leading monomial is multiplicative (Theo-
rem 3.18) we have that LM(g · x1 · · · xn ) = LM(g) · x1 · · · xn , so that LM(g · x1 · · · xn ) involves n variables.
By the robust lower bound (Theorem 6.8) this implies g(x) · x1 · · · xn requires size ≥ 2n as a ∑ ∑
                                                                                                                 V

formula.

                        T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                                     59
                 M ICHAEL F ORBES , A MIR S HPILKA , I DDO T ZAMERET, AND AVI W IGDERSON

6.3.2 ∑ ∑ ∏O(1) formulas
           V

Kayal [46] introduced the method of shifted partial derivatives, and Gupta–Kamath–Kayal–
Saptharishi [36] refined it to give exponential lower bounds for various sub-models of depth-4 formulas.
In particular, it was shown that the monomial x1 · · · xn requires exp(Ω(n)) size to be computed as a
∑ ∑ ∏O(1) formula. Applying the hardness of representation approach of Shpilka and Volkovich [76],
  V

Mahajan-Rao-Sreenivasaiah [51] showed how to develop a deterministic black-box PIT algorithm for
multilinear polynomials computed by ∑ ∑ ∏O(1) formulas. Independently, Forbes [21] (following
                                             V

Forbes–Shpilka [26]) showed that this lower bound can again be made to apply to leading monomials 16
(which implies a deterministic black-box PIT algorithm for all ∑ ∑ ∏O(1) formulas, with the same
                                                                      V

complexity as Mahajan-Rao-Sreenivasaiah [51]). This leading monomial lower bound, which we now
state, is important for its applications to polynomials with hard multiples.
Theorem 6.10 (Forbes [21]). Let f (x) ∈ F[x] be computed as a ∑ ∑ ∏t formula of size ≤ s. If char(F) ≥
                                                                              V

ideg(xa ), then the leading monomial xa = LM( f ) involves |a|0 ≤ O(t lg s) variables.
    As for depth-3 powering formulas (Theorem 6.9), this immediately yields that all multiples (of degree
below the characteristic) of the monomial are hard.
Corollary 6.11. All nonzero multiples of x1 · · · xn of degree < char(F) require size ≥ exp(Ω(n/t )) to be
computed as ∑ ∑ ∏t formula.
              V


6.3.3   Sparse polynomials
While the above approaches for ∑ ∑ and ∑ ∑ ∏O(1) formulas focus on leading monomials, one cannot
                                        V             V

show that the leading monomials of sparse polynomials involve few variables as sparse polynomials
can easily compute the monomial x1 · · · xn . However, following the translation idea of Agrawal-Saha-
Saxena [3], Gurjar-Korwar-Saxena-Thierauf [38] showed that sparse polynomials under full-support
translations have some monomial involving few variables, and Forbes [21] (using different techniques)
showed that in fact the trailing monomial involves few variables (translations do not affect the leading
monomial, so the switch to trailing monomials is necessary here).
Theorem 6.12 (Forbes [21]). Let f (x) ∈ F[x] be (≤ s)-sparse, and let α ∈ (F \ {0})n so that α has
full-support. Then the trailing monomial xa = TM( f (x + α)) involves |a|0 ≤ lg s variables.
    This result thus allows one to construct polynomials whose multiples are all non-sparse.

Corollary 6.13. All nonzero multiples of (x1 + 1) · · · (xn + 1) ∈ F[x] require sparsity ≥ 2n . Similarly, all
nonzero multiples of (x1 + y1 ) · · · (xn + yn ) ∈ F[x, y] require sparsity ≥ 2n .

Proof: Define f (x) = ∏ni=1 (xi + 1). For any 0 6= g(x) ∈ F[x] the multiple g(x) f (x) under the translation
x 7→ x − 1 is equal to g(x − 1) ∏i xi . Then all monomials (in particular the trailing monomial) involve n
variables (as g(x) 6= 0 implies g(x − 1) 6= 0). Thus, by Theorem 6.12 it must be that g(x) f (x) requires
≥ 2n monomials.
    The second part of the claim follows as the first, where we now work over the fraction field F(y)[x],
noting that this can only decrease sparsity. Thus, using the translation x 7→ x − y the above trailing
  16 The result there is stated for trailing monomials, but the argument equally applies to leading monomials.



                          T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                                  60
               P ROOF C OMPLEXITY L OWER B OUNDS FROM A LGEBRAIC C IRCUIT C OMPLEXITY

monomial argument implies that the sparsity of nonzero multiples ∏i (xi + yi ) are ≥ 2n over F(y)[x] and
hence also over F[x, y].

    Note that it is tempting to derive the second part of the above corollary from the first, using that the
substitution y ← 1 does not increase sparsity. However, this substitution can convert nonzero multiples of
∏i (xi + yi ) to zero multiples of ∏i (xi + 1), which ruins such a reduction, as argued in the discussion after
Theorem 6.7.

6.4    Lower bounds for multiples of sparse multilinear polynomials
While the previous section established that all multiples of (x1 + 1) · · · (xn + 1) are non-sparse, the
argument was somewhat specific to that polynomial and fails to obtain an analogous result for (x1 +
1) · · · (xn + 1) + 1. Further, while that argument can show for example that all multiples of the n × n
determinant or permanent require sparsity ≥ exp(Ω(n)), this is the best sparsity lower bound obtainable
for these polynomials with this method.17 In particular, one cannot establish a sparsity lower bound of
“n!” for the determinant or permanent (which would be tight) via this method.
      We now give a different argument, due to Oliveira [55] that establishes a much more general result
showing that multiples of any multilinear polynomial have at least the sparsity of the original polynomial.
While Oliveira [55] gave a proof using Newton polytopes, we give a more compact proof here using
induction on variables (loosely inspired by a similar result of Volkovich [84] on the sparsity of factors of
multi-quadratic polynomials).

Proposition 6.14 (Oliveira [55]). Let f (x) ∈ F[x1 , . . . , xn ] be a nonzero multilinear polynomial with
sparsity exactly s. Then any nonzero multiple of f has sparsity ≥ s.

Proof: By induction on variables.
    n = 0: Then f is a constant, so that s = 1 as f 6= 0. All nonzero multiples are nonzero polynomials so
have sparsity ≥ 1.
    n ≥ 1: Partition the variables x = (y, z), so that f (y, z) = f1 (y)z + f0 (y), where fi (y) has sparsity si and
s = s1 + s0 . Consider any nonzero g(y, z) = ∑di=d 1
                                                     0
                                                       gi (y)zi with gd0 (y), gd1 (y) 6= 0 (possibly with d0 = d1 ).
Then
                                                          !
                                             d1
     g(y, z) f (y, z) = f1 (y)z + f0 (y) · ∑ gi (y)zi
                                                    i=d0
                                                  "                                                #
                                                                                               
                       = f1 (y)gd1 (y)zd1 +1 +         ∑           f1 (y)gi−1 (y) + f0 (y)gi (y) zi + f0 (y)gd0 (y)zd0 .
                                                    d0 <i≤d1

By partitioning this sum by powers of z so that there is no cancellation, and then discarding the middle
terms,
                                                                                 
                 Supp g(y, z) f (y, z) ≥ Supp f1 (y)gd1 (y) + Supp f0 (y)gd0 (y)
  17 Specifically, as the determinant and permanent are degree n multilinear polynomials, and thus so are their translations, their

monomials always involve ≤ n variables so no sparsity bound better than 2n can be obtained by using Theorem 6.12.


                           T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                                                61
                M ICHAEL F ORBES , A MIR S HPILKA , I DDO T ZAMERET, AND AVI W IGDERSON

so that appealing to the induction hypothesis, as f0 and f1 are multilinear polynomials of sparsity s0 and
s1 respectively,

                                             ≥ s1 + s0 = s .

    We note that multilinearity is essential in the above lemma, even for univariates. This is seen by
noting that the 2-sparse polynomial xn − 1 is a multiple of xn−1 + · · · + x + 1.
    Thus, the above not only gives a different proof of the non-sparsity of multiples of ∏i (xi + 1)
(Theorem 6.13), but also establishes that nonzero multiples of ∏i (xi + 1) + 1 are ≥ 2n sparse, and nonzero
multiples of the determinant or permanent are n! sparse, which is tight. Note further that this lower bound
proof is “monotone” in that it applies to any polynomial with the same support, whereas the proof of
Theorem 6.13 is seemingly not monotone as seen by contrasting ∏i (xi + 1) and ∏i (xi + 1) + 1.

6.5     Lower bounds for multiples by leading/trailing diagonals
In the previous sections we obtained polynomials with hard multiples for various circuit classes by
appealing to the fact that lower bounds for these classes can be reduced to studying the number of
variables in leading or trailing monomials. Unfortunately this approach is restricted to circuit classes
where monomials (or translations of monomials) are hard to compute, which in particular rules out this
approach for roABPs. Thus, to develop polynomials with hard multiples for roABPs we need to develop
a different notion of a “leading part” of a polynomial. In this section, we define such a notion called a
leading diagonal, establish its basic properties, and obtain the desired polynomials with hard multiples.
The ideas of this section are a cleaner version of the techniques used in the PIT algorithm of Forbes and
Shpilka [25] for commutative roABPs.

6.5.1    Leading and trailing diagonals
We begin with the definition of a leading diagonal, which is a generalization of a leading monomial.
Definition 6.15. Let f ∈ F[x1 , . . . , xn , y1 , . . . , yn ] be nonzero. The leading diagonal of f , denoted LD( f ),
is the leading coefficient of f (x ◦ z, y ◦ z) when this polynomial is considered in the ring F[x, y][z1 , . . . , zn ],
and where x ◦ z denotes the Hadamard product (x1 z1 , . . . , xn zn ). The trailing diagonal of f is defined
analogously. The zero polynomial has no leading or trailing diagonal.
   As this notion has not explicitly appeared prior in the literature, we now establish several straightfor-
ward properties. The first is that extremal diagonals are homomorphic with respect to multiplication.
Lemma 6.16. Let f , g ∈ F[x1 , . . . , xn , y1 , . . . , yn ] be nonzero. Then LD( f g) = LD( f ) LD(g) and TD( f g) =
TD( f ) TD(g).
Proof: As LD( f ) = LCx,y|z ( f (x ◦ z, y ◦ z)), where this leading coefficient is taken in the ring F[x, y][z],
this automatically follows from the fact that leading coefficients are homomorphic with respect to
multiplication (Theorem 3.18). The result for trailing diagonals is symmetric.

   We now show how to relate the leading monomials of the coefficient space of f to the respective
monomials associated to the leading diagonal of f .

                         T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                                      62
               P ROOF C OMPLEXITY L OWER B OUNDS FROM A LGEBRAIC C IRCUIT C OMPLEXITY

Proposition 6.17. Let f ∈ F[x1 , . . . , xn , y1 , . . . , yn ]. For any b, if Coeffx|yb (LD( f )) 6= 0, then
                                                                         
                                 LM Coeffx|yb (LD( f )) = LM Coeffx|yb ( f ) .

The respective trailing statement also holds.

Proof: We prove the leading statement, the trailing version is symmetric. Let f = ∑a,b αa,b xa yb . We can
then expand f (x ◦ z, y ◦ z) as follows.
                                                        
                  f (x ◦ z, y ◦ z) = ∑c ∑a+b=c αa,b xa yb zc

choose c0 so that LCx,y|z ( f ) = Coeffx,y|zc0 ( f ), we get that
                                                                                  
                                   = ∑a+b=c0 αa,b xa yb zc0 + ∑c≺c0 ∑a+b=c αa,b xa yb zc ,

where LD( f ) = ∑a+b=c0 αa,b xa y b and ∑a+b=c αa,b xa y b = 0 for c  c0 . In particular, this means that for
any b we have that αa,b = 0 for a  c0 − b.
   Thus we have that
                                                                                  
                    LM Coeffx|yb (LD( f )) = LM Coeffx|yb ∑a+b=c0 αa,b xa yb
                                                                    
                                              = LM αc0 −b,b xc0 −b

                                                      = xc0 −b ,

as we assume this leading monomial exists, which is equivalent here to αc0 −b,b 6= 0.
    In comparison,
                                                            
             LM Coeffx|yb ( f ) = LM Coeffx|yb ∑a,b αa,b xa yb
                                                
                                 = LM ∑a αa,b xa
                                                                                           
                                 = LM ∑ac0 −b αa,b xa + αc0 −b,b xc0 −b + ∑a≺c0 −b αa,b xa

as αa,b = 0 for a  c0 − b,
                                                                               
                                        = LM αc0 −b,b xc0 −b + ∑a≺c0 −b αa,b xa

                                        = xc0 −b ,

where in the last step we again used that αc0 −b,b 6= 0. This establishes the desired equality.

   We now relate the extremal monomials of the coefficient space of f to the monomials of the coefficient
space of the extremal diagonals of f .

                          T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                                 63
                  M ICHAEL F ORBES , A MIR S HPILKA , I DDO T ZAMERET, AND AVI W IGDERSON

Corollary 6.18. Let f ∈ F[x1 , . . . , xn , y1 , . . . , yn ]. Then

                                     LM(Coeffx|y ( f )) ⊇ LM(Coeffx|y (LD( f ))) .

The respective trailing statement also holds.

Proof: This follows as LM(Coeffx|y (LD( f ))) is equal to
                        n                                              o
                          LM Coeffx|yb LD( f )       Coeffx|yb LD( f ) 6= 0 ,

but by Theorem 6.17 this set equals
                           n                                       o
                             LM Coeffx|yb ( f ) Coeffx|yb LD( f ) 6= 0 ,

which is clearly contained in LM(Coeffx|y ( f )).

    We now observe that the number of leading monomials of the coefficient space of a leading diagonal
is equal to its sparsity.

Lemma 6.19. Let f ∈ F[x1 , . . . , xn , y1 , . . . , yn ]. For a polynomial g, let |g|0 denotes its sparsity. Then
                                                                   
                                    LM Coeffx|y LD( f ) = | LD( f )|0 .

The respective trailing statement also holds.

Proof: We prove the claim for the leading diagonal, the trailing statement is symmetric. Note that
the claim is a vacuous “0=0” if f is zero. For nonzero f , express it as f = ∑a,b αa,b xa y b so that
LD( f ) = ∑a+b=c0 αa,b xa yb = ∑b αc0 −b,b xc0 −b yb for some c0 ∈ Nn . Then Coeffx|yb (LD( f )) = αc0 −b,b xc0 −b .
As the monomials xc0 −b are distinct and hence linearly independent for distinct b, it follows that
dim Coeffx|y (LD( f )) = |{b|αc0 −b,b 6= 0}|, which is equal the sparsity | LD( f )|0 .

    Finally, we now lower bound the coefficient dimension of a polynomial by the sparsity of its extremal
diagonals.
Corollary 6.20. Let f ∈ F[x1 , . . . , xn , y1 , . . . , yn ]. Then

                                        dim Coeffx|y ( f ) ≥ | LD( f )|0 , | TD( f )|0 ,

where for a polynomial g, |g|0 denotes its sparsity.

Proof: We give the proof for the leading diagonal, the trailing diagonal is symmetric. By the above,
                                                                            
          dim Coeffx|y ( f ) ≥ LM Coeffx|y ( f ) ≥ LM Coeffx|y LD( f ) = | LD( f )|0 ,

where we passed from span to number of leading monomials (Lemma 3.19), and then passed to the
leading monomials of the leading diagonal (Theorem 6.18), and then passed to sparsity of the leading
diagonal (Theorem 6.19).

                           T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                                     64
              P ROOF C OMPLEXITY L OWER B OUNDS FROM A LGEBRAIC C IRCUIT C OMPLEXITY

6.5.2   Lower bounds for multiples for read-once and read-twice ABPs
Having developed the theory of leading diagonals in the previous section, we now turn to using this theory
to obtain explicit polynomials whose nonzero multiples all require large roABPs. We also generalize
this to read-O(1) oblivious ABPs, but only state the results for k = 2 as this has a natural application
                                                                                       V
to proof complexity (Section 7). As the restricted computations considered above (∑ ∑ formulas and
sparse polynomials) have small roABPs, the hard polynomials in this section will also have multiples
requiring large complexity in these models as well and thus qualitatively reprove some of the above
results. However, we included the previous sections as the hard polynomials there are simpler (being
monomials or translations of monomials), and more importantly we will need those results for the proofs
below.
    The proofs will use the characterization of roABPs by their coefficient dimension (Lemma 3.7), the
lower bound for coefficient dimension in terms of the sparsity of the extremal diagonals (Corollary 6.20),
and polynomials whose multiples are all non-sparse (Theorem 6.13).
Proposition 6.21. Let f (x, y) := ∏ni=1 (xi + yi + αi ) ∈ F[x1 , . . . , xn , y1 , . . . , yn ], for αi ∈ F. Then for any
0 6= g ∈ F[x, y],
                                        dim Coeffx|y (g · f ) ≥ 2n .
In particular, all nonzero multiples of f require width at least 2n to be computed by an roABP in any
order of the variables where x ≺ y.
Proof: Observe that the leading diagonal of f is insensitive to the αi . That is, LD(xi + yi + αi ) = xi + yi ,
so by multiplicativity of the leading diagonal (Theorem 6.16) we have that LD( f ) = ∏i (xi + yi ). Thus,
appealing to Corollary 6.20 and Theorem 6.13,
                                   dim Coeffx|y (g · f ) ≥ | LD(g · f )|0
                                                         = | LD(g) · LD( f )|0
                                                         = | LD(g) · ∏i (xi + yi )|0
                                                         ≥ 2n .
    The claim about roABP width follows from Lemma 3.7.

    Note that this lower bound actually works in the “monotone” setting (if we replace Theorem 6.13
with the monotone Theorem 6.14), as the result only uses the zero/nonzero pattern of the coefficients.
    The above result gives lower bounds for coefficient dimension in a fixed variable partition. We now
symmetrize this construction to get lower bounds for coefficient dimension in any variable partition. We
proceed as in Section 5.5, where we plant the fixed-partition lower bound into an arbitrary partition. Note
that unlike that construction, we will not need auxiliary variables here.
Corollary 6.22. Let f ∈ F[x1 , . . . , xn ] be defined by f (x) := ∏i< j (xi + x j + αi, j ) for αi, j ∈ F. Then for
any partition x = (u, v, w) with m := |u| = |v|, and any 0 6= g ∈ F[x],
                                           dimF(w) Coeffu|v (g · f ) ≥ 2m ,
where we treat g · f as a polynomial in F(w)[u, v]. In particular, all nonzero multiples of f require width
≥ 2bn/2c to be computed by an roABP in any order of the variables.

                         T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                                        65
                M ICHAEL F ORBES , A MIR S HPILKA , I DDO T ZAMERET, AND AVI W IGDERSON

Proof: We can factor f into a copy of the hard polynomial from Theorem 6.21, and the rest. That is,
                                                           m
                          f (x) = ∏(xi + x j + αi, j ) = ∏(ui + vi + βi ) · f 0 (u, v, w) ,
                                  i< j                    i=1


for some βi ∈ F and nonzero f 0 (u, v, w) ∈ F[u, v, w]. Thus,

                                                                 m
                              g · f = g(u, v, w) · f 0 (u, v, w) · ∏(ui + vi + βi ) .
                                                                  i=1

Noting that g, f 0 are nonzero in F[u, v, w], they are also nonzero in F(w)[u, v], so that g · f is nonzero
multiple of ∏m i=1 (ui + vi + βi ) in F(w)[u, v]. Appealing to our lower bound for (nonzero) multiples of
coefficient dimension (Theorem 6.21), we have that
                                                                                   !
                                                                         m
                dimF(w) Coeffu|v (g · f ) = dimF(w) Coeffu|v g · f 0 · ∏(ui + vi + βi )       ≥ 2m .
                                                                        i=1

The statement about roABPs follows from Lemma 3.7.

    We briefly remark that the above bound does not match the naive bound achieved by writing the
                                                                                 2
polynomial ∏i< j (xi + x j + αi, j ) in its sparse representation, which has 2Θ(n ) terms. The gap between the
                                                      2
lower bound (2Ω(n) ) and the upper bound (2O(n ) ) is explained by our use of a complete graph to embed
the lower bounds of Theorem 6.21 into an arbitrary partition. As discussed after Theorem 5.13 one can
use expander graphs to essentially close this gap.
    We now observe that the above lower bounds for coefficient dimension suffices to obtain lower
bounds for read-twice oblivious ABPs, as we can appeal to the structural result of Anderson, Forbes,
Saptharishi, Shpilka and Volk [6] (Theorem 3.9). This result shows that for any read-twice oblivious
ABP that (after discarding some variables) there is a partition of the variables across which has small
coefficient dimension, which is in contrast to the above lower bound.

Corollary 6.23. Let f ∈ F[x1 , . . . , xn ] be defined by f (x) := ∏i< j (xi + x j + αi, j ) for αi, j ∈ F. Then for
any 0 6= g ∈ F[x], g · f requires width-2Ω(n) as a read-twice oblivious ABP.

Proof: Suppose that g · f has a read-twice oblivious ABP of width-w. By the lower-bound of Anderson,
Forbes, Saptharishi, Shpilka and Volk [6] (Theorem 3.9), there exists a partition x = (u, v, w) where
|u|, |v| ≥ Ω(n), and such that dimF(w) Coeffu|v (g · f ) ≤ w4 (where we treat g · f as a polynomial in
F(w)[u, v]). Note that we can enforce that the partition obeys m := |u| = |v| ≥ Ω(n), as we can balance
u and v by pushing variables into w, as this cannot increase the coefficient dimension (Theorem 3.8).
However, appealing to our coefficient dimension bound (Theorem 6.22)

                                 w4 ≥ dimF(w) Coeffu|v (g · f ) ≥ 2m ≥ 2Ω(n) ,

so that w ≥ 2Ω(n) as desired.

                        T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                                    66
                P ROOF C OMPLEXITY L OWER B OUNDS FROM A LGEBRAIC C IRCUIT C OMPLEXITY

7      IPS lower bounds via lower bounds for multiples
In this section we use the lower bounds for multiples of Section 6 to derive lower bounds for C-IPS proofs
for various restricted algebraic circuit classes C. The advantage of this approach over the functional
lower bounds strategy of Section 5 is that we derive lower bounds for the general IPS system, not just
its subclass linear-IPS. While our equivalence (Theorem 4.4) of C-IPS and C-IPSLIN holds for any
strong-enough class C, the restricted classes we consider here (depth-3 powering formulas and roABPs) 18
are not strong enough to use Theorem 4.4 to lift the results of Section 5 to lower bounds for the full IPS
system. However, as discussed in the introduction, the techniques of this section can only yield lower
bounds for C-IPS refutations of systems of equations which are hard to compute within C (though our
examples are computable by small (general) circuits).
     We begin by first detailing the relation between IPS refutations and multiples. We then use our lower
                                                                                 V
bounds for multiples (Section 6) to derive as corollaries lower bounds for ∑ ∑-IPS and roABP-IPS
refutations.

Lemma 7.1. Let f , g, x2 − x ∈ F[x1 , . . . , xn ] be an unsatisfiable system of equations, where g, x2 − x is
satisfiable. Let C ∈ F[x, y, z, w] be an IPS refutation of f , g, x2 − x. Then

                                                    1 −C(x, 0, g, x2 − x)

is a nonzero multiple of f .

Proof: That C is an IPS refutation means that

                                      C(x, f , g, x2 − x) = 1,        C(x, 0, 0, 0) = 0 .

We first show that 1−C(x, 0, g, x2 −x) is a multiple of f , using the first condition on C. Expand C(x, y, z, w)
as a univariate in y, so that
                                     C(x, y, z, w) = ∑ Ci (x, z, w)yi ,
                                                               i≥0

for Ci ∈ F[x, z, w]. In particular, C0 (x, z, w) = C(x, 0, z, w). Thus,

                         1 −C(x, 0, g, x2 − x) = C(x, f , g, x2 − x) −C(x, 0, g, x2 − x)
                                               = ∑i≥0 Ci (x, g, x2 − x) f i −C0 (x, g, x2 − x)
                                                                           

                                                  = ∑i≥1 Ci (x, g, x2 − x) f i
                                                  = ∑i≥1 Ci (x, g, x2 − x) f i−1 · f .
                                                                                


Thus, 1 −C(x, 0, g, x2 − x) is a multiple of f as desired.
     We now show that this is a nonzero multiple, using the second condition on C and the satisfiability of
g, x − x. That is, the second condition implies that 0 = C(x, 0, 0, 0) = C0 (x, 0, 0). If 1 −C(x, 0, g, x2 − x)
    2

is zero, then by the above we have that C0 (x, g, x2 − x) = 1, so that C0 (x, z, w) is an IPS refutation of
    18 As in Section 6, we will not treat multilinear formulas in this section as they are less natural for the techniques under

consideration. Further, IPS lower bounds for multilinear formulas can be obtained via functional lower bounds (Theorem 5.15).


                            T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                                            67
                M ICHAEL F ORBES , A MIR S HPILKA , I DDO T ZAMERET, AND AVI W IGDERSON

g, x2 − x, which contradicts the satisfiability of g, x2 − x as IPS is a sound proof system. So it must then
be that 1 −C(x, 0, g, x2 − x) is nonzero.
     That is, take an α satisfying g, x2 − x so that g(α) = 0, α 2 − α = 0. Substituting this α into
C0 (x, g, x2 − x), we have that C0 (x, g, x2 − x)|x←α = C0 (α, 0, 0), and because C0 (x, 0, 0) ≡ 0 in F[x] via
the above we have that C0 (α, 0, 0) = 0. Thus, we have that 1 −C(x, 0, g, x2 − x) = 1 −C0 (x, g, x2 − x) is a
nonzero polynomial as its evaluation at x ← α is 1.

      The above lemma thus gives a template for obtaining lower bounds for IPS. First, obtain a “hard”
polynomial f whose nonzero multiples are hard for C, where f is hopefully also computable by small
(general) circuits. Then find additional (simple) polynomials g such that g, x2 − x is satisfiable yet
 f , g, x2 − x is unsatisfiable. By the above lemma one then has the desired IPS lower bound for refuting
 f , g, x2 − x, assuming that C is sufficiently general. However, for our results we need to be more careful
as even though C(x, y, z, w) is from the restricted class C, the derived polynomial C(x, 0, g, x2 − x) may
not be, and thus we will need to appeal to lower bounds for stronger classes.
      We now instantiate this template, first for depth-3 powering formulas, where we use lower bounds for
multiples of the stronger ∑ ∑ ∏2 model.
                                 V

Corollary 7.2. Let F be a field with char(F) = 0. Let f := x1 · · · xn and g := x1 + · · · + xn − n with
f , g ∈ F[x1 , . . . , xn ]. Then f , g, x2 − x are unsatisfiable and any ∑ ∑-IPS refutation requires size at least
                                                                           V

exp(Ω(n)).
Proof: The hypothesis char(F) = 0 implies that {0, . . . , n} are distinct numbers. In particular, the system
g(x) = 0 and x2 − x = 0 is satisfiable and has the unique satisfying assignment 1. However, this single
assignment does not satisfy f , as f (1) = ∏ni=1 1 = 1 6= 0, so the entire system is unsatisfiable. Thus,
applying our strategy (Theorem 7.1), we see that for any ∑ ∑-IPS refutation C(x, y, z, w) of f , g, x2 − x,
                                                              V

the polynomial 1 −C(x, 0, g, x2 − x) is a nonzero multiple of f .
     Let s be the size of C as a ∑ ∑ formula. As g is linear and the Boolean axioms x2 − x are quadratic,
                                  V

it follows that 1 − C(x, 0, g, x2 − x) is a sum of powers of quadratics (∑ ∑ ∏2 ) of size poly(s). As
                                                                              V

nonzero multiples of f requires exp(Ω(n)) size as a ∑ ∑ ∏2 formula (Corollary 6.11) it follows that
                                                          V

poly(s) ≥ exp(Ω(n)), so that s ≥ exp(Ω(n)) as desired.

    We similarly obtain a lower bound for roABP-IPS, where here we use lower bounds for multiples of
read-twice oblivious ABPs.
Corollary 7.3. Let F be a field with char(F) > n. Let f := ∏i< j (xi + x j − 1) and g := x1 + · · · + xn − n
with f , g ∈ F[x1 , . . . , xn ]. Then f , g, x2 − x are unsatisfiable and any roABP-IPS refutation (in any order of
the variables) requires size ≥ exp(Ω(n)).
Proof: The hypothesis char(F) > n implies that {0, . . . , n} are distinct numbers. In particular, the system
g(x) = 0 and x2 − x = 0 is satisfiable and has the unique satisfying assignment 1. However, this single
assignment does not satisfy f as f (1) = ∏i< j (1 + 1 − 1) = 1 6= 0, so the entire system is unsatisfiable.
Thus, applying our strategy (Theorem 7.1), we see that for any roABP-IPS refutation C(x, y, z, w) of
f , g, x2 − x that 1 −C(x, 0, g, x2 − x) is a nonzero multiple of f .
     Let s be the size of C as an roABP. We now argue that 1 −C(x, 0, g, x2 − x) has a small read-twice
oblivious ABP. First, note that we can expand C(x, 0, z, w) into powers of z, so that C(x, 0, z, w) =

                        T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                                    68
             P ROOF C OMPLEXITY L OWER B OUNDS FROM A LGEBRAIC C IRCUIT C OMPLEXITY

∑0≤i≤s Ci (x, w)zi (where we use that s bounds the width and degree of the roABP C). Each Ci (x, w)
has a poly(s)-size roABP (in the order of the variables of C where z is omitted) as we can compute Ci
via interpolation over z, using that each evaluation preserves roABP size (Theorem 3.8). Further, as
g is linear, for any i we see that gi can be computed by a poly(n, i)-size roABP (in any order of the
variables) (Theorem 3.15). Combining these facts using closure properties of roABPs under addition and
multiplication (Theorem 3.8), we see that C(x, 0, g, w), and hence 1 −C(x, 0, g, w), has a poly(s, n)-size
roABP in the order of the variables that C induces on x, w. Next observe, that as each Boolean axiom
xi2 − xi only refers to a single variable, substituting w ← x2 − x in the roABP for 1 − C(x, 0, g, w) will
preserve obliviousness of the ABP, but now each variable will be read twice, so that 1 −C(x, 0, g, x2 − x)
has a poly(s, n)-size read-twice oblivious ABP.
     Now, using that nonzero multiples of f requires exp(Ω(n)) size to be computed as read-twice oblivious
ABPs (Theorem 6.23) it follows that poly(s, n) ≥ exp(Ω(n)), so that s ≥ exp(Ω(n)) as desired.

    We note that the above lower bound is for the size of the roABP. One can also obtain the stronger
result (for similar but less natural axioms) showing that the width (and hence also the size) of the roABP
must be large, but we do not pursue this as it does not qualitatively change the result.


8    Discussion
In this work we proved new lower bounds for various natural restricted versions of the Ideal Proof System
(IPS) of Grochow and Pitassi [35]. While existing work in algebraic proof complexity showed limitations
of weak measures of complexity such as the degree and sparsity of a polynomial, our lower bounds are
for stronger measures of circuit size that match many of the frontier lower bounds in algebraic circuit
complexity. However, our work leaves several open questions and directions for further study, which we
now list.

    1. Can one obtain proof complexity lower bounds from the recent techniques for lower bounds for
       depth-4 circuits, such as the results of Gupta, Kamath, Kayal and Saptharishi [36]? Neither of
       our approaches (functional lower bounds or lower bounds for multiples) currently extend to their
       techniques.

    2. Many proof complexity lower bounds are for refuting unsatisfiable k-CNFs, where k = O(1), which
       can be encoded as systems of polynomial equations where each equation involves O(1) variables.
       Can one obtain interesting IPS lower bounds for such systems? Our techniques only establish
       exponential lower bounds where there is at least one axiom involving Ω(n) variables.

    3. Given an equation f (x) = 0 where f has a size-s circuit, there is a natural way to convert this
       equation to poly(s)-many equations on O(1) extension variables by tracing through the computation
       of f . Can one understand how introducing extension variables affects the complexity of refuting
       polynomial systems of equations? This seems a viable approach to the previous question when
       applied to our technique of using lower bounds for multiples.

                      T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                            69
                M ICHAEL F ORBES , A MIR S HPILKA , I DDO T ZAMERET, AND AVI W IGDERSON

    4. We have shown various lower bounds for multiples by invoking the hardness of the determinant
       (Theorem 6.7), but this does not lead to satisfactory proof lower bounds as the axioms are com-
       plicated. Can one implicitly invoke the hardness of the determinant? For example, consider the
       hard matrix identities suggested by Cook and Rackoff (see for example the survey of Beame and
       Pitassi [10]) and later studied by Soltys and Cook [80]. That is, consider unsatisfiable equations
       such as XY − In ,Y X − 2 · In , where X and Y are symbolic n × n matrices and In is the n × n identity
       matrix. The simplest refutations known involve the determinant (see Hrubeš-Tzameret [40], and
       the discussion in Grochow-Pitassi [35]), can one provide evidence that computing the determinant
       is intrinsic to such refutations?

    5. The lower bounds of this paper are for the static IPS system, where one cannot simplify intermediate
       computations. There are also dynamic algebraic proof systems (see Appendix A), can one extend
       our techniques to that setting?


A      Relating IPS to other proof systems
In this section we summarize some existing work on algebraic proof systems and how these other
proof systems compare to IPS. In particular, we define the (dynamic) Polynomial Calculus refutation
system over circuits (related to but slightly different than the system of Grigoriev and Hirsch [32]) and
relate it to the (static) IPS system ([59, 35]) considered in this paper. We then examine the roABP-PC
system, essentially considered by Tzameret [83], and its separations from sparse-PC. Finally, we consider
multilinear-formula-PC as studied by Raz and Tzameret [66, 65] and show that its tree-like version
simulates multilinear-formula-IPS, and is hence separated from sparse-PC.


A.1     Polynomial calculus refutations
A substantial body of prior work considers dynamic proof systems, which are systems that allow
simplification of intermediate polynomials in the proof. In contrast, IPS is a static system where
the proof is single object with no “intermediate” computations to simplify. We now define the principle
dynamic system of interest, the Polynomial Calculus system. We give a definition over an arbitrary
circuit class, which generalizes the definition of the system as introduced by Clegg, Edmonds, and
Impagliazzo [13]. In particular, the notion of proof size in the definition below is made to accommodate
very weak circuit classes, while being comparable to standard definitions of size for sufficiently strong
circuit measures. For example, in depth-3 powering formulas unbounded multiplications (such as x1 · · · xn )
are expensive and hence we must charge for the circuit size of the result of a product v = g · u as opposed
to the size of just g.

Definition A.1. Let f1 (x), . . . , fm (x) ∈ F[x1 , . . . , xn ] be a system of polynomials. A Polynomial Calculus
(PC) proof for showing that p ∈ F[x] is in the ideal generated by f , x2 − x is a directed acyclic graph with
a single sink, where

      • Leaves are labelled with an equation from f , x2 − x.

                        T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                                  70
             P ROOF C OMPLEXITY L OWER B OUNDS FROM A LGEBRAIC C IRCUIT C OMPLEXITY

    • An internal node v with children u1 , . . . , uk for k > 1 is labelled with a linear combination v =
      α1 u1 + · · · + αk uk for αi ∈ F.

    • An internal node v with a single child u is labelled with the product g · u for some g ∈ F[x].

The value of a node in the proof is defined inductively via the above labels interpreted as equations, and
the value of the output node is required to be the desired polynomial p. The proof is tree-like if the
underlying graph is a tree, and is otherwise dag-like. A PC refutation of f , x2 − x is a proof that 1 is in
the ideal of f , x2 − x so that f , x2 − x is unsatisfiable.
    The size of each node is defined inductively as follows.

    • The size of a leaf v is the size of the minimal circuit agreeing with the value of v.

    • The size of an addition node v = α1 u1 + · · · + αk uk is k plus the size of each child ui , plus the size
      of the minimal circuit agreeing with the value of v.

    • The size of a product node v = g · u is the size of the child u plus the size of the minimal circuit
      agreeing with the value of v.

The size of the proof is the sum of the sizes of each node in the proof. For a restricted algebraic circuit
class C, a C-PC proof is a PC proof where the circuits are measured as their size coming from the
restricted class C.

     As with IPS, one can show this is a sound and complete proof system for unsatisfiability of equations.
Also as with IPS, in our definition of PC we included the Boolean axioms x2 − x as this in the most
common regime.
     An important aspect of the above proof system is that it is semantic, as the polynomials derived in the
proof are simplified to their smallest equivalent algebraic circuit. This is valid in that such simplifications
can be efficiently verified (with randomness) using polynomial identity testing (which can sometimes
be derandomized, see Section 3). In contrast, one could instead require a syntactic proof system, which
would have to provide a proof via syntactic manipulation of algebraic circuits that such simplifications are
valid. We will focus on semantic systems as they more naturally compare with IPS, which also requires
polynomial identity testing for verification.
     While many prior work ([59, 66, 65, 83, 35]) considered algebraic proof systems whose verification
relied on polynomial identity testing (because of semantic simplification or otherwise), we note that the
system of Grigoriev and Hirsch [32] (which they called “formula-PC”) is actually a syntactic system
and as such is deterministically checkable. Despite their system being restricted to being syntactic, it is
still strong enough to simulate Frege and obtain small-depth refutations of the subset-sum axiom, the
pigeonhole principle, and Tseitin tautologies.

Remark A.2. Note that our definition here varies slightly from the definition of Clegg, Edmonds, and
Impagliazzo [13], in that we allow products by an arbitrary polynomial g instead of only allowing products
of a single variable xi . For some circuit classes C these two definitions are polynomially equivalent
(see for example the discussion in Raz and Tzameret [66]). In general however, using the product rule
 f ` xi · f in a tree-like proof can only yield g · f where g is a small formula. However, we will be interested

                       T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                                 71
                  M ICHAEL F ORBES , A MIR S HPILKA , I DDO T ZAMERET, AND AVI W IGDERSON

in algebraic circuit classes not known to be simulated by small formulas (such as roABPs, which can
compute iterated matrix products which are believed to require superpolynomial-size formulas) and thus
will consider this stronger product rule.

    We now observe that tree-like C-PC can simulate C-IPSLIN for natural restricted circuit classes C.

Lemma A.3. Let C be a restricted class of circuits computing polynomials in F[x1 , . . . , xn ], and suppose
that C-circuits grow polynomially in size under multiplication and addition, that is,

      • sizeC ( f · g) ≤ poly(sizeC ( f ), sizeC (g)).

      • sizeC ( f + g) ≤ poly(sizeC ( f )) + poly(sizeC (g)).

In particular, one can take C to be sparse polynomials, depth-3 powering formulas (in characteristic
zero), or roABPs.
     Then if f , x2 − x are computable by size-t C-circuits and have a C-IPSLIN refutation of size-s, then
f , x − x have a tree-like C-PC refutation of size-poly(s,t, n), which is poly(s,t, n)-explicit given the IPS
     2

refutation.

Proof: That the relevant classes obey these closure properties is mostly immediate. See for example
Theorem 3.8 for roABPs. For depth-3 powering formulas, the closure under addition is immediate and
for multiplication it follows from Fischer [19].
    Turning to the simulation, such an IPS refutation is an equation of the form ∑ j g j f j + ∑i hi · (xi2 − xi ) =
1. Using the closure properties of C, one can compute the expression ∑ j g j f j + ∑i hi · (xi2 − xi ) in the
desired size, which yields the required (explicit) derivation of 1.

    Note that the above claim does not work for multilinear formulas, as multilinear polynomials are not
closed under multiplication. That tree-like multilinear-formula-PC simulates multilinear-formula-IPSLIN
is more intricate, and is given in Theorem A.11.
    The Polynomial Calculus proof system has received substantial attention since its introduction by
Clegg, Edmonds, and Impagliazzo [13], typically when the complexity of the proofs is measured in terms
of the number of monomials. In particular, Impagliazzo, Pudlák and Sgall [41] showed an exponential
lower bound for the subset-sum axiom.
Theorem A.4 (Impagliazzo, Pudlák and Sgall [41]). Let F be a field of characteristic zero. Let α ∈
Fn , β ∈ F and A := {∑ni=1 αi xi : x ∈ {0, 1}n } be so that β ∈
                                                              / A. Then α1 x1 + · · · + αn xn − β , x2 − x is
unsatisfiable and any PC refutation requires degree ≥ dn/2e + 1 and exp(Ω(n))-many monomials.

A.2      roABP-PC
The class of roABPs are a natural restricted class of algebraic computation that non-trivially goes beyond
sparse polynomials. In proof complexity, roABP-PC was explored by Tzameret [83] (under the name
of ordered formulas, a formula-variant of roABPs, but the results there apply to roABPs as well). In
particular, Tzameret [83] observed that roABP-PC can be deterministically checked using the efficient
PIT algorithm for roABPs due to Raz and Shpilka [64].

                          T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                                  72
              P ROOF C OMPLEXITY L OWER B OUNDS FROM A LGEBRAIC C IRCUIT C OMPLEXITY

    Given the Impagliazzo, Pudlák and Sgall [41] lower bound for the subset-sum axiom (Theorem A.4),
our roABP-IPS upper bound for this axiom (Corollary 4.14), and the relation between IPSLIN and tree-like
PC (Theorem A.3), we can conclude the following exponential separation.
Corollary A.5. Let F be a field of characteristic zero. Then x1 + · · · + xn + 1, x2 − x is unsatisfiable,
requires sparse-PC refutations of size exp(Ω(n)), but has poly(n)-explicit poly(n)-size roABP-IPSLIN
and tree-like roABP-PC refutations.
  This strengthens a result of Tzameret [83], who separated dag-like roABP-PC from sparse-PC.
However, we note that it is not clear whether sparse-PC can be efficiently simulated by roABP-IPSLIN .

A.3    Multilinear formula PC
We now proceed to study algebraic proofs defined in terms of multilinear formulas, as explored by
Raz and Tzameret [66, 65]. We seek to show that the tree-like version of this system can simulate
multilinear-formula-IPSLIN . While tree-like C-PC can naturally simulate C-IPSLIN if C is closed under
multiplication (Theorem A.3), the product of two multilinear polynomials may not be multilinear.
Therefore, the simulation we derive is more intricate, and is similar to the efficient multilinearization
results for multilinear formulas from Section 4.3. We first define the Raz-Tzameret [66, 65] system
(which they called fMC).

Definition A.6. Let f1 (x), . . . , fm (x) ∈ F[x1 , . . . , xn ] be a system of polynomials. A multilinear-formula-
PC ¬ refutation for showing that the system f , x2 −x is unsatisfiable is a multilinear-formula-PC refutation
of f (x), x2 − x, ¬x2 − ¬x, x ◦ ¬x in the ring F[x1 , . . . , xn , ¬x1 , . . . , ¬xn ], where ‘◦’ denotes the entrywise
product so that x ◦ ¬x = (x1 ¬x1 , . . . , xn ¬xn ).

     That is, a multilinear-formula-PC¬ refutation of f , x2 − x is a multilinear-formula-PC refutation with
the additional variables ¬x := (¬x1 , . . . , ¬xn ) which are constrained so that ¬xi = 1 − xi (so that ‘¬’ here
is simply a modifier of the symbol ‘x’ as opposed to being imbued with mathematical meaning). These
additional variables are important, as without them the system is not complete. For example, in attempting
to refute the subset-sum axiom ∑i xi + 1, x2 − x in multilinear-formula-PC alone, one can never multiply
the axiom ∑i xi + 1 by another (non-constant) polynomial as it would ruin multilinearity. However, in
multilinear-formula-PC¬ one can instead multiply by polynomials in ¬x and appropriately simplify. We
now formalize this by showing that tree-like multilinear-formula-PC¬ can simulate multilinear-formula-
IPSLIN0 (which is complete (Theorem 4.12)).
     We begin by proving a lemma on how the ¬x variables can help multilinearize products. In particular,
if we have a monomial (1 − ¬x)a (which is meant to be equal to xa ) and multiply by x1 we should be able
to prove that this product equals x1 modulo the axioms.

Lemma A.7. Working in the ring F[x1 , . . . , xd , ¬x1 , . . . , ¬xd ], and for 0 ≤ a ≤ 1,

                                        (1 − ¬x)a x1 − x1 = C(x, x ◦ ¬x) ,

for C(x, z) ∈ F[x, z] where C(x, x ◦ ¬x) can be poly(2d )-explicitly derived from the axioms x ◦ ¬x in
poly(2d ) steps using tree-like multilinear-formula-PC.

                         T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                                     73
                M ICHAEL F ORBES , A MIR S HPILKA , I DDO T ZAMERET, AND AVI W IGDERSON

Proof:

                             (1 − ¬x)a x1 = x1−a · (x − x ◦ ¬x)a
                                                                                    !
                                            = x1−a ·       ∑ xa−b (−x ◦ ¬x)b
                                                         0≤b≤a
                                                                                            !
                                                 1−a      a              a−b            b
                                            =x         · x +     ∑ x           (−x ◦ ¬x)
                                                                0<b≤a
                                                 1             1−b
                                            =x +        ∑ x          (−x ◦ ¬x)b
                                                       0<b≤a

                                            = x1 +C(x, x ◦ ¬x) ,

where C(x, z) is defined by
                                           C(x, z) :=     ∑ x1−b (−z)b .
                                                        0<b≤a

Now note that C(x, x ◦ ¬x) can be derived by tree-like multilinear-formula-PC. That is, the expression
x1−b (−x ◦ ¬x)b is multilinear (as the product is variable disjoint) and in the ideal of x ◦ ¬x as b > 0,
and is clearly a poly(d)-size explicit multilinear formula. Summing over the 2d − 1 relevant b gives the
result.

   We now show how to prove the equivalence of g(x) and g(1 − ¬x) modulo x + ¬x − 1, if g is
computable by a small multilinear formula, where we proceed variable by variable.

Lemma A.8. Working in the ring F[x1 , . . . , xn , ¬x1 , . . . , ¬xn ], if g ∈ F[x] is computable by a size-t multi-
linear formula, than
                             g(x) − g(1 − ¬x) = C(x, x + ¬x − 1) ,
for C(x, z) ∈ F[x, z] where C(x, x + ¬x − 1) is derivable from x + ¬x − 1 in size-poly(t, n) tree-like
multilinear-formula-PC, which is poly(t, n)-explicit given the formula for g.

Proof: We proceed to replace 1 − ¬x with x one variable at a time. Using (x≤i , (1 − ¬x)>i ) to denote
(x1 , . . . , xi , 1 − ¬xi+1 , . . . , 1 − ¬xn ), we see that via telescoping that

              g(x) − g(1 − ¬x) = g(x≤n , (1 − ¬x)>n ) − g(x<1 , (1 − ¬x)≥1 )
                                  n                                          
                               = ∑ g(x≤i , (1 − ¬x)>i ) − g(x<i , (1 − ¬x)≥i )
                                     i=1
                                      n                                                        
                                  = ∑ g(x<i , xi , (1 − ¬x)>i ) − g(x<i , 1 − ¬xi , (1 − ¬x)>i ) .
                                     i=1

Now note that g(x<i , y, (1 − ¬x)>i ) is a multilinear polynomial, which as it is linear in y can be written as
                                                                             
   g(x<i , y, (1 − ¬x)>i ) = g(x<i , 1, (1 − ¬x)>i ) − g(x<i , 0, (1 − ¬x)>i ) y + g(x<i , 0, (1 − ¬x)>i ) .

                        T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                                   74
             P ROOF C OMPLEXITY L OWER B OUNDS FROM A LGEBRAIC C IRCUIT C OMPLEXITY

Thus, plugging in xi and 1 − ¬xi ,

   g(x<i , xi , (1 − ¬x)>i ) − g(x<i , 1 − ¬xi , (1 − ¬x)>i )
                                                                                            
               = g(x<i , 1, (1 − ¬x)>i ) − g(x<i , 0, (1 − ¬x)>i ) xi + g(x<i , 0, (1 − ¬x)>i )
                                                                                                         
                    − g(x<i , 1, (1 − ¬x)>i ) − g(x<i , 0, (1 − ¬x)>i ) (1 − ¬xi ) + g(x<i , 0, (1 − ¬x)>i )
                                                                 
               = g(x<i , 1, (1 − ¬x)>i ) − g(x<i , 0, (1 − ¬x)>i ) (xi + ¬xi − 1) .

Plugging this into the above telescoping equation,
                              n                                                          
           g(x) − g(1 − ¬x) = ∑ g(x<i , xi , (1 − ¬x)>i ) − g(x<i , 1 − ¬xi , (1 − ¬x)>i )
                                 i=1
                                  n                                               
                              = ∑ g(x<i , 1, (1 − ¬x)>i ) − g(x<i , 0, (1 − ¬x)>i ) (xi + ¬xi − 1)
                                 i=1
                              =: C(x, x + ¬x − 1) .

Clearly each g(x<i , b, (1 − ¬x)>i ) for b ∈ {0, 1} has a poly(t)-size multilinear algebraic formula, so the
entire expression C(x, x + ¬x − 1) can be computed by tree-like multilinear-formula-PC from x + ¬x − 1
explicitly in poly(t, n) steps.

   Using the above lemma, we now show how to multilinearize a multilinear-formula times a low-degree
multilinear monomial.

Lemma A.9. Let g, f ∈ F[x1 , . . . , xn , y1 , . . . , yd ], where g is computable by a size-t multilinear formula
and y = ∏di=1 yi . Then

                     g(1 − ¬x, 1 − ¬y)y1 − ml(g(x, y)y1 ) = C(x, y, x + ¬x − 1, y ◦ ¬y) ,

where C(x, y, x + ¬x − 1, y ◦ ¬y) can be derived from the axioms x + ¬x − 1, y ◦ ¬y in poly(2d ,t, n) steps
using tree-like multilinear-formula-PC.

Proof: Express g as g(x, y) = ∑0≤a≤1 ga (x)ya in the ring F[x][y], so that each ga is multilinear. Then,

            g(1 − ¬x, 1 − ¬y) · y1 =     ∑ ga (1 − ¬x)(1 − ¬y)a · y1
                                        0≤a≤1


appealing to Theorem A.7 to obtain (1 − ¬y)a y1 = y1 +Ca (y, y ◦ ¬y) for Ca (y, y ◦ ¬y) derivable in tree-like
multilinear-formula-PC from y ◦ ¬y in poly(2d ) steps,
                                                                      
                                     = ∑ ga (1 − ¬x) y1 +Ca (y, y ◦ ¬y)
                                        a


                        T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                                   75
               M ICHAEL F ORBES , A MIR S HPILKA , I DDO T ZAMERET, AND AVI W IGDERSON

appealing to Theorem A.8 to obtain ga (1 − ¬x) = ga (x) + Ba (x, x + ¬x − 1) for Ba (x, x + ¬x − 1) derivable
in tree-like multilinear-formula-PC from x + ¬x − 1 in poly(t, n) steps,
                                                                                         
                                  = ∑ ga (x) + Ba (x, x + ¬x − 1) · y1 +Ca (y, y ◦ ¬y)
                                      a
                                                   
                                   = ∑ ga (x)y1 + ∑ Ba (x, x + ¬x − 1)y1 + ga (x)Ca (y, y ◦ ¬y)
                                      a              a
                                                                                      
                                                    + Ba (x, x + ¬x − 1)Ca (y, y ◦ ¬y)

                                   = ml(g(x, y)y1 ) +C(x, y, x + ¬x − 1, y ◦ ¬y) ,
by defining C appropriately, and as
                                                                         
                                 ml(g(x, y)y1 ) = ml ∑0≤a≤1 ga (x)ya · y1
                                                                       
                                                = ml ∑0≤a≤1 ga (x)ya+1

                                                = ∑0≤a≤1 ga (x)y1 .
By interpolation, it follows that for each exponent a there are constants α a,β such that ga (x) =
∑β ∈{0,1}d αa,β g(x, β ). From this it follows that ga is computable by a multilinear formula of size-
poly(t, 2d ). It thus follows that C(x, y, x + ¬x − 1, y ◦ ¬y) is a sum of 2d terms, each of which is explicitly
derivable in poly(2d ,t, n) steps in tree-like multilinear-formula-PC from x + ¬x − 1, y ◦ ¬y (as the mul-
tiplications are variable-disjoint), and thus C(x, y, x + ¬x − 1, y ◦ ¬y) is similar derived by tree-like
multilinear-formula-PC.

   By linearity we can extend the above to multilinearization of a multilinear-formula times a sparse
low-degree multilinear polynomial.
Corollary A.10. Let g, f ∈ F[x1 , . . . , xn ] be multilinear, where g is computable by a size-t multilinear
formula and f is s-sparse and deg f ≤ d. Then
                       g(1 − ¬x) · f (x) − ml(g(x) · f (x)) = C(x, x + ¬x − 1, x ◦ ¬x) ,
where C(x, x + ¬x − 1, x ◦ ¬x) can be derived from the axioms x + ¬x − 1, x ◦ ¬x in poly(2d , s,t, n) steps
using tree-like multilinear-formula-PC.
     We now conclude by showing that tree-like multilinear-formula-PC¬ can efficiently simulate
multilinear-formula-IPSLIN0 . Recall that this proof system simply requires an IPS refutation that is
linear in the non-Boolean axioms, so that in particular ∑ j g j f j ≡ 1 mod x2 − x for efficiently computable
g j.
Corollary A.11. Let f1 , . . . , fm ∈ F[x1 , . . . , xn ] be degree at most d multilinear s-sparse polynomials
which are unsatisfiable over x ∈ {0, 1}n . Suppose that there are multilinear g j ∈ F[x] computable by
size-t multilinear formula such that
                                       m
                                      ∑ g j (x) f j (x) ≡ 1 mod x2 − x .
                                      i=1


                       T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                                 76
               P ROOF C OMPLEXITY L OWER B OUNDS FROM A LGEBRAIC C IRCUIT C OMPLEXITY

Then there is a tree-like multilinear-formula-PC ¬ refutation of f , x2 − x of size poly(2d , s,t, n, m), which
is poly(2d , s,t, n, m)-explicit given the formulas for the f j , g j .
     In particular, if there is a size-t multilinear-formula-IPSLIN0 refutation of f , x2 − x, then there is a tree-
like multilinear-formula-PC ¬ refutation of f , x2 − x of size poly(2d , s,t, n, m) which is poly(2d , s,t, n, m)-
explicit given the refutation of f , x2 − x and formulas for the f j .

Proof: By the above multilinearization (Corollary A.10), there are C j ∈ F[x, z, w] such that

                          g j (1 − ¬x) f j (x) = ml(g j (x) f j (x)) +C j (x, x + ¬x − 1, x ◦ ¬x) .

where C j (x, x + ¬x − 1, x ◦ ¬x) is derivable from x + ¬x − 1, x ◦ ¬x in poly(2d , s,t, n) steps of tree-like
multilinear-formula-PC. Thus, as g j (1 − ¬x) has a poly(t)-size multilinear formula, in poly(2d , s,t, n, m)
steps we can derive from f (x), x + ¬x − 1, x ◦ ¬x,

                     m                                                           m
                    ∑ g j (1 − ¬x) f j (x) −C j (x, x + ¬x − 1, x ◦ ¬x) = ∑ ml(g j (x) f j (x))
                    j=1                                                           j=1


as ∑m
    i=1 g j (x) f j (x) ≡ 1   mod x2 − x we have that
                                                                                 !
                                   m                              m
                                  ∑ ml(g j (x) f j (x)) = ml ∑ g j (x) f j (x)        =1,
                                  j=1                            i=1

where we appealed to linearity of multilinearization (Theorem 3.12), so that

                     m                                                      
                    ∑ g j (1 − ¬x) f j (x) −C j (x, x + ¬x − 1, x ◦ ¬x) = 1 ,
                    j=1


yielding the desired refutation, where the explicitness is clear.
    The claim about multilinear-formula-IPSLIN0 follows, and thus a refutation induces the equation
∑m                             2
  i=1 g j (x) f j (x) ≡ 1 mod x − x with the appropriate size bounds.


   Given this efficient simulation of multilinear-formula-IPSLIN0 by tree-like multilinear-formula-PC¬
(Theorem A.11), our multilinear-formula-IPSLIN refutation of the subset-sum axiom (Theorem 4.15),
and the lower bound for sparse-PC of the subset-sum axiom (Theorem A.4), we obtain the following
separation result.
Corollary A.12. Let F be a field of characteristic zero. Then x1 + · · · + xn + 1, x2 − x is unsatisfiable,
requires sparse-PC refutations of size exp(Ω(n)), but has poly(n)-explicit poly(n)-size multilinear-
formula-IPSLIN and tree-like multilinear-formula-PC ¬ refutations.
    This strengthens a results of Raz and Tzameret [66, 65], who separated dag-like multilinear-formula-
PC¬ from sparse-PC. However, we note that it is not clear whether sparse-PC can be efficiently simulated
by multilinear-formula-IPSLIN0 .

                           T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                                77
               M ICHAEL F ORBES , A MIR S HPILKA , I DDO T ZAMERET, AND AVI W IGDERSON

B    Explicit multilinear polynomial satisfying a functional equation

In Section 5.2 we showed that any polynomial that agrees with the function x 7→ 1/(∑i xi −β ) on the Boolean
cube must have degree ≥ n. However, as there is a unique multilinear polynomial obeying this functional
equation it is natural to ask for an explicit description of this polynomial, which we now give.


Proposition B.1. Let n ≥ 1 and F be a field with char(F) > n. Suppose that β ∈ F \ {0, . . . , n}. Let
f ∈ F[x1 , . . . , xn ] be the unique multilinear polynomial such that


                                                                 1
                                                  f (x) =              ,
                                                             ∑i xi − β

for x ∈ {0, 1}n . Then
                                                         n
                                                                     k!
                                           f (x) = − ∑       k
                                                                          Sn,k ,
                                                        k=0 ∏ j=0 (β − j)


where Sn,k := ∑S⊆([n]) ∏i∈S xi is the k-th elementary symmetric polynomial.
                    k




Proof: It follows from the uniqueness of the evaluations of multilinear polynomials over the Boolean
cube that


                                       f (x) = ∑ f (1T ) ∏ xi ∏(1 − xi )
                                               T ⊆[n]          i∈T        i∈T
                                                                           /



where 1T ∈ {0, 1}n is the indicator vector of the set T , so that


                                                          1
                                            = ∑                ∏ xi ∏(1 − xi ) .
                                                      |T | − β i∈T
                                               T ⊆[n]               i∈T
                                                                     /



Using this, let us determine the coefficient of ∏i∈S xi in f (x), for S ⊆ [n] with |S| = k. First observe that
setting xi = 0 for i ∈
                     / S preserves this coefficient, so that

                                                                                        !
                                                                  1
               Coeff∏i∈S xi       f (x) = Coeff∏i∈S xi        ∑ |T | − β ∏ xi ∏(1 − xi )
                                                             T ⊆[n]      i∈T i∈T
                                                                              /             xi ←0,i∈S


                         T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                             78
             P ROOF C OMPLEXITY L OWER B OUNDS FROM A LGEBRAIC C IRCUIT C OMPLEXITY

and thus those sets T with T 6⊆ S are zeroed out,
                                                                                       !
                                                                  1
                                       = Coeff∏i∈S xi     ∑            ∏ xi ∏ (1 − xi )
                                                         T ⊆S |T | − β i∈T i∈S\T
                                                                                       !
                                                  1
                                       = ∑             Coeff∏i∈S xi    ∏ xi ∏ (1 − xi )
                                         T ⊆S |T | − β                 i∈T   i∈S\T
                                                   1
                                       = ∑               (−1)k−|T |
                                         T ⊆S |T  | −  β
                                           k  
                                               k        1
                                       =∑                   (−1)k− j
                                         j=0    j    j −  β
                                                  k!
                                       =−                   ,
                                            ∏kj=0 (β − j)
where the last step uses the subclaim below.
Subclaim B.2.
                                   k   
                                       k  1                   k!
                                  ∑ j j − β (−1)k− j = − ∏k (β − j) .
                                  j=0                     j=0

Sub-Proof: Clearing denominators,
                   k              k                 k  
                                     k  1     k− j       k
                  ∏   (i − β ) · ∑ j j−β  (−1)     = ∑ j (−1)k− j ∏(i − β ) .
                  i=0            j=0                 j=0          i6= j

Note that the right hand side is a univariate degree ≤ k polynomial in β , so it is determined by its value
on ` ∈ {0, . . . , k} (that F has large characteristic implies that these values are distinct). Note that on these
values,
                         k                            
                             k                           k
                        ∑ j (−1) ∏(i − `) = ` (−1)k−` ∏ (i − `) · ∏ (i − `)
                                      k− j
                        j=0                i6= j                     0≤i<`         `<i≤k
                                                        
                                                         k
                                                    =      (−1)k−` · (−1)` `! · (k − `)!
                                                         `
                                                    = (−1)k k! .

Thus by interpolation ∑kj=0 kj (−1)k− j ∏i6= j (i − β ) = (−1)k k! for all β , and thus dividing by ∏ki=0 (i − β )
                              

and clearing −1’s yields the claim.                                                                             


    This then gives the claim as the coefficient of ∏i∈S xi only depends on |S| = k.

   Noting that the above coefficients are all nonzero because char(F) > n. Thus, we obtain the fol-
lowing corollary by observing that degree and sparsity are non-increasing under multilinearization
(Theorem 3.12).

                        T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                                  79
               M ICHAEL F ORBES , A MIR S HPILKA , I DDO T ZAMERET, AND AVI W IGDERSON

Corollary B.3. Let n ≥ 1 and F be a field with char(F) > n. Suppose that β ∈ F \ {0, . . . , n}. Let
f ∈ F[x1 , . . . , xn ] be a polynomial such that
                                                  !
                                  f (x)   ∑ xi − β   =1    mod x2 − x .
                                          i

Then deg f ≥ n, and f requires ≥ 2n monomials.


Acknowledgments
We would like to thank Rafael Oliveira for telling us of Theorem 6.14, Mrinal Kumar and Ramprasad
Saptharishi for conversations [23] clarifying the roles of functional lower bounds in this work, as well as
Avishay Tal for pointing out how Theorem B.1 implies an optimal functional lower bound for sparsity
(Corollary B.3). We would also like to thank Joshua Grochow for helpful discussions regarding this work.
We are grateful for the anonymous reviewers for their careful read of the paper and for their comments.


References
 [1] M ANINDRA AGRAWAL: Proving lower bounds via pseudo-random generators. In Proc. 25th
     Found. Softw. Techn. Theoret. Comp. Sci. Conf. (FSTTCS’05), pp. 92–105. Springer, 2005.
     [doi:10.1007/11590156 6] 14, 56, 57

 [2] M ANINDRA AGRAWAL , ROHIT G URJAR , A RPITA KORWAR , AND N ITIN S AXENA: Hitting-
     sets for ROABP and sum of set-multilinear circuits. SIAM J. Comput., 44(3):669–697, 2015.
     [doi:10.1137/140975103, arXiv:1406.7535, ECCC:TR14-085] 8, 18

 [3] M ANINDRA AGRAWAL , C HANDAN S AHA , AND N ITIN S AXENA: Quasi-polynomial hitting-
     set for set-depth-∆ formulas.    In Proc. 45th STOC, pp. 321–330. ACM Press, 2013.
     [doi:10.1145/2488608.2488649, arXiv:1209.2333] 7, 58, 60

 [4] M ICHAEL A LEKHNOVICH AND A LEXANDER A. R AZBOROV: Lower bounds for polynomial
     calculus: Non-binomial case. Trudy Mat. Inst. Steklova, 242:23–43, 2003 (Russian). MathNet.Ru.
     Preliminary version in FOCS’01. 3

 [5] YAROSLAV A LEKSEEV, D IMA G RIGORIEV, E DWARD A. H IRSCH , AND I DDO T ZAMERET: Semi-
     algebraic proofs, IPS lower bounds, and the τ-conjecture: Can a natural number be negative? In
     Proc. 52nd STOC, pp. 54–67. ACM Press, 2020. [doi:10.1145/3357713.3384245, arXiv:1911.06738,
     ECCC:TR19-142] 16

 [6] M ATTHEW A NDERSON , M ICHAEL A. F ORBES , R AMPRASAD S APTHARISHI , A MIR S HPILKA ,
     AND B EN L EE VOLK : Identity testing and lower bounds for read-k oblivious algebraic branching
     programs. ACM Trans. Comput. Theory, 10(1):1–30, 2018. Preliminary version in CCC’16.
     [doi:10.1145/3170709, arXiv:1511.07136, ECCC:TR15-184] 8, 18, 20, 66

                      T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                             80
            P ROOF C OMPLEXITY L OWER B OUNDS FROM A LGEBRAIC C IRCUIT C OMPLEXITY

 [7] M ATTHEW A NDERSON , D IETER VAN M ELKEBEEK , AND I LYA VOLKOVICH: Derandomizing
     polynomial identity testing for multilinear constant-read formulae. In Proc. 26th IEEE Conf. on
     Comput. Complexity (CCC’11), pp. 273–282. IEEE Comp. Soc., 2011. [doi:10.1109/CCC.2011.18,
     ECCC:TR10-188] 57

 [8] V IKRAMAN A RVIND , P USHKAR S. J OGLEKAR , PARTHA M UKHOPADHYAY, AND S. R AJA:
     Randomized polynomial-time identity testing for noncommutative circuits. Theory of Com-
     puting, 15(7):1–36, 2019. Preliminary version in STOC’17. [doi:10.4086/toc.2019.v015a007,
     arXiv:1606.00596, ECCC:TR16-089] 17

 [9] PAUL B EAME , RUSSELL I MPAGLIAZZO , JAN K RAJÍ ČEK , T ONIANN P ITASSI , AND PAVEL
     P UDLÁK: Lower bounds on Hilbert’s Nullstellensatz and propositional proofs. Proc. London
     Math. Soc., s3-73(1):1–26, 1996. Preliminary version in FOCS’94. [doi:10.1112/plms/s3-73.1.1] 3,
     4

[10] PAUL B EAME AND T ONIANN P ITASSI: Propositional proof complexity: Past, present and future.
     Bull. EATCS, 65:66–89, 1998. [ECCC:TR98-067] 70

[11] S AMUEL R. B USS , D IMA G RIGORIEV, RUSSELL I MPAGLIAZZO , AND T ONIANN P ITASSI: Linear
     gaps between degrees for the polynomial calculus modulo distinct primes. J. Comput. System Sci.,
     62(2):267–289, 2001. Preliminary versions in CCC’99 and STOC’99. [doi:10.1006/jcss.2000.1726]
     3

[12] S AMUEL R. B USS , RUSSELL I MPAGLIAZZO , JAN K RAJÍ ČEK , PAVEL P UDLÁK , A LEXANDER A.
     R AZBOROV, AND J I ŘÍ S GALL: Proof complexity in algebraic systems and bounded depth Frege sys-
     tems with modular counting. Comput. Complexity, 6(3):256–298, 1996. [doi:10.1007/BF01294258]
     3

[13] M ATTHEW C LEGG , J EFF E DMONDS , AND RUSSELL I MPAGLIAZZO: Using the Groebner basis
     algorithm to find proofs of unsatisfiability. In Proc. 28th STOC, pp. 174–183. ACM Press, 1996.
     [doi:10.1145/237814.237860] 3, 16, 70, 71, 72

[14] S TEPHEN A. C OOK AND ROBERT A. R ECKHOW: Corrections for “On the lengths of proofs in
     the propositional calculus (Preliminary version)”. SIGACT News, 6(3):15–22, 1974. Preliminary
     version in STOC’74. [doi:10.1145/1008311.1008313]

[15] S TEPHEN A. C OOK AND ROBERT A. R ECKHOW: The relative efficiency of propositional proof
     systems. J. Symbolic Logic, 44(1):36–50, 1979. [doi:10.2307/2273702] 2

[16] DAVID C OX , J OHN L ITTLE , AND D ONAL O’S HEA: Ideals, Varieties, and Algorithms. Springer,
     3rd edition, 2007. [doi:10.1007/978-3-319-16721-3] 22, 23

[17] R ICHARD A. D E M ILLO AND R ICHARD J. L IPTON: A probabilistic remark on algebraic program
     testing. Inform. Process. Lett., 7(4):193–195, 1978. [doi:10.1016/0020-0190(78)90067-4] 17

                     T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                         81
              M ICHAEL F ORBES , A MIR S HPILKA , I DDO T ZAMERET, AND AVI W IGDERSON

[18] Z EEV DVIR , A MIR S HPILKA , AND A MIR Y EHUDAYOFF: Hardness-randomness tradeoffs for
     bounded depth arithmetic circuits. SIAM J. Comput., 39(4):1279–1293, 2010. Preliminary version
     in STOC’08. [doi:10.1137/080735850] 14, 54, 55

[19] I SMOR F ISCHER: Sums of like powers of multivariate linear forms. Math. Magazine, 67(1):59–61,
     1994. [doi:10.1080/0025570X.1994.11996185] 10, 72

[20] M ICHAEL A. F ORBES: Polynomial Identity Testing of Read-Once Oblivious Algebraic Branching
     Programs. Ph. D. thesis, MIT, 2014. LINK at handle.net. 17, 18, 19

[21] M ICHAEL A. F ORBES: Deterministic divisibility testing via shifted partial derivatives. In Proc.
     56th FOCS, pp. 451–465. IEEE Comp. Soc., 2015. [doi:10.1109/FOCS.2015.35] 7, 14, 18, 56, 57,
     58, 60

[22] M ICHAEL A. F ORBES , A NKIT G UPTA , AND A MIR S HPILKA: Personal Communication to Gupta,
     Kamath, Kayal, Saptharishi [37], 2013. 22

[23] M ICHAEL A. F ORBES , M RINAL K UMAR , AND R AMPRASAD S APTHARISHI: Functional lower
     bounds for arithmetic circuits and connections to boolean circuit complexity. In Proc. 31st Comput.
     Complexity Conf. (CCC’16), pp. 33:1–19. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, 2016.
     [doi:10.4230/LIPIcs.CCC.2016.33, arXiv:1605.04207] 11, 12, 80

[24] M ICHAEL A. F ORBES , R AMPRASAD S APTHARISHI , AND A MIR S HPILKA: Hitting sets for
     multilinear read-once algebraic branching programs, in any order. In Proc. 46th STOC, pp. 867–875.
     ACM Press, 2014. [doi:10.1145/2591796.2591816, arXiv:1309.5668] 7, 8, 18, 57, 58

[25] M ICHAEL A. F ORBES AND A MIR S HPILKA: On identity testing of tensors, low-rank re-
     covery and compressed sensing. In Proc. 44th STOC, pp. 163–172. ACM Press, 2012.
     [doi:10.1145/2213977.2213995, arXiv:1111.0663, ECCC:TR11-147] 8, 15, 62

[26] M ICHAEL A. F ORBES AND A MIR S HPILKA: Explicit Noether Normalization for simultaneous
     conjugation via polynomial identity testing. In Proc. 17th Internat. Workshop on Randomization and
     Computation (RANDOM’13), pp. 527–542. Springer, 2013. [doi:10.1007/978-3-642-40328-6 37,
     arXiv:1303.0084, ECCC:TR13-033] 7, 15, 57, 58, 59, 60

[27] M ICHAEL A. F ORBES AND A MIR S HPILKA: Quasipolynomial-time identity testing of non-
     commutative and read-once oblivious algebraic branching programs. In Proc. 54th FOCS, pp.
     243–252. IEEE Comp. Soc., 2013. [doi:10.1109/FOCS.2013.34, arXiv:1209.2408, ECCC:TR12-
     115] 7, 8, 20, 22, 86

[28] M ICHAEL A. F ORBES AND A MIR S HPILKA: Complexity theory column 88: Challenges in
     polynomial factorization. SIGACT News, 46(4):32–49, 2015. [doi:10.1145/2852040.2852051] 55

[29] M ICHAEL A. F ORBES , A MIR S HPILKA , I DDO T ZAMERET, AND AVI W IGDERSON: Proof
     complexity lower bounds from algebraic circuit complexity. In Proc. 31st Comput. Complex-
     ity Conf. (CCC’16), pp. 32:1–17. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, 2016.
     [doi:10.4230/LIPIcs.CCC.2016.32, arXiv:1606.05050, ECCC:TR16-098] 1, 16

                     T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                           82
            P ROOF C OMPLEXITY L OWER B OUNDS FROM A LGEBRAIC C IRCUIT C OMPLEXITY

[30] J OACHIM VON ZUR G ATHEN AND E RICH L. K ALTOFEN: Factoring sparse multivariate poly-
     nomials. J. Comput. System Sci., 31(2):265–287, 1985. Preliminary version in FOCS’83.
     [doi:10.1016/0022-0000(85)90044-3] 55

[31] D IMA G RIGORIEV: Tseitin’s tautologies and lower bounds for Nullstellensatz proofs. In Proc. 39th
     FOCS, pp. 648–652. IEEE Comp. Soc., 1998. [doi:10.1109/SFCS.1998.743515] 3

[32] D IMA G RIGORIEV AND E DWARD A. H IRSCH: Algebraic proof systems over formulas. Theoret.
     Comput. Sci., 303(1):83–102, 2003. [doi:10.1016/S0304-3975(02)00446-2, ECCC:TR01-011] 3, 4,
     10, 24, 70, 71

[33] D IMA G RIGORIEV AND M AREK K ARPINSKI: An exponential lower bound for depth 3 arithmetic
     circuits. In Proc. 30th STOC, pp. 577–582. ACM Press, 1998. [doi:10.1145/276698.276872] 12

[34] D IMA G RIGORIEV AND A LEXANDER A. R AZBOROV: Exponential lower bounds for depth 3
     arithmetic circuits in algebras of functions over finite fields. Applicable Algebra Eng. Commun.
     Comput., 10(6):465–487, 2000. Preliminary version in FOCS’98. [doi:10.1007/s002009900021]
     11, 12

[35] J OSHUA A. G ROCHOW AND T ONIANN P ITASSI: Circuit complexity, proof complexity, and polyno-
     mial identity testing: The ideal proof system. J. ACM, 65(6):37:1–59, 2018. [doi:10.1145/3230742]
     3, 4, 5, 9, 10, 24, 25, 29, 69, 70, 71

[36] A NKIT G UPTA , P RITISH K AMATH , N EERAJ K AYAL , AND R AMPRASAD S APTHARISHI: Ap-
     proaching the chasm at depth four. J. ACM, 61(6):33:1–16, 2014. Preliminary version in CCC’13.
     [doi:10.1145/2629541] 7, 12, 60, 69

[37] A NKIT G UPTA , P RITISH K AMATH , N EERAJ K AYAL , AND R AMPRASAD S APTHARISHI: Arith-
     metic circuits: A chasm at depth three. SIAM J. Comput., 45(3):1064–1079, 2016. Preliminary
     version in FOCS’13. [doi:10.1137/140957123, ECCC:TR13-026] 82

[38] ROHIT G URJAR , A RPITA KORWAR , N ITIN S AXENA , AND T HOMAS T HIERAUF: Deterministic
     identity testing for sum of read-once oblivious arithmetic branching programs. Comput. Com-
     plexity, 26(4):1–46, 2016. Preliminary version in CCC’15. [doi:10.1007/s00037-016-0141-z,
     arXiv:1411.7341] 58, 60

[39] J OOS H EINTZ AND C LAUS -P ETER S CHNORR: Testing polynomials which are easy to compute (ex-
     tended abstract). In Proc. 12th STOC, pp. 262–272. ACM Press, 1980. [doi:10.1145/800141.804674]
     14, 56

[40] PAVEL H RUBEŠ AND I DDO T ZAMERET: Short proofs for the determinant identities. SIAM
     J. Comput., 44(2):340–383, 2015. Preliminary version in STOC’12. [doi:10.1137/130917788,
     arXiv:1112.6265, ECCC:TR11-174] 70

[41] RUSSELL I MPAGLIAZZO , PAVEL P UDLÁK , AND J I ŘÍ S GALL: Lower bounds for the poly-
     nomial calculus and the Gröbner basis algorithm. Comput. Complexity, 8(2):127–144, 1999.
     [doi:10.1007/s000370050024, ECCC:TR97-042] 3, 9, 11, 12, 24, 45, 47, 72, 73

                     T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                          83
              M ICHAEL F ORBES , A MIR S HPILKA , I DDO T ZAMERET, AND AVI W IGDERSON

[42] RUSSELL I MPAGLIAZZO AND AVI W IGDERSON: P=BPP if E requires exponential circuits:
     Derandomizing the XOR lemma. In Proc. 29th STOC, pp. 220–229. ACM Press, 1997.
     [doi:10.1145/258533.258590] 54

[43] VALENTINE K ABANETS AND RUSSELL I MPAGLIAZZO: Derandomizing polynomial identity tests
     means proving circuit lower bounds. Comput. Complexity, 13(1–2):1–46, 2004. Preliminary version
     in STOC’03. [doi:10.1007/s00037-004-0182-6] 14, 54, 55, 56

[44] E RICH L. K ALTOFEN: Factorization of polynomials given by straight-line programs. In S ILVIO
     M ICALI, editor, Randomness and Computation, volume 5 of Adv. Comput. Res., pp. 375–412. JAI
     Press, Inc., 1989. LINK at author’s website. 14, 53, 54, 55

[45] N EERAJ K AYAL: Personal Communication to Saxena [71], 2008. 15, 59

[46] N EERAJ K AYAL: An exponential lower bound for the sum of powers of bounded degree polynomials.
     Electron. Colloq. Comput. Complexity, TR12-081(81), 2012. [ECCC] 7, 12, 60

[47] A DAM K LIVANS AND DANIEL A. S PIELMAN: Randomness efficient identity testing of multivariate
     polynomials. In Proc. 33rd STOC, pp. 216–223. ACM Press, 2001. [doi:10.1145/380752.380801]
     18

[48] JAN K RAJÍ ČEK: Bounded Arithmetic, Propositional Logic, and Complexity Theory. Volume 60 of
     Encycl. Math. Appl. Cambridge Univ. Press, 1995. [doi:10.1017/CBO9780511529948] 2, 3

[49] M RINAL K UMAR AND R AMPRASAD S APTHARISHI: An exponential lower bound for homogeneous
     depth-5 circuits over finite fields. In Proc. 32nd Comput. Complexity Conf. (CCC’17), pp. 31:1–30.
     Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, 2017. [doi:10.4230/LIPIcs.CCC.2017.31,
     arXiv:1507.00177, ECCC:TR15-109] 12

[50] F U L I , I DDO T ZAMERET, AND Z HENGYU WANG: Characterizing propositional proofs as non-
     commutative formulas. SIAM J. Comput., 47(4):1424–1462, 2018. [doi:10.1137/16M1107632,
     arXiv:1412.8746] 5, 6, 8, 10, 24

[51] M EENA M AHAJAN , BV R AGHAVENDRA R AO , AND K ARTEEK S REENIVASAIAH: Building above
     read-once polynomials: Identity testing and hardness of representation. Algorithmica, 76(4):890–
     909, 2016. Preliminary version in COCOON’14. [ECCC:TR15-202] 60

[52] N OAM N ISAN: Lower bounds for non-commutative computation. In Proc. 23rd STOC, pp. 410–418.
     ACM Press, 1991. [doi:10.1145/103418.103462] 5, 6, 7, 8, 18, 19

[53] N OAM N ISAN AND AVI W IGDERSON: Hardness vs randomness. J. Comput. System Sci., 49(2):149–
     167, 1994. Preliminary version in FOCS’88. [doi:10.1016/S0022-0000(05)80043-1] 15, 54

[54] N OAM N ISAN AND AVI W IGDERSON: Lower bounds on arithmetic circuits via partial
     derivatives. Comput. Complexity, 6(3):217–234, 1996. Preliminary version in FOCS’95.
     [doi:10.1007/BF01294256] 7, 12, 38, 59

                     T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                          84
            P ROOF C OMPLEXITY L OWER B OUNDS FROM A LGEBRAIC C IRCUIT C OMPLEXITY

[55] R AFAEL O LIVEIRA: Personal communication, 2015. 61

[56] R AFAEL O LIVEIRA: Factors of low individual degree polynomials. Comput. Complexity, 25(2):507–
     561, 2016. Preliminary version in CCC’15. 14, 54, 55

[57] R AFAEL O LIVEIRA , A MIR S HPILKA , AND B EN LEE VOLK: Subexponential size hitting sets
     for bounded depth multilinear formulas. Comput. Complexity, 25(2):455–505, 2016. Preliminary
     version in CCC’15. 9

[58] Ø YSTEIN O RE: Über höhere Kongruenzen. Norsk Mat. Forenings Skrifter Ser. I, 7(15):27, 1922. 17

[59] T ONIANN P ITASSI: Algebraic propositional proof systems. In Descriptive Complexity and Finite
     Models, volume 31 of DIMACS Ser. Discr. Math. and Theoret. Comp. Sci., pp. 215–244. Amer.
     Math. Soc., 1997. LINK at author’s website. [doi:10.1090/dimacs/031] 3, 4, 5, 70, 71

[60] T ONIANN P ITASSI AND I DDO T ZAMERET: Algebraic proof complexity: progress, frontiers and
     challenges. ACM SIGLOG News, 3(3):21–43, 2016. [doi:10.1145/2984450.2984455] 3

[61] PAVEL P UDLÁK: Lower bounds for resolution and cutting plane proofs and monotone computations.
     J. Symbolic Logic, 62(3):981–998, 1997. [doi:10.2307/2275583] 3

[62] R AN R AZ: Separation of multilinear circuit and formula size. Theory of Computing, 2(6):121–135,
     2006. Preliminary version in FOCS’04. [doi:10.4086/toc.2006.v002a006] 9, 21

[63] R AN R AZ: Multi-linear formulas for permanent and determinant are of super-polynomial size.
     J. ACM, 56(2):8:1–17, 2009. Preliminary version in STOC’04. [doi:10.1145/1502793.1502797,
     ECCC:TR03-067] 9, 18, 21, 53

[64] R AN R AZ AND A MIR S HPILKA: Deterministic polynomial identity testing in non-commutative mod-
     els. Comput. Complexity, 14(1):1–19, 2005. Preliminary version in CCC’04. [doi:10.1007/s00037-
     005-0188-8] 5, 8, 18, 72

[65] R AN R AZ AND I DDO T ZAMERET: Resolution over linear equations and multilinear proofs. Ann.
     Pure Appl. Logic, 155(3):194–224, 2008. [doi:10.1016/j.apal.2008.04.001, arXiv:0708.1529,
     ECCC:TR07-078] 3, 4, 10, 24, 70, 71, 73, 77

[66] R AN R AZ AND I DDO T ZAMERET: The strength of multilinear proofs. Comput. Complexity,
     17(3):407–457, 2008. [doi:10.1007/s00037-008-0246-0, ECCC:TR06-001] 3, 4, 10, 16, 24, 70, 71,
     73, 77

[67] R AN R AZ AND A MIR Y EHUDAYOFF: Balancing syntactically multilinear arithmetic circuits.
     Comput. Complexity, 17(4):515–535, 2008. [doi:10.1007/s00037-008-0254-0] 21

[68] R AN R AZ AND A MIR Y EHUDAYOFF: Lower bounds and separations for constant depth mul-
     tilinear circuits. Comput. Complexity, 18(2):171–207, 2009. Preliminary version in CCC’08.
     [doi:10.1007/s00037-009-0270-8, ECCC:TR08-006] 9, 21, 53

                     T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                         85
              M ICHAEL F ORBES , A MIR S HPILKA , I DDO T ZAMERET, AND AVI W IGDERSON

[69] A LEXANDER A. R AZBOROV: Lower bounds for the polynomial calculus. Comput. Complexity,
     7(4):291–324, 1998. [doi:10.1007/s000370050013] 3

[70] R AMPRASAD S APTHARISHI: 2012. Personal communication to Forbes–Shpilka [27]. 20

[71] N ITIN S AXENA: Diagonal circuit identity testing and lower bounds. In Proc. 35th Internat.
     Colloq. on Automata, Languages, and Programming (ICALP’08), pp. 60–71. Springer, 2008.
     [doi:10.1007/978-3-540-70575-8 6, ECCC:TR07-124] 6, 7, 18, 21, 22, 84

[72] N ITIN S AXENA: Progress on polynomial identity testing. Bull. EATCS, 99:49–79, 2009. EATCS.
     [arXiv:1401.0976, ECCC:TR09-101] 17

[73] N ITIN S AXENA: Progress on polynomial identity testing - II. In M. AGRAWAL AND V. A RVIND,
     editors, Perspectives in Computational Complexity: The Somenath Biswas Anniversary Volume, pp.
     131–146. Springer, 2014. [doi:10.1007/978-3-319-05446-9 7, arXiv:1401.0976, ECCC:TR13-186]
     17

[74] JACOB T. S CHWARTZ: Fast probabilistic algorithms for verification of polynomial identities. J.
     ACM, 27(4):701–717, 1980. Preliminary version in EUROSAM’79. [doi:10.1145/322217.322225]
     17

[75] A MIR S HPILKA: Affine projections of symmetric polynomials. J. Comput. System Sci., 65(4):639–
     659, 2002. Preliminary version in CCC’01. [doi:10.1016/S0022-0000(02)00021-1, ECCC:TR01-
     035] 6

[76] A MIR S HPILKA AND I LYA VOLKOVICH: Improved polynomial identity testing for read-once
     formulas. In Proc. 13th Internat. Workshop on Randomization and Computation (RANDOM’09),
     volume 5687, pp. 700–713. Springer, 2009. 57, 58, 59, 60

[77] A MIR S HPILKA AND AVI W IGDERSON: Depth-3 arithmetic circuits over fields of characteristic
     zero. Comput. Complexity, 10(1):1–27, 2001. [doi:10.1007/PL00001609] 22

[78] A MIR S HPILKA AND A MIR Y EHUDAYOFF: Arithmetic circuits: A survey of recent results and open
     questions. Found. Trends Theor. Comp. Sci., 5(3–4):207–388, 2010. [doi:10.1561/0400000039] 6,
     17, 18, 25

[79] M ICHAEL S HUB AND S TEVE S MALE: On the intractability of Hilbert’s Nullstellensatz and an
     algebraic version of “NP6=P?". Duke Math. J., 81:47–54, 1995. Available in the Collected papers of
     Stephen Smale, pp. 1508–1515, World Scientific 2000. 16

[80] M ICHAEL S OLTYS AND S TEPHEN A. C OOK: The proof complexity of linear alge-
     bra. Ann. Pure Appl. Logic, 130(1–3):277–323, 2004. Preliminary version in LICS’02.
     [doi:10.1016/j.apal.2003.10.018] 70

[81] VOLKER S TRASSEN: Vermeidung von Divisionen. J. reine angew. Math., 264:184–202, 1973.
     [doi:10.1515/crll.1973.264.184] 24, 25

                     T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                          86
            P ROOF C OMPLEXITY L OWER B OUNDS FROM A LGEBRAIC C IRCUIT C OMPLEXITY

[82] M ADHU S UDAN , L UCA T REVISAN , AND S ALIL P. VADHAN: Pseudorandom generators without
     the XOR lemma. J. Comput. System Sci., 62(2):236–266, 2001. Preliminary version in STOC’99.
     [doi:10.1006/jcss.2000.1730] 54

[83] I DDO T ZAMERET: Algebraic proofs over noncommutative formulas. Inform. Comput.,
     209(10):1269–1292, 2011. Preliminary version in TAMC’10. [doi:10.1016/j.ic.2011.07.004,
     arXiv:1004.2159, ECCC:TR10-097] 3, 8, 16, 70, 71, 72, 73

[84] I LYA VOLKOVICH: Computations beyond exponentiation gates and applications. Electron. Colloq.
     Comput. Complexity, TR15-042, 2015. [ECCC] 61

[85] I LYA VOLKOVICH: Deterministically factoring sparse polynomials into multilinear factors and
     sums of univariate polynomials. In Proc. 19th Internat. Workshop on Randomization and Com-
     putation (RANDOM’15), pp. 943–958. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, 2015.
     [doi:10.4230/LIPIcs.APPROX-RANDOM.2015.943, ECCC:TR14-168] 55, 57

[86] R ICHARD Z IPPEL: Probabilistic algorithms for sparse polynomials. In Proc. Internat. Symp. Sym-
     bolic and Algebraic Manipulation (EUROSAM’79), pp. 216–226. Springer, 1979. [doi:10.1007/3-
     540-09519-5 73] 17


AUTHORS

      Michael A. Forbes
      Assistant professor
      University of Illinois at Urbana-Champaign
      IL, USA
      miforbes illinois edu
      http://miforbes.cs.illinois.edu/


      Amir Shpilka
      Professor
      Tel Aviv University
      Tel Aviv, Israel
      shpilka post tau ac il
      www.cs.tau.ac.il/~shpilka




                     T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                        87
            M ICHAEL F ORBES , A MIR S HPILKA , I DDO T ZAMERET, AND AVI W IGDERSON

    Iddo Tzameret
    Professor
    Imperial College London
    London, UK
    iddo tzameret gmail com
    https://www.doc.ic.ac.uk/~itzamere/

    Avi Wigderson
    Institute for Advanced Study, Princeton
    avi ias edu
    www.math.ias.edu/avi/home

ABOUT THE AUTHORS
    M ICHAEL A. F ORBES obtained his Ph. D. in Electrical Engineering and Computer Science
       from the Massachusetts Institute of Technology in 2014, where he was co-advised by
       Scott Aaronson and Amir Shpilka. In his dissertation, he developed new deterministic
       algorithms to solve cases of the Polynomial Identity Testing problem. In 2017, he joined
       the faculty of the University of Illinois at Urbana-Champaign. His research focuses on
       the interaction of randomness, algebra, and computation.

    A MIR S HPILKA obtained his Ph. D. in Computer Science and Mathematics from the Hebrew
       University in Jerusalem in 2001 under the supervision of Avi Wigderson. From 2005 to
       2014 he was a faculty member at the CS department at the Technion. In October 2014 he
       joined the CS department at Tel Aviv University. His research interests lie in complexity
       theory, especially in algebraic complexity.

    I DDO T ZAMERET holds a Chair in Computational Complexity at Imperial College London.
        Before that he was a professor at the University of London (Royal Holloway), a visiting
        scholar at Oxford University, an assistant professor at Tsinghua University and a research
        fellow at the Academy of Sciences in Prague. He received his Ph. D. from Tel Aviv
        University under the supervision of Ran Raz and Nachum Dershowitz. His research lies
        in complexity theory, exploring different approaches to the limits of efficient computa-
        tion and inference, both as a natural and a mathematical phenomenon. This includes
        algebraic, logical and combinatorial approaches in complexity, lower bounds on concrete
        computational models, proof complexity and satisfiability. Among his other expertise is
        the history of rock ’n roll.

    AVI W IGDERSON was born in Haifa, Israel in 1956, and received his Ph. D. in 1983 at
       Princeton University under Dick Lipton. He enjoys and is fascinated with studying the
       power and limits of efficient computation, and the remarkable impact of this field on
       understanding our world. Avi’s other major source of fascination and joy are his three
       kids, Eyal, Einat, and Yuval, and his granddaughter Tamar.


                    T HEORY OF C OMPUTING, Volume 17 (10), 2021, pp. 1–88                            88