Monotone Circuit Lower Bounds from Resolution

Authors Ankit Garg, Mika G{\"o}{\"o}s, Pritish Kamath, Dmitry Sokolov,
Plaintext
                           T HEORY OF C OMPUTING, Volume 16 (13), 2020, pp. 1–30
                                        www.theoryofcomputing.org




                 Monotone Circuit Lower Bounds
                       from Resolution
       Ankit Garg                 Mika Göös∗                 Pritish Kamath†      Dmitry Sokolov‡
                Received July 26, 2018; Revised September 26, 2020; Published November 9, 2020




       Abstract. For any unsatisfiable CNF formula F that is hard to refute in the Resolution proof
       system, we show that a gadget-composed version of F is hard to refute in any proof system
       whose lines are computed by efficient communication protocols—or, equivalently, that a
       monotone function associated with F has large monotone circuit complexity. Our result
       extends to monotone real circuits, which yields new lower bounds for the Cutting Planes
       proof system.

ACM Classification: F.1.3, F.2.3
AMS Classification: 68Q17, 68Q15, 03F20
Key words and phrases: circuit complexity, communication complexity, proof complexity




     An extended abstract of this paper appeared in the Proceedings of the 50th Symposium on Theory of Computing
(STOC’18) [16].
   ∗ Supported by Michael O. Rabin Postdoctoral Fellowship.
   † Supported in parts by NSF grants CCF-1650733, CCF-1733808, and IIS-1741137.
   ‡ Supported by the Swedish Research Council grant 2016-00782, Knut and Alice Wallenberg Foundation grants KAW

2016.0066 and KAW 2016.0433.


c 2020 Ankit Garg, Mika Göös, Pritish Kamath, and Dmitry Sokolov
c b Licensed under a Creative Commons Attribution License (CC-BY)                    DOI: 10.4086/toc.2020.v016a013
                 A NKIT G ARG , M IKA G Ö ÖS , P RITISH K AMATH , AND D MITRY S OKOLOV

1    Appetizer

DAG-like communication protocols [46, 40, 52], generalizing the usual notion of tree-like communication
protocols [35, 31, 41], provide a useful abstraction to study two kinds of objects in complexity theory:

    • Monotone circuits. Let f be a monotone boolean function. The monotone circuit complexity of f
      can be characterized in the language of DAG-like protocols. Namely, it equals the least size of a
      DAG -like protocol that solves the monotone Karchmer–Wigderson (mKW) search problem for f .

    • Propositional proofs. Let F be a CNF contradiction (an unsatisfiable CNF formula). Lower
      bounds for the Resolution refutation length complexity of F—or indeed lower bounds for any
      propositional proof system whose lines are computed by efficient communication protocols—can
      be proved via DAG-like protocols. Namely, a lower bound is given by the least size of a DAG-like
      protocol that solves a certain CNF search problem associated with F.

In this paper, we prove a query-to-communication lifting theorem that escalates lower bounds for a
DAG -like query model (essentially Resolution) to lower bounds for DAG -like communication protocols.
In particular, this yields a new technique to prove size lower bounds for monotone circuits and several
types of proof systems (including Cutting Planes).

The result can be interpreted as a converse to monotone feasible interpolation [10, 33], which is a popular
method to prove refutation size lower bounds for proof systems (such as Resolution and Cutting Planes)
by reductions to monotone circuit lower bounds. A theorem of this type was conjectured by Beame,
Huynh, and Pitassi [5, §6]. We also note that lifting theory for deterministic tree-like protocols—with
applications to monotone formula size, tree-like refutation size, and size–space tradeoffs—has been
developed in quite some detail [42, 28, 20, 21, 13, 56, 11]. We import techniques from this line of work
into the DAG-like setting.
    A follow-up article [18] has obtained several concrete applications using our technique: an exponential
monotone circuit lower bound for XOR - SAT, and a separation showing that the Nullstellensatz proof
system can be exponentially more powerful than Cutting Planes.
    We formalize our result in Section 3 after we have defined our DAG-like models in Section 2.



2    DAG -like models

We define all computational models as solving search problems, defined by a relation S ⊆ I × O for
some finite input and output sets I and O. On input x ∈ I the search problem is to find some output in
S(x) := {o ∈ O : (x, o) ∈ S}. We always assume S is total so that S(x) 6= 0/ for all x ∈ I. We also define
S−1 (o) := {x ∈ I : (x, o) ∈ S}. For applications, the two most important examples of search problems, one
associated with a monotone function f : {0, 1}n → {0, 1}, another with an n-variable CNF contradiction
     V
F = i Di (where Di are disjunctions of literals), are as follows.

                       T HEORY OF C OMPUTING, Volume 16 (13), 2020, pp. 1–30                             2
                        M ONOTONE C IRCUIT L OWER B OUNDS FROM R ESOLUTION


        mKW search problem S f =           input: a pair (x, y) ∈ f −1 (1) × f −1 (0)
                                          output: a coordinate i ∈ [n] such that xi > yi

          CNF search problem SF =          input: an n-variable truth assignment z ∈ {0, 1}n
                                          output: clause D of F unsatisfied by z, i. e., D(z) = 0



2.1   Abstract DAGs
We work with a top-down definition of DAG-like models. A version of the following definition (with a
specialized F) was introduced by [46] and subsequently simplified in [40, 52].

Top-down definition. Let F be a family of functions I → {0, 1}. An F-DAG solving S ⊆ I × O is a
directed acyclic graph of fan-out ≤ 2 where each node v is associated with a function fv ∈ F (we call
fv−1 (1) the feasible set for v) satisfying the following conditions.

   1. Root: There is a distinguished root node r (fan-in 0), and fr ≡ 1 is the constant 1 function.
   2. Non-leaves: For each non-leaf node v with children u,u0 , we have fv−1 (1) ⊆ fu−1 (1) ∪ fu−1
                                                                                                0 (1).

   3. Leaves: Each leaf node v is labeled with an output ov ∈ O such that fv−1 (1) ⊆ S−1 (ov ).

   The size of an F-DAGis its number of nodes. If we specialize S to be a CNF search problem SF , the
above specializes to the familiar definition of refutations in a proof system whose lines are negations of
functions in F. Here is that dual definition, specialized to S = SF .

Bottom-up definition. Let G be a family of functions {0, 1}n → {0, 1}. (To match up with the top-
down definition, one should take G := {¬ f : f ∈ F}.) A (semantic) G-refutation of an n-variable CNF
contradiction F is a directed acyclic graph of fan-out ≤ 2 where each node (or line) v is associated with a
function gv ∈ G satisfying the following conditions.

   1. Root: There is a distinguished root node r (fan-in 0), and gr ≡ 0 is the constant 0 function.
                                                                                               −1
   2. Non-leaves: For each non-leaf node v with children u,u0 , we have g−1          −1
                                                                            v (1) ⊇ gu (1) ∩ gu0 (1).
                                                                             −1       −1
   3. Leaves: Each leaf node v is labeled with a clause D of F such that gv (1) ⊇ D (1).

2.2   Concrete DAGs
We now instantiate the abstract model for the purposes of communication and query complexity.

Rectangle-DAGs (DAG-like protocols). Consider a bipartite input domain I := X × Y so that Alice
holds x ∈ X, Bob holds y ∈ Y, and let F be the set of all indicator functions of (combinatorial) rectangles
over X × Y (sets of the form X ×Y with X ⊆ X, Y ⊆ Y). Call such F-DAGs simply rectangle-DAGs. For a
search problem S ⊆ X × Y × O we define its rectangle-DAG complexity by

                      rect-DAG(S) := least size of a rectangle-DAG that solves S.

                       T HEORY OF C OMPUTING, Volume 16 (13), 2020, pp. 1–30                             3
                   A NKIT G ARG , M IKA G Ö ÖS , P RITISH K AMATH , AND D MITRY S OKOLOV

           Query x2          >                                  Resolve x2           ⊥

                  x2 = 0

    Query x1       (¬x2 )            x2 = 1               Resolve x1         (x2 )

         x1 = 0             x1 = 1

       (¬x1 ∧ ¬x2 ) (x1 ∧ ¬x2 )         (x2 )                   (x1 ∨ x2 )       (¬x1 ∨ x2 )   (¬x2 )
      Conjunction-DAG: Top-down definition                    Resolution proof : Bottom-up definition

Figure 1: Two equivalent ways to view a Resolution refutation, illustrated in the tree-like case (see [31,
§18.2] for more discussion of the tree-like case).


    In circuit complexity, a straightforward generalization of the Karchmer–Wigderson depth characteri-
zation [32] shows that the monotone circuit complexity of any monotone function f equals rect-DAG(S f );
see [40, 52].
    In proof complexity, a useful-to-study semantic proof system is captured by Fc -DAGs solving CNF
search problems SF where Fc is the family of all functions X × Y → {0, 1} (where X × Y = {0, 1}n
corresponds to a bipartition of the n input variables of SF ) that can be computed by tree-like protocols of
communication cost c, say for c = polylog(n). Such a proof system can simulate other systems (such
as Resolution and Cutting Planes with bounded coefficients), and hence lower bounds against Fc -DAGs
imply lower bounds for other concrete proof systems. Moreover, any Fc -DAG can be simulated by a
rectangle-DAG with at most a factor 2c blow-up in size, and hence we do not lose much generality by
studying only rectangle-DAGs.

Conjunction-DAGs (essentially Resolution). Consider the n-bit input domain I := {0, 1}n and let F
be the set of all conjunctions of literals over the n input variables. Call such F-DAGs simply conjunction-
DAG s. We define the width of a conjunction- DAG Π as the maximum width of a conjunction associated
with a node of Π. For a search problem S ⊆ {0, 1}n × O we define

                     conj-DAG(S) := least size of a conjunction-DAG that solves S,
                            w(S) := least width of a conjunction-DAG that solves S.

    In the context of CNF search problems S = SF , conjunction-DAGs are equivalent to Resolution
refutations; see also Figure 1. Indeed, conj-DAG(SF ) is just the Resolution refutation length complexity
of F, and w(SF ) is the Resolution width complexity of F [8].

The complexity measures introduced so far are related as follows; here S0 is any two-party version of S
obtained by choosing some bipartition X × Y = {0, 1}n of the input domain of S.

                                  rect-DAG(S0 ) ≤ conj-DAG(S) ≤ nO(w(S)) .                              (2.1)

                           T HEORY OF C OMPUTING, Volume 16 (13), 2020, pp. 1–30                           4
                        M ONOTONE C IRCUIT L OWER B OUNDS FROM R ESOLUTION

The first inequality holds because each conjunction can be simulated by a rectangle. The second inequality
holds since there are at most nO(w) many distinct width-w conjunctions, and we may assume w.l.o.g. that
any f ∈ F is associated with at most one node in an F-DAG (any incoming edge to a node v can be rewired
to the lowest node u, in topological order, such that fv = fu ).


3    Our results
Our first theorem is a characterization of the rectangle-DAG complexity for composed search problems
of the form S ◦ gn . Here S ⊆ {0, 1}n × O is an arbitrary n-bit search problem, and g : X × Y → {0, 1} is
some carefully chosen two-party gadget that helps to distribute each input variable of S between the two
parties. More precisely, S ◦ gn ⊆ Xn × Yn × O is the search problem where Alice holds x ∈ Xn , Bob holds
y ∈ Yn , and their goal is to find some o ∈ S(z) for z := gn (x, y) = (g(x1 , y1 ), . . . , g(xn , yn )).
    Our concrete choice for a gadget is the usual m-bit index function I NDm : [m] × {0, 1}m → {0, 1}
mapping (x, y) 7→ yx . For large enough m, we show that the bounds (2.1) are tight.
Theorem 3.1. Let m = m(n) := n∆ for a large enough constant ∆. For any S ⊆ {0, 1}n × O,
                                     rect-DAG(S ◦ I NDnm ) = nΘ(w(S)) .
    We note that the conjunction-DAG width complexity of S ◦ I NDnm depends on how Alice’s gadget
inputs xi ∈ [m] are encoded as binary variables. For example, we can have w(S ◦ I NDnm ) = Θ(w(S)) when
using a “unary” encoding; see Section 8 for a discussion.

Implications. The primary advantage of such a lifting theorem is that we obtain, in a generic fashion, a
large class of hard (explicit) monotone functions and CNF contradictions. Let us outline how to apply our
theorem. We can start with any n-variable k-CNF contradiction F of Resolution width w, and conclude
from Theorem 3.1 that the composed problem S0 := SF ◦ I NDnm has rectangle-DAG complexity nΘ(w) . Then
we can use reductions (either new or known; see Section 8 for known ones) to translate S0 back to a
mKW/CNF search problem. The upshot will be the following.
    − S0 reduces to S f 0 where f 0 is some N-bit monotone function with N := nO(k) .
    − S0 reduces to SF 0 where F 0 is some nO(1) -variable 2k-CNF contradiction.
    A follow-up article [18] has provided concrete applications using a novel reduction framework based
on the above template. For example, they consider a monotone function 3XOR - SATn : {0, 1}N → {0, 1}
over N := 2n3 input bits defined as follows. An input x ∈ {0, 1}N is interpreted as (the indicator vector
of) a set of 3XOR constraints over n boolean variables v1 , . . . , vn (there are N possible constraints). We
define 3XOR - SATn (x) := 1 iff the set x is unsatisfiable, that is, no boolean assignment to the vi exists
that satisfies all constraints in x. They proceed to show that if F is an n-variable “Tseitin” contradiction
(which is hard for Resolution [54]), then S0 = SF ◦ I NDnm reduces to S3XOR - SATmn . Combining this with
Theorem 3.1, one obtains the following.
                                                                                   Ω(1)
Corollary 3.2 ([18, Thm. 1]). 3XOR - SATn requires monotone circuits of size 2n           .
   Since 3XOR - SATn is in NC2 [37], this improves on the exponential monotone vs. non-monotone
separation due to Tardos [53]; her function is in P and not known to be in NC.

                       T HEORY OF C OMPUTING, Volume 16 (13), 2020, pp. 1–30                               5
                  A NKIT G ARG , M IKA G Ö ÖS , P RITISH K AMATH , AND D MITRY S OKOLOV




           (a)                        (b)                         (c)                         (d)
Figure 2: We show lifting theorems for DAGs whose feasible sets are (a) rectangles or (b) triangles. It
remains open (see Section 10) to prove any lower bounds for explicit mKW/CNF search problems when
the feasible sets are (c) block-diagonal, which a special case of (d) intersections of 2 triangles.


Limitations. A disadvantage, stemming from the large gadget size m = n∆ , is that we get at best (using
w = Θ(n)) a monotone circuit lower bound of exp(N ε ) for a small constant ε ≥ 1/(∆ + 1). Such lower
bounds fall short of the current best record of exp(N 1/3−o(1) ) due to Harnik and Raz [25]. We inherit
the need for large gadgets from prior work [19, 22]; see Section 4. For this reason (and others), it is
an important open problem to develop a lifting theory for gadgets of size m = O(1). In particular, an
optimal 2Ω(N) lower bound would follow from an appropriate constant-size-gadget version of Theorem 3.1;
see Section 8 for details.


Techniques. We use tools developed in the context of tree-like lifting theorems, specifically from [19,
22]. These tools allow us to relate large rectangles in the input domain of S ◦ I NDnm with large subcubes in
the input domain of S; see Section 4. Given these tools, the proof of Theorem 3.1 is relatively short (two
pages). The proof is extremely direct: from any rectangle-DAG of size nd solving S ◦ I NDnm we extract a
width-O(d) conjunction-DAG solving S.
    Classical work on monotone circuit lower bounds has typically focused on specific monotone
functions [44, 3, 1, 23, 50] and more generally on studying the power of the underlying proof methods [45,
55, 47, 51, 9, 2]. A notable exception is Jukna’s criterion [30], recently applied in [26, 14], which is a
general sufficient condition for a monotone function to require large monotone circuit complexity. Our
perspective is seemingly even more abstract, as our result is phrased for arbitrary search problems (not
just of mKW/CNF type). However, it remains unclear exactly how the power of our methods compare
with the classical techniques; for example, can our result be rephrased in the language of Razborov’s
method of approximations? (An anonymous reviewer thinks this is possible, but not instructive.)


3.1   Extension: Monotone real circuits

Triangle-DAGs. Consider a bipartite input domain I := X × Y and let F be the set of all indicator
functions of (combinatorial) triangles over X × Y; here a triangle T ⊆ X × Y is a set that can be written as
T = {(x, y) ∈ X × Y : aT (x) < bT (y)} for some labeling of the rows aT : X → R and columns bT : Y → R
by real numbers; see Figure 2b. In particular, every rectangle is a triangle. Call such F-DAGs simply

                       T HEORY OF C OMPUTING, Volume 16 (13), 2020, pp. 1–30                               6
                        M ONOTONE C IRCUIT L OWER B OUNDS FROM R ESOLUTION

triangle-DAGs. For a search problem S ⊆ X × Y × O we define

                        tri-DAG(S) := least size of a triangle-DAG that solves S.

     Hrubeš and Pudlák [27] showed recently that the monotone real circuit complexity of an f equals
tri-DAG(S f ). Monotone real circuits [24, 38] generalize monotone circuits by allowing the wires to carry
arbitrary real numbers and the binary gates to compute arbitrary monotone functions R × R → R. The
original motivation to study such circuits, and what interests us here, is that lower bounds for monotone
real circuits imply lower bounds for the Cutting Planes proof system [12]. In our language, semantic
Cutting Planes refutations are equivalent to L-DAGs solving CNF search problems, where L is the family
of linear threshold functions (each f ∈ L is defined by some (n + 1)-tuple a ∈ Rn+1 so that f (x) = 1 iff
∑i∈[n] ai xi > an+1 ).

Our second theorem states that Theorem 3.1 holds more generally with rectangle-DAGs replaced with
triangle-DAGs. The proof is however more involved than the proof for Theorem 3.1.

Theorem 3.3. Let m = m(n) := n∆ for a large enough constant ∆. For any S ⊆ {0, 1}n × O,

                                      tri-DAG(S ◦ I NDnm ) = nΘ(w(S)) .

    A pithy corollary is that if we start with any k-CNF contradiction F that is hard for Resolution and
compose F with a gadget (as described in Section 8), the formula becomes hard for Cutting Planes. In
particular, the composed formula can itself be written as a 2k-CNF.

Corollary 3.4. For any unsatisfiable k-CNF F on n variables, there is a related unsatisfiable 2k-CNF F 0
on nO(1) variables, such that any Cutting Planes refutation for F 0 has length at least nΩ(w(SF )) .

    The follow-up article [18] observed a near-immediate corollary: the Nullstellensatz proof system
(over any field) can be exponentially more powerful than Cutting Planes.

Corollary 3.5 ([18, §4.2]). There exists an n-variable, nO(1) -clause CNF contradiction F that can be
refuted by Nullstellensatz (over any field) in degree O(log n), but that requires Cutting Planes refutations
            Ω(1)
of length 2n .

    Previously, only few examples of hard contradictions were known for Cutting Planes, all proved via
feasible interpolation [38, 24, 26, 14]. A widely-asked question has been to improve this state-of-the-art by
developing alternative lower bound methods; see the surveys [6, §4] and [49, §5]. In particular, Jukna [31,
Research Problem 19.17] asked to find a more intuitive “combinatorial” proof method “explicitly showing
what properties of [contradictions] force long derivations.” While our method does implicitly use feasible
interpolation for Cutting Planes, at least it does afford a simple combinatorial intuition: the hardness is
simply borrowed from the realm of Resolution (where we understand very well what makes formulas
hard).

                       T HEORY OF C OMPUTING, Volume 16 (13), 2020, pp. 1–30                               7
                  A NKIT G ARG , M IKA G Ö ÖS , P RITISH K AMATH , AND D MITRY S OKOLOV

4      Subcubes from rectangles
In this section, as preparation, we recall some technical notions from [19, 22] concerning the index gadget
g := I NDm . Specifically, writing G := gn : [m]n × {0, 1}mn → {0, 1}n for n copies of g, we explain how
large rectangles in the domain of G are related with large subcubes in the codomain of G. In what follows,
we will always assume that m ≥ n∆ for a sufficiently large constant ∆.

4.1     Structured rectangles
For a partial assignment ρ ∈ {0, 1, ∗}n we let free ρ := ρ −1 (∗) denote its free coordinates, and fix ρ :=
[n] r free ρ denote its fixed coordinates. The number of fixed coordinates |fix ρ| is the width of ρ. Width-d
partial assignments are naturally in 1-to-1 correspondence with width-d conjunctions: for any ρ we define
Cρ : {0, 1}n → {0, 1} as the width-|fix ρ| conjunction that accepts an x ∈ {0, 1}n iff x is consistent with ρ.
Thus Cρ−1 (1) = {x ∈ {0, 1}n : xi = ρi for all i ∈ fix ρ} is a subcube. We say that R ⊆ [m]n × {0, 1}mn is
ρ-like if the image of R under G is precisely the subcube of n-bit strings consistent with ρ, that is,

                                  R is ρ-like    ⇐⇒       G(R) = Cρ−1 (1).

    For a random variable x we let H∞ (xx) := minx log(1/Pr[xx = x]) denote the usual min-entropy of x .
When x ∈ [m]J for some index set J, we write xI ∈ [m]I for the marginal distribution of x on a subset
I ⊆ J of coordinates. For a set X we use the boldface X to denote a random variable uniformly distributed
over X.

Definition 4.1 ([19]). A random variable x ∈ [m]J is δ -dense if for every nonempty I ⊆ J, x I has
min-entropy rate ≥ δ , that is, H∞ (xxI ) ≥ δ · |I| log m.

Definition 4.2 ([17, 22]). A rectangle R := X ×Y ⊆ [m]n × {0, 1}mn is ρ-structured if

    1. X fix ρ is fixed, and every z ∈ G(R) is consistent with ρ, that is, G(R) ⊆ Cρ−1 (1);
    2. X free ρ is 0.9-dense;
    3. Y is large enough: H∞ (Y  Y ) ≥ mn − n3 .

Lemma 4.3 ([17, 22]). For m ≥ n∆ , every ρ-structured rectangle is ρ-like.

    In this article we need a slight strengthening of Lemma 4.3: for a ρ-structured R, there is a single row
of R that is already ρ-like. The proof of the following lemma is defered to Section 9.

Lemma 4.4. Let X ×Y be ρ-structured. For m ≥ n∆ , there exists x ∈ X such that {x} ×Y is ρ-like.

      We remark that the only reason why our proofs require m ≥ n∆ is due to Lemma 4.4.

4.2     Rectangle partition scheme
We claim that, given any rectangle R := X ×Y ⊆ [m]n × {0, 1}mn , we can partition most of X ×Y into
ρ-structured subrectangles with |fix ρ| bounded in terms of the size of X × Y . Indeed, we describe a
simple 2-round partitioning scheme from [22] below; see also Figure 3. In the 1st round of the algorithm,

                        T HEORY OF C OMPUTING, Volume 16 (13), 2020, pp. 1–30                               8
                         M ONOTONE C IRCUIT L OWER B OUNDS FROM R ESOLUTION
                                Y                                                        Yerr




              X


                                                          Xerr
                                (a)                                           (b)
Figure 3: (a) Rectangle Scheme partitions R = X ×Y first along rows, then along columns. (b) Lemma 4.5
illustrated: most subrectangles are ρ-structured for low-width ρ, except some error parts (highlighted in
figure) that are contained in few error rows/columns Xerr , Yerr .

Rectangle Scheme
Input: R = X ×Y ⊆ [m]n × {0, 1}mn .
Output: A partition of R into subrectangles.
 1: 1st round: Iterate the following for i = 1, 2, . . . , until X becomes empty:
       (i) Let Ii ⊆ [n] be a maximal subset (possibly Ii = 0)
                                                            / such that X Ii has min-entropy rate < 0.95,
                                                                X Ii = αi ] > m−0.95|Ii |
           and let αi ∈ [m]Ii be an outcome witnessing this: Pr[X
      (ii) Define X i := {x ∈ X : xIi = αi }
     (iii) Update X ← X r X i
 2: 2nd round: For each part X i and γ ∈ {0, 1}Ii , define Y i,γ := {y ∈ Y : gIi (αi , yIi ) = γ}
              i,γ
 3: return    R := X i ×Y i,γ : Y i,γ 6= 0/


we partition the rows as X = i X i where each X i will be fixed on some blocks Ii ⊆ [n] and 0.95-dense on
                               F

the remaining blocks [n] r Ii . In the 2nd round, each X i ×Y is further partitioned along columns so as to
fix the outputs of the gadgets on coordinates Ii .
    All the properties of Rectangle Scheme that we will subsequently need are formalized below; see
also Figure 3. For terminology, given a subset A0 ⊆ A we define its density (inside A) as |A0 |/|A|. The
proof of the following lemma is postponed to Section 7.

Lemma 4.5 (Rectangle Lemma). Fix any parameter k ≤ n log n. Given a rectangle R ⊆ [m]n ×{0, 1}mn , let
R = i Ri be the output of Rectangle Scheme. Then there exist “error” sets Xerr ⊆ [m]n and Yerr ⊆ {0, 1}mn ,
    F

both of density ≤ 2−k , such that for each i, one of the following holds:

   • Structured case: Ri is ρ i -structured for some ρ i of width at most O(k/ log n).
   • Error case: Ri is covered by error rows/columns, i. e., Ri ⊆ Xerr × {0, 1}mn ∪ [m]n ×Yerr .

Finally, a query alignment property holds: for every x ∈ [m]n r Xerr , there exists a subset Ix ⊆ [n] with
|Ix | ≤ O(k/ log n) such that every “structured” Ri intersecting {x} × {0, 1}mn has fix ρ i ⊆ Ix .

                         T HEORY OF C OMPUTING, Volume 16 (13), 2020, pp. 1–30                           9
                  A NKIT G ARG , M IKA G Ö ÖS , P RITISH K AMATH , AND D MITRY S OKOLOV

5     Lifting for rectangle-DAGs
In this section we prove the nontrivial direction of Theorem 3.1: Let Π be a rectangle-DAG solving S ◦ G
of size nd for some d. Our goal is to show that w(S) ≤ O(d).

5.1     Game semantics for DAGs
For convenience (and fun), we use the language of two-player competitive games, introduced in [39, 4],
which provide an alternative way of thinking about conjunction-DAGs solving S ⊆ {0, 1}n × O. The game
involves two competing players, Explorer and Adversary, and proceeds in rounds. The state of the game
in each round is modeled as a partial assignment ρ ∈ {0, 1, ∗}n . At the start of the game, ρ := ∗n . In each
round, Explorer makes one of two moves:
     − Query a variable: Explorer specifies an i ∈ free ρ, and Adversary responds with a bit b ∈ {0, 1}.
       The state ρ is updated by ρi ← b.
     − Forget a variable: Explorer specifies an i ∈ fix ρ, and the state is updated by ρi ← ∗.
An important detail is that Adversary is allowed to choose b ∈ {0, 1} afresh even if the i-th variable was
queried and subsequently forgotten during past play. The game ends when a solution to S can be inferred
from ρ, that is, when Cρ−1 (1) ⊆ S−1 (o) for some o ∈ O.
    Explorer’s goal is to end the game while keeping the width of the game state ρ as small as possible.
Indeed, Atserias and Dalmau [4] prove that w(S) is characterized (up to an additive ±1) as the least w
such that the Explorer has a strategy for ending the game that keeps the width of the game state at most w
throughout the game. (A similar characterization exists for DAG size [39].) Hence our goal becomes to
describe an Explorer-strategy for S such that the width of the game state never exceeds O(d) regardless
of how the Adversary plays.

5.2     Simplified proof
To explain the basic idea, we first give a simplified version of the proof: We assume that all rectangles R
involved in Π—call them the original rectangles—can be partitioned errorlessly into ρ-structured
subrectangles for ρ of width O(d). That is, invoking Rectangle Scheme for each original R, we assume
that
    (∗∗) Assumption: All subrectangles in the partition R =
                                                              F   i
                                                               i R output by Rectangle Scheme satisfy the
         “structured” case of Lemma 4.5 for k := 2d log n.
In Section 5.3 we remove this assumption by explaining how the proof can be modified to work in the
presence of some error rows/columns.

Overview. We extract a width-O(d) Explorer-strategy for S by walking down the rectangle-DAG Π,
starting at the root. For each original rectangle R that is reached in the walk, we maintain a ρ-structured
subrectangle R0 ⊆ R chosen from the partition of R. Note that ρ will have width O(d) by our choice of k.
The intention is that ρ will record the current state of the game. There are three issues to address: (1)
Why is the starting condition of the game met? (2) How do we take a step from a node of Π to one of its
children? (3) Why are we done once we reach a leaf?

                       T HEORY OF C OMPUTING, Volume 16 (13), 2020, pp. 1–30                              10
                          M ONOTONE C IRCUIT L OWER B OUNDS FROM R ESOLUTION

(1) Root case. At start, the root of Π is associated with the original rectangle R = [m]n × {0, 1}mn
comprising the whole domain. The partition of R computed by Rectangle Scheme is trivial: it contains a
single part, the ∗n -structured R itself. Hence we simply maintain the ∗n -structured R ⊆ R, which meets
the starting condition for the game.


(2) Internal step. This is the crux of the argument. Supposing the game has reached state ρR0 and we
are maintaining some ρR0 -structured subrectangle R0 ⊆ R where R is associated with an internal node v,
we want to move to some ρL0 -structured subrectangle L0 ⊆ L where L is associated with a child of v. We
must keep the width of the game state at most O(d) during this move.


                                                                                          L1
                    L0                                            Y0

                               R0
                                        (x∗ , y∗ )                                X0
                          x∗




     Since R0 =: X 0 ×Y 0 is ρR0 -structured, we have from Lemma 4.4 that there exists some x∗ ∈ X 0 such
that {x∗ } × Y 0 is ρR0 -like. Let the two original rectangles associated with the children of v be L0 and
L1 . Let i Lbi be the partition of Lb output by Rectangle Scheme. By query alignment in Lemma 4.5,
          F

there is some Ib∗ ⊆ [n], |Ib∗ | ≤ O(d), such that all Lbi that intersect the x∗ -th row are ρ i -structured with
fix ρ i ⊆ Ib∗ . As Explorer, we now query the input variables in coordinates J := (I0∗ ∪ I1∗ ) r fix ρR0 (in any
order) obtaining some response string zJ ∈ {0, 1}J from the Adversary. As a result, the state of the game
becomes the extension of ρR0 by zJ , call it ρ ∗ , which has width |fix ρ ∗ | = |fix ρR0 ∪ J| ≤ O(d).
     Note that there is some y∗ ∈ Y 0 (and hence (x∗ , y∗ ) ∈ R0 ⊆ L0 ∪ L1 ) such that G(x∗ , y∗ ) is consistent
with ρ ∗ ; indeed, the whole row {x∗ } ×Y 0 is ρR0 -like and ρ ∗ extends ρR0 . Suppose (x∗ , y∗ ) ∈ L0 ; the case
of L1 is analogous. In the partition of L0 , let L0 be the unique part such that (x∗ , y∗ ) ∈ L0 . Note that L0 is
ρL0 -like for some ρL0 that is consistent with G(x∗ , y∗ ) and fix ρL0 ⊆ I0∗ (by query alignment). Hence ρ ∗
extends ρL0 . As Explorer, we now forget all queried variables in ρ ∗ except those queried in ρL0 .
     We have recovered our invariant: the game state is ρL0 and we maintain a ρL0 -structured subrectangle
  0
L of an original rectangle L0 . Moreover, the width of the game state remained O(d).


(3) Leaf case. Suppose the game state is ρ and we are maintaining an associated ρ-structured subrect-
angle R0 ⊆ R corresponding to a leaf node. The leaf node is labeled with some solution o ∈ O satisfying
R0 ⊆ (S ◦ G)−1 (o), that is, G(R0 ) ⊆ S−1 (o). But G(R0 ) = Cρ−1 (1) by Lemma 4.3 so that Cρ−1 (1) ⊆ S−1 (o).
Therefore the game ends. This concludes the (simplified) proof.

                         T HEORY OF C OMPUTING, Volume 16 (13), 2020, pp. 1–30                                 11
                  A NKIT G ARG , M IKA G Ö ÖS , P RITISH K AMATH , AND D MITRY S OKOLOV

5.3    Accounting for error
Next, we explain how to get rid of Assumption (∗∗) by accounting for the rows and columns that are
classified as error in Lemma 4.5 for k := 2d log n. The partitioning of rectangles in Π is done more
carefully. We sort all original rectangles in reverse topological order R1 , R2 , . . . , Rnd from leaves to root,
that is, if Ri is a descendant of R j then Ri comes before R j in the order. Then we perform the following
process on the rectangles in this order.
                                  ∗ = Y ∗ := 0.
Initialize cumulative error sets Xerr        / Iterate for i = 1, 2, . . . , nd rounds:
                                       err
                                        ∗ , Y ∗ . That is, update
    1. Remove from Ri the rows/columns Xerr  err

                                                ∗                        ∗
                                                    × {0, 1}mn ∪ [m]n ×Yerr
                                                                            
                                     Ri ← Ri r Xerr                          .

    2. Apply the Rectangle Scheme for Ri . Output all resulting subrectangles that satisfy the “structured”
       case of Lemma 4.5 for k := 2d log n. (All non-structured subrectangles are omitted). Call the
       resulting error rows/columns Xerr and Yerr .
               ∗ ← X ∗ ∪ X and Y ∗ ← Y ∗ ∪Y .
    3. Update Xerr  err   err   err   err  err

In words, an original rectangle Ri is processed only after all of its descendants are partitioned. Each
descendant may contribute some error rows/columns, accumulated into sets Xerr       ∗ , Y ∗ , which are deleted
                                                                                         err
from Ri before it is partitioned. The partitioning of Ri will in turn contribute its error rows/columns to its
ancestors.
    We may now repeat the proof of Section 5.2 verbatim using only the structured subrectangles output
by the above process. That is, we still maintain the same invariant: when the game state is ρ, we maintain
a ρ-structured R0 (output by the above process) of an original R. We highlight only the key points below.

(1) Root case. The cumulative error at the end of the process is tiny: Xerr     ∗ , Y ∗ have density at most
                                                                                     err
nd · n−2d ≤ 1% by a union bound over all rounds. In particular, the root rectangle Rnd (with errors
removed) still has density 98% inside [m]n × {0, 1}mn , and so the partition output by Rectangle Scheme is
trivial, containing only the ∗n -structured Rnd itself. This meets the starting condition for the game.

(2) Internal step. By construction, the cumulative error sets shrink when we take a step from a node
to one of its children. This means that our error handling does not interfere with the internal step: each
structured subrectangle R0 of an original rectangle R is wholly covered by the structured subrectangles of
the children of R.

(3) Leaf case. This case is unchanged.


6     Lifting for triangle-DAGs
In this section we prove the nontrivial direction of Theorem 3.3. Let Π be a triangle-DAG solving S ◦ G of
size nd for some d. Our goal is to show that w(S) ≤ O(d).

                        T HEORY OF C OMPUTING, Volume 16 (13), 2020, pp. 1–30                                  12
                         M ONOTONE C IRCUIT L OWER B OUNDS FROM R ESOLUTION



                                Li               T ∩ Ri



                                                                            Ri
Figure 4: Structured case of Lemma 6.1: The subtriangle T ∩ Ri is sandwiched between two ρ i -structured
rectangles Li and Ri .


    The proof is conceptually the same as for rectangle-DAGs. The only difference is that we need to
replace Rectangle Scheme (and the associated Lemma 4.5) with an algorithm that partitions a given
triangle T ⊆ [m]n × {0, 1}mn into subtriangles that behave like conjunctions.

6.1     Triangle partition scheme
We introduce a triangle partitioning algorithm, Triangle Scheme. Its precise definition is postponed to
Section 7.2. For now, we only need its high-level description. On input a triangle T , Triangle Scheme
outputs a disjoint cover i Ri ⊇ T where Ri are rectangles. This induces a partition of T into subtriangles
                         F

T ∩ Ri . Each (non-error) rectangle Ri is ρ i -structured (for low-width ρ i ) and is associated with a ρ i -
structured “inner” subrectangle Li ⊆ Ri satisfying Li ⊆ T ∩ Ri ⊆ Ri ; see Figure 4. Hence T ∩ Ri is ρ i -like,
as it is sandwiched between two ρ i -like rectangles.
     More formally, all the properties of Triangle Scheme that we will subsequently need are formalized
below (note the similarity with Lemma 4.5); see Section 7.4 for the proof.

Lemma 6.1 (Triangle Lemma). Fix any parameter k ≤ n log n. Given a triangle T ⊆ [m]n × {0, 1}mn , let
F i                                                                                n                 mn
 i R ⊇ T be the output of Triangle Scheme. Then there exist “error” sets Xerr ⊆ [m] and Yerr ⊆ {0, 1} ,
                   −k
both of density ≤ 2 , such that for each i, one of the following holds.

      • Structured case: Ri is ρ i -structured for some ρ i of width at most O(k/ log n). Moreover, there
        exists an “inner” rectangle Li ⊆ T ∩ Ri such that Li is also ρ i -structured.
      • Error case: Ri is covered by error rows/columns, i. e., Ri ⊆ Xerr × {0, 1}mn ∪ [m]n ×Yerr .

Finally, a query alignment property holds: for every x ∈ [m]n r Xerr , there exists a subset Ix ⊆ [n] with
|Ix | ≤ O(k/ log n) such that every “structured” Ri intersecting {x} × {0, 1}mn has fix ρ i ⊆ Ix .

6.2     Simplified proof
As in the rectangle case, we give a simplified proof assuming no errors. That is, invoking Triangle Scheme
for each triangle T involved in Π, we assume that
                                                     i
                                                  i R ⊇ T output by Triangle Scheme satisfy the “structured”
                                                 F
  (††) Assumption: All rectangles in the cover
       case of Lemma 6.1 for k := 2d log n.

                        T HEORY OF C OMPUTING, Volume 16 (13), 2020, pp. 1–30                              13
                  A NKIT G ARG , M IKA G Ö ÖS , P RITISH K AMATH , AND D MITRY S OKOLOV

The argument for getting rid of the assumption (††) is the same as in the rectangle case, and hence we
omit that step—one only needs to observe that removing cumulative error rows/columns from a triangle
still leaves us with a triangle.


Overview. As before, we extract a width-O(d) Explorer-strategy for S by walking down the triangle-
DAG Π, starting at the root. For each triangle T of Π that is reached in the walk, we maintain a ρ-structured
inner rectangle L ⊆ T . Here ρ (of width O(d) by the choice of k) will record the current state of the
game. There are the three steps (1)–(3) to address, of which (1) and (3) remain exactly the same as in the
rectangle case. So we only explain step (2), which requires us to replace the use of Lemma 4.5 with the
new Lemma 6.1.


(2) Internal step. Supposing the game has reached state ρL and we are maintaining some ρL -structured
inner rectangle L ⊆ T associated with an internal node v, we want to move to some ρLe -structured inner
rectangle L  e ⊆ Te associated with a child of v. Moreover, we must keep the width of the game state at most
O(d) during this move.
     Since L =: X 0 ×Y 0 is ρL -structured, we have from Lemma 4.4 that there exists some x∗ ∈ X 0 such
that {x∗ } × Y 0 is ρL -like. Let the two triangles associated with the children of v be T0 and T1 , so that
L ⊆ T0 ∪ T1 .
     Let i Rib be the rectangle cover of Tb output by Triangle Scheme. By query alignment in Lemma 6.1,
         F

there is some Ib∗ ⊆ [n], |Ib∗ | ≤ O(d), such that all Rib that intersect the x∗ -th row are ρ i -structured with
fix ρ i ⊆ Ib∗ . As Explorer, we now query the input variables in coordinates J := (I0∗ ∪ I1∗ ) r fix ρL (in any
order) obtaining some response string zJ ∈ {0, 1}J from the Adversary. As a result, the state of the game
becomes the extension of ρL by zJ , call it ρ ∗ , which has width |fix ρ ∗ | = |fix ρL ∪ J| ≤ O(d).
     Note that there is some y∗ ∈ Y 0 (and hence (x∗ , y∗ ) ∈ L ⊆ T0 ∪ T1 ) such that G(x∗ , y∗ ) is consistent
with ρ ∗ ; indeed, the whole row {x∗ } ×Y 0 is ρL -like and ρ ∗ extends ρL . Suppose (x∗ , y∗ ) ∈ T0 ; the case of
T1 is analogous. In the rectangle covering of T0 , let R be the unique part such that (x∗ , y∗ ) ∈ R. Note that
R is ρR -like for some ρR that is consistent with G(x∗ , y∗ ) and fix ρR ⊆ I0∗ (by query alignment). Hence ρ ∗
extends ρR . As Explorer, we now forget all queried variables in ρ ∗ except those queried in ρR . Also we
move to the inner rectangle L   e ⊆ R promised by Lemma 6.1 that satisfies L  e ⊆ T0 and is ρe = ρR structured.
                                                                                              L
     We have recovered our invariant: the game state is ρLe and we maintain a ρLe -structured subrectangle L     e
of a triangle T0 . Moreover, the width of the game state remained O(d).



7    Partitioning rectangles and triangles
In this section, we prove Lemma 4.5, define Triangle Scheme, and prove Lemma 6.1. We use repeatedly
the following simple fact about min-entropy.

                                                              X | E) ≥ H∞ (X
Fact 7.1. Let X be a random variable and E an event. Then H∞ (X            X ) − log 1/Pr[E].


                        T HEORY OF C OMPUTING, Volume 16 (13), 2020, pp. 1–30                                  14
                          M ONOTONE C IRCUIT L OWER B OUNDS FROM R ESOLUTION

7.1    Proof of Rectangle Lemma
The proof is more-or-less implicit in [19, 22]. We start by recording a key property of the 1st round of
Rectangle Scheme.
Claim 7.2. Each part X i obtained in 1st round of Rectangle Scheme satisfies the following conditions.
   − Blockwise-density: X i[n]rIi is 0.95-dense.
   − Relative size: |X ⩾i | ≤ mn−0.05|Ii | where X ⩾i := j≥i X j .
                                                        S

Proof. By definition, X i = (X X ⩾i | X ⩾i                                          i
                                        Ii = αi ). Suppose for contradiction that X [n]rIi is not 0.95-dense.
Then there is some nonempty subset K ⊆ [n] r Ii and an outcome β ∈ [m]K violating the min-entropy
condition, namely Pr[X X iK = β ] > m−0.95|K| . But this contradicts the maximality of Ii since the larger set
Ii ∪ K now violates the min-entropy condition for X ⩾i :
           X ⩾i
        Pr[X                     X ⩾i
             Ii ∪K = αi β ] = Pr[X                X iK = β ] > m−0.95|Ii | · m−0.95|K| = m−0.95(|Ii ∪K|) .
                                   Ii = αi ] · Pr[X

This shows the first property. For the second property, apply Theorem 7.1 for X i = (X    X ⩾i | X ⩾i
                                                                                                   Ii = αi )
                      i           ⩾i                                           i
to find that H∞ (X ) ≥ H∞ (X ) − 0.95|Ii | log m. On the other hand, since
                   X            X                                            X   is fixed on Ii , we have
H∞ (X                                                                       X ⩾i ) ≤ (n − 0.05|Ii |) log m,
     X i ) ≤ (n − |Ii |) log m. Combining these two inequalities we get H∞ (X
which yields the second property.
                                                                                                              2
Proof of Lemma 4.5. Identifying Yerr , Xerr . We define Yerr := i,γ Y i,γ subject to |Y i,γ | < 2mn−n . To
                                                                         S

bound the size of Yerr , we claim that there are at most (4m)n possible choices of i, γ. Indeed, each X i is
associated with a unique pair (Ii ⊆ [n], αi ∈ [m]Ii ), and there are at most 2n choices of Ii and at most mn
choices of corresponding αi . Also, for each X i , there are at most 2n possible assignments to γ ∈ {0, 1}Ii .
                                        2                                                                2
For each i, γ, we add at most 2mn−n columns to Yerr . Thus, Yerr has density at most (4m)n · 2−n < 2−k
inside {0, 1}mn .
    We define Xerr := i X i subject to |Ii | > 20k/ log m. Let i be the least index with |Ii | > 20k/ log m so
                        F

that Xerr ⊆ X ⩾i . By Claim 7.2, |X ⩾i | ≤ mn−0.05|Ii | < mn · 2−k since |Ii | > 20k/ log m. In other words, X ⩾i ,
and hence Xerr , has density at most 2−k inside [m]n .
Structured vs. error. Let Ri,γ := X i ×Y i,γ , where Xi is associated with (Ii , αi ), be a rectangle not contained
                                                                                             2
in the error rows/columns. By definition of Xerr , Yerr , this means |Y i,γ | ≥ 2mn−n (so that H∞ (Y        Y i,γ ) ≥
         2                                                                i
mn − n ) and |Ii | ≤ 20k/ log m. We have from Claim 7.2 that X [n]rIi is 0.95-dense. Hence, Ri,γ is
ρ i -structured where ρ i equals γ on Ii and consists of stars otherwise.
Query alignment. For each x ∈ [m]n r Xerr , we define Ix = Ii where X i is the unique part that contains x. It
follows that any ρ-structured rectangle that intersects the x-th row is of the form X i ×Y i,γ and hence has
fix ρ = Ii . Since X i 6⊆ Xerr , we have |Ii | ≤ O(k/ log n).

7.2    Definition of Triangle Scheme
In the description of Triangle Scheme, we denote projections of a set S ⊆ [m]n × {0, 1}mn by
                            X S := {x ∈ [m]n : ∃y ∈ {0, 1}mn such that (x, y) ∈ S} ,
                            Y S := {y ∈ {0, 1}mn : ∃x ∈ [m]n such that (x, y) ∈ S} .


                        T HEORY OF C OMPUTING, Volume 16 (13), 2020, pp. 1–30                                     15
                   A NKIT G ARG , M IKA G Ö ÖS , P RITISH K AMATH , AND D MITRY S OKOLOV

Triangle Scheme
Input: Triangle T ⊆ [m]n × {0, 1}mn with labeling functions (aT , bT )
Output: A disjoint rectangle cover i Ri ⊇ T
                                  F


 1: Yerr ← Column Cleanup on T
 2: Initialize R0alive := {[m]n × ({0, 1}mn rYerr )};     Rralive := 0/ for all r ≥ 1; Rfinal := 0/
 3: loop for r = 0, 1, 2, . . . , rounds until Rralive is empty:
 4:     for all R ∈ Rralive do
            F i
 5:           i R ← Rectangle Scheme on R relative to free coordinates
 6:         for all parts Ri do
                            i       i
 7:              if |X T ∩R | ≥ |X R |/2 then
 8:                   Add Ri to Rfinal
 9:              else
10:                   Ri,top := top half of Ri according to aT (in particular T ∩ Ri ⊆ Ri,top )
11:                   Add Ri,top to Rr+1
                                       alive subject to T ∩ R
                                                             i,top 6= 0
                                                                      /
12: return Rfinal ∪ {[m]n ×Yerr }




Overview. Triangle Scheme computes a disjoint rectangle cover i Ri of T . Starting with a trivial
                                                                              F

cover of the whole communication domain by a single part, the algorithm progressively refines this cover
over several rounds as guided by the input triangle T . As outlined in Section 6.1, the goal is to end
up with ρ-structured rectangles Ri that contain a large enough portion of T so that we may sandwich
Li ⊆ T ∩ Ri ⊆ Ri where Li is a ρ-structured “inner” rectangle.
    The main idea is as follows. The algorithm maintains a pool of alive rectangles. In a single round,
for each alive rectangle R, we first invoke Rectangle Scheme in order to restore ρ-structuredness for the
resulting subrectangles Ri . Then for each Ri we check if the subtriangle T ∩ Ri occupies at least half the
rows of Ri . If yes, we add it to the final pool, which will eventually form the output of the algorithm.
If no, we discard the “lower” half of Ri as determined by the labeling aT , that is, the half that does not
intersect T . The “top” half (containing T ∩ Ri ) will enter the alive pool for next round.



Column Cleanup. An important detail is the subroutine Column Cleanup, run at the start of Triangle
Scheme, which computes a small set of columns that will eventually be declared as Yerr . By discarding
the columns Yerr , we ensure that whatever subrectangle Ri is returned by Rectangle Scheme, the rows
                                                                      i
of T ∩ Ri will satisfy an empty-or-heavy dichotomy: for every x ∈ X R , the x-th row of T ∩ Ri is either
                                                    2
empty, or “heavy”, that is, of size at least 2mn−n . For intuition, an extreme bad example we want
to avoid is a triangle T that is just a single column; such T would be completely declared as “error”
by Column Cleanup. Having many heavy rows helps towards satisfying the 3rd item in Theorem 4.2
of ρ-stucturedness, and hence in finding the inner rectangle Li . This property of Column Cleanup is
formalized in Claim 7.3 below.

                        T HEORY OF C OMPUTING, Volume 16 (13), 2020, pp. 1–30                           16
                           M ONOTONE C IRCUIT L OWER B OUNDS FROM R ESOLUTION

Column Clean-up
Input: Triangle T ⊆ [m]n × {0, 1}mn with labeling functions (aT , bT )
Output: Error columns Yerr ⊆ {0, 1}mn

 1: Yerr ← 0/
 2: For I ⊆ [n], α ∈ [m]I , γ ∈ {0, 1}I , define YI,α,γ :=
                                                        
                                                          y ∈ {0, 1}mn : gI (α, yI ) = γ
                                                                                       2
 3: while there exists I, α, γ, x such that 0 < |T ∩ ({x} × (YI,α,γ rYerr ))| < 2mn−n do
 4:    Yerr ← Yerr ∪Y T ∩({x}×YI,α,γ )
 5: return Yerr


Free coordinates. Another detail to explain is the underlined phrase relative to free coordinates. For
each alive rectangle R we tacitly associate a subset of free coordinates JR ⊆ [n] and fixed coordinates
[n] r JR . At start, the single alive rectangle has JR := [n], and whenever we invoke Rectangle Scheme for
a rectangle R relative to free coordinates, the understanding is that in line (i) of Rectangle Scheme, the
choice of Ii is made among subsets of JR alone. The resulting subrectangle Ri = X i ×Y i , obtained by
fixing the coordinates Ii in X i , will have its free coordinates JRi := JR r Ii . (Restricting a rectangle to its
top half on line 10 does not modify the free coordinates.)

7.3   Properties of Triangle Scheme
Claim 7.3. For a triangle T ⊆ [m]n × {0, 1}mn , let Yerr be the output of Column Cleanup. Then Yerr has
the following properties.
   − Empty-or-heavy: For every triple (I ⊆ [n], α ∈ [m]I , γ ∈ {0, 1}I ), and every x ∈ [m]n , it holds that
                                                                           2
     T ∩ ({x} × (YI,α,γ rYerr )) is either empty or has size at least 2mn−n .
                                     2
   − Size bound: |Yerr | ≤ 2mn−Ω(n ) .
Proof. The first property is immediate by definition of Column Cleanup. For the second property,
                                         2
in each while-iteration, at most 2mn−n columns get added to Yerr . Moreover, there are no more than
2n · mn · 2n · mn = (2m)2n choices of I ⊆ [n], α ∈ [m]I , γ ∈ {0, 1}I and x ∈ [m]n , and the loop executes at
                                                                       2             2
most once for each choice of I, α, γ, x. Thus, |Yerr | ≤ (2m)2n · 2mn−n ≤ 2mn−Ω(n ) .

   Next, we list some key invariants that hold for Triangle Scheme.
Lemma 7.4. For every r ≥ 0, there exists a partition Xr := X i i of [m]n satisfying the following.
                                                            

   (P1) For every R ∈ Rralive we have X R ∈ Xr .
   (P2) Each X i ∈ Xr is labeled by a pair (Ii ⊆ [n], αi ∈ [m]Ii ) such that X iIi = αi is fixed.
   (P3) The partition Xr+1 is a refinement of Xr . The labels respect this: if X j ∈ Xr+1 is a subset of
        X i ∈ Xr , then I j ⊇ Ii and α j agrees with αi on coordinates Ii .
                       ∗
Moreover, let X := Xr be the final partition assuming Triangle Scheme completes in r∗ rounds.
   (P4) For every R ∈ Rfinal the row set X R is a union of parts of X. If X i ∈ X, labeled (Ii , αi ), is such
        that X R ⊇ X i , then the fixed coordinates of R are a subset of Ii .

                        T HEORY OF C OMPUTING, Volume 16 (13), 2020, pp. 1–30                                  17
                   A NKIT G ARG , M IKA G Ö ÖS , P RITISH K AMATH , AND D MITRY S OKOLOV

   (P5) For every r ≥ 0, Xr and X agree on a fraction ≥ 1 − 2−r of rows, that is, there is a subset of
       “final” parts Xrfinal ⊆ Xr such that Xrfinal has density ≥ 1 − 2−r inside [m]n , and Xrfinal ⊆ X.
                                           S


Proof. Let us define the row partitions Xr . The partition X1 contains only a single part, [m]n , labeled
          / Supposing Xr has been defined, the next partition Xr+1 is obtained by refining each old part
by I1 := 0.
X i ∈ Xr . Consider one such old part X i ∈ Xr with label (Ii , αi ). If there is no rectangle R ∈ Rralive with
X R = X i then we need not partition X i any further; we simply include X i in Xr+1 as a whole. Otherwise,
let R ∈ Rralive be any rectangle such that X R = X i ; we emphasize that there can be many such choices for
R, but the upcoming refinement of X i will not depend on that choice. The r-th round of the algorithm
first computes R = i Ri using Rectangle Scheme, and then each Ri might be horizontally split in half.
                      F

We interpret this as a refinement of X i according to the 1st round of Rectangle Scheme on R (which
only depends on X R = X i ), with each part adding more fixed coordinates to the label (Ii , αi ). Letting
X i = j X i, j denote the resulting row partition, we then split each X i, j into two halves X i, j,top and X i, j,bot .
       F

This completes the definition of Xr+1 .
     The properties (P1)–(P5) are straightforward to verify. For (P5), we only note that when the algorithm
horizontally splits a rectangle (inducing X i, j = X i, j,top ∪ X i, j,bot ), the bottom halves are discarded, and
                                                                0
never again touched in future rounds. That is, X i, j,bot ∈ Xr for all r0 > r. This cuts the number of “alive”
rows R∈RraliveX R in half each round.
       S


Lemma 7.5 (Error rows). Let X = {X i }i be the final row partition in Lemma 7.4. Fix any parameter
k < n log n. There is a density-2−k subset Xerr ⊆ [m]n (which is a union of parts of X) such that for any
part X i 6⊆ Xerr , we have |Ii | ≤ O(k/ log n).

Proof. Our strategy is as follows (cf. [22, Lemma 7]). For x ∈ [m]n , let i(x) be the unique index such that
x ∈ X i(x) ∈ X; recall that X i(x) is labeled by some (Ii(x) , αi(x) ). We will study a uniform random x ∼ [m]n
and show that the distribution of the number of fixed coordinates |Ii(xx) | has an exponentially decaying
tail. This allows us to define Xerr as the set of outcomes of x for which |Ii(xx) | is exceptionally large. More
quantitatively, it suffices to show for a large constant C,

                                         Pr |Ii(xx) | > C · k/ log n ≤ 2−k .
                                                                   
                                                                                                            (7.1)

    Recall that X and X` , where ` := k + 1, agree on all but a fraction 2−k /2 of rows by (P5). Hence by a
union bound, it suffices to show a version of (7.1) truncated at level `:

                              Pr |Ii0 (xx) | > C0 · `/ log n ≤ 2−` (= 2−k /2),
                                                           
                                                                                                      (7.2)
                                                             0
where i0 (x) is defined as the unique index with x ∈ X i (x) ∈ X` .
     Partitions as a tree. The sequence X0 , . . . , X` , of row partitions can be visualized as a depth-` tree
where the nodes at depth r corresponds to parts of Xr , and there is an edge from X ∈ Xr to X 0 ∈ Xr+1 iff
X 0 ⊆ X. A way to generate a uniform random x ∼ [m]n is to take a random walk down this tree, starting
at the root:

   − At a non-leaf node X ∈ Xr we take a tree edge (X, X 0 ) with probability |X 0 |/|X|.
   − Once at a leaf node X ∈ X` , we output a uniformly random x ∼ X.

                         T HEORY OF C OMPUTING, Volume 16 (13), 2020, pp. 1–30                                      18
                          M ONOTONE C IRCUIT L OWER B OUNDS FROM R ESOLUTION

   Potential function. We define a nonnegative potential function on the nodes of the tree. For each part
X ∈ Xr , labeled (I ⊆ [n], α ∈ {0, 1}I ), we define

                                   D(X) := (n − |I|) log m − log |X| ≥ 0.

How does the potential change as we take a step starting at node X ∈ Xr labeled (J, α)? If X has one
child, the value of D remains unchanged. Otherwise, we move to a child of X in two substeps.
   − Substep 1: Recall that we partition X = i X i according to the 1st round of Rectangle Scheme
                                                       F

     relative to free coordinates. That is, X i is further restricted on Ii ⊆ [n] r J to some value αi ∈ [m]Ii .
     For a child X i labeled (J t Ii , α t αi ) the potential change is

                     D(X i ) − D(X) = (n − |J ∪ Ii |) log m − log |X i | − (n − |J|) log m + log |X|
                                      = log |X| − log |X i | − |Ii | log m
                                      = log(|X|/|X ⩾i |) − log(|X i |/|X ⩾i |) − |Ii | log m
                                                                  X ⩾i
                                      = log(|X|/|X ⩾i |) − log Pr[X Ii = αi ] − |Ii | log m
                                      ≤ log(|X|/|X ⩾i |) + 0.95|Ii | log m − |Ii | log m
                                      = δ (i) − 0.05|Ii | log m.                  (where δ (i) := log(|X|/|X ⩾i |))

   − Substep 2: Each X i gets split into two halves, X i,top and X i,bot . Moving to either child makes the
     potential increase by exactly 1 bit.
    In summary, when we take a step to a random child in our random walk, the overall change in
potential is itself a random variable, which is at most

                                              δ − 0.05|II | log m + 1,                                        (7.3)

where (II , · ) is the label of the random child, and δ := δ (ii) is the random variable generated by choosing
i with Pr[ii = i] = |X i |/|X|. Summing (7.3) over ` many rounds, we see that ` steps of the random
walk takes us to a node X j ∈ X` with random index j , which is labeled (I j , α j ), and which satisfies
D(X j ) ≤ ∑r∈[`] (δδ r + 1) − 0.05|I j | log m where δ r is the “δδ ” variable corresponding to the r-th step. Since
the potential is nonnegative, we get that
                                                       20
                                          |I j | ≤        · ∑ (δδ r + 1).                                     (7.4)
                                                     log m r∈[`]

Bounding this quantity is awkward since, in general, the variables δ r are not mutually independent.
However, a standard trick to overcome this is to define mutually independent and identically distributed
random variables d r and couple them with δ r so that δ r ≤ d r with probability 1.
   − Definition of d r : Sample a uniform real p r ∈ [0, 1) and define d r := log(1/(1 − p r )) and couple
     with δ r such that δ r = δ (ii) where i is such that p r falls in the i -th interval, assuming we have
     partitioned [0, 1) into half-open intervals with lengths |X i |/|X| (where X 1 , X 2 , . . . are the sets from
     Substep 1) in the natural left-to-right order. Thus, δ r is correctly distributed and δ r ≤ d r holds
     with probability 1.

                        T HEORY OF C OMPUTING, Volume 16 (13), 2020, pp. 1–30                                    19
                  A NKIT G ARG , M IKA G Ö ÖS , P RITISH K AMATH , AND D MITRY S OKOLOV

Note that E[2d r /2 ] = 01 1/(1 − p)1/2 dp = 2. For a large enough constant C > 0, we calculate
                       R


                               Pr ∑r∈[`] d r > C` = Pr[2∑r∈[`] (dd r /2) > 2C`/2 ]
                                                

                                                     ≤ E[2∑r∈[`] (dd r /2) ]/2C`/2
                                                     = ∏r∈[`] E[2d r /2 ] /2C`/2
                                                                              

                                                     = 2` · 2−C`/2
                                                     ≤ 2−C`/3 .

Plugging this estimate in (7.4) (using δ r ≤ d r ) we get that Pr[|I j | > C0 · `/ log n] < 2−` for a sufficiently
large C0 . This proves (7.2) and concludes the proof of the lemma.

7.4   Proof of Triangle Lemma
Identifying Yerr , Xerr . The column error set Yerr is already defined by Triangle Scheme. Note that only one
rectangle, [m]n ×Yerr , is covered by the error columns. Claim 7.3 ensures that Yerr has density at most
      2
2−Ω(n ) < 2−k . The row error set X err is defined by Lemma 7.5 (for the given k).
Structured vs. error. Let i Ri be the output of Triangle Scheme, and consider an Ri = X i ×Y i which is
                           F

not covered by error rows/columns; in particular Ri ∈ Rfinal . Let Ii ⊆ [n] denote the fixed coordinates of
Ri such that X iIi = αi for some αi ∈ {0, 1}Ii . From Claim 7.2 we have that X i[n]rIi is 0.95-dense. From
(P4) and Lemma 7.5 we have |Ii | ≤ O(k/ log n). Moreover, we observe that Y i = YIi ,αi ,γi rYerr for some
γi ∈ {0, 1}Ii (notation from Column Cleanup) since Rectangle Scheme, and hence Triangle Scheme by
extension, only partitions columns by fixing individual gadget outputs. We have |YIi ,αi ,γi | ≥ 2mn−n by
definition, and so |Y i | ≥ 2mn−2n is large enough. Hence we conclude that Ri is ρ i -structured for ρ i that
equals γi on Ii and consists of stars otherwise.
    Next, we locate the associated inner rectangle Li ⊆ Ri . All final rectangles returned by Triangle
                               i
Scheme are such that |X (T ∩R ) | ≥ |X i |/2. That is, every top row in Ri,top has a nonempty intersection with
T . Hence the empty-vs-heavy property of Claim 7.3 says that for all x ∈ X i,top , we have |T ∩ ({x} ×Y i )| ≥
      2
2mn−n . Moreover, note that X i,top is 0.9-dense on its free coordinates [n] r Ii (we lose at most 1 bit of
min-entropy compared to X i by Theorem 7.1). We can now define Li := X i,top ×Y 0 ⊆ T ∩ Ri where Y 0 is
                                               2
the set of the first (according to bT ) 2mn−n columns of Y i ; see Figure 4. This Li meets all the conditions
for being ρ i -structured.
Query alignment. For x ∈ [m]n r Xerr , we define (Ix , αx ) as the label of the unique part i(x) such that
x ∈ X i(x) ∈ X. By Lemma 7.5, |Ix | ≤ O(k/ log n). Every ρ-structured rectangle R j := X j × Y j with
X j ⊇ X i(x) is, by (P4), such that fix ρ ⊆ Ix .


8     Translating between mKW and CNF
In this section, for exposition, we recall some known reductions between mKW and CNF search problems
(as outlined in Section 3). These reductions are generic in that they are not adapted to the special
properties of the search problem S ⊆ {0, 1}n × O one starts with. For concrete applications to natural

                        T HEORY OF C OMPUTING, Volume 16 (13), 2020, pp. 1–30                                  20
                         M ONOTONE C IRCUIT L OWER B OUNDS FROM R ESOLUTION

problems, one often needs more fine-grained reductions; for example, as mentioned in Section 3, the
follow-up article [18] has introduced a more specific framework.
    In an effort to add some new perspective to the old reductions expounded here, we continue to use the
somewhat abstract search problem–centric “top-down” language. We encourage the readers who prefer
the CNF-centric “bottom-up” language to refer to the original cited papers.

Certificates. The key property of an n-variable search problem S ⊆ {0, 1}n × O that facilitates an
efficient reduction to a mKW/CNF search problem is having a low certificate (a.k.a. nondeterministic)
complexity. A certificate for (x, o) ∈ S is a partial assignment ρ ∈ {0, 1, ∗}n such that x is consistent with
ρ and o is a valid output for every input consistent with ρ; in short, x ∈ Cρ−1 (1) ⊆ S−1 (o). A certificate
for x is a certificate for (x, o) ∈ S for some o ∈ S(x). The certificate complexity of x is the least width of
a certificate for x. The certificate complexity of S is the maximum over all x ∈ {0, 1}n of the certificate
complexity of x.
    For any search problem S one can associate a “certification” search problem Scert : on input x to S,
return a certificate for x in S. Algorithmically speaking, such an Scert is clearly at least as hard as S: if we
solve Scert by finding a certificate for (x, o) ∈ S, we can solve S by returning o.

CNF search ⇔ low certificate complexity. For any k-CNF contradiction F, the associated CNF
search problem SF has certificate complexity at most k. Conversely [36], for any total search problem
S ⊆ {0, 1}n × O, we can construct a k-CNF contradiction F, where k is the certificate complexity of S,
such that SF is a type of certification problem for S (and hence at least as hard as S). Namely, we can pick
a collection C of width-k certificates, one for each x ∈ {0, 1}n . The k-CNF formula F is then defined as
  ρ∈C ¬Cρ .
V


Gadget composition. For the purposes of query complexity, there are two ways to represent the first
argument x ∈ [m] to the index function I NDm : [m] × {0, 1}m → {0, 1} as a binary string. The simplest
is to write x as a log m-bit string. Under this convention, I NDm has certificate complexity log m + 1. If
S ⊆ {0, 1}n × O has certificate complexity k, the composed problem S ◦ I NDnm has certificate complexity
k(log m + 1) (by composing certificates). This means that if we start with a k-CNF contradiction F, we
may reduce SF ◦ I NDnm to solving SF 0 where F 0 is a k(log m + 1)-CNF contradiction over O(mn) variables.
     A better representation [5, 13], which does not blow up the certificate complexity (or CNF width), is to
write x as an m-bit string of Hamming weight 1 (the index of the unique 1-entry encodes x ∈ [m]). Under
this convention, I NDm : {0, 1}m × {0, 1}m → {0, 1} becomes a partial function of certificate complexity 2.
Hence, if S has certificate complexity k, the partial composed problem S0 := S ◦ I NDnm has certificate
complexity 2k.
     Moreover, the partial problem S0 can be extended into a total problem Stot without making it any
easier to solve for rectangle/triangle-DAGs, while still allowing for a O(log m)-depth decision tree to
find a 1-entry in a given x. Indeed, we introduce new variables/certificates in order to say that an input
(x, y) to S0 is trivially solved with output ⊥ ∈/ O, if xi = 0m for some i ∈ [n]. Specifically, Alice will
                            0
receive new input bits x ∈ ({0, 1} ) (in addition to the original x ∈ ({0, 1}m )n ) with the convention
                                      m n

that xi,1 := 0 and xi,m+1 := 1. We say that an Alice input xx0 is good if whenever the string xi0 ∈ {0, 1}m
contains the substring 01 starting at position j, then xi, j = 1. Note that since each xi0 starts with a 0

                       T HEORY OF C OMPUTING, Volume 16 (13), 2020, pp. 1–30                                 21
                  A NKIT G ARG , M IKA G Ö ÖS , P RITISH K AMATH , AND D MITRY S OKOLOV

and ends with a 1, the substring 01 must appear somewhere in xi0 . Thus when xx0 is good, each xi will
have Hamming weight at least 1. If xx0 is not good (meaning some xi0 contains a substring 01 but the
corresponding bit of xi is 0), there is a width-3 certificate witnessing this. Our total search problem
Stot ⊆ {0, 1}2mn × {0, 1}mn × (O ∪ {⊥}) is defined by all these width-3 certificates (for output ⊥) together
with all the original certificates of S0 . To see that Stot is at least as hard as S0 for rectangle/triangle-DAGs,
we note that for any input (x, y) to S0 , Alice can compute a unique x0 so that xx0 is good. Now any
output o ∈ Stot (xx0 , y) is also such that o ∈ S0 (x, y). Finally, we note that for every i ∈ [n], there is a
(log m + 1)-query decision tree that either finds some j ∈ [m] with xi, j = 1 or finds a certificate that xx0 is
not good; namely, the decision tree performs a binary search on xi0 for an occurrence of the substring 01.
(This decision tree is useful when finding upper bounds for Stot , such as for Theorem 3.4.)
     In summary, we can reduce (in the context of our DAG-like models) SF ◦ I NDnm to solving SF 0 where
F 0 is a 2k-CNF contradiction over 2mn variables.

mKW problems. A rectangle R ⊆ X × Y is monochromatic for a search problem S ⊆ X × Y × O if
R ⊆ S−1 (o) for some o ∈ O. The nondeterministic communication complexity of S is the logarithm
of the least number of monochromatic rectangles that cover the whole input domain X × Y. If S has
nondeterministic communication complexity log N, then by a standard reduction (e. g., [15, Lemma 2.3])
S reduces to S f for some monotone f : {0, 1}N → {0, 1}.
     Consider a composed search problem SF ◦ gn obtained from a k-CNF contradiction with ` clauses. Its
nondeterministic communication complexity is at most log ` + k · (log m + 1); intuitively, it takes log `
bits to specify an unsatisfied clause C, and log m + 1 bits to verify the output of a single gadget, and there
are k gadgets relevant to C. In summary, SF ◦ gn reduces to S f for some monotone f : {0, 1}N → {0, 1}
on N = ` · (2m)k variables.
     Suppose for a moment that a version of Theorem 3.1, proving a 2Ω(w) lower bound, held for a gadget
of constant size m = O(1). Then we could lift any of the known CNF contradictions with parameters
k = O(1), ` = O(n), w = Ω(n), to obtain an explicit monotone function on N = Θ(n) variables, with
essentially maximal monotone circuit complexity 2Ω(N) . This gives some motivation to further develop
lifting tools for small gadgets.


9    Proof of Lemma 4.4
To prove Lemma 4.4, we recall two claims from [22] (which were used to prove Lemma 4.3). We need
the first claim in a slightly strengthened form. First, define χ(z) := (−1)∑i zi .
Claim 9.1 (Strengthening [22, Lemma 8]). For any ρ-structured X ×Y with free ρ =: J ⊆ [n],
                         ∀I ⊆ J, I 6= 0/ :      EX EY [χ(gI (X    Y I ))] ≤ 2−5|I| log n .
                                                             X I ,Y
Proof. Fix any I ⊆ J, I 6= 0.
                           / Define subsets
      X + := x ∈ X : EY [χ(gI (xI ,Y                             X - :=     x ∈ X : EY [χ(gI (xI ,Y
                                                                          
                                    Y I ))] > 0          and                                      Y I ))] < 0
so that
                                     |X + |                                  |X - |
       EX EY [χ(gI (xI ,Y
                        Y I ))] =           · EX + EY [χ(gI (X
                                                             X +I ,Y
                                                                   Y I ))] +        · EX - EY [−χ(gI (X
                                                                                                      X -I ,Y
                                                                                                            Y I ))] .
                                      |X|                                     |X|

                        T HEORY OF C OMPUTING, Volume 16 (13), 2020, pp. 1–30                                           22
                              M ONOTONE C IRCUIT L OWER B OUNDS FROM R ESOLUTION

It suffices to show that each of the two terms is at most 0.5 · 2−5|I| log n . Let us focus only on the first
term (a similar argument takes care of the second term). If |X + | ≤ 0.5 · 2−5|I| log n · |X|, then we are
already done, so assume the contrary so that H∞ (X    X +I ) ≥ H∞ (XX I ) − 5|I| log n − 1 ≥ 0.8|I| log m; here
recall that H∞ (X                                                  60
                 X I ) ≥ 0.9|I| log m and we may assume m ≥ n . To complete the proof, we rely on a
calculation from [22, Lem. 8]. There, the following is proved for constant 0.9 in place of 0.8, but this is
inconsequential, as one can always increase the exponent in m = n∆ if necessary.
Calculation from [22, Lem. 8, Eq. 4]: If H∞ (X X +I ) ≥ 0.8|I| log m and H∞ (Y Y ) ≥ mn − n3 then

                                       | EX + EY [χ(gI (X     Y I ))]| ≤ 0.5 · 2−5|I| log n .
                                                        X +I ,Y

Claim 9.2 ([22, Lem. 9]). If a random variable z J over {0, 1}J satisfies1 | E[χ(zzI )]| ≤ 2−3|I| log n for every
nonempty I ⊆ J, then z J has full support over {0, 1}J .

                                                              Y I ))]| ≤ 2−3|I| log n for all 0/ 6= I ⊆ J. By
Proof of Lemma 4.4. Say that x ∈ X is good if | EY [χ(gI (xI ,Y
applying Markov’s inequality to Theorem 9.1, we have for a uniform random x ∼ X and any 0/ 6= I ⊆ J
that                          h                                  i
                                           Y I ))] > 2−3|I| log n ≤ 2−2|I| log n .
                      Prx∼X EY [χ(gI (xxI ,Y

Taking a union bound over all 0/ 6= I ⊆ J, we get

                   Prx ∼X [ x is not good ] ≤ ∑06/ =I⊆J Prx ∼X | EY [χ(gI (xxI ,YY I ))]| > 2−3|I| log n
                                                                                                        

                                                                        |J|
                                            ≤ ∑06/ =I⊆J 2−2|I| log n = ∑d=1 |J|
                                                                                   −2d log n
                                                                                d ·2
                                                       |J|
                                                ≤ ∑d=1 2−d log n ≤ 2/n.

Hence most x ∈ X are good. Finally, observe that for any good x, the random variable z J defined as
gJ (x, y ) for a random y ∼ Y , satisfies the Fourier condition in Theorem 9.2. Therefore, such a z J has full
support over {0, 1}J , which means that {x} ×Y is ρ-like.


10      Open problems
If the long line of work on tree-like lifting theory is of any indication, there should be much to explore
also in the DAG-like setting. We propose a few concrete directions.
     Can our methods be extended to prove lower bounds for DAGs whose feasible sets are intersections of
k triangles for k ≥ 2? See Figure 2. This would imply lower bounds for proofs systems such as width-k
Resolution over Cutting Planes [34] and Resolution over linear equations [43, 29].

Question 10.1. Prove a lifting theorem for F-DAGs where F := {intersections of k triangles}.

    One of the most important open problems (e. g., [49, §5]) regarding semi-algebraic proof systems
that manipulate low-degree polynomials—where F is, say, degree-d polynomial threshold functions—is
to prove lower bounds on their DAG-like refutation length (tree-like lower bounds are known [7, 20]).
   1 In [22, §4.6], the claim is proved for the condition | E[χ(zz )]| ≤ 2−5|I| log n . However, the proof still works with the weaker
                                                                    I
condition 2−3|I| log n , as we only require that z J has full support instead of being pointwise-close to uniform.


                            T HEORY OF C OMPUTING, Volume 16 (13), 2020, pp. 1–30                                                 23
                 A NKIT G ARG , M IKA G Ö ÖS , P RITISH K AMATH , AND D MITRY S OKOLOV

Since degree-d polynomials can be efficiently evaluated by (d + 1)-party number-on-forehead (NOF)
protocols, one might hope to prove a DAG-like NOF lifting theorem. However, we currently lack a good
understanding of NOF lifting even in the tree-like case. We believe the first necessary step should be to
settle the following (a two-party analogue of which was proved in [19]).

Question 10.2. Prove a nondeterministic lifting theorem for NOF protocols.

   The proof of Theorem 3.1, which extracts a width-O(d) conjunction-DAG from a size-nd rectangle-
DAG , has the additional property of preserving the DAG depth (up to an O(d) factor). This raises the
question of whether one could investigate size–depth tradeoffs for monotone circuits via lifting.

Question 10.3. Does there exist, for any d ≥ 1, an f : {0, 1}n → {0, 1} computable with monotone
circuits of size nd such that any subexponential-size monotone circuit computing f has depth nΩ(d) ?

    Razborov [48] has recently obtained related results for Resolution, but the parameters in his construc-
tion seem not to be good enough for a direct application of Theorem 3.1.



Acknowledgements
We thank Jakob Nordström for extensive feedback on an early draft of this paper. We also thank Toniann
Pitassi, Thomas Watson, and anonymous STOC and ToC reviewers for comments.


References
 [1] N OGA A LON AND R AVI B. B OPPANA: The monotone circuit complexity of boolean functions.
     Combinatorica, 7(1):1–22, 1987. [doi:10.1007/BF02579196] 6

 [2] K AZUYUKI A MANO AND A KIRA M ARUOKA: The potential of the approximation method. SIAM J.
     Comput., 33(2):433–447, 2004. [doi:10.1137/S009753970138445X] 6

 [3] A LEXANDER E. A NDREEV: On a method for obtaining lower bounds for the complexity of
     individual monotone functions. Dokl. Math., 282(5):1033–1037, 1985. Link at Math-Net.ru. 6

 [4] A LBERT ATSERIAS AND V ÍCTOR DALMAU: A combinatorial characterization of resolution width.
     J. Comput. System Sci., 74(3):323–334, 2008. [doi:10.1016/j.jcss.2007.06.025] 10

 [5] PAUL B EAME , T RINH H UYNH , AND T ONIANN P ITASSI: Hardness amplification in proof com-
     plexity. In Proc. 42nd STOC, pp. 87–96. ACM Press, 2010. [doi:10.1145/1806689.1806703] 2,
     21

 [6] PAUL B EAME AND T ONIANN P ITASSI: Propositional proof complexity: Past, present, and future.
     In Current Trends in Theoretical Computer Science: Entering the 21st Century, pp. 42–70. World
     Scientific, 2001. [doi:10.1142/9789812810403_0001, ECCC:TR98-067] 7

                      T HEORY OF C OMPUTING, Volume 16 (13), 2020, pp. 1–30                             24
                      M ONOTONE C IRCUIT L OWER B OUNDS FROM R ESOLUTION

 [7] PAUL B EAME , T ONIANN P ITASSI , AND NATHAN S EGERLIND: Lower bounds for Lovász–
     Schrijver systems and beyond follow from multiparty communication complexity. SIAM J. Comput.,
     37(3):845–869, 2007. [doi:10.1137/060654645] 23
 [8] E LI B EN -S ASSON AND AVI W IGDERSON: Short proofs are narrow—resolution made simple. J.
     ACM, 48(2):149–169, 2001. [doi:10.1145/375827.375835] 4
 [9] C HRISTER B ERG AND S TAFFAN U LFBERG:             Symmetric approximation arguments for
     monotone lower bounds without sunflowers.           Comput. Complexity, 8(1):1–20, 1999.
     [doi:10.1007/s000370050017] 6
[10] M ARIA L UISA B ONET, T ONIANN P ITASSI , AND R AN R AZ: Lower bounds for cutting planes
     proofs with small coefficients. J. Symbolic Logic, 62(3):708–728, 1997. [doi:10.2307/2275569] 2
[11] A RKADEV C HATTOPADHYAY, M ICHAL KOUCKÝ , B RUNO L OFF , AND S AGNIK M UKHOPAD -
     HYAY: Simulation theorems via pseudo-random properties. Comput. Complexity, 28:617–659, 2019.
     [doi:10.1007/s00037-019-00190-7, arXiv:1704.06807] 2
[12] W ILLIAM C OOK , C OLLETTE C OULLARD , AND G YÖRGY T URÁN: On the complexity of cutting-
     plane proofs. Discr. Appl. Math., 18(1):25–38, 1987. [doi:10.1016/0166-218X(87)90039-4] 7
[13] S USANNA F. DE R EZENDE , JAKOB N ORDSTRÖM , AND M ARC V INYALS: How limited interaction
     hinders real communication (and what it means for proof and circuit complexity). In Proc. 57th
     FOCS, pp. 295–304. IEEE Comp. Soc., 2016. [doi:10.1109/FOCS.2016.40] 2, 21
[14] N OAH F LEMING , D ENIS PANKRATOV, T ONIANN P ITASSI , AND ROBERT ROBERE: Random
     Θ(log n)-CNFs are hard for cutting planes. In Proc. 58th FOCS, pp. 109–120. IEEE Comp. Soc.,
     2017. Update: ECCC TR17-045. [doi:10.1109/FOCS.2017.19] 6, 7
[15] A NNA G ÁL: A characterization of span program size and improved lower bounds for monotone
     span programs. Comput. Complexity, 10(4):277–296, 2001. [doi:10.1007/s000370100001] 22
[16] A NKIT G ARG , M IKA G ÖÖS , P RITISH K AMATH , AND D MITRY S OKOLOV: Monotone cir-
     cuit lower bounds from resolution. In Proc. 50th STOC, pp. 902–911. ACM Press, 2018.
     [doi:10.1145/3188745.3188838] 1
[17] M IKA G ÖÖS , P RITISH K AMATH , T ONIANN P ITASSI , AND T HOMAS WATSON: Query-to-
     communication lifting for PNP . Comput. Complexity, 28(1):113–144, 2019. Preliminary version in
     CCC’17. [doi:10.1007/s00037-018-0175-5] 8
[18] M IKA G ÖÖS , P RITISH K AMATH , ROBERT ROBERE , AND D MITRY S OKOLOV: Adven-
     tures in monotone complexity and TFNP. In Proc. 10th Innovations in Theoret. Comp.
     Sci. (ITCS’19), pp. 38:1–38:19. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, 2019.
     [doi:10.4230/LIPIcs.ITCS.2019.38] 2, 5, 7, 21
[19] M IKA G ÖÖS , S HACHAR L OVETT, R AGHU M EKA , T HOMAS WATSON , AND DAVID Z UCK -
     ERMAN : Rectangles are nonnegative juntas. SIAM J. Comput., 45(5):1835–1869, 2016.
     [doi:10.1137/15M103145X] 6, 8, 15, 24

                     T HEORY OF C OMPUTING, Volume 16 (13), 2020, pp. 1–30                       25
                A NKIT G ARG , M IKA G Ö ÖS , P RITISH K AMATH , AND D MITRY S OKOLOV

[20] M IKA G ÖÖS AND T ONIANN P ITASSI: Communication lower bounds via critical block
     sensitivity. SIAM J. Comput., 47(5):1778–1806, 2018. Preliminary version in STOC’14.
     [doi:10.1137/16M1082007] 2, 23
[21] M IKA G ÖÖS , T ONIANN P ITASSI , AND T HOMAS WATSON: Deterministic communication vs.
     partition number. SIAM J. Comput., 47(6):2435–2450, 2018. Preliminary version in FOCS’15.
     [doi:10.1137/16M1059369] 2
[22] M IKA G ÖÖS , T ONIANN P ITASSI , AND T HOMAS WATSON: Query-to-communication lift-
     ing for BPP. SIAM J. Comput., 49(4):132–143, 2020. Preliminary version in FOCS’17.
     [doi:10.1137/17M115339X] 6, 8, 15, 18, 22, 23
[23] A RMIN H AKEN: Counting bottlenecks to show monotone P 6= NP. In Proc. 36th FOCS, pp. 36–40.
     IEEE Comp. Soc., 1995. [doi:10.1109/SFCS.1995.492460] 6
[24] A RMIN H AKEN AND S TEPHEN A. C OOK: An exponential lower bound for the size of monotone
     real circuits. J. Comput. System Sci., 58(2):326–335, 1999. [doi:10.1006/jcss.1998.1617] 7
[25] DANNY H ARNIK AND R AN R AZ: Higher lower bounds on monotone size. In Proc. 32nd STOC,
     pp. 378–387. ACM Press, 2000. [doi:10.1145/335305.335349] 6
[26] PAVEL H RUBEŠ AND PAVEL P UDLÁK: Random formulas, monotone circuits, and interpolation. In
     Proc. 58th FOCS, pp. 121–131. IEEE Comp. Soc., 2017. [doi:10.1109/FOCS.2017.20] 6, 7
[27] PAVEL H RUBEŠ AND PAVEL P UDLÁK: A note on monotone real circuits. Information Processing
     Letters, 131:15–19, 2018. [doi:10.1016/j.ipl.2017.11.002, ECCC:TR17-048] 7
[28] T RINH H UYNH AND JAKOB N ORDSTRÖM: On the virtue of succinct proofs: Amplifying commu-
     nication complexity hardness to time–space trade-offs in proof complexity. In Proc. 44th STOC, pp.
     233–248. ACM Press, 2012. [doi:10.1145/2213977.2214000] 2
[29] D MITRY I TSYKSON AND D MITRY S OKOLOV: Lower bounds for splittings by linear combinations.
     In Proc. 39th Math. Found. Comp. Sci. (MFCS’14), pp. 372–383. Springer, 2014. [doi:10.1007/978-
     3-662-44465-8_32] 23
[30] S TASYS J UKNA: Finite limits and monotone computations: The lower bounds criterion. In
     Proc. 12th IEEE Conf. on Comput. Complexity (CCC’97), pp. 302–313. IEEE Comp. Soc., 1997.
     [doi:10.1109/CCC.1997.612325] 6
[31] S TASYS J UKNA: Boolean Function Complexity: Advances and Frontiers. Volume 27 of Algorithms
     and Combinatorics. Springer, 2012. [doi:10.1007/978-3-642-24508-4] 2, 4, 7
[32] M AURICIO K ARCHMER AND AVI W IGDERSON: Monotone circuits for connectivity require super-
     logarithmic depth. SIAM J. Discr. Math., 3(2):255–265, 1990. Preliminary version in STOC’88.
     [doi:10.1137/0403021] 4
[33] JAN K RAJÍ ČEK: Interpolation theorems, lower bounds for proof systems, and independence results
     for bounded arithmetic. J. Symbolic Logic, 62(2):457–486, 1997. [doi:10.2307/2275541] 2

                     T HEORY OF C OMPUTING, Volume 16 (13), 2020, pp. 1–30                          26
                       M ONOTONE C IRCUIT L OWER B OUNDS FROM R ESOLUTION

[34] JAN K RAJÍ ČEK: Discretely ordered modules as a first-order extension of the cutting planes proof
     system. J. Symbolic Logic, 63(4):1582–1596, 1998. [doi:10.2307/2586668] 23

[35] E YAL K USHILEVITZ AND N OAM N ISAN: Communication Complexity. Cambridge Univ. Press,
     1997. [doi:10.1017/CBO9780511574948] 2

[36] L ÁSZLÓ L OVÁSZ , M ONI NAOR , I LAN N EWMAN , AND AVI W IGDERSON: Search problems in the
     decision tree model. SIAM J. Discr. Math., 8(1):119–132, 1995. [doi:10.1137/S0895480192233867]
     21

[37] K ETAN M ULMULEY: A fast parallel algorithm to compute the rank of a matrix over an arbitrary
     field. Combinatorica, 7(1):101–104, 1987. [doi:10.1007/BF02579205] 5

[38] PAVEL P UDLÁK: Lower bounds for resolution and cutting plane proofs and monotone computations.
     J. Symbolic Logic, 62(3):981–998, 1997. [doi:10.2307/2275583] 7

[39] PAVEL P UDLÁK: Proofs as games.               Amer. Math. Monthly, 107(6):541–550, 2000.
     [doi:10.2307/2589349] 10

[40] PAVEL P UDLÁK: On extracting computations from propositional proofs (a survey). In Proc. 30th
     Found. Software Techn. Theoret. Comp. Sci. (FSTTCS’10), pp. 30–41. Schloss Dagstuhl–Leibniz-
     Zentrum fuer Informatik, 2010. [doi:10.4230/LIPIcs.FSTTCS.2010.30] 2, 3, 4

[41] A NUP R AO AND A MIR Y EHUDAYOFF: Communication Complexity and Applications. Cambridge
     Univ. Press, 2020. [doi:10.1017/9781108671644] 2

[42] R AN R AZ AND P IERRE M C K ENZIE: Separation of the monotone NC hierarchy. Combinatorica,
     19(3):403–435, 1999. [doi:10.1007/s004930050062] 2

[43] R AN R AZ AND I DDO T ZAMERET: Resolution over linear equations and multilinear proofs. Ann.
     Pure Appl. Logic, 155(3):194–224, 2008. [doi:10.1016/j.apal.2008.04.001] 23

[44] A LEXANDER A. R AZBOROV: Lower bounds on the monotone complexity of some Boolean
     functions. Dokl. Math., 31(4):354–357, 1985. Link at Math-Net.ru. 6

[45] A LEXANDER A. R AZBOROV: On the method of approximations. In Proc. 21st STOC, pp. 167–176.
     ACM Press, 1989. [doi:10.1145/73007.73023] 6

[46] A LEXANDER A. R AZBOROV: Unprovability of lower bounds on circuit size in certain fragments of
     bounded arithmetic. Izvestiya Math., (1):205–227, 1995. Link at Math-Net.ru. 2, 3

[47] A LEXANDER A. R AZBOROV: On small size approximation models. In The Mathematics of Paul
     Erdös I, pp. 385–392. Springer, 1997. [doi:10.1007/978-3-642-60408-9_28] 6

[48] A LEXANDER A. R AZBOROV: A new kind of tradeoffs in propositional proof complexity. J. ACM,
     63(2):16:1–16:14, 2016. [doi:10.1145/2858790] 24

                     T HEORY OF C OMPUTING, Volume 16 (13), 2020, pp. 1–30                          27
                A NKIT G ARG , M IKA G Ö ÖS , P RITISH K AMATH , AND D MITRY S OKOLOV

[49] A LEXANDER A. R AZBOROV: Proof complexity and beyond. SIGACT News, 47(2):66–86, 2016.
     [doi:10.1145/2951860.2951875] 7, 23

[50] B ENJAMIN ROSSMAN: The monotone complexity of k-clique on random graphs. SIAM J. Comput.,
     43(1):256–279, 2014. [doi:10.1137/110839059] 6

[51] JANOS S IMON AND S HI -C HUN T SAI: On the bottleneck counting argument. Theoret. Comput. Sci.,
     237(1–2):429–437, 2000. Preliminary version in CCC’97. [doi:10.1016/S0304-3975(99)00321-7] 6

[52] D MITRY S OKOLOV: Dag-like communication and its applications. In Proc. 12th Comp. Sci. Symp.
     in Russia (CSR’17), pp. 294–307. Springer, 2017. [doi:10.1007/978-3-319-58747-9_26] 2, 3, 4

[53] É VA TARDOS: The gap between monotone and non-monotone circuit complexity is exponential.
     Combinatorica, 8(1):141–142, 1988. [doi:10.1007/BF02122563] 5

[54] A LASDAIR U RQUHART: Hard examples for resolution.               J. ACM, 34(1):209–219, 1987.
     [doi:10.1145/7531.8928] 5

[55] AVI W IGDERSON: The fusion method for lower bounds in circuit complexity. In Combinatorics,
     Paul Erdős is Eighty, pp. 453–468. János Bolyai Math. Soc., Budapest, 1993. 6

[56] X IAODI W U , P ENGHUI YAO , AND H ENRY Y UEN: Raz–McKenzie simulation with the inner
     product gadget. Electron. Colloq. Comput. Complexity, TR17-010, 2017. [ECCC] 2


AUTHORS

      Ankit Garg
      Researcher
      Microsoft Research India
      garga microsoft com
      https://ankit-garg-6.github.io/


      Mika Göös
      Assistant professor
      EPFL
      mika goos epfl ch
      https://theory.epfl.ch/mika/




                     T HEORY OF C OMPUTING, Volume 16 (13), 2020, pp. 1–30                       28
                     M ONOTONE C IRCUIT L OWER B OUNDS FROM R ESOLUTION

    Pritish Kamath
    Postdoctoral scholar
    Toyota Technological Institute at Chicago
    Chicago, IL, USA
    pritish alum mit edu
    https://pritishkamath.github.io/


    Dmitry Sokolov
    Postdoctoral researcher
    KTH Royal Institute of Technology
    sokolovd kth se
    https://logic.pdmi.ras.ru/~sokolov/


ABOUT THE AUTHORS

    A NKIT G ARG is a researcher at Microsoft Research India. Previously, he obtained his
       Ph. D. in computer science from Princeton University under the supervision of Mark
       Braverman, and then spent two years as a postdoc at Microsoft Research New England.
       Even before that, he got his undergraduate degree in computer science from the Indian
       Insitute of Technology Delhi. His research interests lie in algebraic complexity, optimiza-
       tion, and communication complexity. He grew up in Bathinda (a small city in Punjab,
       India) and like most Indians, loves playing and watching cricket.


    M IKA G ÖÖS is an assistant professor at EPFL in the Theory Group. Previously, he was
       a postdoc at Stanford Theory, the Institute for Advanced Study, and the ToC group at
       Harvard. He completed his Ph. D. at the University of Toronto under the watchful eye
       of Toniann Pitassi. He obtained his B. S. from Aalto University, and his M. S. from the
       University of Oxford. In his spare time, Mika enjoys rock climbing, hoping to climb a
       7C-grade boulder one day.


    P RITISH K AMATH is a postdoctoral scholar at the Toyota Technological Institute at Chicago.
        He completed his Ph. D. at MIT, co-advised by Madhu Sudan and Ronitt Rubinfeld.
        Prior to that, he completed a B. Tech. in Computer Science at IIT Bombay in 2012 after
        which he was a Research Fellow at Microsoft Research India until 2013, where he was
        mentored by Neeraj Kayal. He has broad interests in complexity theory and has worked
        in areas touching upon communication complexity, information theory, Boolean and
        algebraic circuit complexity and proof complexity. Most recently he has also become
        interested in foundational aspects of machine learning. He likes to juggle multiple things
        in life; sometimes on a bicycle.



                    T HEORY OF C OMPUTING, Volume 16 (13), 2020, pp. 1–30                            29
          A NKIT G ARG , M IKA G Ö ÖS , P RITISH K AMATH , AND D MITRY S OKOLOV

D MITRY S OKOLOV is a postdoc at Lund University and the University of Copenhagen. He
   got his Ph. D. in mathematics from the St. Petersburg Department of Steklov Mathemati-
   cal Institute of the Russian Academy of Sciences under the supervision of Edward Hirsch
   and Dmitry Itsykson, and then spent two years as a postdoc at KTH Royal Institute
   of Technology. He defended his master’s thesis in computer science at St. Petersburg
   Academic University. His research interests include proof complexity, communication
   complexity, and circuit complexity. He likes to play table tennis.




               T HEORY OF C OMPUTING, Volume 16 (13), 2020, pp. 1–30                         30