Shrinkage under Random Projections, and Cubic Formula Lower Bounds for $\mathsf{AC}^0$

Authors Yuval Filmus, Or Meir, Avishay Tal,
Plaintext
                           T HEORY OF C OMPUTING, Volume 19 (7), 2023, pp. 1–51
                                        www.theoryofcomputing.org




     Shrinkage under Random Projections,
      and Cubic Formula Lower Bounds
                  for 𝑨𝑪 0
                       Yuval Filmus∗                     Or Meir†          Avishay Tal
             Received February 19, 2021; Revised December 28, 2021; Published December 5, 2023




       Abstract. Håstad showed that any De Morgan formula (composed of AND, OR
       and NOT gates) shrinks by a factor of 𝑂(𝑝 ˜ 2 ) under a random restriction that leaves
       each variable alive independently with probability 𝑝 [SICOMP, 1998]. Using this
                           e 3 ) formula size lower bound for the Andreev function, which,
       result, he gave an Ω(𝑛
       up to lower order improvements, remains the state-of-the-art lower bound for any
       explicit function.
           In this paper, we extend the shrinkage result of Håstad to hold under a far wider
       family of random restrictions and their generalization — random projections. Based
                                                e 3 ) formula size lower bound for an explicit
       on our shrinkage results, we obtain an Ω(𝑛
       function computable in 𝑨𝑪 0 . This improves upon the best known formula size lower
       bounds for 𝑨𝑪 0 , that were only quadratic prior to our work. In addition, we prove
       that the KRW conjecture [Karchmer et al., Computational Complexity 5(3/4), 1995]
       holds for inner functions for which the unweighted quantum adversary bound is
       tight. In particular, this holds for inner functions with a tight Khrapchenko bound.
           Our random projections are tailor-made to the function’s structure so that the
       function maintains structure even under projection — using such projections is
       necessary, as standard random restrictions simplify 𝑨𝑪 0 circuits. In contrast, we
       show that any De Morgan formula shrinks by a quadratic factor under our random
       projections, allowing us to prove the cubic lower bound.
     An extended abstract of this article appeared in the Proceeding of the 12th Innovations in Theoretical Computer
Science Conference (ITCS’21).
   ∗ Taub Fellow — supported by the Taub Foundations. The research was funded by ISF grant 1337/16.
   † Partially supported by the Israel Science Foundation (grant No. 1445/16).



© 2023 Yuval Filmus, Or Meir, and Avishay Tal
c b Licensed under a Creative Commons Attribution License (CC-BY)                       DOI: 10.4086/toc.2023.v019a007
                            Y UVAL F ILMUS , O R M EIR , AND AVISHAY TAL

         Our proof techniques build on Håstad’s proof for the simpler case of balanced
      formulas. This allows for a significantly simpler proof at the cost of slightly worse
      parameters. As such, when specialized to the case of 𝑝-random restrictions, our
      proof can be used as an exposition of Håstad’s result.

ACM Classification: F.1.1
AMS Classification: 68Q06
Key words and phrases: circuit complexity, lower bounds, shrinkage, formulas, formula
complexity, KRW conjecture

1     Introduction
1.1   Background
Is there an efficiently solvable computational task that does not admit an efficient parallel
algorithm? A formalization of this question asks, is P * NC1 ? The answer is still unknown. The
question can be rephrased as follows (ignoring issues of uniformity): is there a Boolean function
in P that does not have a (De Morgan) formula of polynomial size? (We define De Morgan
formulas formally in Section 2; informally, these are formulas composed of AND, OR and NOT
gates.)
    The history of formula lower bounds for functions in P goes back to the 1960s, with the
seminal result of Subbotovskaya [55] that introduced the technique of random restrictions.
Subbotovskaya showed that the Parity function on 𝑛 variables requires formulas of size at least
Ω(𝑛 1.5 ). Khrapchenko [33], using a different proof technique, showed that in fact the Parity
function on 𝑛 variables requires formulas of size Θ(𝑛 2 ). Later, Andreev [4] came up with a
new explicit function (now known as the Andreev function) for which he was able to obtain an
Ω(𝑛 2.5 ) size lower bound. This lower bound was subsequently improved in [29, 44, 23, 56] to
𝑛 3−𝑜(1) .
    The line of work initiated by Subbotovskaya and Andreev relies on the shrinkage of formulas
under 𝑝-random restrictions. A 𝑝-random restriction is a randomly chosen partial assignment
to the inputs of a function. Set a parameter 𝑝 ∈ (0, 1). We fix each variable independently with
probability 1 − 𝑝 to a uniformly random bit, and we keep the variable alive with probability 𝑝.
Under such a restriction, formulas shrink (in expectation) by a factor more significant than 𝑝.
Subbotovskaya showed that formulas shrink to at most 𝑝 1.5 times their original size, whereas
subsequent work [44, 29] improved the bound to 𝑝 1.55 and then to 𝑝 1.63 . Finally, Håstad [23]
showed that the shrinkage exponent of formulas is 2, or in other words, that formulas shrink by
a factor of 𝑝 2−𝑜(1) under 𝑝-random restrictions. Tal [56] improved the shrinkage factor to 𝑂(𝑝 2 )
— obtaining a tight result, as exhibited by the Parity function.
    In a nutshell, shrinkage results are useful for proving lower bounds as long as the explicit
function being analyzed maintains structure under such restrictions and does not trivialize. For
example, the Parity function does not become constant as long as at least one variable remains
alive. Thus any formula 𝐹 that computes Parity must be of at least quadratic size, or else the

                      T HEORY OF C OMPUTING, Volume 19 (7), 2023, pp. 1–51                       2
                                S HRINKAGE UNDER R ANDOM P ROJECTIONS

formula 𝐹 under restriction, keeping each variable alive with probability 100/𝑛, would likely
become a constant function, whereas Parity would not. Andreev’s idea is similar, though he
manages to construct a function such that under a random restriction keeping only Θ(log 𝑛) of
the variables, the formula size should be at least Ω̃(𝑛) (in expectation). This ultimately gives the
nearly cubic lower bound.

1.1.1   The KRW Conjecture
Despite much effort, proving P * NC1 , and even just breaking the cubic barrier in formula lower
bounds, have remained a challenge for more than two decades. An approach to solve the P
versus NC1 problem was suggested by Karchmer, Raz and Wigderson [31]. They conjectured
that when composing two Boolean functions, 𝑓 and 𝑔, the formula size of the resulting function,
𝑓  𝑔, is (roughly) the product of the formula sizes of 𝑓 and 𝑔.1 We will refer to this conjecture
as the “KRW conjecture”. Under the KRW conjecture (and even under weaker variants of it),
[31] constructed a function in P with no polynomial-size formulas. It remains a major open
challenge to settle the KRW conjecture.
    A few special cases of the KRW conjecture are known to be true. The conjecture holds when
either 𝑓 or 𝑔 is the AND or the OR function. Håstad’s result [23] and its improvement [56]
show that the conjecture holds when the inner function 𝑔 is the Parity function and the outer
function 𝑓 is any function. This gives an alternative explanation to the 𝑛 3−𝑜(1) lower bound for
the Andreev function. Indeed, the Andreev function is at least as hard as the composition of
a maximally hard function 𝑓 on log 𝑛 bits and 𝑔 = Parity𝑛/log 𝑛 , where the formula size of 𝑓 is
Ω̃(𝑛) and the formula size of Parity𝑛/log 𝑛 is Θ(𝑛 2 /log2 𝑛). Since the KRW conjecture holds for
this special case, the formula size of the Andreev function is at least Ω̃(𝑛 3 ). In other words, the
state-of-the-art formula size lower bounds for explicit functions follow from a special case of the
KRW conjecture — the case in which 𝑔 is the Parity function. Moreover, this special case follows
from the shrinkage offormulas under 𝑝-random restrictions.

1.1.2   Bottom-up versus top-down techniques
Whereas random restrictions are a “bottom-up” proof technique [24], a different line of work
suggested a “top-down" approach using the language of communication complexity. The
connection between formula size and communication complexity was introduced in the seminal
paper by Karchmer and Wigderson [32]. They defined for any Boolean function 𝑓 a two-party
communication problem KW 𝑓 : Alice gets an input 𝑥 such that 𝑓 (𝑥) = 1, and Bob gets an input 𝑦
such that 𝑓 (𝑦) = 0. Their goal is to identify a coordinate 𝑖 on which 𝑥 𝑖 ≠ 𝑦 𝑖 , while minimizing
their communication. It turns out that there is a one-to-one correspondence between any
protocol tree solving KW 𝑓 and any formula computing the function 𝑓 . Since protocols naturally
traverse the tree from root to leaf, proving lower bounds on their size or depth is done usually
in a top-down fashion. This framework has proven to be very useful in proving formula lower

   1More precisely, the original KRW conjecture [31] concerns depth complexity rather than formula complexity.
The variant of the conjecture for formula complexity, which is discussed above, was posed in [19].


                        T HEORY OF C OMPUTING, Volume 19 (7), 2023, pp. 1–51                                3
                                Y UVAL F ILMUS , O R M EIR , AND AVISHAY TAL

bounds in the monotone setting (see, e. g., [32, 20, 47, 31, 46, 21, 45]) and in studying the KRW
conjecture (see, e. g., [31, 14, 26, 19, 13, 34, 41, 12, 42]). Moreover, in a recent paper [13], Dinur
and Meir were able to reprove Håstad’s cubic lower bound using the framework of Karchmer
and Wigderson. As Dinur and Meir’s proof showed that top-down techniques can replicate
Håstad’s cubic lower bound, a natural question (which motivated this project) arose:
      Are top-down techniques superior to bottom-up techniques?
Towards that, we focused on a candidate problem: prove a cubic lower bound for an explicit
function in 𝑨𝑪 0 .2 Based on the work of Dinur and Meir [13], we suspected that such a lower
bound could be achieved using top-down techniques. We were also certain that the problem
cannot be solved using the random restriction technique. Indeed, in order to prove a lower
bound on a function 𝑓 using random restrictions, one should argue that 𝑓 remains hard under a
random restriction, however, it is well-known that functions in 𝑨𝑪 0 trivialize under 𝑝-random
restrictions [1, 16, 57, 22]. Based on this intuition, surely random restrictions cannot show that a
function in 𝑨𝑪 0 requires cubic size. Our intuition turned out to be false.

1.2   Our results
In this article, we construct an explicit function in 𝑨𝑪 0 which requiresformulas of size 𝑛 3−𝑜(1) .
Surprisingly, our proof is conducted via the bottom-up technique of random projections, which
is a generalization of random restrictions (more details below).
Theorem 1.1. There exists a family of Boolean functions ℎ 𝑛 : {0, 1} 𝑛 → {0, 1} for 𝑛 ∈ ℕ such that
   1. ℎ 𝑛 can be computed by uniform depth-4 unbounded fan-in formulas of size 𝑂(𝑛 3 ).

   2. The formula size of ℎ 𝑛 is at least 𝑛 3−𝑜(1) .
    Prior to our work, the best formula size lower bounds for an explicit function in 𝑨𝑪 0 were
only quadratic [43, 10, 30, 6].
    Our hard function is a variant of the Andreev function. More specifically, recall that the
Andreev function is based on the composition 𝑓  𝑔, where 𝑓 is a maximally hard function and
𝑔 is the Parity function. Since Parity is not in 𝑨𝑪 0 , we cannot take 𝑔 to be the Parity function in
our construction. Instead, our hard function is obtained by replacing the Parity function with
the Surjectivity function of Beame and Machmouchi [6].
    As in the case of the Andreev function, we establish the hardness of our function by proving
an appropriate special case of the KRW conjecture. To this end, we introduce a generalization of
the unweighted adversary method [2], called the soft-adversary bound, which we denote Adv𝑠 (𝑔)
(see Section 7.1). We prove the KRW conjecture for the special case in which the outer function 𝑓
is any function, and 𝑔 is a function whose formula complexity is bounded tightly by the soft-
adversary bound. We then obtain Theorem 1.1 by applying this version of the KRW conjecture
to the case where 𝑔 is the Surjectivity function. We note that our KRW result holds in particular
   2Recall that 𝑨𝑪 0 is the class of functions computed by constant depth polynomial size circuits composed of AND
and OR gates of unbounded fan-in, with variables or their negation at the leaves.


                         T HEORY OF C OMPUTING, Volume 19 (7), 2023, pp. 1–51                                   4
                                S HRINKAGE UNDER R ANDOM P ROJECTIONS

for functions 𝑔 with a large Khrapchenko bound, which in turn implies the known lower bounds
in the cases where 𝑔 is the Parity function [23] and the Majority function [17].
     Our proof of the special case of the KRW conjecture follows the methodology of Håstad [23],
who proved the special case in which 𝑔 is Parity on 𝑚 variables. Håstad proved that formulas
shrink by a factor of (roughly) 𝑝 2 under 𝑝-random restrictions. Choosing 𝑝 = 1/𝑚 shrinks a
formula for 𝑓  𝑔 by a factor of roughly 𝑚 2 , which coincides with the formula complexity of 𝑔.
On the other hand, on average each copy of 𝑔 simplifies to a single input variable, and so 𝑓  𝑔
simplifies to 𝑓 . This shows that 𝐿( 𝑓  𝑔) & 𝐿( 𝑓 ) · 𝐿(𝑔).
     Our main technical contribution is a new shrinkage theorem that works in a far wider range
of scenarios than just 𝑝-random restrictions: A random projection is a generalization of a random
restriction in which each of the input variables 𝑥 1 , . . . , 𝑥 𝑛 may either be fixed to a constant or be
replaced with a literal from the set 𝑦1 , . . . , 𝑦𝑚 , 𝑦1 , . . . , 𝑦𝑚 where 𝑦1 , . . . , 𝑦𝑚 are new variables.
Given a function 𝑔 with soft-adversary bound Adv𝑠 (𝑔), we construct a random projection which,
on the one hand, shrinks formulas by a factor of Adv𝑠 (𝑔), and on the other hand, simplifies 𝑓  𝑔
to 𝑓 . We thus show that 𝐿( 𝑓  𝑔) & 𝐿( 𝑓 ) · Adv𝑠 (𝑔), and in particular, if Adv𝑠 (𝑔) ≈ 𝐿(𝑔), then
𝐿( 𝑓  𝑔) & 𝐿( 𝑓 ) · 𝐿(𝑔), just as in Håstad’s proof. Our random projections are tailored specifically
to the structure of the function 𝑓  𝑔, ensuring that 𝑓  𝑔 simplifies to 𝑓 under projection. This
enables us to overcome the aforementioned difficulty. In contrast, 𝑝-random restrictions that do
not respect the structure of 𝑓  𝑔 would likely result in a restricted function that is much simpler
than 𝑓 and in fact would be a constant function with high probability.
     Our shrinkage theorem applies more generally to two types of random projections, which
we call fixing projections and hiding projections. Roughly speaking, fixing projections are random
projections in which substituting a constant for one of the variables 𝑦1 , . . . , 𝑦𝑚 results in a
projection that is much more probable. Hiding projections are random projections in which
substituting a constant for a variable 𝑦 𝑗 hides which of the input variables 𝑥 1 , . . . , 𝑥 𝑛 were
mapped to {𝑦 𝑗 , 𝑦 𝑗 } before the substitution. We note that our shrinkage theorem for fixing
projections captures Håstad’s result for 𝑝-random restrictions as a special case.
     The proof of our shrinkage theorem is based on Håstad’s proof [23], but also simplifies it. In
particular, we take the simpler argument that Håstad uses for the special case of completely
balanced trees, and adapt it to the general case. As such, our proof avoids a complicated case
analysis, at the cost of slightly worse bounds. Using our bounds, it is nevertheless easy to obtain
the 𝑛 3−𝑜(1) lower bound for the Andreev function. Therefore, one can see the specialization of
our shrinkage result to 𝑝-random restrictions as an exposition of Håstad’s cubic lower bound.


1.2.1   An example: our techniques when specialized to 𝑓  Majority𝑚

To illustrate our choice of random projections, we present its instantiation to the special case of
𝑓 𝑔, where 𝑓 : {0, 1} 𝑘 → {0, 1} is non-constant and 𝑔 = Majority𝑚 for some odd integer 𝑚. In this
case, the input variables to 𝑓  𝑔 are composed of 𝑘 disjoint blocks, 𝐵1 , . . . , 𝐵 𝑘 , each containing
𝑚 variables. We use the random projection that for each block 𝐵 𝑖 = {𝑥 𝑚(𝑖−1)+1 , . . . , 𝑥 𝑚𝑖 }, picks
one variable in the block 𝐵 𝑖 uniformly at random, projects this variable to the new variable 𝑦 𝑖 ,
and fixes the rest of the variables in the block in a balanced way so that the number of zeros and

                        T HEORY OF C OMPUTING, Volume 19 (7), 2023, pp. 1–51                                 5
                               Y UVAL F ILMUS , O R M EIR , AND AVISHAY TAL

ones in the block is equal (i. e., we have exactly (𝑚 − 1)/2 zeros and (𝑚 − 1)/2 ones). It is not hard
to see that under this choice, 𝑓  𝑔 simplifies to 𝑓 . On the other hand, we show that this choice
of random projections shrinks the formula complexity by a factor of ≈ 1/𝑚 2 . Combining the
two together, we get that 𝐿( 𝑓  Majority𝑚 ) & 𝐿( 𝑓 ) · 𝑚 2 . Note that in this distribution of random
projections, the different coordinates are not independent of one another, and this feature allows
us to maintain structure.


1.3   Related work
Our technique of using tailor-made random projections was inspired by the celebrated result of
Rossman, Servedio, and Tan [25] that proved an average-case depth hierarchy. In fact, the idea
to use tailor-made random restrictions goes back to Håstad’s thesis [27, Chapter 6.2]. Similar to
our case, in [27, 25], 𝑝-random restrictions are too crude to separate depth 𝑑 from depth 𝑑 + 1
circuits. Given a circuit 𝐶 of depth 𝑑 + 1, the main challenge is to construct a distribution of
random restrictions or projections (tailored to the circuit 𝐶) that on the one hand maintains
structure for 𝐶, but on the other hand simplify any depth 𝑑 circuit 𝐶 0.


1.4   Paper outline
The paper starts with brief preliminaries in Section 2. Then, in Section 3, we define the notions of
fixing and hiding projections, state the corresponding shrinkage theorems, and sketch how they
can be used to prove a cubic formula lower bound for 𝑨𝑪 0 . We prove the shrinkage theorems
for fixing and hiding projections in Sections 4 and 5, respectively. In Section 6 we provide a brief
interlude on the join operation for projections. The quantum adversary bound, the Khrapchenko
bound, and their relation to hiding projections are discussed in Section 7. Finally, Section 8
contains a proof of Theorem 1.1, as a corollary of a more general result which is a special case of
the KRW conjecture. In the same section we also rederive the cubic lower bound on Andreev’s
function, and the cubic lower bound on the Majority-based variant considered in [17].


2     Preliminaries
Throughout the paper, we use bold letters to denote random variables. For any 𝑛 ∈ ℕ , we
denote by [𝑛] the set {1, . . . , 𝑛}. Given a bit 𝜎 ∈ {0, 1}, we denote its negation by 𝜎. We
assume familiarity with the basic definitions of communication complexity (see, e. g., [37]). All
logarithms in this paper are base 2.

Definition 2.1. A (De Morgan) formula (with bounded fan-in) is a binary tree, whose leaves are
labeled with literals from the set {𝑥 1 , 𝑥 1 , . . . , 𝑥 𝑛 , 𝑥 𝑛 }, and whose internal vertices are labeled as
AND (∧) or OR (∨) gates. The size of a formula 𝜙, denoted size(𝜙), is the number of leaves in the
tree. The depth of the formula is the depth of the tree. A formula with unbounded fan-in is defined
similarly, but every internal vertex in the tree can have any number of children. Unless stated
explicitly otherwise, whenever we say “formula” we refer to a formula with bounded fan-in.

                        T HEORY OF C OMPUTING, Volume 19 (7), 2023, pp. 1–51                                 6
                               S HRINKAGE UNDER R ANDOM P ROJECTIONS

Definition 2.2. A formula 𝜙 computes a Boolean function 𝑓 : {0, 1} 𝑛 → {0, 1} in the natural
way. The formula complexity of a Boolean function 𝑓 : {0, 1} 𝑛 → {0, 1}, denoted 𝐿( 𝑓 ), is the size
of the smallest formula that computes 𝑓 . The depth complexity of 𝑓 , denoted 𝐷( 𝑓 ), is the smallest
depth of a formula that computes 𝑓 . For convenience, we define the size and depth of the
constant functions to be zero.
    A basic property of formula complexity is that it is subadditive:
Fact 2.3. For every two functions 𝑓1 , 𝑓2 : {0, 1} 𝑛 → {0, 1} we have 𝐿( 𝑓1 ∧ 𝑓2 ) ≤ 𝐿( 𝑓1 ) + 𝐿( 𝑓2 ) and
𝐿( 𝑓1 ∨ 𝑓2 ) ≤ 𝐿( 𝑓1 ) + 𝐿( 𝑓2 ).
   The following theorem shows that every small formula can be “balanced” to obtain a shallow
formula.
Theorem 2.4 (Formula balancing, [7], following [54, 8]). For every 𝑠 ∈ ℕ and 𝛼 > 0, the following
holds: For every formula 𝜙 of size 𝑠, there exists an equivalent formula 𝜙0 of depth at most 𝑂(2 𝛼 · log 𝑠)
                                                                                                 1


and size at most 𝑠 1+𝛼 (where the constant inside the big-O notation does not depend on the choice of 𝛼).
Notation 2.5. With a slight abuse of notation, we will often identify a formula 𝜙 with the
function it computes. In particular, the notation 𝐿(𝜙) denotes the formula complexity of the function
computed by 𝜙, and not the size of 𝜙 (which is denoted by size(𝜙)).
Notation 2.6. Given a Boolean variable 𝑧, we denote by 𝑧 0 and 𝑧 1 the literals 𝑧 and 𝑧, respectively.
In other words, 𝑧 𝑏 = 𝑧 ⊕ 𝑏.
Notation 2.7. Given a literal ℓ , we define var(ℓ ) to be the underlying variable, that is, var(𝑧) =
var(𝑧) = 𝑧.
Notation 2.8. Let Π be a deterministic communication protocol that takes inputs from 𝒜 × ℬ,
and recall that the leaves of the protocol induce a partition of 𝒜 × ℬ to combinatorial rectangles.
For every leaf ℓ of Π, we denote by 𝒜ℓ × ℬℓ the combinatorial rectangle that is associated with ℓ .
    We use the framework of Karchmer–Wigerson relations [32], which relates the complexity
of 𝑓 to the complexity of a related communication problem KW 𝑓 .
Definition 2.9 ([32]). Let 𝑓 : {0, 1} 𝑛 → {0, 1} be a Boolean function. The Karchmer–Wigderson
relation of 𝑓 , denoted KW 𝑓 , is the following communication problem: The inputs of Alice and
Bob are strings 𝑎 ∈ 𝑓 −1 (1) and 𝑏 ∈ 𝑓 −1 (0), respectively, and their goal is to find a coordinate
𝑖 ∈ [𝑛] such that 𝑎 𝑖 ≠ 𝑏 𝑖 . Note that such a coordinate must exist since 𝑓 −1 (1) ∩ 𝑓 −1 (0) = ∅ and
hence 𝑎 ≠ 𝑏.
Theorem 2.10 ([32], see also [48]). Let 𝑓 : {0, 1} 𝑛 → {0, 1}. The communication complexity of KW 𝑓
is equal to 𝐷( 𝑓 ), and the minimal number of leaves in a protocol that solves KW 𝑓 is 𝐿( 𝑓 ).
   We use the following two standard inequalities.
                                                  √       𝑥+𝑦
Fact 2.11 (the AM–GM inequality). If 𝑥, 𝑦 ≥ 0 then 𝑥 · 𝑦 ≤ 2 .
Fact 2.12 (special case of Cauchy-Schwarz
                         √             √     √ √ inequality). For every 𝑡 non-negative real numbers
𝑥 1 , . . . , 𝑥 𝑡 we have 𝑥 1 + . . . + 𝑥 𝑡 ≤ 𝑡 · 𝑥 1 + . . . + 𝑥 𝑡 .

                        T HEORY OF C OMPUTING, Volume 19 (7), 2023, pp. 1–51                             7
                                 Y UVAL F ILMUS , O R M EIR , AND AVISHAY TAL

3     Main results
In this section we define the notions of fixing and hiding projections and state their associated
shrinkage theorems (see Sections 3.1 and 3.2, respectively). We then sketch the proof of our
cubic formula lower bounds for 𝑨𝑪 0 assuming those theorems (see Section 3.3). We start by
defining projections and the relevant notation.

Definition 3.1. Let 𝑥 1 , . . . , 𝑥 𝑛 and 𝑦1 , . . . , 𝑦𝑚 be Boolean variables. A projection 𝜋 from 𝑥 1 , . . . , 𝑥 𝑛
to 𝑦1 , . . . , 𝑦𝑚 is a function from the set {𝑥 1 , . . . , 𝑥 𝑛 } to the set {0, 1, 𝑦1 , 𝑦1 , . . . , 𝑦𝑚 , 𝑦𝑚 }. Given
such a projection 𝜋 and a Boolean function 𝑓 : {0, 1} 𝑛 → {0, 1} over the variables 𝑥 1 , . . . , 𝑥 𝑛 , we
denote by 𝑓 | 𝜋 : {0, 1} 𝑚 → {0, 1} the function obtained from 𝑓 by substituting 𝜋(𝑥 𝑖 ) for 𝑥 𝑖 for
each 𝑖 ∈ [𝑛] in the natural way. Unless stated explicitly otherwise, all projections in this section
are from 𝑥1 , . . . , 𝑥 𝑛 to 𝑦1 , . . . , 𝑦𝑚 , and all functions from {0, 1} 𝑛 to {0, 1} are over the variables
𝑥 1 , . . . , 𝑥 𝑛 . A random projection is a distribution over projections.

Notation 3.2. Let 𝜋 be a projection. For every 𝑗 ∈ [𝑚] and bit 𝜎 ∈ {0, 1}, we denote by 𝜋 𝑦 𝑗 ←𝜎 the
projection that is obtained from 𝜋 by substituting 𝜎 for 𝑦 𝑗 .

Notation 3.3. With a slight abuse of notation, if a projection 𝜋 maps all the variables 𝑥 1 , . . . , 𝑥 𝑛
to constants in {0, 1}, we will sometimes treat it as a binary string in {0, 1} 𝑛 .

3.1    Fixing projections
Intuitively, a 𝑞-fixing projection is a random projection 𝝅 in which for every variable 𝑥 𝑖 , the
probability that 𝝅 maps a variable 𝑥 𝑖 to a literal 𝑦 𝑗𝜎 is much smaller than the probability that
𝝅 fixes 𝑦 𝑗 to a constant, regardless of the values that 𝜋 assigns to the other variables. This property is
essentially the minimal property that is required in order to carry out the argument of Håstad
[23]. Formally, we define 𝑞-fixing projections as follows.

Definition 3.4. Let 0 ≤ 𝑞0 , 𝑞1 ≤ 1. We say that a random projection 𝝅 is a (𝑞0 , 𝑞1 )-fixing projection
if for every projection 𝜋, every bit 𝜎 ∈ {0, 1}, and every variable 𝑥 𝑖 , we have

                        Pr 𝝅(𝑥 𝑖 ) ∉ {0, 1} and 𝝅var(𝝅(𝑥 𝑖 ))←𝜎 = 𝜋 ≤ 𝑞 𝜎 · Pr[𝝅 = 𝜋] .
                                                                      
                                                                                                                 (3.1)
                                                                            √
For shorthand, we say that 𝝅 is a 𝑞-fixing projection, for 𝑞 =                  𝑞0 𝑞1 .

    If needed, one can consider without loss of generality only variables 𝑥 𝑖 such that 𝜋(𝑥 𝑖 ) ∈ {0, 1},
as otherwise Equation 3.1 holds trivially, with the left-hand side equaling zero.

Example 3.5. In order to get intuition for the definition of fixing projections, let us examine how
this definition applies to random restrictions. In our terms, a restriction is a projection from
𝑥 1 , . . . , 𝑥 𝑛 to 𝑥 1 , . . . , 𝑥 𝑛 that maps every variable 𝑥 𝑖 either to itself or to {0, 1}. Suppose that 𝝆 is
any distribution over restrictions, and that 𝜌 is some fixed restriction. In this case, the condition
of being 𝑞-fixing can be rewritten as follows:

                               Pr[𝝆(𝑥 𝑖 ) = 𝑥 𝑖 and 𝝆 𝑥 𝑖 ←𝜎 = 𝜌] ≤ 𝑞 𝜎 · Pr[𝝆 = 𝜌] .

                          T HEORY OF C OMPUTING, Volume 19 (7), 2023, pp. 1–51                                       8
                                   S HRINKAGE UNDER R ANDOM P ROJECTIONS

Denote by 𝝆0 , 𝜌0 the restrictions obtained from 𝝆, 𝜌 by truncating 𝑥 𝑖 (i. e., 𝝆0 = 𝝆| {𝑥1 ,...,𝑥 𝑛 }−{𝑥 𝑖 } ).
Using this notation, we can rewrite the foregoing equation as
      Pr[𝝆(𝑥 𝑖 ) = 𝑥 𝑖 and 𝝆0 = 𝜌0 and 𝝆 𝑥 𝑖 ←𝜎 (𝑥 𝑖 ) = 𝜌(𝑥 𝑖 )] ≤ 𝑞 𝜎 · Pr[𝝆(𝑥 𝑖 ) = 𝜌(𝑥 𝑖 ) and 𝝆0 = 𝜌0] .
Now, observe that it is always the case 𝝆 𝑥 𝑖 ←𝜎 (𝑥 𝑖 ) = 𝜎, and therefore the probability on the
left-hand side is non-zero only if 𝜌(𝑥 𝑖 ) = 𝜎. Hence, we can restrict ourselves to the latter case,
and the foregoing equation can be rewritten again as
                     Pr[𝝆(𝑥 𝑖 ) = 𝑥 𝑖 and 𝝆0 = 𝜌0] ≤ 𝑞 𝜎 · Pr[𝝆(𝑥 𝑖 ) = 𝜎 and 𝝆0 = 𝜌0] .
Finally, if we divide both sides by Pr[𝝆0 = 𝜌0], we obtain the following intuitive condition:
                            Pr[𝝆(𝑥 𝑖 ) = 𝑥 𝑖 | 𝝆0 = 𝜌0] ≤ 𝑞 𝜎 · Pr[𝝆(𝑥 𝑖 ) = 𝜎 | 𝝆0 = 𝜌0] .
This condition informally says the following: 𝝆 is a fixing projection if the probability of leaving
𝑥 𝑖 unfixed is at most 𝑞 𝜎 times the probability of fixing it to 𝜎, and this holds regardless of what
the restriction assigns to the other variables.
     In particular, it is now easy to see that the classic random restrictions are fixing projections.
Recall that a 𝑝-random restriction fixes each variable independently with probability 1 − 𝑝 to
a random bit. Due to the independence of the different variables, the foregoing condition
simplifies to
                                 Pr[𝝆(𝑥 𝑖 ) = 𝑥 𝑖 ] ≤ 𝑞 𝜎 · Pr[𝝆(𝑥 𝑖 ) = 𝜎] ,
and it is easy to see that this condition is satisfied for 𝑞0 = 𝑞1 = 1−𝑝 .
                                                                                      2𝑝


We prove the following shrinkage theorem for 𝑞-fixing projections, which is analogous to the
shrinkage theorem of [23] for random restrictions in the case of balanced formulas.
Theorem 3.6 (Shrinkage under fixing projections). Let 𝜙 be a formula of size 𝑠 and depth 𝑑, and let
𝝅 be a 𝑞-fixing projection. Then
                                                                 √ 
                                 𝔼 𝐿(𝜙| 𝝅 ) = 𝑂 𝑞 2 · 𝑑2 · 𝑠 + 𝑞 · 𝑠 .
                                          

We prove Theorem 3.6 in Section 4. It is instructive to compare our shrinkage theorem to the
shrinkage theorem by Håstad [23] for unbalanced formulas:
Theorem 3.7 ([23]). Let 𝜙 be a formula of size 𝑠, and let 𝝆 be a 𝑝-random restriction. Then
                                                                                              √
                                                                                            
                                                                        1
                     𝔼 𝐿(𝜙| 𝝆 ) = 𝑂 𝑝 · 1 + log                           ,𝑠          ·𝑠+𝑝·       𝑠 .
                                            2           3/2
                                                                    min
                                                                        𝑝
    Our Theorem 3.6 has somewhat worse parameters compared to Theorem 3.7: specifically,
the factor of 𝑑 2 does not appear in Theorem 3.7. The reason is that the proof of [23] uses a fairly
complicated case analysis in order to avoid losing that factor, and we chose to skip this analysis
in order to obtain a simpler proof. We did not check if the factor of 𝑑 2 in our result can be
avoided by using a similar case analysis. By applying formula balancing (Theorem 2.4) to our
shrinkage theorem, we can obtain the following result, which is independent of the depth of the
formula.

                            T HEORY OF C OMPUTING, Volume 19 (7), 2023, pp. 1–51                                9
                              Y UVAL F ILMUS , O R M EIR , AND AVISHAY TAL

Corollary 3.8. Let 𝑓 : {0, 1} 𝑛 → {0, 1} be a function with formula complexity 𝑠, and let 𝝅 be a 𝑞-fixing
projection. Then                                                         
                                                      1+𝑂 √ 1             1/2+𝑂 √ 1
                           𝔼 [𝐿( 𝑓 | 𝝅 )] = 𝑞 2 · 𝑠        log 𝑠
                                                                   +𝑞·𝑠               log 𝑠
                                                                                              .

Proof. By assumption, there exists a formula 𝜙 of size 𝑠 that computes 𝑓 . We balance the
formula 𝜙 by applying Theorem 2.4 with 𝛼 = √ 1 , and obtain a new formula 𝜙0 that
                                                    log 𝑠
                          1+ √ 1                 √                   𝑂( √ 1 )
computes 𝑓 and has size 𝑠     log 𝑠
                                    and depth 𝑂(2 log 𝑠 · log 𝑠) = 𝑠     log 𝑠
                                                                               . The required result
now follows by applying Theorem 3.6 to 𝜙 . 0                                                      

3.2   Hiding projections
Intuitively, a hiding projection is a random projection 𝝅 in which, when given 𝝅 𝑦 𝑗 ←𝜎 , it is hard
to tell in which locations the variable 𝑦 𝑗 appears in 𝝅. Formally, we define 𝑞-hiding projections
as follows.

Definition 3.9. Let 0 ≤ 𝑞0 , 𝑞1 ≤ 1. We say that a random projection 𝝅 is a (𝑞0 , 𝑞1 )-hiding projection
if for every projection 𝜋, every bit 𝜎 ∈ {0, 1}, and all variables 𝑥 𝑖 , 𝑦 𝑗 , we have

                                Pr 𝝅(𝑥 𝑖 ) ∈ 𝑦 𝑗 , 𝑦 𝑗          𝝅 𝑦 𝑗 ←𝜎 = 𝜋 ≤ 𝑞 𝜎 ,
                                                                         

whenever the event conditioned on has positive probability. For shorthand, we say that 𝝅 is a
                            √
𝑞-hiding projection, for 𝑞 = 𝑞0 𝑞 1 .

To illustrate the definition, consider the following natural random restriction: given 𝑛 variables
𝑥 1 , . . . , 𝑥 𝑛 , the restriction chooses a set of 𝑚 variables uniformly at random, and fixes all the
other variables to random bits. This restriction is not captured by the notion of 𝑝-random
restrictions or by fixing projections, but as we demonstrate next, it can be implemented by
hiding projections. We start with the simple case of 𝑚 = 1, and then consider the general case.

Example 3.10. In order to implement the case of 𝑚 = 1, consider the random projection 𝝅
from 𝑥1 , . . . , 𝑥 𝑛 to 𝑦 that is defined as follows: the projection 𝝅 chooses an index 𝒊 ∈ [𝑛] and
a bit 𝝉 ∈ {0, 1} uniformly at random, sets 𝝅(𝑥 𝒊 ) = 𝑦 𝝉 , and sets 𝝅(𝑥 𝑖0 ) to a random bit for all
𝑖 0 ∈ [𝑛] \ {𝒊}. It is clear that 𝝅 is essentially equivalent to the random restriction described
above for 𝑚 = 1. We claim that 𝝅 is a 𝑛1 -hiding projection. To see this, observe that for every
bit 𝜎 ∈ {0, 1}, the projection 𝝅 𝑦←𝜎 is a uniformly distributed string in {0, 1} 𝑛 , and moreover, this
is true conditioned on any possible value of 𝒊. In particular, the random variable 𝒊 is independent
of 𝝅 𝑦←𝜎 . Therefore, for every projection 𝜋 ∈ {0, 1} 𝑛 and index 𝑖 ∈ [𝑛] we have

                                                                                                  1
             Pr 𝝅(𝑥 𝑖 ) ∈ {𝑦, 𝑦} 𝝅 𝑦←𝜎 = 𝜋 = Pr 𝒊 = 𝑖 𝝅 𝑦←𝜎 = 𝜋 = Pr[𝒊 = 𝑖] =                       ,
                                                                               
                                                                                                  𝑛

so 𝝅 satisfies the definition of a (𝑞0 , 𝑞1 )-hiding projection with 𝑞0 = 𝑞1 = 𝑛1 . Intuitively, given
𝝅 𝑦←𝜎 , one cannot guess the original location of 𝑦 in 𝝅 better than a random guess.

                      T HEORY OF C OMPUTING, Volume 19 (7), 2023, pp. 1–51                              10
                                 S HRINKAGE UNDER R ANDOM P ROJECTIONS

Example 3.11. We turn to consider the case of a general 𝑚 ∈ ℕ . Let 𝝅 be the random projection
from 𝑥1 , . . . , 𝑥 𝑛 to 𝑦1 , . . . , 𝑦𝑚 that is defined as follows: the projection 𝝅 chooses 𝑚 distinct
                                                                                                           𝝉𝑗
indices 𝒊1 , . . . , 𝒊 𝑚 ∈ [𝑛] and 𝑚 bits 𝝉1 , . . . , 𝝉𝑚 ∈ {0, 1} uniformly at random, sets 𝝅(𝑥 𝒊 𝑗 ) = 𝑦 𝑗 for
every 𝑗 ∈ [𝑚], and sets all the other variables 𝑥 𝑖 to random bits. It is clear that 𝝅 is essentially
equivalent to the random projection described above. We show that 𝝅 is a 𝑛−𝑚+1                   1
                                                                                                        -hiding
projection. To this end, we should show that for every 𝑖 ∈ [𝑛], 𝑗 ∈ [𝑚], and 𝜎 ∈ {0, 1} we have
                                                                                      1
                             Pr 𝝅(𝑥 𝑖 ) ∈ 𝑦 𝑗 , 𝑦 𝑗         𝝅 𝑦 𝑗 ←𝜎 = 𝜋 ≤                .
                                                                          
                                                                                    𝑛−𝑚+1
For simplicity of notation, we focus on the case where 𝑗 = 𝑚. Observe that 𝝅 𝑦𝑚 ←𝜎 reveals
the values of 𝒊1 , . . . , 𝒊 𝑚−1 , and the projection of 𝝅 𝑦𝑚 ←𝜎 on the remaining 𝑛 − 𝑚 + 1 indices
is uniform over {0, 1} 𝑛−𝑚+1 . Moreover, the latter assertion remains true conditioned on any
possible value of 𝒊 𝑚 (which must be different from the known values of 𝒊1 , . . . , 𝒊 𝑚−1 ). Therefore
given 𝒊1 , . . . , 𝒊 𝑚−1 , the random variable 𝒊 𝑚 (ranging over all indices other than 𝒊1 , . . . , 𝒊 𝑚−1 )
is independent of the projection of 𝝅 𝑦𝑚 ←𝜎 on the 𝑛 − 𝑚 + 1 indices other than 𝒊1 , . . . , 𝒊 𝑚−1 .
It follows that for every projection 𝜋 in the support of 𝝅 𝑦𝑚 ←𝜎 and every index 𝑖 ∈ [𝑛], if
                                         𝝉𝑚−1
𝝅(𝑥 𝑖1 ) = 𝑦1𝝉1 , . . . , 𝝅(𝑥 𝑖 𝑚−1 ) = 𝑦𝑚−1  then:

     Pr 𝝅(𝑥 𝑖 ) ∈ {𝑦𝑚 , 𝑦𝑚 } 𝝅 𝑦𝑚 ←𝜎 = 𝜋 = Pr 𝒊 𝑚 = 𝑖 𝝅 𝑦𝑚 ←𝜎 = 𝜋
                                                                                   

                                                                                                                1
                                                   = Pr[𝒊 𝑚 = 𝑖 | 𝒊1 = 𝑖1 , . . . , 𝒊 𝑚−1 = 𝑖 𝑚−1 ] ≤               ,
                                                                                                              𝑛−𝑚+1
as required.
    In Section 5, we prove the following shrinkage theorem for hiding projections.
Theorem 3.12 (Shrinkage under hiding projections). Let 𝜙 be a formula of size 𝑠 and depth 𝑑, and
let 𝝅 be a 𝑞-hiding projection. Then
                                                                         √ 
                             𝔼 𝐿(𝜙| 𝝅 ) = 𝑂 𝑚 4 · 𝑞 2 · 𝑑2 · 𝑠 + 𝑚 2 · 𝑞 · 𝑠 .
                                      


   Applying formula balancing, we can obtain an analog of Corollary 3.8, with an identical
proof.
Corollary 3.13. Let 𝑓 : {0, 1} 𝑛 → {0, 1} be a function with formula complexity 𝑠, and let 𝝅 be a
𝑞-hiding projection. Then
                                                                                                     
                                                       1+𝑂 √ 1                        2 +𝑂
                                                                                      1      √1
                        𝔼 [𝐿( 𝑓 | 𝝅 )] = 𝑚 · 𝑞 · 𝑠
                                          4        2           log 𝑠
                                                                           +𝑚 ·𝑞·𝑠
                                                                                2             log 𝑠
                                                                                                          .

Remark 3.14. Note that Theorem 3.12 loses a factor of 𝑚 4 compared to Theorem 3.6. While this
loss might be large in general, it will be of no importance in our applications. The factor 𝑚 4 can
be improved to 𝑚 2 in our actual applications, as discussed in Section 5.1.
    The following example shows that the loss of a factor of at least 𝑚 2 is necessary for
Theorem 3.12. Let 𝑓 = Parity𝑛 be the Parity function over 𝑥 1 , . . . , 𝑥 𝑛 , and note that the formula

                        T HEORY OF C OMPUTING, Volume 19 (7), 2023, pp. 1–51                                            11
                             Y UVAL F ILMUS , O R M EIR , AND AVISHAY TAL

complexity of 𝑓 is 𝑠 = Θ(𝑛 2 ). Let 𝝅 be the random projection from Example 3.11, and recall
that it is a 𝑞-hiding projection for 𝑞 = 𝑛−𝑚+1
                                            1
                                               . Then, 𝑓 | 𝝅 is the Parity function over 𝑚 bits, and
therefore its formula complexity is Θ(𝑚 ). It follows that when 𝑚 ≤ 𝑛/2,
                                          2


                                 𝔼 [𝐿( 𝑓 | 𝝅 )] = Θ(𝑚 2 ) = Θ(𝑚 2 · 𝑞 2 · 𝑠).

3.3     Sketch: Cubic formula lower bound for 𝑨𝑪 0
We now sketch how our shrinkage theorems can be used to prove the cubic formula lower
bound for 𝑨𝑪 0 . We start by sketching how our shrinkage theorems can be used to reprove a
quadratic formula lower bound for the surjectivity function Surj (originally due to [6]). Then,
we sketch a lower bound for compositions of the form 𝑓  Surj, where 𝑓 is an arbitrary function.
Finally, we combine the latter bound with an idea of Andreev [4] to obtain a cubic lower bound
for an explicit function in 𝑨𝑪 0 . We note that the first step is the key idea of the proof, whereas
the two latter steps are relatively standard. A formal proof following this sketch can be found in
Section 8.

3.3.1    Lower bound for the surjectivity function
The surjectivity function Surj𝑛 : {0, 1} 𝑛 → {0, 1} [6] takes as input a list of 𝑟 numbers in [𝑆] for
some 𝑆 ∈ ℕ , encoded in binary, and outputs 1 if and only if all the numbers in [𝑆] appear in
the list. For our purposes, we write 𝑆0 = 𝑆 − 2 and set 𝑟 = 32 · 𝑆0 + 2. We sketch a proof that
𝐿(Surj𝑛 ) ≈ Ω(𝑆 2 ). Since the input length of Surj𝑛 , when measured in bits, is 𝑛 = 𝑂(𝑆 log 𝑆), this
implies that 𝐿(Surj𝑛 ) ≈ Ω( 𝑛2 ).
                                2

                             log 𝑛
      To this end, we construct a random projection 𝝅 for Surj𝑛 that maps the input variables
𝑥 1 , . . . , 𝑥 𝑛 to a single variable 𝑦 as follows: We choose two distinct numbers 𝒊, 𝒋 ∈ [𝑆] at
random, and then partition the remaining 𝑆 − 2 numbers at random to two sets 𝑨 and 𝑩 of size
2 (𝑆 − 2) = 2 𝑆 each. Then, we create a list of 𝑟 = 2 · 𝑆 + 2 numbers in [𝑆] as follows:
1                 1 0                                 3   0


   1. Initially, all the places in the list are vacant.

   2. We put each of the 12 𝑆0 + 1 numbers in 𝑨 ∪ {𝒋} at a single random vacant place in the list.

   3. We put each of the 12 𝑆0 numbers in 𝑩 in two random vacant places in the list (for a total of
      𝑆0 places).

   4. Note that so far we filled 32 𝑆0 + 1 = 𝑟 − 1 vacant places in the list. The remaining vacant
      place in the list is left empty.
We set 𝝅 such that it fixes all the inputs variables 𝑥 1 , . . . , 𝑥 𝑛 to constants according to the list
we chose, except for the variables corresponding to the empty place in the list. Then, the
dlog 𝑆e binary input variables that correspond to the empty place are mapped to {0, 1, 𝑦, 𝑦} such
that if we substitute 𝑦 = 1 we get the number 𝒊 and if we substitute 𝑦 = 0 we get the number 𝒋.
Observe that if 𝑦 = 1, then the numbers in 𝑨 ∪ {𝒊, 𝒋} appear in the list exactly once and the
numbers in 𝑩 appear exactly twice. On the other hand, if 𝑦 = 0, then the numbers in 𝑨 appear

                       T HEORY OF C OMPUTING, Volume 19 (7), 2023, pp. 1–51                           12
                             S HRINKAGE UNDER R ANDOM P ROJECTIONS

in the list exactly once, the numbers in 𝑩 ∪ {𝒋} appear in the list exactly twice, and 𝒊 does not
appear at all.
    Observe that Surj𝑛 | 𝝅 = 𝑦. Indeed, if 𝑦 = 1 then the foregoing list of numbers contains all
the numbers in [𝑆] and thus Surj𝑛 takes the value 1, and if 𝑦 = 0 then the list of numbers does
not contain 𝒊 and thus Surj𝑛 takes the value 0. We claim that 𝝅 is an 𝑂( 𝑆1 )-hiding projection.
Intuitively, if we substitute a value 𝜎 ∈ {0, 1} in 𝑦, then given the resulting projection 𝝅 𝑦←𝜎 it is
difficult to tell which input variables were mapped to 𝑦 in the original projection 𝝅. The reason
is that in order to tell that, one has to guess the empty place in the original list. To this end,
one has to guess 𝒊 among the ≈ 12 𝑆0 numbers that appear exactly once in the list (if 𝜎 = 1) or to
guess 𝒋 among the ≈ 12 𝑆0 numbers that appear exactly twice in the list (if 𝜎 = 0).
    A bit more formally, observe that 𝝅 maps an input variable 𝑥 ℎ to {𝑦, 𝑦} only if 𝑥 ℎ corresponds
to the empty place in the list of 𝝅, and that 𝝅 𝑦←0 and 𝝅 𝑦←1 represent lists of 𝑟 numbers in [𝑆].
Hence, to find an upper bound on the probability that 𝝅(𝑥 ℎ ) ∈ {𝑦, 𝑦} conditioned on the choice
of 𝝅 𝑦←𝜎 (for 𝜎 ∈ {0, 1}), we need an upper bound on the probability that a place in the list
represented by 𝝅 𝑦←𝜎 was the empty place in the list of 𝝅. We consider two cases:
   • If 𝜎 = 1, then the empty place in the list of 𝝅 contains 𝒊 in the list of 𝝅 𝑦←1 . From the
     perspective of someone who only sees 𝝅 𝑦←1 , the number 𝒊 could be any number among
     the 12 𝑆0 + 2 numbers that appear exactly once in the list of 𝝅 𝑦←1 . Therefore, conditioned
     on 𝝅 𝑦←1 , the empty place in the list of 𝝅 is uniformly distributed among 12 𝑆0 +2 places in the
     list of 𝝅 𝑦←1 . Hence, any place in the list of 𝝅 𝑦←1 has probability of at most 1 10 = 𝑂 𝑆1
                                                                                                     
                                                                                       2 𝑆 +2
      to be the empty place in 𝝅.

   • On the other hand, if 𝜎 = 0, then the empty place in the list of 𝝅 contains 𝒋 in the list
     of 𝝅 𝑦←0 . From the perspective of someone who only sees 𝝅 𝑦←0 , the number 𝒋 could be
     any number among the 12 𝑆0 + 1 numbers that appear exactly twice in the list of 𝝅 𝑦←0 .
     Therefore, conditioned on 𝝅 𝑦←0 , the empty place in the list of 𝝅 is uniformly distributed
     among 2 · 21 𝑆0 + 1 places in the list of 𝝅 𝑦←0 . Hence, any place in the list has probability
                             

                               = 𝑂 𝑆1 to be the empty place in 𝝅.
                                     
     of at most 1 10
                 2·( 2 𝑆 +1)

We showed that 𝝅 is an 𝑂( 𝑆1 )-hiding projection, and therefore by our shrinkage theorem for
hiding projections (specifically, Corollary 3.13) we get that
                                                          1
                                    𝔼 [𝐿(Surj𝑛 | 𝝅 )] .      · 𝐿(Surj).
                                                          𝑆2
(We ignored the power of √        1
                                            from Corollary 3.13 in order to simplify the exposition.)
                              log 𝐿(Surj𝑛 )
On the other hand, as explained above, we have Surj | 𝜋 = 𝑦, and therefore 𝐿(Surj𝑛 | 𝝅 ) = 1. It
follows that 𝐿(Surj𝑛 ) & 𝑆 2 , as required.
    We note that the above proof is a special case of a more general phenonmenon. Specifically,
in Section 7.1, we show that such a hiding projection can be constructed for every function that
has a large (unweighted) adversary bound. As [6] showed, the surjectivity function has such a
large adversary bound, and therefore we can construct a hiding projection for it.

                      T HEORY OF C OMPUTING, Volume 19 (7), 2023, pp. 1–51                         13
                               Y UVAL F ILMUS , O R M EIR , AND AVISHAY TAL

3.3.2   Lower bound for 𝑓  Surj𝑛

Let 𝑓 : {0, 1} 𝑘 → {0, 1} be any function, and let Surj𝑛 be defined as in the previous section.
The block-composition of 𝑓 and Surj𝑛 , denoted 𝑓  Surj𝑛 , is the function that takes 𝑘 inputs
𝑥 1 , . . . , 𝑥 𝑘 ∈ {0, 1} 𝑛 for Surj𝑛 and outputs
                          ( 𝑓  Surj𝑛 )(𝑥 1 , . . . , 𝑥 𝑘 ) = 𝑓 (Surj(𝑥 1 ), . . . , Surj(𝑥 𝑘 ))
                                                                            
                                                                      𝑛2
We sketch a proof that 𝐿( 𝑓  Surj𝑛 ) ≈ Ω 𝐿( 𝑓 ) ·                               . Denote by 𝑥 𝑖,1 , . . . , 𝑥 𝑖,𝑛 the boolean
                                                                    log2 𝑛
variables of the input 𝑥 𝑖 for each 𝑖 ∈ [𝑘]. Let 𝝅 be the 𝑂( 𝑆1 )-hiding projection from the previous
section, and let 𝝅 𝑘 be random projection from 𝑥 1,1 , . . . , 𝑥 𝑘,𝑛 to 𝑦1 , . . . , 𝑦 𝑘 that is obtained by
joining 𝑘 independent copies of 𝝅. In Section 6 we show that the join operation maintains
the hiding property, and therefore 𝝅 𝑘 is an 𝑂( 𝑆1 )-hiding projection. Now, it is not hard to see
that 𝑓  Surj𝑛 | 𝝅 𝑘 = 𝑓 and therefore 𝐿( 𝑓  Surj𝑛 | 𝝅 𝑘 ) = 𝐿( 𝑓 ). On the other hand, by our shrinkage
theorem for hiding projections (specifically, Corollary 3.13) we get that
                                                                    1
                                  𝔼 [𝐿( 𝑓  Surj𝑛 | 𝝅 𝑘 )] .           · 𝐿( 𝑓  Surj𝑛 ).
                                                                    𝑆2
(Again, we ignored the power of √                  1
                                                                from Corollary 3.13 in order to simplify the
                                              log 𝐿( 𝑓 Surj𝑛 )
exposition.) It follows that
                                                       1
                                           𝐿( 𝑓 ) .       · 𝐿( 𝑓  Surj𝑛 ),
                                                       𝑆2
and therefore                                                                                
                             𝐿( 𝑓  Surj𝑛 ) ≈ Ω 𝐿( 𝑓 ) · 𝑆 2 = Ω̃ 𝐿( 𝑓 ) · 𝑛 2 ,
as required.

3.3.3   Cubic lower bound for 𝑨𝑪 0
Let 𝑛 = 2 𝑘 · (𝑘 + 1) and let ℎ 𝑛 : {0, 1} 𝑛 → {0, 1} be the function that takes as input the truth table
of a function 𝑓 : {0, 1} 𝑘 → {0, 1} and 𝑘 inputs 𝑥1 , . . . , 𝑥 𝑘 for Surj2𝑘 , and outputs
                                𝐹( 𝑓 , 𝑥1 , . . . , 𝑥 𝑘 ) = ( 𝑓  Surj)(𝑥 1 , . . . , .𝑥 𝑘 )
As mentioned above, the function ℎ 𝑛 is a variant of the Andreev function in which the parity
function is replaced with Surj. It is not hard to show that ℎ 𝑛 is in 𝑨𝑪 0 (see Section 8.2 for details).
We sketch a proof that 𝐿(ℎ 𝑛 ) ≈ Ω(𝑛 3−𝑜(1) ).
   To see that 𝐿(ℎ 𝑛 ) ≈ Ω(𝑛 3−𝑜(1) ), recall that by a standard
                                                                 counting argument, there exists a
                                                                     2𝑘
function 𝑓 : {0, 1} 𝑘 → {0, 1} such that 𝐿( 𝑓 ) = Θ                 log 𝑘    . We now hardwire 𝑓 as an input to ℎ 𝑛 ,
which turns the function ℎ 𝑛 to the function 𝑓  Surj2𝑘 . By the lower bound we have for 𝑓  Surj2𝑘 ,
it follows that
                                                                           2𝑘
                                                                                         
                 𝐿(𝐹) ≥ 𝐿( 𝑓  Surj) ≈ Ω̃ 𝐿( 𝑓 ) · 2       2𝑘
                                                                    = Ω̃       · 22𝑘 = Ω(𝑛 3−𝑜(1) ),
                                                                         log 𝑘
as required.

                       T HEORY OF C OMPUTING, Volume 19 (7), 2023, pp. 1–51                                                14
                              S HRINKAGE UNDER R ANDOM P ROJECTIONS

Remark 3.15. Note that in the proof of this result, we did not use the shrinkage theorem for fixing
projections — indeed, we only used the shrinkage theorem for hiding projections. However, the
proof of the shrinkage theorem for hiding projections itself relies on the shrinkage theorem for
fixing projections.


4   Proof of shrinkage theorem for fixing projections
In this section, we prove our the shrinkage theorem for fixing projections. Our proof is based on
the ideas of [23], but the presentation is different. We start by recalling the definition of fixing
projections and the theorem’s statement.

Definition 3.4. Let 0 ≤ 𝑞0 , 𝑞1 ≤ 1. We say that a random projection 𝝅 is a (𝑞0 , 𝑞1 )-fixing projection
if for every projection 𝜋, every bit 𝜎 ∈ {0, 1}, and every variable 𝑥 𝑖 , we have

                     Pr 𝝅(𝑥 𝑖 ) ∉ {0, 1} and 𝝅var(𝝅(𝑥 𝑖 ))←𝜎 = 𝜋 ≤ 𝑞 𝜎 · Pr[𝝅 = 𝜋] .
                                                                    

                                                                          √
For shorthand, we say that 𝝅 is a 𝑞-fixing projection, for 𝑞 =                𝑞0 𝑞1 .

Theorem 3.6 (Shrinkage under fixing projections). Let 𝜙 be a formula of size 𝑠 and depth 𝑑, and let
𝝅 be a 𝑞-fixing projection. Then
                                                                √ 
                                𝔼 𝐿(𝜙| 𝝅 ) = 𝑂 𝑞 2 · 𝑑2 · 𝑠 + 𝑞 · 𝑠 .
                                         


   Fix a formula 𝜙 of size 𝑠 and depth 𝑑, and let 𝝅 be a 𝑞-fixing projection. We would like to get
an upper bound on the expectation of 𝐿(𝜙| 𝝅 ). As in [23], we start by giving an upper bound the
probability that the projection 𝝅 shrinks a formula to size 1. Specifically, we prove the following
lemma in Section 4.1.

Lemma 4.1. Let 𝑓 : {0, 1} 𝑛 → {0, 1} be a Boolean function, and let 𝝅 be a 𝑞-fixing projection. Then,
                                                                q
                                     Pr[𝐿( 𝑓 | 𝝅 ) = 1] ≤ 𝑞 ·       𝐿( 𝑓 ).

   Next, we show that to find and upper bound on the expectation of 𝐿(𝜙| 𝝅 ), it suffices to
give an upper bound on the probability that the projection 𝝅 shrinks two formulas to size 1
simultaneously. In order to state this claim formally, we introduce some notation.

Notation 4.2. Let 𝑔 be a gate of 𝜙. We denote the depth of 𝑔 in 𝜙 by depth𝜙 (𝑔), and omit 𝜙 if it is
clear from context (the root has depth 0). If 𝑔 is an internal gate, we denote the subformulas that
are rooted in its left and right children by left(𝑔) and right(𝑔), respectively. Here, by “internal
gate”, we refer to a gate which is an internal node of the tree, i. e., either an AND or an OR gate.

    We prove the following lemma, which says that in order to upper-bound 𝔼 𝐿(𝜙| 𝝅 ) it suffices
                                                                                           
to upper-bound, for every internal gate 𝑔, the probability that left(𝑔) and right(𝑔) shrink to size 1
under 𝝅.

                      T HEORY OF C OMPUTING, Volume 19 (7), 2023, pp. 1–51                           15
                                     Y UVAL F ILMUS , O R M EIR , AND AVISHAY TAL

Lemma 4.3. For every projection 𝜋 it holds that
                             Õ
        𝐿(𝜙| 𝜋 ) ≤                           (depth(𝑔) + 2) · 1{ 𝐿(left(𝑔)|𝜋 )=1and 𝐿(right(𝑔)|𝜋 )=1} + 1𝐿(𝜙|𝜋 )=1 .
                      internal gate 𝑔 of 𝜙


    We would like to use Lemmas 4.1 and 4.3 to prove the shrinkage theorem. As a warm-up,
let us make the simplifying assumption that for every two functions 𝑓1 , 𝑓2 : {0, 1} 𝑛 → {0, 1}, the
events 𝐿( 𝑓1 | 𝝅 ) = 1 and
                         𝐿( 𝑓2| 𝝅 ) = 1 are independent. If this were true, we could have given an
upper bound on 𝔼 𝐿(𝜙| 𝝅 ) as follows:
                      Õ                                        h                                                     i
𝔼 𝐿(𝜙| 𝝅 ) ≤
                                                                                                                                      
                                   (depth(𝑔) + 2) · 𝔼 1{ 𝐿(left(𝑔)|𝝅 )=1 and 𝐿(right(𝑔)|𝝅 )=1} + 𝔼 1𝐿(𝜙|𝜋 )=1
                int. gate 𝑔 of 𝜙

                                                                                                                                     (Lemma 4.3)
                                    Õ
                                               Pr 𝐿(left(𝑔)| 𝝅 ) = 1 and 𝐿(right(𝑔)| 𝝅 ) = 1 + 𝔼 1𝐿(𝜙|𝜋 )=1
                                                                                                                                          
              ≤ (𝑑 + 2) ·
                            int. gate 𝑔 of 𝜙

                                                                                                                             (𝜙 is of depth 𝑑)
                                    Õ
                                               Pr[𝐿(left(𝑔)| 𝝅 ) = 1] · Pr 𝐿(right(𝑔)| 𝝅 ) = 1 + 𝔼 1𝐿(𝜙|𝜋 )=1
                                                                                                                                          
              = (𝑑 + 2) ·
                            int. gate 𝑔 of 𝜙

                                                                                                                 (simplifying assumption)
                                    Õ                  q                                                 √
              ≤ (𝑑 + 2) ·                      𝑞2 ·        𝐿(left(𝑔)) · 𝐿(right(𝑔)) + 𝑞 ·                    𝑠                       (Lemma 4.1)
                            int. gate 𝑔 of 𝜙
                                         Õ                                                                   √
              ≤ 𝑞 2 · (𝑑 + 2) ·                         𝐿(left(𝑔)) + 𝐿(right(𝑔)) + 𝑞 ·                           𝑠
                                                                                             
                                                                                                                         (AM–GM inequality)
                                    int. gate 𝑔 of 𝜙
                                         Õ                                                                           √
              ≤ 𝑞 · (𝑑 + 2) ·                           size(left(𝑔)) + size(right(𝑔)) + 𝑞 ·                             𝑠
                  2
                                                                                                         
                                    int. gate 𝑔 of 𝜙
                                         Õ                                 √
              = 𝑞 2 · (𝑑 + 2) ·                        size(𝑔) + 𝑞 ·           𝑠.
                                    int. gate 𝑔 of 𝜙


The last sum counts every leaf ℓ of 𝜙 once for each internal ancestor of ℓ , so the last expression
is equal to
                                                                Õ                                √
                                        𝑞 2 · (𝑑 + 2) ·                    depth(ℓ ) + 𝑞 ·           𝑠
                                                             leaf ℓ of 𝜙
                                                                              Õ                  √
                                                ≤ 𝑞 2 · (𝑑 + 2) ·                        𝑑+𝑞·        𝑠
                                                                           leaf ℓ of 𝜙
                                                                             √
                                                = 𝑞 2 · (𝑑 + 2) · 𝑑 · 𝑠 + 𝑞 · 𝑠
                                                                       √ 
                                                = 𝑂 𝑞 2 · 𝑑2 · 𝑠 + 𝑞 · 𝑠 ,


                          T HEORY OF C OMPUTING, Volume 19 (7), 2023, pp. 1–51                                                                   16
                                S HRINKAGE UNDER R ANDOM P ROJECTIONS

which is the bound we wanted. However, the above calculation only works under our simplifying
assumption, which is false: the events 𝐿( 𝑓1 | 𝝅 ) = 1 and 𝐿( 𝑓2 | 𝝅 ) = 1 will often be dependent. In
particular, in order for the foregoing calculation to work, we need the following inequality to
hold:                                                              q
                               Pr[𝐿( 𝑓2 | 𝝅 ) = 1 | 𝐿( 𝑓1 | 𝝅 ) = 1] ≤ 𝑞 ·         𝐿( 𝑓2 ).

This inequality holds under our simplifying assumption by Lemma 4.1, but may not hold in
general. Nevertheless, we prove the following similar statement in Section 4.2.

Lemma 4.4. Let 𝝅 be a 𝑞-fixing projection. Let 𝑓1 , 𝑓2 : {0, 1} 𝑛 → {0, 1}, let 𝜎, 𝜏 ∈ {0, 1}, and let 𝑦 𝑗 be
a variable. Then,                                                   q
                                 h                                        i
                              Pr 𝐿( 𝑓2 | 𝝅 𝑦 𝑗 ←𝜎 ) = 1 𝑓1 | 𝝅 = 𝑦 𝜏𝑗 ≤ 𝑞 ·         𝐿( 𝑓2 ).

    Intuitively, Lemma 4.4 breaks the dependency between the events 𝐿( 𝑓1 | 𝝅 ) = 1 and 𝐿( 𝑓2 | 𝝅 ) = 1
by fixing in 𝑓2 the single literal to which 𝑓1 has shrunk. We would now like to use Lemma 4.4 to
prove the theorem. To this end, we prove an appropriate variant of Lemma 4.3, which allows
using the projection 𝜋 𝑦 𝑗 ←𝜎 rather than 𝜋 in the second function. This variant is motivated by
the following “one-variable simplification rules” of [23], which are easy to verify.

Fact 4.5 (one-variable simplification rules). Let ℎ : {0, 1} 𝑚 → {0, 1} be a function over the variables
𝑦1 , . . . , 𝑦𝑚 , and let 𝜎 ∈ {0, 1}. We denote by ℎ 𝑦 𝑗 ←𝜎 the function obtained from ℎ by setting 𝑦 𝑗 to the
bit 𝜎. Then:

   • The function 𝑦 𝑗𝜎 ∨ ℎ is equal to the function 𝑦 𝑗𝜎 ∨ ℎ 𝑦 𝑗 ←𝜎 .

   • The function 𝑦 𝑗𝜎 ∧ ℎ is equal to the function 𝑦 𝑗𝜎 ∧ ℎ 𝑦 𝑗 ←𝜎 .

    In order to use the simplification rules, we define, for every internal gate 𝑔 of 𝜙 and
projection 𝜋, an event ℰ𝑔,𝜋 as follows: if 𝑔 is an OR gate, then ℰ𝑔,𝜋 is the event that there exists
some literal 𝑦 𝑗𝜎 (for 𝜎 ∈ {0, 1}) such that left(𝑔)| 𝜋 = 𝑦 𝑗𝜎 and 𝐿(right(𝑔)| 𝜋 𝑦 𝑗 ←𝜎 ) = 1. If 𝑔 is an AND
gate, then ℰ𝑔,𝜋 is defined similarly, except that we replace 𝜋 𝑦 𝑗 ←𝜎 with 𝜋 𝑦 𝑗 ←𝜎 . We have the
following lemma, which is proved in Section 4.3.

Lemma 4.6. For every projection 𝜋 we have
                                            Õ
                       𝐿(𝜙| 𝜋 ) ≤                           (depth(𝑔) + 2) · 1ℰ𝑔,𝜋 + 1𝐿(𝜙|𝜋 )=1 .
                                     internal gate 𝑔 of 𝜙


    We can now use the following corollary of Lemma 4.4 to replace our simplifying assumption.

Corollary 4.7. For every internal gate 𝑔 of 𝜙 we have
                                                            q
                                Pr ℰ𝑔,𝝅 ≤ 𝑞 ·                   𝐿(left(𝑔)) · 𝐿(right(𝑔)).
                                                   2



                        T HEORY OF C OMPUTING, Volume 19 (7), 2023, pp. 1–51                               17
                                              Y UVAL F ILMUS , O R M EIR , AND AVISHAY TAL

Proof. Let 𝑔 be an internal gate of 𝜙. We prove the corollary for the case where 𝑔 is an OR gate,
and the proof for the case that 𝑔 is an AND gate is similar. We have
                        h                                                                                         i
 Pr ℰ𝑔,𝝅 = Pr ∃ literal 𝑦 𝑗𝜎 : left(𝑔)| 𝝅 = 𝑦 𝑗𝜎 and 𝐿(right(𝑔)| 𝝅 𝑦 𝑗 ←𝜎 ) = 1
           
                       Õ              h                                                               i
                =                  Pr left(𝑔)| 𝝅 = 𝑦 𝑗𝜎 and 𝐿(right(𝑔)| 𝝅 𝑦 𝑗 ←𝜎 ) = 1
                    literal 𝑦 𝑗𝜎
                       Õ              h                                                           i       h                   i
                =                  Pr 𝐿(right(𝑔)| 𝝅 𝑦 𝑗 ←𝜎 ) = 1 left(𝑔)| 𝝅 = 𝑦 𝑗𝜎 · Pr left(𝑔)| 𝝅 = 𝑦 𝑗𝜎
                    literal 𝑦 𝑗𝜎
                         q                            Õ              h                    i
                ≤𝑞·             𝐿(right(𝑔)) ·                     Pr left(𝑔)| 𝝅 = 𝑦 𝑗𝜎                                              (Lemma 4.4)
                                                   literal 𝑦 𝑗𝜎
                         q
                =𝑞·             𝐿(right(𝑔)) · Pr[𝐿(left(𝑔)| 𝝅 ) = 1]
                            q
                ≤ 𝑞2 ·           𝐿(left(𝑔)) · 𝐿(right(𝑔)),                                                                          (Lemma 4.1)

as required.                                                                                                                                        

   The shrinkage theorem now follows using the same calculation as above, replacing Lemma 4.3
with Lemma 4.6 and the simplifying assumption with Corollary 4.7:
                                                 Õ                                    √
 𝔼 𝐿(𝜙| 𝝅 ) ≤ (𝑑 + 2) ·                                           Pr ℰ𝑔,𝝅 + 𝑞 ·           𝑠
                                                                           
                                                                                                                                    (Lemma 4.6)
                                          internal gate 𝑔 of 𝜙
                                                       Õ                 q                                            √
                    ≤ 𝑞 2 · (𝑑 + 2) ·                                        𝐿(left(𝑔)) · 𝐿(right(𝑔)) + 𝑞 ·               𝑠       (Corollary 4.7)
                                               internal gate 𝑔 of 𝜙
                                                       Õ                                                              √
                    ≤ 𝑞 · (𝑑 + 2) ·                                       𝐿(left(𝑔)) + 𝐿(right(𝑔)) + 𝑞 ·                  𝑠
                            2
                                                                                                              
                                                                                                                              (AM–GM ineq.)
                                               internal gate 𝑔 of 𝜙
                                                       Õ                                  √
                    = 𝑞 · (𝑑 + 2) ·
                          2
                                                                         size(𝑔) + 𝑞 ·        𝑠
                                               internal gate 𝑔 of 𝜙
                                                      √
                    ≤ 𝑂(𝑞 2 · 𝑑 2 · 𝑠 + 𝑞 ·               𝑠).

In the remainder of this section, we prove Lemmas 4.1, and 4.4, and 4.6.

Remark 4.8. In this paper, we do not prove Lemma 4.3, since we do not actually need it to
prove our main results in Section 8. However, this lemma can be established using the proof of
Lemma 4.6, with some minor changes.

4.1       Proof of Lemma 4.1
We begin with proving Lemma 4.1, restated next.

                                   T HEORY OF C OMPUTING, Volume 19 (7), 2023, pp. 1–51                                                         18
                                S HRINKAGE UNDER R ANDOM P ROJECTIONS

Lemma 4.1. Let 𝑓 : {0, 1} 𝑛 → {0, 1} be a Boolean function, and let 𝝅 be a 𝑞-fixing projection. Then,

                                                                  q
                                       Pr[𝐿( 𝑓 | 𝝅 ) = 1] ≤ 𝑞 ·       𝐿( 𝑓 ).


   Let 𝑓 : {0, 1} 𝑛 → {0, 1}, and let ℰ be the set
                                               p of projections 𝜋 such that 𝐿( 𝑓 | 𝜋 ) = 1. We prove
that the probability that 𝝅 ∈ ℰ is at most 𝑞 · 𝐿( 𝑓 ). Our proof follows closely the proof of [23,
Lemma 4.1].
    Let Π be a protocol that solves KW 𝑓 and has 𝐿( 𝑓 ) leaves (such a protocol exists by Theo-
rem 2.10). Let 𝒜 and ℬ be the sets of projections 𝜋 for which 𝑓 | 𝜋 is the constants 1 and 0,
respectively. We extend the protocol Π to take inputs from 𝒜 × ℬ as follows: when Alice and
Bob are given as inputs the projections 𝜋𝐴 ∈ 𝒜 and 𝜋𝐵 ∈ ℬ, respectively, they construct strings
𝑎, 𝑏 ∈ {0, 1} 𝑛 from 𝜋𝐴 , 𝜋𝐵 by substituting 0 for all the variables 𝑦1 , . . . , 𝑦𝑚 , and invoke Π on the
inputs 𝑎 and 𝑏. Observe that 𝑎 and 𝑏 are indeed legal inputs for Π (since 𝑓 (𝑎) = 1 and 𝑓 (𝑏) = 0).
Moreover, recall that the protocol Π induces a partition of 𝒜 × ℬ to combinatorial rectangles,
and that we denote the rectangle of the leaf ℓ by 𝒜ℓ × ℬℓ (see Notation 2.8).
       Our proof strategy is the following: We associate with every projection 𝜋 ∈ ℰ a leaf of Π,
denoted leaf(𝜋). We consider the two disjoint events ℰ + , ℰ − that correspond to the event that
 𝑓 | 𝜋 is a single positive literal or a single negative literal, respectively, and show that for every
leaf ℓ we have

                                                            p
                     Pr 𝝅 ∈ ℰ + and leaf(𝝅) = ℓ ≤ 𝑞 ·
                                                  
                                                                Pr[𝝅 ∈ 𝒜ℓ ] · Pr[𝝅 ∈ ℬℓ ]           (4.1)
                                                            p
                     Pr[𝝅 ∈ ℰ − and leaf(𝝅) = ℓ ] ≤ 𝑞 ·         Pr[𝝅 ∈ 𝒜ℓ ] · Pr[𝝅 ∈ ℬℓ ].          (4.2)


Together, the two inequalities imply that

                                                           p
                     Pr[𝝅 ∈ ℰ and leaf(𝝅) = ℓ ] ≤ 2𝑞 ·         Pr[𝝅 ∈ 𝒜ℓ ] · Pr[𝝅 ∈ ℬℓ ].


The desired bound on Pr[𝝅 ∈ ℰ] will follow by summing the latter bound over all the leaves ℓ
of Π.
   We start by explaining how to associate a leaf with every projection 𝜋 ∈ ℰ + . Let 𝜋 ∈ ℰ + .
Then, it must be the case that 𝑓 | 𝜋 = 𝑦 𝑗 for some 𝑗 ∈ [𝑚]. We define the projections 𝜋1 = 𝜋 𝑦 𝑗 ←1
and 𝜋0 = 𝜋 𝑦 𝑗 ←0 , and observe that 𝜋1 ∈ 𝒜 and 𝜋0 ∈ ℬ. We now define leaf(𝜋) to be the leaf to
which Π arrives when invoked on inputs n𝜋1 ando 𝜋0 . Observe that the output of Π at leaf(𝜋)
must be a variable 𝑥 𝑖 that satisfies 𝜋(𝑥 𝑖 ) ∈ 𝑦 𝑗 , 𝑦 𝑗 , and thus 𝜋var(𝜋(𝑥 𝑖 ))←1 = 𝜋1 .
    Next, fix a leaf ℓ . We prove that Pr[𝝅 ∈ ℰ + and leaf(𝝅) = ℓ ] ≤ 𝑞1 · Pr[𝝅 ∈ 𝒜ℓ ]. Let 𝑥 𝑖 be the

                        T HEORY OF C OMPUTING, Volume 19 (7), 2023, pp. 1–51                           19
                                        Y UVAL F ILMUS , O R M EIR , AND AVISHAY TAL

output of the protocol Π at ℓ . Then,

   Pr 𝝅 ∈ ℰ + and leaf(𝝅) = ℓ ≤ Pr 𝝅1 ∈ 𝒜ℓ and 𝝅(𝑥 𝑖 ) ∉ {0, 1} and 𝝅var(𝝅(𝑥 𝑖 ))←1 = 𝝅1
                                                                                                                

                                                 ≤ Pr 𝝅(𝑥 𝑖 ) ∉ {0, 1} and 𝝅var(𝝅(𝑥 𝑖 ))←1 ∈ 𝒜ℓ
                                                                                                    
                                                     Õ
                                                            Pr 𝝅(𝑥 𝑖 ) ∉ {0, 1} and 𝝅var(𝝅(𝑥 𝑖 ))←1 = 𝜋
                                                                                                         
                                                 =
                                                     𝜋∈𝒜ℓ
                                                            Õ
                                                 ≤ 𝑞1 ·           Pr[𝝅 = 𝜋]                 (since 𝝅 is (𝑞 0 , 𝑞1 )-fixing)
                                                           𝜋∈𝒜ℓ

                                                 = 𝑞1 · Pr[𝝅 ∈ 𝒜ℓ ] .

Similarly, it can be proved that Pr[𝝅 ∈ ℰ + and leaf(𝝅) = ℓ ] ≤ 𝑞 0 · Pr[𝝅 ∈ ℬℓ ]. Together, the two
bounds imply that

                                                 p                                            p
  Pr 𝝅 ∈ ℰ + and leaf(𝝅) = ℓ ≤                       𝑞1 Pr[𝝅 ∈ 𝒜ℓ ] · 𝑞0 Pr[𝝅 ∈ ℬℓ ] = 𝑞 ·
                                        
                                                                                                  Pr[𝝅 ∈ 𝒜ℓ ] · Pr[𝝅 ∈ ℬℓ ]

for every leaf ℓ of Π. We define leaf(𝜋) for projections 𝜋 ∈ ℰ − in an analogous way, and then a
similar argument shows that

                                                                          p
                         Pr[𝝅 ∈ ℰ − and leaf(𝝅) = ℓ ] ≤ 𝑞 ·                   Pr[𝝅 ∈ 𝒜ℓ ] · Pr[𝝅 ∈ ℬℓ ].

It follows that
                                                                          p
                         Pr[𝝅 ∈ ℰ and leaf(𝝅) = ℓ ] ≤ 2𝑞 ·                    Pr[𝝅 ∈ 𝒜ℓ ] · Pr[𝝅 ∈ ℬℓ ].

Finally, let ℒ denote the set of leaves of Π. It holds that
                  Õ
  Pr[𝝅 ∈ ℰ] =            Pr[𝝅 ∈ ℰ and leaf(𝝅) = ℓ ]
                  ℓ ∈ℒ
                         Õp
             ≤ 2𝑞 ·               Pr[𝝅 ∈ 𝒜ℓ ] · Pr[𝝅 ∈ ℬℓ ]
                         ℓ ∈ℒ
                         p           sÕ
             ≤ 2𝑞 ·          |ℒ| ·               Pr[𝝅 ∈ 𝒜ℓ ] · Pr[𝝅 ∈ ℬℓ ]           (Cauchy-Schwarz – see Fact 2.12)
                                         ℓ ∈ℒ
                         q              sÕ
             = 2𝑞 ·          𝐿( 𝑓 ) ·               Pr[𝝅 ∈ 𝒜ℓ ] · Pr[𝝅 ∈ ℬℓ ].
                                             ℓ ∈ℒ


                                                                                                                     𝐴    𝐵
                                                              ℓ ∈ℒ Pr[𝝅 ∈ 𝒜ℓ ] · Pr[𝝅 ∈ ℬℓ ] ≤ 4 . To this end, let 𝝅 , 𝝅
                                                            Í                                  1
We conclude the proof by showing that

                           T HEORY OF C OMPUTING, Volume 19 (7), 2023, pp. 1–51                                               20
                                    S HRINKAGE UNDER R ANDOM P ROJECTIONS

be two independent random variables that are distributed identically to 𝝅. Then, we have
  Õ                                       Õ
                                                 Pr 𝝅𝐴 ∈ 𝒜ℓ · Pr 𝝅𝐵 ∈ ℬℓ
                                                                              
         Pr[𝝅 ∈ 𝒜ℓ ] · Pr[𝝅 ∈ ℬℓ ] =
  ℓ ∈ℒ                                    ℓ ∈ℒ
                                          Õ
                                                 Pr (𝝅𝐴 , 𝝅𝐵 ) ∈ 𝒜ℓ × ℬℓ                  (𝝅𝐴 , 𝝅𝐵 are independent)
                                                                          
                                      =
                                          ℓ ∈ℒ
                                      = Pr (𝝅𝐴 , 𝝅𝐵 ) ∈ 𝒜 × ℬ
                                                                   
                                                                               (𝒜ℓ × ℬℓ are a partition of 𝒜 × ℬ)
                                      = Pr[𝝅𝐴 ∈ 𝒜] · Pr[𝝅𝐵 ∈ ℬ]                           (𝝅𝐴 , 𝝅𝐵 are independent)
                                      = Pr[𝝅 ∈ 𝒜] · Pr[𝝅 ∈ ℬ]
                                      ≤ Pr[𝝅 ∈ 𝒜] · (1 − Pr[𝝅 ∈ 𝒜]) ≤ 1/4.                        (𝒜,ℬ are disjoint)



4.2      Proof of Lemma 4.4
We turn to proving Lemma 4.4, restated next.

Lemma 4.4. Let 𝝅 be a 𝑞-fixing projection. Let 𝑓1 , 𝑓2 : {0, 1} 𝑛 → {0, 1}, let 𝜎, 𝜏 ∈ {0, 1}, and let 𝑦 𝑗 be
a variable. Then,            h                                i     q
                                  Pr 𝐿( 𝑓2 | 𝝅 𝑦 𝑗 ←𝜎 ) = 1 𝑓1 | 𝝅 = 𝑦 𝜏𝑗 ≤ 𝑞 ·      𝐿( 𝑓2 ).                         (4.3)
                                                                                            √
       Let 𝝅 be a (𝑞0 , 𝑞1 )-fixing random projection, and let 𝑞 = 𝑞0 𝑞1 . Let 𝑓1 , 𝑓2 : {0, 1} 𝑛 → {0, 1},
let 𝜎, 𝜏 ∈ {0, 1}, and let 𝑦 𝑗 be a variable. We prove Equation 4.3. For simplicity, we focus on
the case that 𝑓1 | 𝝅 = 𝑦 𝑗 , and the case that 𝑓1 | 𝝅 = 𝑦 𝑗 can be dealt with similarly. The crux of the
proof is to show that the random projection 𝝅 𝑦 𝑗 ←𝜎 is essentially a (𝑞0 , 𝑞1 )-fixing projection even
when conditioned on the event 𝑓1 | 𝝅 = 𝑦 𝑗 , and therefore Equation 4.3 is implied immediately by
Lemma 4.1.
       In order to carry out this argument, we first establish some notation. Let 𝑰 + = 𝝅−1 (𝑦 𝑗 )
and 𝑰 − = 𝝅−1 (𝑦 𝑗 ), and denote by 𝝅0 the projection obtained from 𝝅 by restricting its domain
to {𝑥 1 , . . . , 𝑥 𝑛 } \(𝑰 + ∪ 𝑰 − ). We denote by 𝑓2,𝑰 + ,𝑰 − the function over {𝑥 1 , . . . , 𝑥 𝑛 } \(𝑰 + ∪ 𝑰 − ) that is
obtained from 𝑓2 by hard-wiring 𝜎 and 𝜎 to the variables in 𝑰 + and 𝑰 − , respectively. Observe that
 𝑓2 | 𝝅 𝑦 𝑗 ←𝜎 = 𝑓2,𝑰 + ,𝑰 − | 𝝅0 , so it suffices to prove that for every two disjoint sets 𝐼 + , 𝐼 − ⊆ {𝑥 1 , . . . , 𝑥 𝑛 }
we have                                                                                               q
                               Pr 𝐿( 𝑓2,𝑰 + ,𝑰 − | 𝝅0 ) = 1 𝑓1 | 𝝅 = 𝑦 𝑗 , 𝑰 + = 𝐼 + , 𝑰 − = 𝐼 − ≤ 𝑞 · 𝐿( 𝑓2 ).
                                                                                               
                                                                                                                       (4.4)

Let 𝐼 + , 𝐼 − ⊆ {𝑥 1 , . . . , 𝑥 𝑛 } be disjoint sets, and let ℐ be the event that 𝑰+ = 𝐼 + and 𝑰 − = 𝐼 − . For
convenience, let 𝐾 = {𝑥 1 , . . . , 𝑥 𝑛 } \(𝐼 + ∪ 𝐼 − ) and 𝑌 = {𝑦1 , . . . , 𝑦𝑚 } \ 𝑦 𝑗 , so 𝝅0 is a random
projection from 𝐾 to 𝑌 when conditioned on ℐ. To prove Equation 4.4, it suffices to prove that
𝝅0 is a (𝑞0 , 𝑞1 )-fixing projection when conditioned on the events ℐ and 𝑓1 | 𝝅 = 𝑦 𝑗 , and then the
inequality will follow from Lemma 4.1. We first prove that 𝝅0 is a (𝑞0 , 𝑞1 )-fixing projection when
conditioning only on the event ℐ (and not on 𝑓1 | 𝝅 = 𝑦 𝑗 ).

Proposition 4.9. Conditioned on the event ℐ, the projection 𝝅0 is a (𝑞 0 , 𝑞1 )-fixing projection.

                           T HEORY OF C OMPUTING, Volume 19 (7), 2023, pp. 1–51                                          21
                                            Y UVAL F ILMUS , O R M EIR , AND AVISHAY TAL

Proof. We prove that 𝝅0 satisfies the definition of a fixing projection. Let 𝜋0 be a projection from
𝐾 to 𝑌, and let 𝑥 𝑖 ∈ 𝐾. Let 𝜎 ∈ {0, 1}. We have
     h                                                          i
 Pr 𝝅0(𝑥 𝑖 ) ∉ {0, 1} and 𝝅0var(𝝅0(𝑥 𝑖 ))←𝜎 = 𝜋0 ℐ
 = Pr 𝝅(𝑥 𝑖 ) ∉ {0, 1} and 𝝅var(𝝅(𝑥 𝑖 ))←𝜎 | 𝐾 = 𝜋0 and ℐ /Pr[ℐ]
                                                                            
                   Õ
                                      Pr 𝝅(𝑥 𝑖 ) ∉ {0, 1} and 𝝅| var(𝝅(𝑥 𝑖 ))←𝜎 = 𝜋 /Pr[ℐ]
                                                                                        
 =
     𝜋 : 𝜋| 𝐾 =𝜋0 ,𝜋−1 (𝑦 𝑗 )=𝐼 + ,
             𝜋−1 (𝑦 𝑗 )=𝐼 −
                   Õ
 ≤                                    𝑞 𝜎 · Pr[𝝅 = 𝜋]/Pr[ℐ]                                   (since 𝝅 is (𝑞0 , 𝑞1 )-fixing)
     𝜋 : 𝜋| 𝐾 =𝜋0 ,𝜋−1 (𝑦 𝑗 )=𝐼 + ,
             𝜋−1 (𝑦 𝑗 )=𝐼 −

 = 𝑞 𝜎 · Pr[𝝅| 𝐾 = 𝜋0 and ℐ] /Pr[ℐ]
 = 𝑞 𝜎 · Pr[𝝅0 = 𝜋0 | ℐ] ,
as required.                                                                                                              
   We now prove that 𝝅0 remains a (𝑞0 , 𝑞1 )-fixing projection when conditioning on 𝑓1 | 𝝅 = 𝑦 𝑗 in
addition to ℐ. The crucial observation is that the event 𝑓1 | 𝝅 = 𝑦 𝑗 is essentially a filter, defined
next.
Definition 4.10. A set of projections ℰ from 𝑥1 , . . . , 𝑥 𝑛 to 𝑦1 , . . . , 𝑦𝑚 is a filter if it is closed under
assignment to variables, i. e., if for every 𝜋 ∈ ℰ, every variable 𝑦 𝑗 , and every bit 𝜏 ∈ {0, 1}, we
have 𝜋 𝑦 𝑗 ←𝜏 ∈ ℰ.
  It turns out that the property of a projection being a (𝑞0 , 𝑞1 )-fixing projection is preserved
when conditioning on filters. Formally:
Proposition 4.11. Let ℰ be a filter and let 𝝅∗ be a (𝑞 0 , 𝑞1 )-fixing projection. Then, 𝝅∗ |ℰ is a (𝑞 0 , 𝑞1 )-fixing
projection.
Proof. Let 𝜋∗ be a projection, and let 𝑥 𝑖 be a variable. Let 𝜎 ∈ {0, 1}. We would like to prove that
                   h                                                                 i
               Pr 𝝅∗ (𝑥 𝑖 ) ∉ {0, 1} and 𝝅∗var(𝝅∗ (𝑥 𝑖 ))←𝜎 = 𝜋∗ 𝝅∗ ∈ ℰ ≤ 𝑞 𝜎 · Pr[𝝅∗ = 𝜋∗ | 𝝅∗ ∈ ℰ] .

If 𝜋∗ ∉ ℰ, then both sides of the equation are equal to zero: this is obvious for the right-hand
side, and holds for the left-hand side since if there is a projection 𝜋0 ∈ ℰ and a variable 𝑦 𝑗 such
that 𝜋0𝑦 𝑗 ←𝜎 = 𝜋∗ then it must be the case that 𝜋∗ ∈ ℰ by the definition of a filter. Thus, we may
assume that 𝜋∗ ∈ ℰ. Now, it holds that
               h                                                                 i
          Pr 𝝅∗ (𝑥 𝑖 ) ∉ {0, 1} and 𝝅∗var(𝝅∗ (𝑥 𝑖 ))←𝜎 = 𝜋∗ 𝝅∗ ∈ ℰ
                   h                                                     i
             ≤ Pr 𝝅 (𝑥 𝑖 ) ∉ {0, 1} and 𝝅var(𝝅∗ (𝑥 𝑖 ))←𝜎 = 𝜋 /Pr[𝝅∗ ∈ ℰ]
                        ∗                           ∗                ∗


             ≤ 𝑞 𝜎 · Pr[𝝅∗ = 𝜋∗ ] /Pr[𝝅∗ ∈ ℰ]                                                (𝝅∗ is (𝑞0 , 𝑞1 )-fixing)
             = 𝑞 𝜎 · Pr[ 𝝅∗ = 𝜋∗ | 𝝅∗ ∈ ℰ] ,                                                               (𝜋∗ ∈ ℰ)


                                      T HEORY OF C OMPUTING, Volume 19 (7), 2023, pp. 1–51                                22
                                    S HRINKAGE UNDER R ANDOM P ROJECTIONS

as required.                                                                                                                            

      Consider the event 𝑓1 | 𝝅 = 𝑦 𝑗 . Viewed as a set of projections from 𝑥 1 , . . . , 𝑥 𝑛 to 𝑦1 , . . . , 𝑦𝑚 ,
this event is not a filter, since it is not closed under assignments to 𝑦 𝑗 . However, this event is
closed under assignments to all variables except 𝑦 𝑗 : when 𝑓1 | 𝝅 = 𝑦 𝑗 , the equality continues to hold
even if the variables in 𝑌 are fixed to constants. Moreover, observe that conditioned on ℐ, the
event 𝑓1 | 𝝅 = 𝑦 𝑗 depends only on the values that 𝝅 assigns to 𝐾. Thus, we can view the event
𝑓1 | 𝝅 = 𝑦 𝑗 as a set of projections from 𝐾 to 𝑌, and taking this view, this event is a filter. Since 𝝅0 is
a (𝑞0 , 𝑞1 )-fixing projection from 𝐾 to {𝑦1 , . . . , 𝑦𝑚 } \ 𝑦 𝑗 when conditioned on ℐ, we conclude
                                                             
that it is a (𝑞0 , 𝑞1 )-fixing projection when conditioned on both ℐ and 𝑓1 | 𝝅 = 𝑦 𝑗 . It follows by
Lemma 4.1 that
                                                                                         q                            q
                                                       +         +   −     −
            Pr 𝐿( 𝑓2,𝐼 + ,𝐼 − | 𝝅0 ) = 1 𝑓1 | 𝝅 = 𝑦 𝑗 , 𝑰 = 𝐼 , 𝑰 = 𝐼              ≤𝑞·       𝐿( 𝑓2,𝐼 + ,𝐼 − ) ≤ 𝑞 ·       𝐿( 𝑓2 ),
                                                                              


as required.


4.3     Proof of Lemma 4.6
Finally, we prove Lemma 4.6, restated next.

Lemma 4.6. For every projection 𝜋 we have
                                               Õ
                           𝐿(𝜙| 𝜋 ) ≤                          (depth(𝑔) + 2) · 1ℰ𝑔,𝜋 + 1𝐿(𝜙|𝜋 )=1 .                                 (4.5)
                                        internal gate 𝑔 of 𝜙


   Let 𝜋 be a projection. We prove that 𝜋 satisfies Equation 4.5. Recall that ℰ𝑔,𝜋 is the event
that there exists some literal 𝑦 𝑗𝜎 such that left(𝑔)| 𝜋 = 𝑦 𝑗𝜎 and

      • 𝐿(right(𝑔)| 𝜋 𝑦 𝑗 ←𝜎 ) = 1 if 𝑔 is an OR gate, or

      • 𝐿(right(𝑔)| 𝜋 𝑦 𝑗 ←𝜎 ) = 1 if 𝑔 is an AND gate.

We prove Equation 4.5 by induction. If 𝜙 consists of a single leaf, then the upper bound clearly
holds. Otherwise, the root of 𝜙 is an internal gate. Without loss of generality, assume that the
root is an OR gate. We denote the subformulas rooted at the left and right children of the root
by 𝜙ℓ and 𝜙 𝑟 , respectively. We consider several cases:

      • If 𝐿(𝜙| 𝜋 ) = 1, then the upper bound clearly holds.

      • Suppose that 𝐿(𝜙ℓ | 𝜋 ) ≥ 2. By the subadditivity of formula complexity (Fact 2.3), we have

                                                  𝐿(𝜙| 𝜋 ) ≤ 𝐿(𝜙ℓ | 𝜋 ) + 𝐿(𝜙 𝑟 | 𝜋 ).

                           T HEORY OF C OMPUTING, Volume 19 (7), 2023, pp. 1–51                                                        23
                                Y UVAL F ILMUS , O R M EIR , AND AVISHAY TAL

  By the induction hypothesis, we have
                                 Õ
         𝐿(𝜙ℓ | 𝜋 ) ≤                           (depth𝜙ℓ (𝑔) + 2) · 1ℰ𝑔,𝜋 + 1𝐿(𝜙ℓ |𝜋 )=1
                        internal gate 𝑔 of 𝜙ℓ
                                 Õ
                    =                           (depth𝜙ℓ (𝑔) + 2) · 1ℰ𝑔,𝜋                               (𝐿(𝜙ℓ | 𝜋 ) ≥ 2)
                        internal gate 𝑔 of 𝜙ℓ
                                 Õ
                    =                           (depth𝜙 (𝑔) + 1) · 1ℰ𝑔,𝜋 ,
                        internal gate 𝑔 of 𝜙ℓ


  where the equality holds since depth𝜙ℓ (𝑔) = depth𝜙 (𝑔) − 1 for every gate 𝑔 of 𝜙ℓ . Since
  𝐿(𝜙ℓ | 𝜋 ) ≥ 2, at least one of the terms in the last sum must be non-zero, so it holds that
               Õ                                                             Õ
                               (depth𝜙 (𝑔) + 1) · 1ℰ𝑔,𝜋 ≤                                   (depth𝜙 (𝑔) + 2) · 1ℰ𝑔,𝜋 − 1.
       internal gate 𝑔 of 𝜙ℓ                                        internal gate 𝑔 of 𝜙ℓ


  Next, by the induction hypothesis we have
                                                 Õ
                      𝐿(𝜙 𝑟 | 𝜋 ) ≤                             (depth𝜙𝑟 (𝑔) + 2) · 1ℰ𝑔,𝜋 + 1𝐿(𝜙𝑟 |𝜋 )=1
                                       internal gate 𝑔 of 𝜙 𝑟
                                                 Õ
                                   ≤                            (depth𝜙 (𝑔) + 2) · 1ℰ𝑔,𝜋 + 1.
                                       internal gate 𝑔 of 𝜙 𝑟


  By combining the two bounds, we get that

                         𝐿(𝜙| 𝜋 ) ≤ 𝐿(𝜙ℓ | 𝜋 ) + 𝐿(𝜙 𝑟 | 𝜋 )
                                                  Õ
                                    ≤                            (depth𝜙 (𝑔) + 2) · 1ℰ𝑔,𝜋 − 1
                                        internal gate 𝑔 of 𝜙ℓ
                                                        Õ
                                           +                             (depth𝜙 (𝑔) + 2) · 1ℰ𝑔,𝜋 + 1
                                                internal gate 𝑔 of 𝜙 𝑟
                                                 Õ
                                    ≤                           (depth(𝑔) + 2) · 1ℰ𝑔,𝜋
                                        internal gate 𝑔 of 𝜙
                                                 Õ
                                    ≤                           (depth(𝑔) + 2) · 1ℰ𝑔,𝜋 + 1𝐿(𝜙|𝜋 )=1 ,
                                        internal gate 𝑔 of 𝜙


  as required.

• If 𝐿(𝜙 𝑟 | 𝜋 ) ≥ 2, then we use the same argument of the previous case by exchanging 𝜙ℓ and
  𝜙𝑟 .

• Suppose that 𝐿(𝜙| 𝜋 ) ≥ 2, 𝐿(𝜙ℓ | 𝜋 ) ≤ 1 and 𝐿(𝜙 𝑟 | 𝜋 ) ≤ 1. Then, it must be the case that
  𝐿(𝜙| 𝜋 ) = 2 and also that 𝐿(𝜙ℓ | 𝜋 ) = 1 and 𝐿(𝜙 𝑟 | 𝜋 ) = 1 (or otherwise 𝐿(𝜙| 𝜋 ) = 1). In

                      T HEORY OF C OMPUTING, Volume 19 (7), 2023, pp. 1–51                                                  24
                                  S HRINKAGE UNDER R ANDOM P ROJECTIONS

       particular, 𝜙ℓ | 𝜋 is equal to some literal 𝑦 𝑗𝜎 . It follows that 𝜙| 𝜋 = 𝑦 𝑗𝜎 ∨ 𝜙 𝑟 | 𝜋 , and by the
       one-variable simplification rules (Fact 4.5), this function is equal to

                                             𝑦 𝑗𝜎 ∨ 𝜙 𝑟 | 𝜋 𝑦 𝑗 ←𝜎 = 𝑦 𝑗𝜎 ∨ 𝜙 𝑟 | 𝜋 𝑦 𝑗 ←𝜎 .
                                                              


       Thus, it must be the case that 𝐿(𝜙 𝑟 | 𝜋 𝑦 𝑗 ←𝜎 ) = 1 (since 𝐿(𝜙| 𝜋 ) = 2). It follows that if we let 𝑔
       be the root of 𝜙, then the event ℰ𝑔,𝜋 occurs and so

                                            (depth(𝑔) + 2) · 1ℰ𝑔,𝜋 = 2 = 𝐿(𝜙| 𝜋 ),

       so the desired upper bound holds.
We proved that the upper bound holds in each of the possible cases, so the required result
follows.


5    Proof of shrinkage theorem for hiding projections
In this section we prove our the shrinkage theorem for hiding projections. We start by recalling
the definition of hiding projections and their shrinkage theorem.
Definition 3.9. Let 0 ≤ 𝑞0 , 𝑞1 ≤ 1. We say that a random projection 𝝅 is a (𝑞0 , 𝑞1 )-hiding projection
if for every projection 𝜋, every bit 𝜎 ∈ {0, 1}, and all variables 𝑥 𝑖 , 𝑦 𝑗 , we have

                                     Pr 𝝅(𝑥 𝑖 ) ∈ 𝑦 𝑗 , 𝑦 𝑗           𝝅 𝑦 𝑗 ←𝜎 = 𝜋 ≤ 𝑞 𝜎 ,
                                                                               

whenever the event conditioned on has positive probability. For shorthand, we say that 𝝅 is a
                            √
𝑞-hiding projection, for 𝑞 = 𝑞0 𝑞 1 .
Theorem 3.12 (Shrinkage under hiding projections). Let 𝜙 be a formula of size 𝑠 and depth 𝑑, and
let 𝝅 be a 𝑞-hiding projection. Then
                                                                           √ 
                               𝔼 𝐿(𝜙| 𝝅 ) = 𝑂 𝑚 4 · 𝑞 2 · 𝑑2 · 𝑠 + 𝑚 2 · 𝑞 · 𝑠 .
                                        


   The crux of the proof of Theorem 3.12 is that hiding projections can be converted into fixing
projections. This is captured by the following result.
Lemma 5.1. Let 𝝅 be a 𝑞-hiding projection from 𝑥 1 , . . . , 𝑥 𝑛 to 𝑦1 , . . . , 𝑦𝑚 . Then, there exists a (4𝑚 2 · 𝑞)-
fixing projection 𝝅0 from 𝑥 1 , . . . , 𝑥 𝑛 to 𝑦1 , . . . , 𝑦𝑚 and an event ℋ of probability at least 12 such that
𝝅0 = 𝝅 conditioned on ℋ . Furthermore, the event ℋ is independent of 𝝅.
    We prove Lemma 5.1 in Section 5.1. By combining Lemma 5.1 with our shrinkage theorem
for fixing projections, we obtain the following shrinkage theorem for hiding projections.
Theorem 3.12. Let 𝜙 be a formula of size 𝑠 and depth 𝑑, and let 𝝅 be a 𝑞-hiding projection. Then
                                                                    √ 
                               𝔼 𝐿(𝜙| 𝝅 ) = 𝑂 𝑚 · 𝑞 · 𝑑 · 𝑠 + 𝑚 · 𝑞 · 𝑠 .
                                                        4       2     2        2



                         T HEORY OF C OMPUTING, Volume 19 (7), 2023, pp. 1–51                                     25
                                      Y UVAL F ILMUS , O R M EIR , AND AVISHAY TAL

Proof. Let 𝝅0 and ℋ be the (4𝑚 2 · 𝑞)-fixing projection and event that are obtained from 𝝅 by
Lemma 5.1. Since Pr [ℋ ] ≥ 12 , we have
                                                1               1
           𝔼 𝐿(𝜙| 𝝅0 ) ≥ Pr[ℋ ] · 𝔼 𝐿(𝜙| 𝝅0 ) ℋ ≥ · 𝔼 𝐿(𝜙| 𝝅 ) ℋ = · 𝔼 𝐿(𝜙| 𝝅 ) ,
                                                                          
                                                    2                       2
where the second inequality holds since conditioned on ℋ it holds that 𝝅0 = 𝝅, and the
last equality holds since the event ℋ is independent of 𝝅. On the other hand, by applying
Theorem 3.6 to 𝝅0, we obtain that
                                                                       √
                         𝔼 𝐿(𝜙| 𝝅0 ) = 𝑂(𝑚 4 · 𝑞 2 · 𝑑2 · 𝑠 + 𝑚 2 · 𝑞 · 𝑠).
                                   

The theorem follows by combining the two bounds.                                                                            

5.1     Proof of Lemma 5.1
We use the following straightforward generalization of the property of hiding projections.
Claim 5.2. Let 𝝅 be a (𝑞 0 , 𝑞1 )-hiding projection, and let 𝓔 be a random set of projections that is
independent of 𝝅. Then, for every 𝜎 ∈ {0, 1}, we have
                                          Pr 𝝅(𝑥 𝑖 ) ∈ 𝑦 𝑗 , 𝑦 𝑗         𝝅 𝑦 𝑗 ←𝜎 ∈ 𝓔 ≤ 𝑞 𝜎 .
                                                                                   

Proof. Let 𝝅 and 𝓔 be as in the claim, and let 𝜎 ∈ {0, 1}. We have
           Pr 𝝅(𝑥 𝑖 ) ∈ 𝑦 𝑗 , 𝑦 𝑗          𝝅 𝑦 𝑗 ←𝜎 ∈ 𝓔
                                                        
               Õ
                       Pr 𝝅(𝑥 𝑖 ) ∈ 𝑦 𝑗 , 𝑦 𝑗     𝝅 𝑦 𝑗 ←𝜎 = 𝜋 and 𝜋 ∈ 𝓔 · Pr 𝝅 𝑦 𝑗 ←𝜎 = 𝜋 𝝅 𝑦 𝑗 ←𝜎 ∈ 𝓔
                                                                                                              
           =
               𝜋
               Õ
                       Pr 𝝅(𝑥 𝑖 ) ∈ 𝑦 𝑗 , 𝑦 𝑗     𝝅 𝑦 𝑗 ←𝜎 = 𝜋 · Pr 𝝅 𝑦 𝑗 ←𝜎 = 𝜋 𝝅 𝑦 𝑗 ←𝜎 ∈ 𝓔
                                                                                                 
           =
                   𝜋
                                                                                                (𝝅 and 𝓔 are independent)
               Õ
                       𝑞 𝜎 · Pr 𝝅 𝑦 𝑗 ←𝜎 = 𝜋 𝝅 𝑦 𝑗 ←𝜎 ∈ 𝓔                                             (𝝅 is (𝑞0 , 𝑞1 )-hiding)
                                                                
           ≤
                   𝜋
           = 𝑞𝜎 .                                                                                                           
      We turn to proving Lemma 5.1, restated next.
Lemma 5.1. Let 𝝅 be a 𝑞-hiding projection from 𝑥 1 , . . . , 𝑥 𝑛 to 𝑦1 , . . . , 𝑦𝑚 . Then, there exists a (4𝑚 2 · 𝑞)-
fixing projection 𝝅0 from 𝑥 1 , . . . , 𝑥 𝑛 to 𝑦1 , . . . , 𝑦𝑚 and an event ℋ of probability at least 12 such that
𝝅0 = 𝝅 conditioned on ℋ . Furthermore, the event ℋ is independent of 𝝅.
     Suppose        that 𝝅 is a (𝑞0 , 𝑞1 )-hiding projection from 𝑥 1 , . . . , 𝑥 𝑛 to 𝑦1 , . . . , 𝑦𝑚 . Let 𝝆 be a
               -random restriction over 𝑦1 , . . . , 𝑦𝑚 , i. e., 𝝆 is the random projection from 𝑦1 , . . . , 𝑦𝑚 to
             
 1 − 2𝑚  1

𝑦1 , . . . , 𝑦𝑚 that assigns each 𝑦 𝑗 independently as follows:

                                                    𝑦𝑗       with probability 1 − 2𝑚 ,
                                                   
                                                                                   1
                                                   
                                                   
                                          𝝆(𝑦 𝑗 ) = 0         with probability 4𝑚 ,
                                                                                1

                                                                                  .
                                                   
                                                                               1
                                                   1         with probability 4𝑚
                                                   

                              T HEORY OF C OMPUTING, Volume 19 (7), 2023, pp. 1–51                                         26
                                            S HRINKAGE UNDER R ANDOM P ROJECTIONS

For convenience, we define 𝝆(0) = 0, 𝝆(1) = 1, and 𝝆(𝑦 𝑗 ) = 𝝆(𝑦 𝑗 ) for every 𝑗 ∈ [𝑚]. We now
choose the random projection 𝝅0 to be the composition 𝝆 ◦ 𝝅, and define the event ℋ to be the
event that 𝝆(𝑦 𝑗 ) = 𝑦 𝑗 for every 𝑗 ∈ [𝑚]. Observe that the event 𝝅0 = 𝝅 occurs whenever ℋ occurs,
and moreover                                        𝑚
                                                               𝑚
                                            
                                                 1                   1
                                  Pr[ℋ ] = 1 −         ≥ 1−       = ,
                                                2𝑚            2𝑚 2
as required. Moreover, the event ℋ is independent of 𝝅, since 𝝆 is independent of 𝝅. We show
that 𝝅0 is a (4 · 𝑚 2 · 𝑞0 , 4 · 𝑚 2 · 𝑞1 )-fixing projection. To this end, we should show that for every
projection 𝜋0, every bit 𝜎 ∈ {0, 1}, and every variable 𝑥 𝑖 ,
                   h                                                                  i
                       0                                      0                   0
                Pr 𝝅 (𝑥 𝑖 ) ∉ {0, 1} and 𝝅var(𝝅0(𝑥 𝑖 ))←𝜎 = 𝜋                             ≤ 4 · 𝑚 2 · 𝑞 𝜎 · Pr[𝝅0 = 𝜋0] .       (5.1)

This is implied by the following inequality, which we will prove for all 𝑗 ∈ [𝑚]:
                       h                                                          i
                   Pr 𝝅0(𝑥 𝑖 ) ∈ 𝑦 𝑗 , 𝑦 𝑗                   and 𝝅0𝑦 𝑗 ←𝜎 = 𝜋0 ≤ 4 · 𝑚 · 𝑞 𝜎 · Pr[𝝅0 = 𝜋0] .
                                            
                                                                                                                                (5.2)

Let 𝑗 ∈ [𝑚] and let 𝝆−𝑗 be the random projection obtained by restricting 𝝆 to the do-
main {𝑦1 , . . . , 𝑦𝑚 } \ 𝑦 𝑗 . First, observe that if 𝝆(𝑦 𝑗 ) = 𝜎 then 𝝅0 = 𝝆 ◦ 𝝅 = 𝝆−𝑗 ◦ 𝝅 𝑦 𝑗 ←𝜎 ,
                         
and therefore
                                                                  1
       Pr[𝝅0 = 𝜋0] ≥ Pr 𝝆(𝑦 𝑗 ) = 𝜎 and 𝝆−𝑗 ◦ 𝝅 𝑦 𝑗 ←𝜎 = 𝜋0 =        · Pr 𝝆−𝑗 ◦ 𝝅 𝑦 𝑗 ←𝜎 = 𝜋0 ,
                                                                                          
                                                                 4𝑚
where the equality holds since 𝝆(𝑦 𝑗 ) is independent of 𝝆−𝑗 and 𝝅. In the rest of this section
we will prove the following inequality, which together with the last inequality will imply
Equation 5.2:
                   h                                                          i
                Pr 𝝅0(𝑥 𝑖 ) ∈ 𝑦 𝑗 , 𝑦 𝑗                 and 𝝅0𝑦 𝑗 ←𝜎 = 𝜋0 ≤ 𝑞 𝜎 · Pr 𝝆−𝑗 ◦ 𝝅 𝑦 𝑗 ←𝜎 = 𝜋0 .
                                                                                                                   
                                                                                                                                (5.3)

To this end, observe that the event 𝝅0(𝑥 𝑖 ) ∈ 𝑦 𝑗 , 𝑦 𝑗 happens if and only if 𝝅(𝑥 𝑖 ) ∈ 𝑦 𝑗 , 𝑦 𝑗 and
                                                                                                                           
𝝆(𝑦 𝑗 ) = 𝑦 𝑗 , and therefore
                                h                                                         i
                           Pr 𝝅0(𝑥 𝑖 ) ∈ 𝑦 𝑗 , 𝑦 𝑗                 and 𝝅0𝑦 𝑗 ←𝜎 = 𝜋0
                                                    
                                    h                                                                            i
                           = Pr 𝝅(𝑥 𝑖 ) ∈ 𝑦 𝑗 , 𝑦 𝑗                  and 𝝆(𝑦 𝑗 ) = 𝑦 𝑗 and 𝝅0𝑦 𝑗 ←𝜎 = 𝜋0 .
                                                        

Next, observe that if 𝝆(𝑦 𝑗 ) = 𝑦 𝑗 then 𝝅0𝑦 𝑗 ←𝜎 = 𝝆−𝑗 ◦ 𝝅 𝑦 𝑗 ←𝜎 . It follows that the latter expression is
equal to
                  Pr 𝝅(𝑥 𝑖 ) ∈ 𝑦 𝑗 , 𝑦 𝑗                    and 𝝆(𝑦 𝑗 ) = 𝑦 𝑗 and 𝝆−𝑗 ◦ 𝝅 𝑦 𝑗 ←𝜎 = 𝜋0
                                                                                                           

                  ≤ Pr 𝝅(𝑥 𝑖 ) ∈ 𝑦 𝑗 , 𝑦 𝑗                    and 𝝆−𝑗 ◦ 𝝅 𝑦 𝑗 ←𝜎 = 𝜋0
                                                                                            

                  = Pr 𝝅(𝑥 𝑖 ) ∈ 𝑦 𝑗 , 𝑦 𝑗                        𝝆−𝑗 ◦ 𝝅 𝑦 𝑗 ←𝜎 = 𝜋0 · Pr 𝝆−𝑗 ◦ 𝝅 𝑦 𝑗 ←𝜎 = 𝜋0
                                                                                                                 

                  ≤ 𝑞 𝜎 · Pr 𝝆−𝑗 ◦ 𝝅 𝑦 𝑗 ←𝜎 = 𝜋0 ,
                                                                    

where the last inequality follows by applying Claim 5.2 with 𝓔 being the set of projections 𝜋
that satisfy 𝝆−𝑗 ◦ 𝜋 = 𝜋0. This concludes the proof of Equation 5.3.

                           T HEORY OF C OMPUTING, Volume 19 (7), 2023, pp. 1–51                                                   27
                                    Y UVAL F ILMUS , O R M EIR , AND AVISHAY TAL

Remark 5.3. We can improve the bound 𝑚 2 in the statement of Lemma 5.1 to 𝑚 𝑘, where 𝑘 is the
maximal number of variables 1, . . . , 𝑚 which could appear in any position of 𝝅. The reason
is that in the latter case, the transition from Equation 5.2 to Equation 5.1 incurs a factor of 𝑘
rather than 𝑚. This is useful, for example, since the random projections 𝝅 of Section 8.1 have
the feature that for each 𝑖 ∈ [𝑛] there is a unique 𝑗 ∈ [𝑚] such that 𝝅(𝑥 𝑖 ) ∈ {0, 1, 𝑦 𝑗 , 𝑦 𝑗 }, and so
for these projections 𝑘 = 1.


6     Joining projections
In this section, we define a join operation on fixing and hiding projections, and show that
it preserves the corresponding properties. This operation provides a convenient tool for
constructing random projections, and will be used in our applications.

Definition 6.1. Let 𝛼 be a projection from 𝑥1 , . . . , 𝑥 𝑛 𝑎 to 𝑦1 , . . . , 𝑦𝑚𝑎 , and let 𝛽 be a projection
from 𝑤 1 , . . . , 𝑤 𝑛𝑏 to 𝑧 1 , . . . , 𝑧 𝑚𝑏 . The join 𝛼 ] 𝛽 is the projection from 𝑥 1 , . . . , 𝑥 𝑛 𝑎 , 𝑤1 , . . . , 𝑤 𝑛𝑏 to
𝑦1 , . . . , 𝑦𝑚𝑎 , 𝑧1 , . . . , 𝑧 𝑚𝑏 obtained by joining 𝛼 and 𝛽 together in the obvious way.

Lemma 6.2. Let 𝜶 and 𝜷 be independent random projections. If 𝜶 and 𝜷 are (𝑞0 , 𝑞1 )-fixing projections,
then so is 𝜶 ] 𝜷.

Proof. Let 𝜶 and 𝜷 be (𝑞0 , 𝑞1 )-fixing projections, and let 𝜸 = 𝜶 ] 𝜷. We prove that 𝜸 is a (𝑞0 , 𝑞1 )-
fixing projection. Let 𝛼 be a projection from 𝑥 1 , . . . , 𝑥 𝑛 𝑎 to 𝑦1 , . . . , 𝑦𝑚𝑎 , let 𝛽 be a projection from
𝑤 1 , . . . , 𝑤 𝑛𝑏 to 𝑧 1 , . . . , 𝑧 𝑚𝑏 , and let 𝛾 = 𝛼 ] 𝛽. We should show that for every 𝜎 ∈ {0, 1} and every
input variable 𝑢 (either 𝑥 𝑖 or 𝑤 𝑖 ) we have

                          Pr 𝜸(𝑢) ∉ {0, 1} and 𝜸var(𝜸(𝑢))←𝜎 = 𝛾 ≤ 𝑞 𝜎 · Pr 𝜸 = 𝛾 .
                                                                                                  

Let 𝜎 ∈ {0, 1} be a bit, and let 𝑢 be an input variable. Assume that 𝑢 = 𝑥 𝑖 for some 𝑖 ∈ [𝑛 𝑎 ] (if
𝑢 = 𝑤 𝑖 for some 𝑖 ∈ [𝑛𝑏 ], the proof is similar). The above equation is therefore equivalent to

    Pr 𝜶(𝑥 𝑖 ) ∉ {0, 1} and 𝜶 var(𝜶(𝑥 𝑖 ))←𝜎 = 𝛼 and 𝜷var(𝜶(𝑥 𝑖 ))←𝜎 = 𝛽 ≤ 𝑞 𝜎 · Pr 𝜶 = 𝛼 and 𝜷 = 𝛽 .
                                                                                                                      

Since the ranges of 𝜶 and 𝜷 are disjoint, the variable var(𝜶(𝑥 𝑖 )) does not appear in the range of 𝜷,
and therefore 𝜷var(𝜶(𝑥 𝑖 ))←𝜎 = 𝜷. The independence of 𝜶 and 𝜷 shows that the above inequality is
equivalent to

           Pr 𝜶(𝑥 𝑖 ) ∉ {0, 1} and 𝜶 var(𝜶(𝑥 𝑖 ))←𝜎 = 𝛼 · Pr 𝜷 = 𝛽 ≤ 𝑞 𝜎 · Pr[𝜶 = 𝛼] · Pr 𝜷 = 𝛽 ,
                                                                                                              

which follows from 𝜶 being (𝑞0 , 𝑞1 )-fixing.                                                                                
Lemma 6.3. Let 𝜶 and 𝜷 be independent random projections. If 𝜶 and 𝜷 are (𝑞 0 , 𝑞1 )-hiding projections,
then so is 𝜶 ] 𝜷.

Proof. Let 𝜶 and 𝜷 be (𝑞0 , 𝑞1 )-hiding projections, and let 𝜸 = 𝜶 ] 𝜷. We prove that 𝜸 is a (𝑞0 , 𝑞1 )-
hiding projection. Let 𝛼 be a projection from 𝑥 1 , . . . , 𝑥 𝑛 𝑎 to 𝑦1 , . . . , 𝑦𝑚𝑎 , let 𝛽 be a projection from

                           T HEORY OF C OMPUTING, Volume 19 (7), 2023, pp. 1–51                                              28
                                        S HRINKAGE UNDER R ANDOM P ROJECTIONS

𝑤 1 , . . . , 𝑤 𝑛𝑏 to 𝑧 1 , . . . , 𝑧 𝑚𝑏 , and let 𝛾 = 𝛼 ] 𝛽. We should show that for every 𝜎 ∈ {0, 1}, every
input variable 𝑢 (either 𝑥 𝑖 or 𝑤 𝑖 ), and every output variable 𝑣 (either 𝑦 𝑗 or 𝑧 𝑗 ) we have

                                            Pr 𝜸(𝑢) ∈ {𝑣, 𝑣} 𝜸𝑣←𝜎 = 𝛾 ≤ 𝑞 𝜎 .
                                                                                     

Let 𝜎 ∈ {0, 1} be a bit, let 𝑢 be an input variable, and let 𝑣 be an output variable. Assume that
𝑢 = 𝑥 𝑖 for some 𝑖 ∈ [𝑛 𝑎 ] (if 𝑢 = 𝑤 𝑖 for some 𝑖 ∈ [𝑛𝑏 ], the proof is similar). In this case, we may
assume that 𝑣 = 𝑦 𝑗 for some 𝑗 ∈ [𝑚 𝑎 ], since otherwise the probability on the left-hand side is 0.
Thus, the above equation is equivalent to

                               Pr 𝜶(𝑥 𝑖 ) ∈ 𝑦 𝑗 , 𝑦 𝑗             𝜶 𝑦 𝑗 ←𝜎 = 𝛼 and 𝜷 𝑦 𝑗 ←𝜎 = 𝛽 ≤ 𝑞 𝜎 .
                                                                                               

Since 𝜶 and 𝜷 are independent, whenever the conditioned event has positive probability, we
have

      Pr 𝜶(𝑥 𝑖 ) ∈ 𝑦 𝑗 , 𝑦 𝑗       𝜶 𝑦 𝑗 ←𝜎 = 𝛼 and 𝜷 𝑦 𝑗 ←𝜎 = 𝛽 = Pr 𝜶(𝑥 𝑖 ) ∈ 𝑦 𝑗 , 𝑦 𝑗                 𝜶 𝑦𝑖 ←𝜎 = 𝛼 ≤ 𝑞 𝜎 ,
                                                                                                                

where the inequality holds since 𝜶 is (𝑞 0 , 𝑞1 )-hiding.                                                                       


7      Hiding projections from complexity measures
In this section we define a generalization of the unweighted quantum adversary bound due to
Ambainis [2], and show how it can be used for constructing hiding projections. We also derive a
relationship between this measure to the Khrapchenko bound [33], and conclude that the latter
bound can be used for constructing hiding projections as well.

7.1     The soft-adversary method
Ambainis [2] defined the following complexity measure for Boolean functions, called the
unweighted quantum adversary bound, and proved that it is a lower bound on quantum query
complexity (for a definition of quantum query complexity, see [2] or [9]).

Definition 7.1. Let 𝑓 : {0, 1} 𝑛 → {0, 1} be a non-constant function. Let 𝑅 ⊆ 𝑓 −1 (1) × 𝑓 −1 (0),
and let 𝐴, 𝐵 be the projections of 𝑅 into the first and second coordinates, respectively. Let
𝑅(𝑎, 𝐵) = {𝑏 ∈ 𝐵 : (𝑎, 𝑏) ∈ 𝑅}, let 𝑅 𝑖 (𝑎, 𝐵) = {𝑏 ∈ 𝐵 : (𝑎, 𝑏) ∈ 𝑅, 𝑎 𝑖 ≠ 𝑏 𝑖 }, and define
𝑅(𝐴, 𝑏), 𝑅 𝑖 (𝐴, 𝑏) analogously. The unweighted quantum adversary bound of 𝑓 is
                                                              v
                                                                   min𝑎∈𝐴 |𝑅(𝑎, 𝐵)| · min𝑏∈𝐵 |𝑅(𝐴, 𝑏)|
                                                              u
                                                              t
                    Adv𝑢 ( 𝑓 ) =            max                                                               .
                                      𝑅⊆ 𝑓 −1 (1)× 𝑓 −1 (0)       max 𝑎∈𝐴 |𝑅 𝑖 (𝑎, 𝐵)| · max 𝑏∈𝐵 |𝑅 𝑖 (𝐴, 𝑏)|
                                                                       𝑖∈[𝑛]                  𝑖∈[𝑛]

Theorem 7.2 ([2]). The quantum query complexity of a non-constant function 𝑓 : {0, 1} 𝑛 → {0, 1} is
Ω(Adv𝑢 ( 𝑓 )).

      We define the following generalization of the unweighted quantum adversary bound.

                           T HEORY OF C OMPUTING, Volume 19 (7), 2023, pp. 1–51                                                 29
                                    Y UVAL F ILMUS , O R M EIR , AND AVISHAY TAL

Definition 7.3. Let 𝑓 : {0, 1} 𝑛 → {0, 1} be a Boolean function. We define the soft-adversary bound
of 𝑓 , which we denote Adv𝑠 , to be the maximum of the quantity
                      v
                      u
                      t                      1                               1
                            min                          · min
                          𝑎∈supp(𝒂) Pr[𝒂 𝑖 ≠ 𝒃 𝑖 | 𝒂 = 𝑎] 𝑏∈supp(𝒃) Pr[𝒂 𝑖 ≠ 𝒃 𝑖 | 𝒃 = 𝑏]
                            𝑖∈[𝑛]                             𝑖∈[𝑛]


over all distributions (𝒂, 𝒃) supported on 𝑓 −1 (1) × 𝑓 −1 (0).

    We chose to call this notion “soft-adversary bound” since we view it as a variant of the
unweighted adversary bound in which, instead of choosing for each pair (𝑎, 𝑏) whether it
belongs to 𝑅 or not (a “hard” decision), we assign each pair (𝑎, 𝑏) a probability of being in the
relation 𝑅 (a “soft” decision). We now show that this bound indeed generalizes the unweighted
quantum adversary method.

Proposition 7.4. Let 𝑓 : {0, 1} 𝑛 → {0, 1} be non-constant function. Then Adv𝑠 ( 𝑓 ) ≥ Adv𝑢 ( 𝑓 ).

Proof. Let 𝑅 ⊆ 𝑓 −1 (1) × 𝑓 −1 (0) be a relation that attains Adv𝑢 ( 𝑓 ). Let (𝒂, 𝒃) be a uniformly
distributed element of 𝑅. Using Pr[𝒂 𝑖 ≠ 𝒃 𝑖 | 𝒂 = 𝑎] = |𝑅 𝑖 (𝑎, 𝐵)|/|𝑅(𝑎, 𝐵)| and Pr[𝒂 𝑖 ≠ 𝒃 𝑖 | 𝒃 =
𝑏] = |𝑅 𝑖 (𝐴, 𝑏)|/|𝑅(𝐴, 𝑏)|, it is easy to check that the quantity in the definition of Adv𝑠 ( 𝑓 ) is at
least Adv𝑢 ( 𝑓 ).                                                                                     

   Following [53, 38], it can be shown that the soft-adversary bound is a lower bound on
formula complexity. We have the following result, which is proved in Section A.

Proposition 7.5. For any non-constant function 𝑓 : {0, 1} 𝑛 → {0, 1} we have 𝐿( 𝑓 ) ≥ Adv2𝑠 ( 𝑓 ).

   Our motivation for introducing the soft-adversary bound is that it can be used for constructing
hiding projections. We have the following result.

Lemma 7.6. Let 𝑓 : {0, 1} 𝑛 → {0, 1} be a non-constant Boolean function. There is a 𝑞-hiding projection
𝝅 to a single variable 𝑦 such that 𝑞 = 1/Adv𝑠 ( 𝑓 ) and such that 𝑓 | 𝝅 is a non-constant function with
probability 1.

Proof. Let (𝒂, 𝒃) be a distribution supported on 𝑓 −1 (1) × 𝑓 −1 (0) which attains Adv𝑠 ( 𝑓 ). We
construct a 𝑞-hiding projection 𝝅 from 𝑥 1 , . . . , 𝑥 𝑛 to a single variable 𝑦 for 𝑞 = 1/Adv𝑠 ( 𝑓 ), such
that 𝝅 𝑦←1 = 𝒂 and 𝝅 𝑦←0 = 𝒃. Note that this implies in particular that 𝑓 | 𝝅 is a non-constant
function, since 𝒂 ∈ 𝑓 −1 (1) and 𝒃 ∈ 𝑓 −1 (0). Specifically, for every input variable 𝑥 𝑖 of 𝑓 , we
define 𝝅 as follows:

   • If 𝒂 𝑖 = 𝒃 𝑖 then 𝝅(𝑥 𝑖 ) = 𝒂 𝑖 .

   • If 𝒂 𝑖 = 1 and 𝒃 𝑖 = 0 then 𝝅(𝑥 𝑖 ) = 𝑦.

   • If 𝒂 𝑖 = 0 and 𝒃 𝑖 = 1 then 𝝅(𝑥 𝑖 ) = 𝑦.

                        T HEORY OF C OMPUTING, Volume 19 (7), 2023, pp. 1–51                           30
                                   S HRINKAGE UNDER R ANDOM P ROJECTIONS

It is not hard to see that indeed 𝝅 𝑦←1 = 𝒂 and 𝝅 𝑦←0 = 𝒃. We now show that 𝝅 is 1/Adv𝑠 ( 𝑓 )-hiding.
To this end, we show that 𝝅 is (𝑞0 , 𝑞1 )-hiding for

               𝑞0 =     max Pr[𝒂 𝑖 ≠ 𝒃 𝑖 | 𝒃 = 𝑏],            𝑞1 =    max Pr[𝒂 𝑖 ≠ 𝒃 𝑖 | 𝒂 = 𝑎].
                      𝑏∈supp(𝒃)                                      𝑎∈supp(𝒂)
                        𝑖∈[𝑛]                                          𝑖∈[𝑛]
                      √
Note that indeed          𝑞0 𝑞1 = 1/Adv𝑠 ( 𝑓 ). In order to prove that 𝝅 is (𝑞0 , 𝑞1 )-hiding, we need to
prove that
                                      Pr 𝝅(𝑥 𝑖 ) ∈ {𝑦, 𝑦} 𝝅 𝑦←𝜎 = 𝜋 ≤ 𝑞 𝜎
                                                                        

for every 𝑖 ∈ [𝑛] and every 𝜎 ∈ {0, 1}. To this end, observe that the event 𝝅(𝑥 𝑖 ) ∈ {𝑦, 𝑦} occurs if
and only if 𝒂 𝑖 ≠ 𝒃 𝑖 , and that 𝝅 𝑦←1 , 𝝅 𝑦←0 are equal to the strings 𝒂, 𝒃 respectively. It follows that

                       Pr 𝝅(𝑥 𝑖 ) ∈ {𝑦, 𝑦} 𝝅 𝑦←1 = 𝜋 = Pr[𝒂 𝑖 ≠ 𝒃 𝑖 | 𝒂 = 𝜋] ≤ 𝑞1
                                                         

                       Pr 𝝅(𝑥 𝑖 ) ∈ {𝑦, 𝑦} 𝝅 𝑦←0 = 𝜋 = Pr[𝒂 𝑖 ≠ 𝒃 𝑖 | 𝒃 = 𝜋] ≤ 𝑞0
                                                         

as required.                                                                                            
Remark 7.7. Several generalizations of Ambainis’ original quantum adversary bound have
appeared in the literature. Špalek and Szegedy [53] showed that all of these methods are
equivalent, and so they are known collectively as the strong quantum adversary bound. The
formulation of the strong quantum adversary bound in Ambainis [3] makes it clear that it
subsumes our soft version. Laplante, Lee and Szegedy [38] showed that the strong quantum
adversary bound is a lower bound on formula complexity, and they came up with an even
stronger measure, maxPI2 , that is still a lower bound on formula complexity, but no longer a
lower bound on quantum query complexity.
    Høyer, Lee and Špalek [28] generalized the strong quantum adversary bound, coming up
with a new measure known as the general adversary bound, which is a lower bound both on
quantum query complexity and (after squaring) on formula complexity. Reichardt [49, 50, 39, 51]
showed that the general adversary bound in fact coincides with quantum query complexity (up
to constant factors). These results are described in a recent survey by Li and Shirley [40].
Remark 7.8. We note that Adv𝑠 ( 𝑓 ) ≤ 𝑛 (always). In order to prove this, we show that each
factor in Definition 7.3 is at most 𝑛. In particular, observe that it always holds that 𝒂 ≠ 𝒃, and
hence for every possible value 𝑎 of 𝒂, we have
                      𝑛
                      Õ
                            Pr[𝒂 𝑖 ≠ 𝒃 𝑖 | 𝒂 = 𝑎] ≥ Pr[∃𝑖 ∈ [𝑛] s.t. 𝒂 𝑖 ≠ 𝒃 𝑖 | 𝒂 = 𝑎] = 1.
                      𝑖=1

By averaging, there must be a coordinate 𝑖 ∈ [𝑛] such that Pr[𝒂 𝑖 ≠ 𝒃 𝑖 | 𝒂 = 𝑎] ≥ 𝑛1 , and therefore
                                                          1
                                         min                             ≤ 𝑛.
                                       𝑎∈supp(𝒂) Pr[𝒂 𝑖 ≠ 𝒃 𝑖 | 𝒂 = 𝑎]
                                         𝑖∈[𝑛]

The upper bound for the other factor in Definition 7.3 can be proved similarly.

                            T HEORY OF C OMPUTING, Volume 19 (7), 2023, pp. 1–51                       31
                              Y UVAL F ILMUS , O R M EIR , AND AVISHAY TAL

7.2   Khrapchenko bound
Khrapchenko [33] defined the following complexity measure for Boolean functions, and proved
that it is a lower bound on formula complexity.
Definition 7.9. Let 𝑓 : {0, 1} 𝑛 → {0, 1} be a non-constant function. For every two sets 𝐴 ⊆ 𝑓 −1 (1)
and 𝐵 ⊆ 𝑓 −1 (0), let 𝐸(𝐴, 𝐵) be the set of pairs (𝑎, 𝑏) ∈ 𝐴 × 𝐵 such that 𝑎 and 𝑏 differ on exactly
one coordinate. The Khrapchenko bound of 𝑓 is

                                                                       |𝐸(𝐴, 𝐵)| 2
                                  Khr( 𝑓 ) =           max                         .
                                                𝐴⊆ 𝑓 −1 (1),𝐵⊆ 𝑓 −1 (0) |𝐴| · |𝐵|

Theorem 7.10 ([33]). For every non-constant function 𝑓 : {0, 1} 𝑛 → {0, 1} we have 𝐿( 𝑓 ) ≥ Khr( 𝑓 ).
    Laplante, Lee, and Szegedy [38] observed that the strong adversary method generalizes the
Khrapchenko bound. We show that this holds even for the unweighted adversary bound, up to
a constant factor. Specifically, we have the following result, which is proved in Section C.
                                                                                         √
                                                          𝑛                                Khr( 𝑓 )
Proposition 7.11. For any non-constant function 𝑓 : {0, 1} → {0, 1} we have Adv𝑢 ( 𝑓 ) ≥     4      .
   By combining Propositions 7.4 and 7.11 with Lemma 7.6, it follows that the Khrapchenko
bound too can be used for constructing hiding projections.
Corollary 7.12. Let 𝑓 : {0, 1} 𝑛 → {0, 1} be p   a non-constant Boolean function. There is a 𝑞-hiding
projection 𝝅 to a single variable 𝑦 such that 𝑞 = 4/Khr( 𝑓 ) and such that 𝑓 | 𝝅 is a non-constant function
with probability 1.
   In Section B, we describe a new complexity measure, the min-entropy Khrapchenko bound,
which generalizes the Khrapchenko bound in the same way the soft-adversary bound generalizes
the unweighted adversary bound. As we show there, the min-entropy Khrapchenko bound
provides a simple way to prove lower bounds on the soft-adversary bound.


8     Applications
In this section, we apply our shrinkage theorems to obtain new results regarding the KRW
conjecture and the formula complexity of 𝑨𝑪 0 . First, in Section 8.1, we prove the KRW conjecture
for inner functions for which the soft-adversary bound is tight. We use this version of the KRW
conjecture in Section 8.2 to prove cubic formula lower bounds for a function in 𝑨𝑪 0 . Finally, we
rederive some closely related known results in Section 8.3.

8.1   Application to the KRW conjecture
Given two Boolean functions 𝑓 : {0, 1} 𝑚  → {0, 1} and 𝑔 : {0, 1} 𝑛 → {0, 1}, their (block-)-
                                            𝑚
composition is the function 𝑓  𝑔 : {0, 1} 𝑛 → {0, 1} defined by

                              ( 𝑓  𝑔)(𝑥1 , . . . , 𝑥 𝑚 ) = 𝑓 (𝑔(𝑥 1 ), . . . , 𝑔(𝑥 𝑚 )) ,

                       T HEORY OF C OMPUTING, Volume 19 (7), 2023, pp. 1–51                             32
                                 S HRINKAGE UNDER R ANDOM P ROJECTIONS

where 𝑥 1 , . . . , 𝑥 𝑚 ∈ {0, 1} 𝑛 . It is easy to see that 𝐿( 𝑓  𝑔) ≤ 𝐿( 𝑓 ) · 𝐿(𝑔). The KRW conjecture [31]
asserts that this is roughly optimal, namely, that 𝐿( 𝑓  𝑔) & 𝐿( 𝑓 ) · 𝐿(𝑔). In this section, we prove
the following related result.
Theorem 8.1. Let 𝑓 : {0, 1} 𝑚 → {0, 1} and 𝑔 : {0, 1} 𝑛 → {0, 1} be non-constant Boolean functions.
Then,                                        
                                          1+𝑂 √       1

                             𝐿 ( 𝑓  𝑔)           log 𝐿( 𝑓 𝑔)
                                                                  ≥ 𝑂(𝑚
                                                                      1
                                                                        4 ) · 𝐿( 𝑓 ) · Adv 𝑠 (𝑔).
                                                                                          2


Proof. Let 𝝅 be the 𝑞-hiding projection constructed for 𝑔 in Lemma 7.6 with 𝑞 = 1/Adv𝑠 (𝑔), and
recall that 𝑔| 𝝅 is non-constant with probability 1. Let 𝝅1 , . . . , 𝝅𝑚 be independent copies of 𝝅, and
let 𝝅𝑚 denote their 𝑚-fold join. Observe that     𝑚
                                             √ 𝝅 is 𝑞-hiding according to Lemma 6.3. Applying
Corollary 3.13 and using the estimate 𝑥 + 𝑥 = 𝑂(𝑥 + 1), we see that 𝑠 = 𝐿( 𝑓  𝑔) satisfies
                                                                                                    
                                                                     1            1+𝑂 √ 1
                        𝔼 [𝐿 (( 𝑓  𝑔)| 𝝅𝑚 )] = 𝑚 ·       4
                                                                             ·𝑠             log 𝑠
                                                                                                         + 𝑂(1).
                                                                 Adv2𝑠 (𝑔)
On the other hand, it is not hard to see that

                        (( 𝑓  𝑔)| 𝝅𝑚 ) (𝑥 1 , . . . , 𝑥 𝑚 ) = 𝑓 (𝑔| 𝝅1 (𝑥1 ), . . . , 𝑔| 𝝅𝑚 (𝑥 𝑚 )) ,

and since 𝑔| 𝝅1 , . . . , 𝑔| 𝝅𝑚 are non-constant, the function 𝑓 reduces to ( 𝑓  𝑔)| 𝝅𝑚 . In particular,
                                                                                                                
                                                                         1              1+𝑂 √ 1
                    𝐿( 𝑓 ) ≤ 𝔼 [𝐿 (( 𝑓  𝑔)| 𝝅𝑚 )] = 𝑚 ·         4
                                                                                   ·𝑠                   log 𝑠
                                                                                                                    + 𝑂(1).
                                                                     Adv2𝑠 (𝑔)
This means that there exists a constant 𝐶 > 0 such that for all non-constant 𝑓 and 𝑔
                                                                                        𝐶
                                                      1                       1+ √
                             𝐿( 𝑓 ) ≤ 𝑚 4 ·                       · 𝐿( 𝑓  𝑔)        log 𝐿( 𝑓 𝑔)
                                                                                                    + 𝐶.
                                               Adv2𝑠 (𝑔)
We consider two cases depending on whether 𝐿( 𝑓 ) ≥ 2𝐶 or not. If 𝐿( 𝑓 ) ≥ 2𝐶, then the above
inequality implies
                         𝐿( 𝑓 )          1                   1+ √ 𝐶
                                ≤ 𝑚4 ·           · 𝐿( 𝑓  𝑔)     log 𝐿( 𝑓 𝑔)
                                                                              ,
                          2            Adv2𝑠 (𝑔)
and we obtain the theorem by rearranging.
   If 𝐿( 𝑓 ) < 2𝐶, then we have 𝐿( 𝑓  𝑔) ≥ 𝐿(𝑔) since 𝑔 or ¬𝑔 is a sub-function of 𝑓 ◦ 𝑔 (as follows
from the fact that 𝑓 is non-constant). This means that in this case

                                               𝐿( 𝑓 ) · 𝐿(𝑔)   𝐿( 𝑓 ) · Adv2𝑠 (𝑔)
                            𝐿( 𝑓  𝑔) ≥ 𝐿(𝑔) >               ≥
                                                    2𝐶                 2𝐶
which is stronger than the required conclusion.                                                                               
  A direct consequence of the theorem is the following corollary, which is a special case of the
KRW conjecture for inner functions 𝑔 for which the soft-adversary bound is almost tight.

                        T HEORY OF C OMPUTING, Volume 19 (7), 2023, pp. 1–51                                                  33
                                Y UVAL F ILMUS , O R M EIR , AND AVISHAY TAL

Corollary 8.2. Let 𝑓 : {0, 1} 𝑚 → {0, 1} and 𝑔 : {0, 1} 𝑛 → {0, 1} be non-constant Boolean functions
                              1−𝑂 √        1

such that Adv2𝑠 (𝑔) ≥ 𝐿(𝑔)               log 𝐿(𝑔)
                                                        . Then,
                                                                                                                                         
                                                                        1−𝑂 √      1
                                                                                                               1−𝑂 √               1

                          𝐿( 𝑓  𝑔) ≥ 𝑂(𝑚
                                        1
                                          4) · 𝐿 ( 𝑓 )
                                                                                 log 𝐿( 𝑓 )
                                                                                                  · 𝐿(𝑔)                         log 𝐿(𝑔)
                                                                                                                                                .

Proof. By substituting the assumption on 𝑔 in the bound of Theorem 8.1, we obtain that
                                                                                                                                             
                                     1+𝑂 √          1
                                                                                                                   1−𝑂 √             1

                        𝐿 ( 𝑓  𝑔)              log 𝐿( 𝑓 𝑔)
                                                                   ≥ 𝑂(𝑚
                                                                       1
                                                                         4 ) · 𝐿( 𝑓 ) · (𝐿(𝑔))
                                                                                                                                   log 𝐿(𝑔)
                                                                                                                                                     .

Moreover, since 𝐿( 𝑓  𝑔) ≤ 𝐿( 𝑓 ) · 𝐿(𝑔), we have
                                                                                                                                   
                           1+𝑂 √         1
                                                                                                         𝑂 √               1

                 𝐿 ( 𝑓  𝑔)          log 𝐿( 𝑓 𝑔)
                                                        ≤ 𝐿( 𝑓  𝑔) · (𝐿( 𝑓 ) · 𝐿(𝑔))                                  log 𝐿( 𝑓 𝑔)

                                                                                                                                                                
                                                                                       𝑂 √            1
                                                                                                                                      𝑂 √               1

                                                        ≤ 𝐿( 𝑓  𝑔) · 𝐿 ( 𝑓 )                     log 𝐿( 𝑓 𝑔)
                                                                                                                       · 𝐿 (𝑔)                      log 𝐿( 𝑓 𝑔)

                                                                                                                                                       
                                                                                       𝑂 √          1
                                                                                                                                   𝑂 √          1

                                                        ≤ 𝐿( 𝑓  𝑔) · 𝐿 ( 𝑓 )                     log 𝐿( 𝑓 )
                                                                                                                   · 𝐿 (𝑔)                    log 𝐿(𝑔)
                                                                                                                                                              .

The corollary follows by combining the two bounds.                                                                                                                     
   Using the fact that Khr(𝑔) ≤ Adv2𝑠 (𝑔) (Proposition 7.11), we obtain the following immediate
corollary.
Corollary 8.3. Let 𝑓 : {0, 1} 𝑚 → {0, 1} and 𝑔 : {0, 1} 𝑛 → {0, 1} be non-constant Boolean functions.
Then,                                          
                                                1+𝑂 √             1

                              𝐿 ( 𝑓  𝑔)                      log 𝐿( 𝑓 𝑔)
                                                                             ≥ 𝑂(𝑚
                                                                                 1
                                                                                   4 ) · 𝐿( 𝑓 ) · Khr(𝑔).
                                                       
                              1−𝑂 √          1

Moreover, if Khr(𝑔) ≥ 𝐿(𝑔)                 log 𝐿(𝑔)
                                                            , then
                                                                                                                                         
                                                                        1−𝑂 √      1
                                                                                                               1−𝑂 √               1

                          𝐿( 𝑓  𝑔) ≥ 𝑂(𝑚
                                        1
                                          4) · 𝐿 ( 𝑓 )
                                                                                 log 𝐿( 𝑓 )
                                                                                                  · 𝐿(𝑔)                         log 𝐿(𝑔)
                                                                                                                                                .


8.2   Formula lower bounds for 𝑨𝑪 0
In this section, we derive our second main application: cubic formula lower bounds for 𝑨𝑪 0 .
Formally, we have the following result.
Theorem 1.1. There exists a family of Boolean functions ℎ 𝑛 : {0, 1} 𝑛 → {0, 1} for 𝑛 ∈ ℕ such that
   1. ℎ 𝑛 can be computed by uniform depth-4 unbounded fan-in formulas of size 𝑂(𝑛 3 ).

   2. The formula size of ℎ 𝑛 is at least 𝑛 3−𝑜(1) .

                        T HEORY OF C OMPUTING, Volume 19 (7), 2023, pp. 1–51                                                                                           34
                                     S HRINKAGE UNDER R ANDOM P ROJECTIONS

   The function ℎ 𝑛 is constructed similarly to the Andreev function [4], with the parity function
replaced with the surjectivity function [6], defined next.

Definition 8.4. Let Σ be a finite alphabet, and let 𝑟 ∈ ℕ be such that 𝑟 ≥ |Σ|. The surjectivity
function SurjΣ,𝑟 : Σ𝑟 → {0, 1} interprets its input as a function from [𝑟] to Σ, and outputs whether
the function is surjective. In other words, Surj(𝜎1 , . . . , 𝜎𝑟 ) = 1 if and only if every symbol in Σ
appears in (𝜎1 , . . . , 𝜎𝑟 ).

    In order to prove Theorem 1.1, we will use Theorem 8.1 with the inner function 𝑔 being Surj𝑛 .
To this end, we use the following result of [6].

Lemma 8.5 ([6]). For every natural number 𝑠 such that 𝑠 ≥ 2, we have Adv𝑢 (Surj[𝑠],2𝑠−2 ) = Ω 𝑠 .
                                                                                                                                              


    Note that we can view the inputs of SurjΣ,𝑟 as binary strings of length 𝑛 = 𝑟 · log |Σ| by
                                                                                                                                              
fixing some arbitrary binary encoding of the symbols in Σ. In what follows, we extend the
definition of Surj to every sufficiently large input    𝑛 as follows: Given 𝑛 ∈ ℕ , we choose 𝑠
                                                   length
to be the largest number such that (2𝑠 − 2) · log 𝑠 ≤ 𝑛, and define Surj𝑛 to be the function
that interprets its input as a string in [𝑠]2𝑠−2 and computes Surj[𝑠],2𝑠−2 on it. Lemma 8.5 now
                                                                                                                      
                                                                                                                 𝑛
implies that for every sufficiently large 𝑛 ∈ ℕ we have Adv𝑢 (Surj𝑛 ) = Ω                                      log 𝑛       . In particular,
                                                                                                                                         
                                                                                                                                    𝑛
this implies the same bound for the soft-adversary bound, i. e., Adv𝑠 (Surj𝑛 ) = Ω                                                log 𝑛       . For
completeness, we provide an alternative proof for this lower bound on Adv𝑠 (Surj𝑛 ) in Section B.
We turn to prove Theorem 1.1 using our special case of the KRW conjecture (Corollary 8.2).

Proof of Theorem 1.1. We would like to construct, for every sufficiently large 𝑛 ∈ ℕ , a func-
tion ℎ 𝑛 : {0, 1} 𝑛 → {0, 1} that is computable by a formula with unbounded fan-in of depth 4
and size 𝑂( 𝑛3 ) such that 𝐿(ℎ 𝑛 ) = Ω(𝑛 3−𝑜(1) ). We start by constructing the function ℎ 𝑛 for
                 3

              log 𝑛
input lengths 𝑛 of a special form, and then extend the construction to all sufficiently large input
lengths.
    First, assume that the input length 𝑛 is of the form 2 𝑘 · (𝑘 + 1), where 𝑘 ∈ ℕ is sufficiently
large such that Surj2𝑘 is well-defined. The function ℎ 𝑛 : {0, 1} 𝑛 → {0, 1} takes two inputs: the
                                                                                                                 𝑘
truth table of a function 𝑓 : {0, 1} 𝑘 → {0, 1}, and 𝑘 strings 𝑥 1 , . . . , 𝑥 𝑘 ∈ {0, 1}2 . On such inputs,
the function 𝐹 outputs

            ℎ 𝑛 ( 𝑓 , 𝑥1 , . . . , 𝑥 𝑘 ) = ( 𝑓  Surj2𝑘 )(𝑥 1 , . . . , 𝑥 𝑘 ) = 𝑓 (Surj2𝑘 (𝑥 1 ), . . . , Surj2𝑘 (𝑥 𝑘 )) .

It is easy to see that the function ℎ 𝑛 has input length 𝑛. We show that 𝐹 can be computed by a
formula with unbounded fan-in of depth 4 and size 𝑂( 𝑛3 ). We start by constructing a formula
                                                            3

                                                                             log 𝑛
for Surj2𝑘 . Recall that Surj2𝑘 takes as input (the binary encoding of) a string (𝜎1 , . . . , 𝜎𝑟 ) ∈ [𝑠]𝑟 ,
                   𝑘
where 𝑟, 𝑠 = 𝑂( 2𝑘 ). For simplicity, let us assume that every number in [𝑠] is encoded by exactly
one binary string, and that if the binary input to Surj 𝑘 contains a binary string in {0, 1} dlog 𝑠 e
                                                                            2


                          T HEORY OF C OMPUTING, Volume 19 (7), 2023, pp. 1–51                                                                     35
                                  Y UVAL F ILMUS , O R M EIR , AND AVISHAY TAL

that does not encode any number in [𝑠], then we do not care what the formula outputs. Now,
observe that
                                                                                         𝑟
                                                                                        ÛÜ
                                      Surj2𝑘 (𝜎1 , . . . , 𝜎𝑟 ) =                                 (𝜎 𝑗 = 𝛾).                                        (8.1)
                                                                                    𝛾∈[𝑠] 𝑗=1

It is not hard to see that the expression on the right-hand sidecan be implemented by a formula
with unbounded fan-in of depth 3 and size 𝑠 · 𝑟 · log 𝑠 = 𝑂 2𝑘 . Next, observe that
                                                                                                           2𝑘




                                                                                                  𝑘
                                                                   "                                                                       !#
                                                     Ü                                            Û
                  ℎ 𝑛 ( 𝑓 , 𝑥1 , . . . , 𝑥 𝑘 ) =                       ( 𝑓 (𝑦) = 1) ∧                    Surj2𝑘 (𝑥 𝑖 ) = 𝑦 𝑖                    .
                                                   𝑦∈{0,1}   𝑘                                     𝑖=1


Using the foregoing formula for surjectivity, it is not hard to see that the expression on the
right-hand side can be implemented by a formula with unbounded fan-in of depth 4 and size
                                                                                                                  !
                                                              𝑛3
                                                           
                                           𝑘22𝑘     
                                    𝑂 2 ·𝑘·     = 𝑂 23𝑘 = 𝑂        ,
                                             𝑘              log3 𝑛

as required.
    Finally, we prove that 𝐿(ℎ 𝑛 ) = Ω(𝑛 3−𝑜(1) ). To this end, let us hardwire the input 𝑓 to be some
                                                          2𝑘
function from {0, 1} 𝑘 to {0, 1} such that 𝐿( 𝑓 ) = Ω( log   𝑘 ) (such a function exists by a well-known
counting argument, see [30, Theorem 1.23]). After hardwiring the input 𝑓 , the function ℎ 𝑛
becomes exactly the function 𝑓  Surj2𝑘 . Now, by observing that Adv2𝑠 (Surj2𝑘 ) = Ω̃ (𝐿(Surj2𝑘 )) and
applying Corollary 8.2, it follows that
                                                                                                                                       !
                                                                                   
                                                         1−𝑂       √ 1                                 1−𝑂        √        1
                                                                                                                      log 𝐿(Surj 𝑘 )
                       𝐿(𝐹) ≥ 𝑂(𝑘1 4 ) · 𝐿 ( 𝑓 )                    log 𝐿( 𝑓 )
                                                                                        · 𝐿(Surj2𝑘 )                            2

                                                                                                          
                                                            1−𝑂 √1                      2𝑘  1−𝑂 √1
                                                    2𝑘
                                               
                                                                            𝑘            2               𝑘
                               ≥ 𝑂(𝑘1 4 ) ·                                         ·
                                                   log 𝑘                                 𝑘2
                               ≥2    3𝑘−𝑜(1)

                               = 𝑛 3−𝑜(1) ,

as required.
    It remains to deal with input lengths 𝑛 that are not of the form 2 𝑘 · (𝑘 + 1). For such input
lengths 𝑛, we choose 𝑘 to be the largest natural number such that 2 𝑘 · (𝑘 + 1) ≤ 𝑛, and proceed
as before. It can be verified that for this choice of 𝑘 we have 2 𝑘 · (𝑘 + 1) = Θ(𝑛), and therefore all
the foregoing asymptotic bounds continue to hold.                                                    
Remark 8.6. We note that our methods cannot prove formula lower bounds that are better than
cubic. As explained in Remark 7.8, it holds that Adv2𝑠 ( 𝑓 ) ≤ 𝑛 2 . Thus, one cannot expect to obtain
a lower bound that is better than cubic by combining these measures with Andreev’s argument.

                        T HEORY OF C OMPUTING, Volume 19 (7), 2023, pp. 1–51                                                                          36
                                S HRINKAGE UNDER R ANDOM P ROJECTIONS

8.3     Reproving known formula lower bounds
The proof of Theorem 1.1 combines a lower bound on Adv𝑠 (Surj𝑛 ) with an upper bound on
𝐿(Surj𝑛 ). More generally, we can prove the following result along similar lines.
Theorem 8.7. Let 𝑔 : {0, 1} 𝑛 → {0, 1} be an arbitrary function, and let 𝑘 be an integer satisfying
                                    𝑘
𝑘 = log 𝑛 + 𝑂(1). Let 𝐹 : {0, 1}2 +𝑛 𝑘 → {0, 1} be a function on 𝑁 = Θ(𝑛 log 𝑛) variables, whose input
consists of a function 𝑓 : {0, 1} 𝑘 → {0, 1} and 𝑘 strings 𝑥 1 , . . . , 𝑥 𝑘 ∈ {0, 1} 𝑛 , given by

                                  𝐹( 𝑓 , 𝑥1 , . . . , 𝑥 𝑘 ) = ( 𝑓  𝑔)(𝑥 1 , . . . , 𝑥 𝑘 ).

We call 𝐹 the 𝑔-based Andreev function.
    1. If either Adv2𝑠 (𝑔) or Khr(𝑔) are at least Ω(𝑛 2−𝑜(1) ) then 𝐿(𝐹) = Ω(𝑁 3−𝑜(1) ).

    2. If 𝐿(𝑔) = 𝑂(𝑛 2+𝑜(1) ) then 𝐿(𝐹) = 𝑂(𝑁 3+𝑜(1) ).
    The proof of Theorem 8.7 is very similar to the proof of Theorem 1.1, and so we leave it
to the reader (the only difference is that we may use Corollary 8.3 instead of Corollary 8.2).
Using Theorem 8.7, we can derive two known special cases: the original Andreev function (in
which 𝑔 is Parity), and the variant considered in [17]. In particular, using the known facts that
Khr(Parity) ≥ 𝑛 2 and Khr(Majority) ≥ Ω(𝑛 2 ) [33], we obtain the following results.
Corollary 8.8. The formula complexity of the Parity𝑛 -based Andreev function is Θ(𝑁 3±𝑜(1) ).
Corollary 8.9. For 𝑚 ≥ 3 odd, let Majority𝑛 : {0, 1} 𝑚 → {0, 1} be the Majority function. The formula
complexity of the Majority𝑛 -based Andreev function is Ω(𝑁 3−𝑜(1) ).
   The best known upper bound on the formula complexity of Majority𝑛 is only 𝑂(𝑛 3.91 ) [52],
and so we do not expect Corollary 8.9 to be tight.


A       Proof of Proposition 7.5
In this appendix, we prove Proposition 7.5, restated next.
Proposition 7.5. For any 𝑓 : {0, 1} 𝑛 → {0, 1} we have 𝐿( 𝑓 ) ≥ Adv2𝑠 ( 𝑓 ).

Preliminaries
While the proof of Proposition 7.5 is fairly simple, it uses some basic concepts from information
theory which we review next.
Definition A.1. Let 𝒙, 𝒚, 𝒛 be discrete random variables.
                                               h                i
      • The entropy of 𝒙 is 𝐻(𝒙) = 𝔼𝑥←𝒙 log Pr[𝒙=𝑥]
                                               1
                                                    .

      • The conditional entropy of 𝒙 given 𝒚 is 𝐻(𝒙 | 𝒚) = 𝔼 𝑦←𝒚 [𝐻 (𝒙 | 𝒚 = 𝑦)].

                        T HEORY OF C OMPUTING, Volume 19 (7), 2023, pp. 1–51                       37
                              Y UVAL F ILMUS , O R M EIR , AND AVISHAY TAL

   • The mutual information between 𝒙 and 𝒚 is 𝐼(𝒙; 𝒚) = 𝐻(𝒙) − 𝐻(𝒙 | 𝒚).

   • The conditional mutual information between 𝒙 and 𝒚 given 𝒛 is

                                      𝐼(𝒙; 𝒚 | 𝒛) = 𝐻(𝒙 | 𝒛) − 𝐻(𝒙 | 𝒚, 𝒛).


    We use the following basic facts from information theory (see [11] for proofs).

Fact A.2. Let 𝒙, 𝒚, 𝒛 be discrete random variables with finite supports, and let 𝑋 denote the support of 𝒙.

   • It holds that 0 ≤ 𝐻(𝒙) ≤ log |𝑋 |.

   • It holds that 0 ≤ 𝐻(𝒙 | 𝒚) ≤ 𝐻(𝒙), where 𝐻(𝒙 | 𝒚) = 0 if and only if 𝒚 determines 𝒙 (i. e., 𝒙 is a
     function of 𝒚).

   • If 𝒚 determines 𝒙 conditioned on 𝒛 (i. e., 𝒙 is a function of 𝒚 and 𝒛) then 𝐻(𝒚 | 𝒛) ≥ 𝐻(𝒙 | 𝒛).

   • It holds that 0 ≤ 𝐼(𝒙; 𝒚) ≤ 𝐻(𝒙). Similarly, we have 0 ≤ 𝐼(𝒙; 𝒚 | 𝒛) ≤ 𝐻(𝒙 | 𝒛), where
     𝐼(𝒙; 𝒚 | 𝒛) = 0 if and only if 𝒙 and 𝒚 are independent conditioned on any value of 𝒛.

   • It holds that 𝐼(𝒙; 𝒚) = 𝐼(𝒚; 𝒙). Similarly, 𝐼(𝒙; 𝒚 | 𝒛) = 𝐼(𝒚; 𝒙 | 𝒛).

   • The chain rule: It holds that

                                  𝐼(𝒙; 𝒚, 𝒛 | 𝒘) = 𝐼(𝒙; 𝒛 | 𝒘) + 𝐼(𝒙; 𝒚 | 𝒛, 𝒘).


    Finally, we use the following result from the theory of interactive information complexity.

Claim A.3 ([5, Fact 4.15]). Let Π be a deterministic protocol. Suppose we invoke Π on random inputs 𝒙
and 𝒚 for Alice and Bob, respectively, and let ℓ denote the random leaf that Π reaches on those inputs.
Then,
                                 𝐼(𝒙, 𝒚; ℓ) ≥ 𝐼(𝒙; ℓ | 𝒚) + 𝐼(𝒚; ℓ | 𝒙).

Proof sketch. Let 𝒕 denote the transcript of the protocol that is associated with ℓ. We prove that
𝐼(𝒙, 𝒚; 𝒕) ≥ 𝐼(𝒙; 𝒕 | 𝒚) + 𝐼(𝒚; 𝒕 | 𝒙), as this is equivalent to the claim. Suppose Alice speaks first,
and denote the (random) bit she sends by 𝒕1 . We show that 𝐼(𝒙, 𝒚; 𝒕1 ) ≥ 𝐼(𝒙; 𝒕1 | 𝒚) + 𝐼(𝒚; 𝒕1 | 𝒙).
Using the chain rule, we can write

             𝐼(𝒙, 𝒚; 𝒕1 ) = 𝐼(𝒚; 𝒕1 ) + 𝐼(𝒙; 𝒕1 | 𝒚) ≥ 𝐼(𝒙; 𝒕1 | 𝒚) = 𝐼(𝒙; 𝒕1 | 𝒚) + 𝐼(𝒚; 𝒕1 | 𝒙),

where the last equality follows since 𝐼(𝒚; 𝒕1 | 𝒙) = 0, as Alice’s message 𝒕1 is independent of 𝒚
given her input 𝒙. Proceeding by induction on the coordinates of 𝒕 using the chain rule finishes
the proof.                                                                                      

                       T HEORY OF C OMPUTING, Volume 19 (7), 2023, pp. 1–51                             38
                                  S HRINKAGE UNDER R ANDOM P ROJECTIONS

The proof of Proposition 7.5
Our proof of Proposition 7.5 generalizes similar arguments in [32, 19]. Let Π be a protocol that
solves KW 𝑓 . For every leaf ℓ of Π, we denote by 𝑖ℓ the output of the protocol at leaf ℓ , so we
have 𝑎 𝑖ℓ ≠ 𝑏 𝑖ℓ any pair of inputs (𝑎, 𝑏) that reach the leaf ℓ . We prove that 𝐿(Π) ≥ Adv2𝑠 ( 𝑓 ). Let
(𝒂, 𝒃) be a distribution for 𝑓 that attains Adv𝑠 ( 𝑓 ), and let
               𝑞0 =    max Pr[𝒂 𝑖 ≠ 𝒃 𝑖 | 𝒃 = 𝑏],               𝑞1 =    max Pr[𝒂 𝑖 ≠ 𝒃 𝑖 | 𝒂 = 𝑎].
                      𝑏∈supp(𝒃)                                        𝑎∈supp(𝒂)
                        𝑖∈[𝑛]                                            𝑖∈[𝑛]

Let ℓ be the leaf that Π reaches on input (𝒂, 𝒃). By Fact A.2, we have
                                         𝐼(ℓ; 𝒂, 𝒃) ≤ 𝐻(ℓ) ≤ log |𝐿(Π)| .
On the other hand, Claim A.3 implies that
                                       𝐼(ℓ; 𝒂, 𝒃) ≥ 𝐼(ℓ; 𝒂 | 𝒃) + 𝐼(ℓ; 𝒃 | 𝒂).
Next, observe that 𝒂 and 𝒃 together determine ℓ, and that the event ℓ = ℓ implies that 𝒂 𝑖ℓ ≠ 𝒃 𝑖ℓ .
It follows that
      𝐼(ℓ; 𝒂 | 𝒃) = 𝐻(ℓ | 𝒃) − 𝐻(ℓ | 𝒂, 𝒃)                                                (by definition)
                  = 𝐻(ℓ | 𝒃)                                                       (𝒂 and 𝒃 determine ℓ)
                                                                       
                                                 1
                  = 𝔼ℓ ←ℓ ,𝑏←𝒃 log                         𝒃=𝑏                          (Definition A.1)
                                         Pr[ℓ = ℓ | 𝒃 = 𝑏]
                                                                           
                                                    1
                  ≥ 𝔼ℓ ←ℓ ,𝑏←𝒃        log                         𝒃=𝑏
                                          Pr[𝒂 𝑖ℓ ≠ 𝒃 𝑖ℓ | 𝒃 = 𝑏]
                                                    
                                          1
                  ≥ 𝔼ℓ ←ℓ ,𝑏←𝒃        log    𝒃=𝑏                                       (Definition of 𝑞0 )
                                          𝑞0
                           1
                  = log
                           𝑞0

Similarly, it can be shown that 𝐼(ℓ; 𝒃 | 𝒂) ≥ log 𝑞11 . By combining the foregoing equations, it
follows that
                                                                   1       1
                      log |𝐿(Π)| ≥ 𝐼(ℓ; 𝒂 | 𝒃) + 𝐼(ℓ; 𝒃 | 𝒂) ≥ log    + log ,
                                                                   𝑞0      𝑞1
and thus
                                                         1 1
                                          |𝐿(Π)| ≥         ·   = Adv2𝑠 ( 𝑓 ),
                                                         𝑞0 𝑞1
as required.
Remark A.4. We note that Proposition 7.5 can also be proved by observing that the soft-adversary
bound is a special case of the weighted adversary bound [3], which was shown to be a lower
bound on formula complexity by [38]. Alternatively, one could prove Proposition 7.5 (up to a
constant factor) by combining Lemma 7.6 along with Lemma 5.1 and Lemma 4.1.

                         T HEORY OF C OMPUTING, Volume 19 (7), 2023, pp. 1–51                                39
                             Y UVAL F ILMUS , O R M EIR , AND AVISHAY TAL

B    The min-entropy Khrapchenko bound
In this appendix, we describe a generalization of the Khrapchenko bound that provides a simple
lower bound technique for the soft-adversary bound. We then show how this generalization
can be used to give an alternative proof for the lower bound on the soft-adversary bound of the
surjectivity function. We start by recalling the original Khrapchenko bound.

Definition 7.9. Let 𝑓 : {0, 1} 𝑛 → {0, 1} be a non-constant function. For every two sets 𝐴 ⊆ 𝑓 −1 (1)
and 𝐵 ⊆ 𝑓 −1 (0), let 𝐸(𝐴, 𝐵) be the set of pairs (𝑎, 𝑏) ∈ 𝐴 × 𝐵 such that 𝑎 and 𝑏 differ on exactly
one coordinate. The Khrapchenko bound of 𝑓 is

                                                                  |𝐸(𝐴, 𝐵)| 2
                                Khr( 𝑓 ) =         max                        .
                                           𝐴⊆ 𝑓 −1 (1),𝐵⊆ 𝑓 −1 (0) |𝐴| · |𝐵|


    The Khrapchenko measure can be viewed as follows: Consider the subgraph of the Hamming
                                                                                     2
cube that consists only of the cut between 𝐴 and 𝐵. Then, the measure |𝐸(𝐴,𝐵)|
                                                                        |𝐴|·|𝐵|
                                                                                is the product of
the average degree of a vertex in 𝐴 (which is |𝐸(𝐴,𝐵)|
                                                 |𝐴|
                                                       ) and the average degree of a vertex in 𝐵
(which is |𝐸(𝐴,𝐵)|
              |𝐵|
                   ). Note that the average degree of 𝐴 can also be described as the average, over
all strings 𝑎 ∈ 𝐴, of the number of coordinates 𝑖 ∈ [𝑛] such that if we flip the 𝑖-th bit of 𝑎 we get
a string in 𝐵. We generalize the Khrapchenko measure as follows:

    • Whereas the Khrapchenko bound maximizes over all cuts (𝐴, 𝐵) of the Hamming cube
      with 𝐴 ⊆ 𝑓 −1 (1) and 𝐵 ⊆ 𝑓 −1 (0), we are maximizing over all distributions over edges (𝒂, 𝒃)
      of the Hamming cube, where 𝒂 ∈ 𝑓 −1 (1) and 𝒃 ∈ 𝑓 −1 (0).

    • Whereas the Khrapchenko bound considers the average number of coordinates 𝑖 as
      described above, we consider the min-entropy of the coordinate 𝒊 on which 𝒂, 𝒃 differ.

    • Whereas the Khrapchenko bound considers functions whose inputs are binary strings, we
      consider inputs that are strings over arbitrary finite alphabets.

Our generalization uses the following notion: The conditional min-entropy of a random variable 𝒙
conditioned on a random variable 𝒚 is 𝐻∞ (𝒙 | 𝒚) = min𝑥,𝑦 log Pr[𝒙=𝑥|𝒚=𝑦]
                                                                      1
                                                                           . We can now define
our generalization formally:

Definition B.1. Let Σ be a finite alphabet, and let 𝑓 : Σ𝑛 → {0, 1} be a non-constant Boolean
function. We say that a distribution (𝒂, 𝒃) on 𝑓 −1 (1) × 𝑓 −1 (0) is a Khrapchenko distribution for 𝑓 if
𝒂 and 𝒃 always differ on a unique coordinate 𝒊 ∈ [𝑛]. We define the min-entropy Khrapchenko
bound of 𝑓 , which we denote Khr𝐻∞ ( 𝑓 ), to be the maximum of the quantity

                                              2𝐻∞ (𝒊|𝒂)+𝐻∞ (𝒊|𝒃)

over all Khrapchenko distributions (𝒂, 𝒃) for 𝑓 .

                      T HEORY OF C OMPUTING, Volume 19 (7), 2023, pp. 1–51                            40
                               S HRINKAGE UNDER R ANDOM P ROJECTIONS

    As noted above, the min-entropy Khrapchenko bound can be used to lower bound the the
soft-adversary bound. In order to define the adversary bound of a function 𝑓 : Σ𝑛 → {0, 1}
over a non-boolean alphabet Σ, we view 𝑓 as if its input is a binary string in {0, 1} 𝑛· dlog|Σ|e that
encodes a string in Σ𝑛 via some fixed encoding (the choice of the encoding does not matter for
what follows).
Lemma B.2. For every non-constant function 𝑓 : Σ𝑛 → {0, 1} we have Adv𝑠 ( 𝑓 ) ≥              Khr𝐻∞ ( 𝑓 ).
                                                                                         p

Proof. Let (𝒂, 𝒃) be a Khrapchenko distribution that attains Khr𝐻∞ ( 𝑓 ). Let 𝒊 ∈ [𝑛] be the unique
index at which 𝒂, 𝒃 differ, and let 𝒂 𝑖,𝑗 , 𝒃 𝑖,𝑗 be the 𝑗-th bits of the binary encoding of 𝒂 𝑖 , 𝒃 𝑖 ∈ Σ.
For any 𝑎 ∈ supp(𝒂) and (𝑖, 𝑗) ∈ [𝑛] × [dlog |Σ|e], we have

            Pr[𝒂 𝑖,𝑗 ≠ 𝒃 𝑖,𝑗 | 𝒂 = 𝑎] ≤ Pr[𝒂 𝑖 ≠ 𝒃 𝑖 | 𝒂 = 𝑎] = Pr[𝒊 = 𝑖 | 𝒂 = 𝑎] ≤ 2−𝐻∞ (𝒊|𝒂) .

It follows that the first factor in the definition of Adv𝑠 is at least    𝐻∞ (𝒊|𝒂) . Similarly, the second
                    𝐻
                                                                       p2
factor is at least 2 ∞ (𝒊|𝒃) , and so the entire expression is at least Khr𝐻∞ ( 𝑓 ).                     
     Similarly to the original Khrapchenko bound, the min-entropy Khrapcehnko bound too
gives a lower bound on the formula complexity 𝐿( 𝑓 ), which can be proved by combining
Proposition 7.5 and Lemma B.2. Again, in order to define the formula complexity of a function
 𝑓 : Σ𝑛 → {0, 1} over a non-boolean alphabet, we view 𝑓 as if its input is a binary string
in {0, 1} 𝑛· dlog|Σ|e .
Corollary B.3. Let Σ be a finite alphabet, and let 𝑓 : Σ𝑛 → {0, 1} be a non-constant Boolean function.
Then 𝐿( 𝑓 ) ≥ Khr𝐻∞ ( 𝑓 ).
By combining Lemmas 7.6 and B.2, it also follows that the min-entropy Khrapchenko bound too
can be used for constructing hiding projections.
Corollary B.4. Let Σ be a finite alphabet, let 𝑓 : Σ𝑛 → {0, 1} be apnon-constant Boolean function. There
is a 𝑞-hiding projection 𝝅 to a single variable 𝑦 such that 𝑞 = 1/Khr𝐻∞ ( 𝑓 ) and such that 𝑓 | 𝝅 is a
non-constant function with probability 1.

B.1   Lower bound on the surjectivity function
In order to demonstrate the usefulness of the min-entropy Khrapchenko bound, we use it to
prove a lower bound on the surjectivity function from Section 8.2. This implies a similar bound
on the soft-adversary bound of the surjectivity function, and provides an alternative way for
proving our main theorem.
Definition 8.4. Let Σ be a finite alphabet, and let 𝑟 ∈ ℕ be such that 𝑟 ≥ |Σ|. The surjectivity
function SurjΣ,𝑟 : Σ𝑟 → {0, 1} interprets its input as a function from [𝑟] to Σ, and outputs whether
the function is surjective. In other words, Surj(𝜎1 , . . . , 𝜎𝑟 ) = 1 if and only if every symbol in Σ
appears in (𝜎1 , . . . , 𝜎𝑟 ).
Lemma B.5. For every 𝑠 ∈ ℕ , we have Khr𝐻∞ (Surj[2𝑠+1],3𝑠+1 ) = Ω 𝑠 2 .
                                                                           


                       T HEORY OF C OMPUTING, Volume 19 (7), 2023, pp. 1–51                                 41
                             Y UVAL F ILMUS , O R M EIR , AND AVISHAY TAL

Proof. Let 𝑠 ∈ ℕ , let Σ = [2𝑠 + 1], and let Surj = SurjΣ,3𝑠+1 . We define a Khrapchenko distribu-
tion (𝒂, 𝒃) for Surj as follows. The input 𝒃 ∈ Surj−1 (0) is a uniformly distributed string in Σ3𝑠+1
in which 𝑠 + 1 of the symbols in Σ appear exactly twice, 𝑠 − 1 of the symbols in Σ appear exactly
once, and the remaining symbol in Σ does not appear at all. The string 𝒂 is sampled by choosing
uniformly at random a coordinate 𝒊 such that 𝒃 𝒊 is one of the symbols that appear twice in 𝒃,
and replacing 𝒃 𝒊 with the unique symbol in Σ that does not appear in 𝒃.
    We turn to bound 𝐻∞ (𝒊 | 𝒃) and 𝐻∞ (𝒊 | 𝒂). First, observe that conditioned on any choice
of 𝒃, the coordinate 𝒊 is uniformly distributed among 2𝑠 + 2 coordinates, and therefore

                                       𝐻∞ (𝒊 | 𝒃) = log(2𝑠 + 2).

In order to bound 𝐻∞ (𝒊 | 𝒂), observe that 𝒂 is distributed like a uniformly distributed string
in Σ3𝑠+1 in which 𝑠 + 1 of the symbols in Σ appear exactly once, and the remaining 𝑠 symbols in Σ
appear exactly twice. Conditioned on any choice of 𝒂, the coordinate 𝒊 is uniformly distributed
over the 𝑠 + 1 coordinates of symbols that appear exactly once. It follows that

                                        𝐻∞ (𝒊 | 𝒂) ≥ log(𝑠 + 1).

We conclude that
                                                                                 
                    Khr𝐻∞ (Surj𝑛 ) ≥ 2𝐻∞ (𝒊|𝒂)+𝐻∞ (𝒊|𝒃) ≥ (𝑠 + 1) · (2𝑠 + 2) = Ω 𝑠 2 .           

    Lemma B.5 implies, via Corollary B.3 a quadratic lower bound on the formula complexity
of Surj. The same result also follows from the lower bound on the adversary bound of Surj due
to [6] (Lemma 8.5).

B.1.1   Comparison to the Khrapchenko bound
We now further compare the min-entropy Khrapchenko bound to the original Khrapchenko
bound. Observe that when Σ = {0, 1}, the min-entropy 𝐻∞ (𝒊 | 𝒂) is exactly the min-entropy of a
random neighbor of 𝒂. In particular, when (𝒂, 𝒃) is the uniform distribution over a set of edges,
the min-entropy 𝐻∞ (𝒊 | 𝒂) is the logarithm of the minimal degree of a vertex 𝑎. Moreover, if
the latter set of edges also induces a regular graph, then the measure 2𝐻∞ (𝒊|𝒂)+𝐻∞ (𝒊|𝒃) coincides
                                            2
exactly with the original measure |𝐸(𝐴,𝐵)|
                                    |𝐴|·|𝐵|
                                            of Khrapchenko. More generally, when Σ = {0, 1}, the
bound Khr𝐻∞ ( 𝑓 ) is within a constant factor of the original Khrapchenko bound, as we show in
Section C.
Proposition B.6. For any non-constant function 𝑓 : {0, 1} 𝑛 → {0, 1} it holds that
                                     Khr( 𝑓 )
                                       4      ≤ Khr𝐻∞ ( 𝑓 ) ≤ Khr( 𝑓 ).

   Unfortunately, when Σ is a larger alphabet, the connection between Khr𝐻∞ ( 𝑓 ) and the
                2
measure |𝐸(𝐴,𝐵)|
           |𝐴||𝐵|
                  is not so clean. Specifically, the min-entropy 𝐻∞ (𝒊 | 𝒂) has no clear connection
to the degree of 𝒂, since the vertex 𝒂 may have multiple neighbors that correspond to the same
coordinate 𝒊.

                       T HEORY OF C OMPUTING, Volume 19 (7), 2023, pp. 1–51                      42
                              S HRINKAGE UNDER R ANDOM P ROJECTIONS

B.1.2    Previous work on the Khrapchenko bound
Several versions of the Khrapchenko bound appeared in the literature: Zwick [58] generalized
the Khrapchenko bound such that different input coordinates can be given different weights,
and Koutsoupias [36] gave a spectral generalization of the bound. The paper of Håstad [23]
observed that his analogue of Lemma 4.1 can be viewed as a generalization of the Khrapchenko
bound. Ganor, Komargodski, and Raz [18] considered a variant of the Khrapchenko bound
in which the edges of the Boolean hypercube are replaced with random walks on the noisy
hypercube. Of particular relevance is a paper of Laplante, Lee, and Szegedy [38] that defined a
complexity measure that is very similar to our min-entropy Khrapchenko bound, except that
the entropy is replaced with Kolmogorov complexity.

B.1.3    An entropy Khrapchenko bound
It is possible to generalize the complexity measure Khr𝐻∞ by replacing the min-entropy in
Definition B.1 with Shannon entropy. Such a measure would still give a lower bound on formula
complexity — specifically, the proof of Corollary B.3 would go through without a change.
However, we do not know how to use such a measure for constructing hiding projections as in
Corollary B.4. We note that it is easy to prove that such a measure is an upper bound on Khr𝐻∞ .


C       Proof of Propositions B.6 and 7.11
In this appendix we prove Propositions B.6 and 7.11, restated next.
                                                                        Khr( 𝑓 )
Proposition B.6. For any 𝑓 : {0, 1} 𝑛 → {0, 1} we have                    4      ≤ Khr𝐻∞ ( 𝑓 ) ≤ Khr( 𝑓 ).
                                                                         q
                                                                             Khr( 𝑓 )
Proposition 7.11. For any 𝑓 : {0, 1} 𝑛 → {0, 1} we have                        4      ≤ Adv𝑢 ( 𝑓 ).

    We relate Khr( 𝑓 ) to Khr𝐻∞ ( 𝑓 ) using an auxiliary measure Khrmin ( 𝑓 ):
                                                                                                     
                   Khrmin ( 𝑓 ) =           max                   min |𝐸(𝑎, 𝐵)| · min |𝐸(𝐴, 𝑏)| ,
                                    𝐴⊆ 𝑓 −1 (1),𝐵⊆ 𝑓 −1 (0)       𝑎∈𝐴               𝑏∈𝐵

where 𝐸(𝑎, 𝐵) = 𝐸({𝑎}, 𝐵) and similarly 𝐸(𝐴, 𝑏) = 𝐸(𝐴, {𝑏}).
  We first show that Khrmin ( 𝑓 ) and Khr( 𝑓 ) are equal up to constants.
                                                       Khr( 𝑓 )
Claim C.1. For any 𝑓 : Σ𝑛 → {0, 1} we have               4      ≤ Khrmin ( 𝑓 ) ≤ Khr( 𝑓 ).
Proof. The inequality Khrmin ( 𝑓 ) ≤ Khr( 𝑓 ) is simple since min𝑎∈𝐴 |𝐸(𝑎, 𝐵)| ≤ |𝐸(𝐴, 𝐵)|/|𝐴| and
similarly min𝑏∈𝐵 |𝐸(𝐴, 𝑏)| ≤ |𝐸(𝐴, 𝐵)|/|𝐵|.
   The other direction is more subtle. For ease of notation, for any sets 𝐴 ⊆ 𝑓 −1 (1) and
𝐵 ⊆ 𝑓 −1 (0) we write Khr(𝐴, 𝐵) = |𝐸(𝐴,𝐵)|
                                                 2

                                    |𝐴||𝐵|
                                           . Thus,

                                Khr( 𝑓 ) =               max               Khr(𝐴, 𝐵).                        (C.1)
                                                 𝐴⊆ 𝑓 −1 (1),𝐵⊆ 𝑓 −1 (0)


                      T HEORY OF C OMPUTING, Volume 19 (7), 2023, pp. 1–51                                     43
                                Y UVAL F ILMUS , O R M EIR , AND AVISHAY TAL

Assume that 𝐴 and 𝐵 are sets that maximize Khr(𝐴, 𝐵) in Equation C.1. We show that

                                       ∀𝑎 ∈ 𝐴 :      |𝐸(𝑎, 𝐵)| ≥ |𝐸(𝐴,𝐵)|
                                                                   2|𝐴|
                                                                          ,                           (C.2)
                                        ∀𝑏 ∈ 𝐵 :     |𝐸(𝐴, 𝑏)| ≥ |𝐸(𝐴,𝐵)|
                                                                   2|𝐵|
                                                                          .                           (C.3)

In words, the min-degree is at least half the average degree. Before showing why Equations C.2
and C.3 hold, we show that they imply the statement of the claim. Indeed,
                                                           
                                                                                           Khr( 𝑓 )
        Khrmin ( 𝑓 ) ≥ min |𝐸(𝑎, 𝐵)| · min |𝐸(𝐴, 𝑏)| ≥ |𝐸(𝐴,𝐵)| · |𝐸(𝐴,𝐵)| = Khr(𝐴,𝐵) =      4 .
                          𝑎∈𝐴               𝑏∈𝐵          2|𝐴|       2|𝐵|        4


   It remains to show that Equations C.2 and C.3 hold. We focus on Equation C.2 due to
symmetry. Assume by contradiction that there exists an 𝑎 ∈ 𝐴 for which

                                              |𝐸(𝑎, 𝐵)| < |𝐸(𝐴,𝐵)|
                                                            2|𝐴|
                                                                   .

It must be the case that |𝐴| > 1, as otherwise 𝐴 contains only one element and |𝐸(𝑎, 𝐵)| =
|𝐸(𝐴, 𝐵)|/|𝐴| for this element. Consider now the set 𝐴0 = 𝐴 \ {𝑎} — which is non-empty by
the above discussion. We claim that Khr(𝐴0 , 𝐵) > Khr(𝐴, 𝐵), contradicting the choice of 𝐴, 𝐵.
Indeed,

                             |𝐸(𝐴0 , 𝐵)| 2
             Khr(𝐴0 , 𝐵) =
                               |𝐴0 ||𝐵|
                             (|𝐸(𝐴, 𝐵)| − |𝐸(𝑎, 𝐵)|)2
                           =
                                    (|𝐴| − 1)|𝐵|
                                (|𝐸(𝐴, 𝐵)| − |𝐸(𝐴,𝐵)| )2
                           >                                             (By assumption on 𝑎)
                                               2|𝐴|
                                    (|𝐴| − 1)|𝐵|
                                |𝐸(𝐴, 𝐵)| 2 · (1 − 2|𝐴|
                                                    1 2
                                                        )
                           =
                                   |𝐴||𝐵| · (1 − |𝐴|
                                                  1
                                                     )
                                             (1 − 2|𝐴|
                                                   1 2
                                                       )
                           = Khr(𝐴, 𝐵) ·
                                              (1 − |𝐴|
                                                    1
                                                       )
                           > Khr(𝐴, 𝐵).                                                    

   It turns out that over the binary alphabet, Khrmin ( 𝑓 ) and Khr𝐻∞ ( 𝑓 ) coincide.

Claim C.2. For any 𝑓 : {0, 1} 𝑛 → {0, 1} we have Khr𝐻∞ ( 𝑓 ) = Khrmin ( 𝑓 ).

Proof. We first show that Khr𝐻∞ ( 𝑓 ) ≥ Khrmin ( 𝑓 ). Let 𝐴, 𝐵 be sets that maximize the expression
(min𝑎∈𝐴 |𝐸(𝑎, 𝐵)| · min𝑏∈𝐵 |𝐸(𝐴, 𝑏)|). We take (𝒂, 𝒃) to be a uniformly distributed pair in 𝐸(𝐴, 𝐵).
It is not hard to see that
                               2𝐻∞ (𝒊|𝒂) = 2𝐻∞ (𝒃|𝒂) = min |𝐸(𝑎, 𝐵)|,
                                                                𝑎∈𝐴


                      T HEORY OF C OMPUTING, Volume 19 (7), 2023, pp. 1–51                              44
                               S HRINKAGE UNDER R ANDOM P ROJECTIONS

and similarly 2𝐻∞ (𝒊|𝒃) = min𝑏∈𝐵 |𝐸(𝐴, 𝑏)|. We thus get Khr𝐻∞ ( 𝑓 ) ≥ Khrmin ( 𝑓 ).
    Next, we show the other direction, Khrmin ( 𝑓 ) ≥ Khr𝐻∞ ( 𝑓 ). Let (𝒂, 𝒃) be a random vari-
able distributed according to a Khrapchenko distribution for 𝑓 that attains the maximum of
2𝐻∞ (𝒂|𝒃)+𝐻∞ (𝒃|𝒂) over all such distributions. Let 𝐴 := supp(𝒂) and 𝐵 := supp(𝒃) be the supports
of 𝒂 and 𝒃, respectively. By definition of 𝐻∞ we have 2𝐻∞ (𝒂|𝒃) = max Pr[𝒂=𝑎|𝒃=𝑏]
                                                                            1
                                                                                    . Rearranging,
                                                                        𝑎,𝑏
we get that for any 𝑏 in the support of 𝒃 it holds that

                                    Pr[𝒂 = 𝑎 | 𝒃 = 𝑏] ≤ 1/2𝐻∞ (𝒂|𝒃) .

In particular, since theses probabilities sum to 1, it must be the case that there are at least 2𝐻∞ (𝒂|𝒃)
neighbors of 𝑏 in 𝐴 — i. e., |𝐸(𝐴, 𝑏)| ≥ 2𝐻∞ (𝒂|𝒃) for all 𝑏 ∈ 𝐵. Similarly, |𝐸(𝑎, 𝐵)| ≥ 2𝐻∞ (𝒃|𝒂) for all
𝑎 ∈ 𝐴. The two sets 𝐴 and 𝐵 show that Khrmin ( 𝑓 ) ≥ 2𝐻∞ (𝒂|𝒃) · 2𝐻∞ (𝒃|𝒂) = Khr𝐻∞ ( 𝑓 ).               

   Proposition B.6 follows by combining these two claims. Proposition 7.11 follows from
Claim C.2 using the following simple observation.

Claim C.3. For any 𝑓 : {0, 1} 𝑛 → {0, 1} we have Adv𝑢 ( 𝑓 ) ≥        Khrmin ( 𝑓 ).
                                                                 p

Proof. Let 𝐴, 𝐵 be sets that attain Khrmin ( 𝑓 ). Define 𝑅 to be the set of pairs (𝑎, 𝑏) ∈ 𝐴 × 𝐵 that
differ at a single coordinate. Thus |𝑅(𝑎, 𝐵)| = |𝐸(𝑎, 𝐵)| and |𝑅(𝐴, 𝑏)| = |𝐸(𝐴, 𝑏)|. Moreover,
|𝑅 𝑖 (𝑎, 𝐵)| ≤ 1 since there is a unique 𝑏 that differs from 𝑎 only on the 𝑖-th coordinate. Similarly,
|𝑅 𝑖 (𝐴, 𝑏)| ≤ 1. The claim now immediately follows by comparing the definitions of Khrmin ( 𝑓 )
and Adv𝑢 ( 𝑓 ).                                                                                     

Acknowledgements
A. T. would like to thank Igor Carboni Oliveira for bringing the question of proving formula size
lower bounds for 𝑨𝑪 0 to his attention. We are also grateful to Robin Kothari for posing this open
question on “Theoretical Computer Science Stack Exchange” [35], and to Kaveh Ghasemloo and
Stasys Jukna for their feedback on this question. We would like to thank Anna Gál for very
helpful discussions, and Gregory Rosenthal for helpful comments on the manuscript. Finally,
we are grateful to anonymous referees for comments that improved the presentation of this
work.
    This work was partly carried out while the authors were visiting the Simons Institute for
the Theory of Computing in association with the DIMACS/Simons Collaboration on Lower
Bounds in Computational Complexity, which is conducted with support from the National
Science Foundation.


References
 [1] Miklós Ajtai: Σ11 -formulae on finite structures. Ann. Pure Appl. Logic, 24(1):1–48, 1983.
     [doi:10.1016/0168-0072(83)90038-6] 4

                       T HEORY OF C OMPUTING, Volume 19 (7), 2023, pp. 1–51                            45
                           Y UVAL F ILMUS , O R M EIR , AND AVISHAY TAL

 [2] Andris Ambainis: Quantum lower bounds by quantum arguments. J. Comput. System Sci.,
     64(4):750–767, 2002. Special issue on STOC 2000 (Portland, OR). [doi:10.1006/jcss.2002.1826]
     4, 29

 [3] Andris Ambainis: Polynomial degree vs. quantum query complexity. J. Comput. System Sci.,
     72(2):220–238, 2006. [doi:10.1016/j.jcss.2005.06.006] 31, 39

 [4] Alexander E. Andreev: On a method for obtaining more than quadratic effective lower
     bounds for the complexity of 𝜋-schemes. Moscow University Mathematics Bulletin, 42(1):24–
     29, 1987. Vestnik Moskov. Univ. Ser. 1: Mat. Mekh. 1987(1) 70–73 (Russian original on
     Math-net.ru) MR 0883632. 2, 12, 35

 [5] Boaz Barak, Mark Braverman, Xi Chen, and Anup Rao: How to compress interactive
     communication. SIAM J. Comput., 42(3):1327–1363, 2013. [doi:10.1137/100811969] 38

 [6] Paul Beame and Widad Machmouchi: The quantum query complexity of AC0 . Quantum
     Inf. Comput., 12(7–8):670–676, 2012. [doi:10.26421/QIC12.7-8-11] 4, 12, 13, 35, 42

 [7] Maria Luisa Bonet and Samuel R. Buss: Size-depth tradeoffs for Boolean formulae. Inform.
     Process. Lett., 49(3):151–155, 1994. [doi:10.1016/0020-0190(94)90093-0] 7

 [8] Richard P. Brent: The parallel evaluation of general arithmetic expressions. J. ACM,
     21(2):201–206, 1974. [doi:10.1145/321812.321815] 7

 [9] Harry Buhrman and Ronald de Wolf: Complexity measures and decision tree complexity:
     a survey. Theoret. Comput. Sci., 288(1):21–43, 2002. [doi:10.1016/S0304-3975(01)00144-X] 29

[10] Andrew M. Childs, Shelby Kimmel, and Robin Kothari: The quantum query complexity of
     read-many formulas. In Proc. 20th Eur. Symp. Algorithms (ESA’12), pp. 337–348. Springer,
     2012. [doi:10.1007/978-3-642-33090-2_30] 4

[11] Thomas M. Cover and Joy A. Thomas: Elements of Information Theory. Wiley-Interscience,
     1991. [doi:10.1002/047174882X] 38

[12] Susanna F. de Rezende, Or Meir, Jakob Nordström, Toniann Pitassi, and Robert Robere:
     KRW composition theorems via lifting. In Proc. 61st FOCS, pp. 43–49. IEEE Comp. Soc.,
     2020. [doi:10.1109/FOCS46700.2020.00013, ECCC:TR20-099] 4

[13] Irit Dinur and Or Meir: Toward the KRW composition conjecture: Cubic formula
     lower bounds via communication complexity. Comput. Complexity, 27(3):375–462, 2018.
     [doi:10.1007/s00037-017-0159-x] 4

[14] Jeff Edmonds, Russell Impagliazzo, Steven Rudich, and Jirí Sgall: Communication
     complexity towards lower bounds on circuit depth. Comput. Complexity, 10(3):210–246, 2001.
     [doi:10.1007/s00037-001-8195-x] 4

                    T HEORY OF C OMPUTING, Volume 19 (7), 2023, pp. 1–51                      46
                           S HRINKAGE UNDER R ANDOM P ROJECTIONS

[15] Yuval Filmus, Or Meir, and Avishay Tal: Shrinkage under random projections, and cubic
     formula lower bounds for AC0 (extended abstract). In Proc. 12th Innovations in Theoret.
     Comp. Sci. Conf. (ITCS’21), pp. 89:1–7. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik,
     2021. [doi:10.4230/LIPIcs.ITCS.2021.89]

[16] Merrick L. Furst, James B. Saxe, and Michael Sipser: Parity, circuits, and the polynomial-
     time hierarchy. Math. Sys. Theory, 17(1):13–27, 1984. [doi:10.1007/BF01744431] 4

[17] Anna Gál, Avishay Tal, and Adrian Trejo Nuñez: Cubic formula size lower bounds
     based on compositions with majority. In Proc. 10th Innovations in Theoret. Comp. Sci.
     Conf. (ITCS’19), pp. 35:1–35:13. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, 2019.
     [doi:10.4230/LIPIcs.ITCS.2019.35] 5, 6, 37

[18] Anat Ganor, Ilan Komargodski, and Ran Raz: The spectrum of small De Morgan formulas.
     Electron. Colloq. Comput. Complexity, TR12-174, 2012. [ECCC] 43

[19] Dmitry Gavinsky, Or Meir, Omri Weinstein, and Avi Wigderson: Toward better formula
     lower bounds: The composition of a function and a universal relation. SIAM J. Comput.,
     46(1):114–131, 2017. [doi:10.1137/15M1018319] 3, 4, 39

[20] Mikael Goldmann and Johan Håstad: A simple lower bound for monotone clique using
     a communication game. Inform. Process. Lett., 41(4):221–226, 1992. [doi:10.1016/0020-
     0190(92)90184-W] 4

[21] Mika Göös and Toniann Pitassi: Communication lower bounds via critical block sensitivity.
     SIAM J. Comput., 47(5):1778–1806, 2018. [doi:10.1137/16M1082007] 4

[22] Johan Håstad: Almost optimal lower bounds for small depth circuits. In S. Micali, editor,
     Randomness and Computation, volume 5 of Adv. Computing Research, pp. 143–170. JAI Press,
     1989. DSpace@MIT. Preliminary version in STOC’86. 4

[23] Johan Håstad: The shrinkage exponent of De Morgan formulas is 2. SIAM J. Comput.,
     27(1):48–64, 1998. Preliminary version in FOCS’93. [doi:10.1137/S0097539794261556] 2, 3,
     5, 8, 9, 15, 17, 19, 43

[24] Johan Håstad, Stasys Jukna, and Pavel Pudlák: Top-down lower bounds for depth-three
     circuits. Comput. Complexity, 5(2):99–112, 1995. [doi:10.1007/BF01268140] 3

[25] Johan Håstad, Benjamin Rossman, Rocco A. Servedio, and Li-Yang Tan: An average-case
     depth hierarchy theorem for Boolean circuits. J. ACM, 64(5):35:1–27, 2017. Preliminary
     version in FOCS’15. [doi:10.1145/3095799] 6

[26] Johan Håstad and Avi Wigderson: Composition of the universal relation. In Jin yi Cai,
     editor, Advances In Computational Complexity Theory, volume 13 of DIMACS Ser. in Discr.
     Math., pp. 119–134. Amer. Math. Soc., 1993. [doi:10.1090/dimacs/013/07] 4

                    T HEORY OF C OMPUTING, Volume 19 (7), 2023, pp. 1–51                    47
                           Y UVAL F ILMUS , O R M EIR , AND AVISHAY TAL

[27] Johan Torkel Håstad: Computational limitations of small-depth circuits. Ph. D. thesis, MIT,
     1986. Available on author’s website, published as a book by MIT Press, 1987. 6

[28] Peter Høyer, Troy Lee, and Robert Špalek: Negative weights make adversaries stronger.
     In Proc. 39th STOC, pp. 526–535. ACM Press, 2007. [doi:10.1145/1250790.1250867] 31

[29] Russell Impagliazzo and Noam Nisan: The effect of random restrictions on formula size.
     Random Struct. Algor., 4(2):121–134, 1993. [doi:10.1002/rsa.3240040202] 2

[30] Stasys Jukna: Boolean Function Complexity – Advances and Frontiers. Volume 27 of Algorithms
     and Combinatorics. Springer, 2012. [doi:10.1007/978-3-642-24508-4] 4, 36

[31] Mauricio Karchmer, Ran Raz, and Avi Wigderson: Super-logarithmic depth lower bounds
     via the direct sum in communication complexity. Comput. Complexity, 5(3–4):191–204, 1995.
     [doi:10.1007/BF01206317] 3, 4, 33

[32] Mauricio Karchmer and Avi Wigderson: Monotone circuits for connectivity require
     super-logarithmic depth. SIAM J. Discr. Math., 3(2):255–265, 1990. [doi:10.1137/0403021] 3,
     4, 7, 39

[33] Valerii Mikhailovich Khrapchenko: A method of obtaining lower bounds for the complex-
     ity of 𝜋-schemes. Math. Notes. Acad. Sci. USSR (English translation), 10:474–479, 1971. Matem.
     Zametki 10(1) 83-92, 1971 (Russian original on Math-net.ru). [doi:10.1007/BF01747074] 2,
     29, 32, 37

[34] Sajin Koroth and Or Meir: Improved composition theorems for functions and relations.
     In Proc. 22nd Internat. Conf. on Randomization and Computation (RANDOM’18), pp. 48:1–18.
     Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, 2018. [doi:10.4230/LIPIcs.APPROX-
     RANDOM.2018.48] 4

[35] Robin Kothari: Formula size lower bounds for AC0 functions, 2011. Theoretical Computer
     Science Stack Exchange. 45

[36] Elias Koutsoupias: Improvements on Khrapchenko’s theorem. Theoret. Comput. Sci.,
     116(2):399–403, 1993. [doi:10.1016/0304-3975(93)90330-V] 43

[37] Eyal Kushilevitz and Noam Nisan: Communication Complexity. Cambridge Univ. Press,
     1997. [doi:10.1017/9781108671644] 6

[38] Sophie Laplante, Troy Lee, and Mario Szegedy: The quantum adversary method and classi-
     cal formula size lower bounds. Comput. Complexity, 15(2):163–196, 2006. [doi:10.1007/s00037-
     006-0212-7] 30, 31, 32, 39, 43

[39] Troy Lee, Rajat Mittal, Ben W. Reichardt, Robert Špalek, and Mario Szegedy: Quantum
     query complexity of state conversion. In Proc. 52nd FOCS, pp. 344–353. IEEE Comp. Soc.,
     2011. [doi:10.1109/FOCS.2011.75] 31

                     T HEORY OF C OMPUTING, Volume 19 (7), 2023, pp. 1–51                       48
                           S HRINKAGE UNDER R ANDOM P ROJECTIONS

[40] Lily Li and Morgan Shirley:        The general adversary bound:       A survey, 2021.
     [arXiv:2104.06380] 31

[41] Or Meir: Toward better depth lower bounds: Two results on the multiplexor relation.
     Comput. Complexity, 29(1):4, 2020. [doi:10.1007/s00037-020-00194-8] 4

[42] Ivan Mihajlin and Alexander Smal: Toward better depth lower bounds: The XOR-
     KRW conjecture. In Proc. 36th Comput. Complexity Conf. (CCC’21), pp. 38:1–24. Schloss
     Dagstuhl–Leibniz-Zentrum fuer Informatik, 2021. [doi:10.4230/LIPIcs.CCC.2021.38] 4

[43] Èduard Ivanovič Nečiporuk: On a Boolean function. Soviet Math. Doklady, 7(4):999–1000,
     1966. Doklady Akad. Nauk SSSR 169(4), 1966, 765–766 (Russian original on Math-net.ru). 4

[44] Mike Paterson and Uri Zwick: Shrinkage of De Morgan formulae under restriction. Random
     Struct. Algor., 4(2):135–150, 1993. [doi:10.1002/rsa.3240040203] 2

[45] Toniann Pitassi and Robert Robere: Strongly exponential lower bounds for
     monotone computation. In Proc. 49th STOC, pp. 1246–1255. ACM Press, 2017.
     [doi:10.1145/3055399.3055478] 4

[46] Ran Raz and Pierre McKenzie: Separation of the monotone NC hierarchy. Combinatorica,
     19(3):403–435, 1999. [doi:10.1007/s004930050062] 4

[47] Ran Raz and Avi Wigderson: Monotone circuits for matching require linear depth. J. ACM,
     39(3):736–744, 1992. [doi:10.1145/146637.146684] 4

[48] Alexander A. Razborov: Applications of matrix methods to the theory of lower bounds in
     computational complexity. Combinatorica, 10(1):81–93, 1990. [doi:10.1007/BF02122698] 7

[49] Ben W. Reichardt: Span programs and quantum query complexity: the general adversary
     bound is nearly tight for every Boolean function. In Proc. 50th FOCS, pp. 544–551. IEEE
     Comp. Soc., 2009. [doi:10.1109/FOCS.2009.55] 31

[50] Ben W. Reichardt: Reflections for quantum query algorithms. In Proc. 22nd
     Ann. ACM–SIAM Symp. on Discrete Algorithms (SODA’11), pp. 560–569. SIAM, 2011.
     [doi:10.1137/1.9781611973082.44] 31

[51] Ben W. Reichardt: Span programs are equivalent to quantum query algorithms. SIAM J.
     Comput., 43(3):1206–1219, 2014. [doi:10.1137/100792640] 31

[52] Igor Sergeevich Sergeev: Complexity and depth of formulas for symmetric Boolean
     functions. Moscow University Mathematics Bulletin, 71(3):127–130, 2016. Vestnik
     Moskov. Univ. Ser. 1: Mat. Mekh. 2016(3) 53–57 (Russian original on Math-net.ru).
     [doi:10.3103/S0027132216030098] 37

[53] Robert Špalek and Mario Szegedy: All quantum adversary methods are equivalent. Theory
     of Computing, 2(1):1–18, 2006. [doi:10.4086/toc.2006.v002a001] 30, 31

                    T HEORY OF C OMPUTING, Volume 19 (7), 2023, pp. 1–51                  49
                           Y UVAL F ILMUS , O R M EIR , AND AVISHAY TAL

[54] Philip M. Spira: On time-hardware complexity tradeoffs for Boolean functions. In Proc. 4th
     Hawaii International Symposium on System Sciences, pp. 525–527. Western Periodicals, 1971.
     Google books. 7

[55] Bella Abramovna Subbotovskaya: Realizations of linear functions by formulas using ∨, &,
     − . Soviet Math. Doklady, 2:110–112, 1961. Doklady Akad. Nauk SSSR 136(3), 553–555, 1961

     (Russian original on Math-net.ru). 2

[56] Avishay Tal: Shrinkage of De Morgan formulae by spectral techniques. In Proc. 55th FOCS,
     pp. 551–560. IEEE Comp. Soc., 2014. [doi:10.1109/FOCS.2014.65] 2, 3

[57] Andrew Chi-Chih Yao: Separating the polynomial-time hierarchy by oracles (preliminary
     version). In Proc. 26th FOCS, pp. 1–10. IEEE Comp. Soc., 1985. [doi:10.1109/SFCS.1985.49] 4

[58] Uri Zwick: An extension of Khrapchenko’s theorem. Inform. Process. Lett., 37(4):215–217,
     1991. [doi:10.1016/0020-0190(91)90191-J] 43


AUTHORS

     Yuval Filmus
     Associate Professor
     Technion — Israel Institute of Technology
     The Henry and Marilyn Taub Faculty of Computer Science
     Faculty of Mathematics
     Haifa 3200003, Israel
     yuvalfi cs technion ac il
     https://yuvalfilmus.cs.technion.ac.il/


     Or Meir
     Associate Professor
     Department of Computer Science
     University of Haifa
     Haifa 3303220, Israel
     ormeir cs haifa ac il
     https://cs.haifa.ac.il/~ormeir/




                    T HEORY OF C OMPUTING, Volume 19 (7), 2023, pp. 1–51                     50
                          S HRINKAGE UNDER R ANDOM P ROJECTIONS

    Avishay Tal
    Assistant Professor
    Department of Electrical Engineering and Computer Sciences
    University of California, Berkeley
    Berkeley, CA 94720, United States
    atal berkeley edu
    https://www.avishaytal.org/


ABOUT THE AUTHORS

    Yuval Filmus received his Ph. D. from the University of Toronto under the supervision
       of Toni Pitassi. Subsequently, he held postdoctoral positions at the Simons
       Institute for the Theory of Computing and the Institute for Advanced Study. He
       is interested in Boolean function analysis, complexity theory, and combinatorics.


    Or Meir received his Ph. D. from the Weizmann Institute of Science under the
       supervision of Oded Goldreich. Subsequently, he held postdoctoral positions
       at Stanford University, the Institute for Advanced Study, and the Weizmann
       Institute of Science. He is interested in complexity theory, and in particular in
       circuit complexity, communication complexity, coding theory, probabilistic proof
       systems, and the theory of pseudorandomness.


    Avishay Tal received his M. Sc. from the Technion under the supervision of
       Amir Shpilka, and his Ph. D. from the Weizmann Institute of Science under
       the supervision of Ran Raz. Subsequently, he held postdoctoral positions
       at the Institute for Advanced Study, the Simons Institute for the Theory of
       Computing, and Stanford University. He is interested in Boolean function analysis,
       complexity theory and combinatorics, and in particular in circuit complexity,
       query complexity, computational learning theory, quantum complexity, and the
       theory of pseudorandomness.




                   T HEORY OF C OMPUTING, Volume 19 (7), 2023, pp. 1–51                     51