Randomized Query Complexity of Sabotaged and Composed Functions

Authors Shalev Ben-David, Robin Kothari,
Plaintext
                           T HEORY OF C OMPUTING, Volume 14 (5), 2018, pp. 1–27
                                        www.theoryofcomputing.org




             Randomized Query Complexity of
            Sabotaged and Composed Functions
                               Shalev Ben-David∗                    Robin Kothari†
                      Received June 9, 2016; Revised March 2, 2017; Published May 4, 2018




       Abstract: We study the composition question for bounded-error randomized query complex-
       ity: Is R( f ◦ g) = Ω(R( f ) R(g)) for all Boolean functions f and g? We show that inserting a
       simple Boolean function h, whose query complexity is only Θ(log R(g)), in between f and
       g allows us to prove R( f ◦ h ◦ g) = Ω(R( f ) R(h) R(g)).
            We prove this using a new lower bound measure for randomized query complexity we
       call randomized sabotage complexity, RS( f ). Randomized sabotage complexity has several
       desirable properties, such as a perfect composition theorem, RS( f ◦ g) ≥ RS( f ) RS(g), and
       a composition theorem with randomized query complexity, R( f ◦ g) = Ω(R( f ) RS(g)). It is
       also a quadratically tight lower bound for total functions and can be quadratically superior to
       the partition bound, the best known general lower bound for randomized query complexity.
            Using this technique we also show implications for lifting theorems in communication
       complexity. We show that a general lifting theorem for zero-error randomized protocols
       implies a general lifting theorem for bounded-error protocols.

ACM Classification: F.1.2, F.1.3
AMS Classification: 68W20, 68Q15, 68Q17
Key words and phrases: randomized algorithms, query complexity, composition theorems

     A conference version of this paper appeared in the Proceedings of the 43rd International Colloquium on Automata,
Languages, and Programming (ICALP 2016) [6].
   ∗ Partially supported by Scott Aaronson’s NSF Waterman award under grant number 1249349.
   † Partially supported by ARO grant number W911NF-12-1-0486.



© 2018 Shalev Ben-David and Robin Kothari
c b Licensed under a Creative Commons Attribution License (CC-BY)                    DOI: 10.4086/toc.2018.v014a005 title
                                      S HALEV B EN -DAVID AND ROBIN KOTHARI

1     Introduction
1.1     Composition theorems
A basic structural question that can be asked in any model of computation is whether there can be resource
savings when computing the same function on several independent inputs. We say a direct sum theorem
holds in a model of computation if solving a problem on n independent inputs requires roughly n times
the resources needed to solve one instance of the problem. Direct sum theorems hold for deterministic
and randomized query complexity [14], fail for circuit size [21], and remain open for communication
complexity [16, 5, 9].
    More generally, instead of merely outputting the n answers, we could compute another function of
these n answers. If f is an n-bit Boolean function and g is an m-bit Boolean function, we define the
composed function f ◦ g to be an nm-bit Boolean function such that f ◦ g(x1 , . . . , xn ) = f (g(x1 ), . . . , g(xn )),
where each xi is an m-bit string. The composition question now asks if there can be significant savings in
computing f ◦ g compared to simply running the best algorithm for f and using the best algorithm for
g to evaluate the input bits needed to compute f . If we let f be the identity function on n bits that just
outputs all its inputs, we recover the direct sum problem.
    Composition theorems are harder to prove and are known for only a handful of models, such as
deterministic [24, 20] and quantum query complexity [22, 19, 17]. More precisely, let D( f ), R( f ), and
Q( f ) denote the deterministic, randomized, and quantum query complexities of f . Then for all (possibly
partial) Boolean1 functions f and g, we have

                          D( f ◦ g) = D( f ) D(g)          and        Q( f ◦ g) = Θ(Q( f ) Q(g)) .                         (1.1)

In contrast, in the randomized setting we only have the upper bound R( f ◦ g) = O(R( f ) R(g) log R( f )).
Proving a composition theorem for randomized query complexity remains a major open problem.

Open Problem 1. Does it hold that R( f ◦ g) = Ω(R( f ) R(g)) for all Boolean functions f and g?

    In this paper we prove something close to a composition theorem for randomized query complexity.
While we cannot rule out the possibility of synergistic savings in computing f ◦ g, we show that a
composition theorem does hold if we insert a small gadget in between f and g to obfuscate the output
of g. Our gadget is “small” in the sense that its randomized (and even deterministic) query complexity
is Θ(log R(g)). Specifically we choose the index function, which on an input of size k + 2k interprets
the first k bits as an address into the next 2k bits and outputs the bit stored at that address. The index
function’s query complexity is k + 1 and we choose k = Θ(log R(g)).

Theorem 1.1. Let f and g be (partial) Boolean functions and let I ND be the index function with
R(I ND) = Θ(log R(g)). Then

                       R( f ◦ I ND ◦ g) = Ω(R( f ) R(I ND) R(g)) = Ω(R( f ) R(g) log R(g)) .
    1 Composition theorems usually fail for trivial reasons for non-Boolean functions. Hence we restrict our attention to Boolean

functions, which have domain {0, 1}n (or a subset of {0, 1}n ) and range {0, 1}.


                             T HEORY OF C OMPUTING, Volume 14 (5), 2018, pp. 1–27                                              2
             R ANDOMIZED Q UERY C OMPLEXITY OF S ABOTAGED AND C OMPOSED F UNCTIONS

     Theorem 1.1 can be used instead of a true composition theorem in many applications. For example,
recently a composition theorem for randomized query complexity was needed in the special case when g
is the A ND function [2]. Our composition theorem would suffice for this application, since the separation
shown there only changes by a logarithmic factor if an index gadget is inserted between f and g.
     We prove Theorem 1.1 by introducing a new lower bound technique for randomized query complexity.
This is not surprising since the composition theorems for deterministic and quantum query complexities
are also proved using powerful lower bound techniques for these models, namely, the adversary argument
and the negative-weights adversary bound [12], respectively.

1.2   Sabotage complexity
To describe the new lower bound technique, consider the problem of computing a Boolean function f on
an input x ∈ {0, 1}n in the query model. In this model we have access to an oracle, which when queried
with an index i ∈ [n] responds with xi ∈ {0, 1}.
     Imagine that a hypothetical saboteur damages the oracle and makes some of the input bits unreadable.
For these input bits the oracle simply responds with a ∗. We can now view the oracle as storing a string
p ∈ {0, 1, ∗}n as opposed to a string x ∈ {0, 1}n . Although it is not possible to determine the true input
x from the oracle string p, it may still be possible to compute f (x) if all input strings consistent with p
evaluate to the same f value. On the other hand, it is not possible to compute f (x) if p is consistent with
a 0-input and a 1-input to f . We call such a string p ∈ {0, 1, ∗}n a sabotaged input. For example, let f be
the O R function that computes the logical O R of its bits. Then p = 00∗0 is a sabotaged input since it is
consistent with the 0-input 0000 and the 1-input 0010. However, p = 01∗0 is not a sabotaged input since
it is only consistent with 1-inputs to f .
     Now consider a new problem in which the input is promised to be sabotaged (with respect to a
function f ) and our job is to find the location of a ∗. Intuitively, any algorithm that solves the original
problem f when run on a sabotaged input must discover at least one ∗, since otherwise it would answer
the same on 0- and 1-inputs consistent with the sabotaged input. Thus the problem of finding a ∗ in a
sabotaged input is no harder than the problem of computing f , and hence naturally yields a lower bound
on the complexity of computing f . As we show later, this intuition can be formalized in several models
of computation.
     As it stands the problem of finding a ∗ in a sabotaged input has multiple valid outputs, as the location
of any star in the input is a valid output. For convenience we define a decision version of the problem as
follows: Imagine there are two saboteurs and one of them has sabotaged our input. The first saboteur,
Asterix, replaces input bits with an asterisk (∗) and the second, Obelix, uses an obelisk (†). Promised that
the input has been sabotaged exclusively by one of Asterix or Obelix, our job is to identify the saboteur.
This is now a decision problem since there are only two valid outputs. We call this decision problem fsab ,
the sabotage problem associated with f .
     We now define lower bound measures for various models using fsab . For example, we can define the
deterministic sabotage complexity of f as DS( f ) := D( fsab ) and in fact, it turns out that for all f , DS( f )
equals D( f ) (Theorem 8.1).
     We could define the randomized sabotage complexity of f as R( fsab ), but instead we define it as
RS( f ) := R0 ( fsab ), where R0 denotes zero-error randomized query complexity, since R( fsab ) and R0 ( fsab )
are equal up to constant factors (Theorem 3.3). RS( f ) has the following desirable properties.

                         T HEORY OF C OMPUTING, Volume 14 (5), 2018, pp. 1–27                                  3
                                    S HALEV B EN -DAVID AND ROBIN KOTHARI

   1. (Lower bound for R)           For all f , R( f ) = Ω(RS( f )).                          (Theorem 3.4)
   2. (Perfect composition)         For all f and g, RS( f ◦ g) ≥ RS( f ) RS(g).              (Theorem 4.5)
   3. (Composition with R)          For all f and g, R( f ◦ g) = Ω(R( f ) RS(g)).             (Theorem 4.7)
   4. (Superior to prt( f ))        There exists a total f with RS( f ) ≥ prt( f )2−o(1) .    (Theorem 7.1)
   5. (Superior to Q( f ))          There exists a total f with RS( f ) = Ω(Q(
                                                                          e       f )2.5 ).   (Theorem 7.1)
   6. (Quadratically tight)         For all total f , R( f ) = O(RS( f )2 log RS( f )).       (Theorem 7.3)
Here prt( f ) denotes the partition bound [13, 15], which subsumes most other lower bound techniques
such as approximate polynomial degree, randomized certificate complexity, block sensitivity, etc. The
only general lower bound technique not subsumed by prt( f ) is quantum query complexity, Q( f ), which
can also be considerably smaller than RS( f ) for some functions. In fact, we are unaware of any total
function f for which RS( f ) = o(R( f )), leaving open the intriguing possibility that this lower bound
technique captures randomized query complexity for total functions.

1.3   Lifting theorems
Using randomized sabotage complexity we are also able to show a relationship between lifting theorems
in communication complexity. A lifting theorem relates the query complexity of a function f with the
communication complexity of a related function created from f . Recently, Göös, Pitassi, and Watson [11]
showed that there is a communication problem GI ND , also known as the two-party index gadget, with
communication complexity Θ(log n) such that for any function f on n bits, Dcc ( f ◦GI ND ) = Ω(D( f ) log n),
where Dcc (F) denotes the deterministic communication complexity of a communication problem F.
    Analogous lifting theorems are known for some complexity measures, but no such theorem is known
for either zero-error randomized or bounded-error randomized query complexity. Our second result
shows that a lifting theorem for zero-error randomized query complexity implies one for bounded-error
randomized query. We use Rcc             cc
                            0 (F) and R (F) to denote the zero-error and bounded-error communication
complexities of F, respectively.
Theorem 1.2. Let G : X × Y → {0, 1} be a communication problem with min{|X|, |Y|} = O(log n). If it
holds that for all n-bit partial functions f ,
                                       Rcc
                                        0 ( f ◦ G) = Ω(R0 ( f )/ polylog n) ,                          (1.2)
then for all n-bit partial functions f ,
                                      Rcc ( f ◦ GI ND ) = Ω(R( f )/ polylog n) ,                       (1.3)
                                b
where GI ND : {0, 1}b × {0, 1}2 → {0, 1} is the index gadget (Definition 6.1) with b = Θ(log n).
     Proving a lifting theorem for bounded-error randomized query complexity remains an important
open problem in communication complexity. Such a theorem would allow the recent separations in
communication complexity shown by Anshu et al. [4] to be proved simply by establishing their query
complexity analogues, which was done in [2] and [3]. Our result shows that it is sufficient to prove a
lifting theorem for zero-error randomized protocols instead.

                         T HEORY OF C OMPUTING, Volume 14 (5), 2018, pp. 1–27                              4
               R ANDOMIZED Q UERY C OMPLEXITY OF S ABOTAGED AND C OMPOSED F UNCTIONS

1.4     Open problems
The main open problem is to determine whether R( f ) = Θ(RS(  e     f )) for all total functions f . This is
known to be false for partial functions, however. Any partial function where all inputs in Dom( f ) are far
apart in Hamming distance necessarily has low sabotage complexity. For example, any sabotaged input
                                                                                                       √
to the collision problem2 has at least half the bits sabotaged making RS( f ) = O(1), but R( f ) = Ω( n).
     It would also be interesting to extend the sabotage idea to other models of computation and see if
it yields useful lower bound measures. For example, we can define quantum sabotage complexity as
QS( f ) := Q( fsab ), but we were unable to show that it lower bounds Q( f ).

1.5     Paper organization
In Section 2, we present some preliminaries and useful properties of randomized algorithms (whose proofs
appear in Appendix A for completeness). We then formally define sabotage complexity in Section 3 and
prove some basic properties of sabotage complexity. In Section 4 we establish the composition properties
of randomized sabotage complexity described above (Theorem 4.5 and Theorem 4.7). Using these results,
we establish the main result (Theorem 1.1) in Section 5. We then prove the connection between lifting
theorems (Theorem 1.2) in Section 6. In Section 7 we compare randomized sabotage complexity with
other lower bound measures. We end with a discussion of deterministic sabotage complexity in Section 8.


2     Preliminaries
In this section we define some basic notions in query complexity that will be used throughout the paper.
Note that all the functions in this paper have Boolean input and output, except sabotaged functions whose
input alphabet is {0, 1, ∗, †}. For any positive integer n, we define [n] := {1, 2, . . . , n}.
     In the model of query complexity, we wish to compute an n-bit Boolean function f on an input x
given query access to the bits of x. The function f may be total, i. e., f : {0, 1}n → {0, 1}, or partial,
which means it is defined only on a subset of {0, 1}n , which we denote by Dom( f ). The goal is to output
 f (x) using as few queries to the bits of x as possible. The number of queries used by the best possible
deterministic algorithm (over worst-case choice of x) is denoted D( f ).
     A randomized algorithm is a probability distribution over deterministic algorithms. The worst-case
cost of a randomized algorithm is the worst-case (over all the deterministic algorithms in its support)
number of queries made by the algorithm on any input x. The expected cost of the algorithm is the
expected number of queries made by the algorithm (over the probability distribution) on an input x
maximized over all inputs x. A randomized algorithm has error at most ε if it outputs f (x) on every x
with probability at least 1 − ε.
     We use Rε ( f ) to denote the worst-case cost of the best randomized algorithm that computes f with
error ε. Similarly, we use Rε ( f ) to denote the expected cost of the best randomized algorithm that
computes f with error ε. When ε is unspecified it is taken to be ε = 1/3. Thus R( f ) denotes the
bounded-error randomized query complexity of f . Finally, we also define zero-error expected randomized
    2 In the collision problem, we are given an input x ∈ [m]n , and we have to decide if x viewed as a function from [n] to [m] is

1-to-1 or 2-to-1 promised that one of these holds.


                             T HEORY OF C OMPUTING, Volume 14 (5), 2018, pp. 1–27                                                5
                                S HALEV B EN -DAVID AND ROBIN KOTHARI

query complexity, R0 ( f ), which we also denote by R0 ( f ) to be consistent with the literature. For precise
definitions of these measures as well as the definition of quantum query complexity Q( f ), see the survey
by Buhrman and de Wolf [7].

2.1     Properties of randomized algorithms
We will assume familiarity with the following basic properties of randomized algorithms. For complete-
ness, we prove these properties in Appendix A.
    First, we have Markov’s inequality, which allows us to convert an algorithm with a guarantee on the
expected number of queries into an algorithm with a guarantee on the maximum number of queries with
a constant factor loss in the query bound and a constant factor increase in the error. This can be used, for
example, to convert zero-error randomized algorithms into bounded-error randomized algorithms.

Lemma 2.1 (Markov’s Inequality). Let A be a randomized algorithm that makes T queries in expectation
(over its internal randomness). Then for any δ ∈ (0, 1), the algorithm A terminates within bT /δ c queries
with probability at least 1 − δ .

      The next property allows us to amplify the success probability of an ε-error randomized algorithm.

Lemma 2.2 (Amplification). If f is a function with Boolean output and A is a randomized algorithm for
 f with error ε < 1/2, repeating A several times and taking the majority vote of the outcomes decreases
the error. To reach error ε 0 > 0, it suffices to repeat the algorithm 2 ln(1/ε 0 )/(1 − 2ε)2 times.

    Recall that we defined Rε ( f ) to be the minimum expected number of queries made by a randomized
algorithm that computes f with error probability at most ε. Clearly, we have Rε ( f ) ≤ Rε ( f ), since the
expected number of queries made by an algorithm is at most the maximum number of queries made by
the algorithm. Using Lemma 2.1, we can now relate them in the other direction.

Lemma 2.3. Let f be a partial function, δ > 0, and ε ∈ [0, 1/2). Then we have
                                                 1 − 2ε             1
                                  Rε+δ ( f ) ≤          Rε ( f ) ≤    Rε ( f ) .
                                                   2δ              2δ
    The next lemma shows how to relate these measures with the same error ε on both sides of the
inequality. This also shows that Rε ( f ) is only a constant factor away from Rε ( f ) for constant ε.

Lemma 2.4. If f is a partial function, then for all ε ∈ (0, 1/2), we have

                                                        ln(1/ε)
                                       Rε ( f ) ≤ 14             Rε ( f ) .
                                                       (1 − 2ε)2

When ε = 1/3, we can improve this to R( f ) ≤ 10R( f ).

    Although these measures are closely related for constant error, Rε ( f ) is more convenient than Rε ( f )
for discussing composition and direct sum theorems.
    We can also convert randomized algorithms that find certificates with bounded error into zero-error
randomized algorithms.

                         T HEORY OF C OMPUTING, Volume 14 (5), 2018, pp. 1–27                               6
             R ANDOMIZED Q UERY C OMPLEXITY OF S ABOTAGED AND C OMPOSED F UNCTIONS

Lemma 2.5. Let A be a randomized algorithm that uses T queries in expectation and finds a certificate
with probability 1 − ε. Then repeating A when it fails to find a certificate turns it into an algorithm that
always finds a certificate (i. e., a zero-error algorithm) that uses at most T /(1 − ε) queries in expectation.

    Finally, the following lemma is useful for proving lower bounds on randomized algorithms.

Lemma 2.6. Let f be a partial function. Let A be a randomized algorithm that solves f using at most T
expected queries and with error at most ε. For x, y ∈ Dom( f ) if f (x) 6= f (y) then when A is run on x, it
must query an entry on which x differs from y with probability at least 1 − 2ε.


3    Sabotage complexity
We now formally define sabotage complexity. Given a (partial or total) n-bit Boolean function f , let
Pf ⊆ {0, 1, ∗}n be the set of all partial assignments of f that are consistent with both a 0-input and
a 1-input. That is, for each p ∈ Pf , there exist x, y ∈ Dom( f ) such that f (x) 6= f (y) and xi = yi = pi
whenever pi 6= ∗. Let Pf† ⊆ {0, 1, †}n be the same as Pf , except using the symbol † instead of ∗. Observe
that Pf and Pf† are disjoint. Let Q f = Pf ∪ Pf† ⊆ {0, 1, ∗, †}n . We then define fsab as follows.

Definition 3.1. Let f be an n-bit partial function. We define fsab : Q f → {0, 1} as fsab (q) = 0 if q ∈ Pf
and fsab (q) = 1 if q ∈ Pf† .

    Note that even when f is a total function, fsab is always a partial function. See Section 1.2 for more
discussion and motivation for this definition. Now that we have defined fsab , we can define deterministic
and randomized sabotage complexity.

Definition 3.2. Let f be a partial function. Then DS( f ) := D( fsab ) and RS( f ) := R0 ( fsab ).

    We will primarily focus on RS( f ) in this work and only discuss DS( f ) in Section 8. To justify
defining RS( f ) as R0 ( fsab ) instead of R( fsab ) (or R( fsab )), we now show these definitions are equivalent
up to constant factors.

Theorem 3.3. Let f be a partial function. Then R0 ( fsab ) ≥ Rε ( fsab ) ≥ (1 − 2ε) R0 ( fsab ).

Proof. The first inequality follows trivially. For the second, let x ∈ Q f be any valid input to fsab . Let
x0 be the input x with asterisks replaced with obelisks and vice versa. Then since fsab (x) 6= fsab (x0 ), by
Lemma 2.6 any ε-error randomized algorithm that solves fsab must find a position on which x and x0 differ
with probability at least 1 − 2ε. The positions at which they differ are either asterisks or obelisks. Since
x was an arbitrary input, the algorithm must always find an asterisk or obelisk with probability at least
1 − 2ε. Since finding an asterisk or obelisk is a certificate for fsab , by Lemma 2.5, we get a zero-error
algorithm for fsab that uses Rε ( fsab )/(1 − 2ε) expected queries. Thus R0 ( fsab ) ≤ Rε ( fsab )/(1 − 2ε), as
desired.

    Finally, we prove that RS( f ) is indeed a lower bound on R( f ), i. e., R( f ) = Ω(RS( f )).

Theorem 3.4. Let f be a partial function. Then Rε ( f ) ≥ Rε ( f ) ≥ (1 − 2ε) RS( f ).

                         T HEORY OF C OMPUTING, Volume 14 (5), 2018, pp. 1–27                                  7
                                S HALEV B EN -DAVID AND ROBIN KOTHARI

Proof. Let A be a randomized algorithm for f that uses Rε ( f ) randomized queries and outputs the correct
answer on every input in Dom( f ) with probability at least 1 − ε. Now fix a sabotaged input x, and let p
be the probability that A finds a ∗ or † when run on x. Let q be the probability that A outputs 0 if it doesn’t
find a ∗ or † when run on x. Let x0 and x1 be inputs consistent with x such that f (x0 ) = 0 and f (x1 ) = 1.
Then A outputs 0 on x1 with probability at least q(1 − p), and A outputs 1 on x0 with probability at least
(1 − q)(1 − p). These are both errors, so we have q(1 − p) ≤ ε and (1 − q)(1 − p) ≤ ε. Summing them
gives 1 − p ≤ 2ε, or p ≥ 1 − 2ε.
    This means A finds a ∗ entry within Rε ( f ) expected queries with probability at least 1 − 2ε. By
Lemma 2.5, we get
                          1
                              Rε ( f ) ≥ RS( f ) ,   or     Rε ( f ) ≥ (1 − 2ε) RS( f ) .                 (3.1)
                       1 − 2ε
    We also define a variant of RS where the number of asterisks (or obelisks) is limited to one. Specifi-
cally, let U ⊆ {0, 1, ∗, †}n be the set of all partial assignments with exactly one ∗ or †. Formally,
                            U := x ∈ {0, 1, ∗, †}n : |{i ∈ [n] : xi ∈
                                  
                                                                    / {0, 1}}| = 1 .                  (3.2)
Definition 3.5. Let f be an n-bit partial function. We define fusab as the restriction of fsab to U, the set
of strings with only one asterisk or obelisk. That is, fusab has domain Q f ∩U, but is equal to fsab on its
domain. We then define RSu ( f ) := R0 ( fusab ). If Q f ∩U is empty, we define RSu ( f ) := 0.
    The measure RSu will play a key role in our lifting result in Section 6. Since fusab is a restriction
of fsab to a promise, it is clear that its zero-error randomized query complexity cannot increase, and so
RSu ( f ) ≤ RS( f ). Interestingly, when f is total, RSu ( f ) equals RS( f ). In other words, when f is total,
we may assume without loss of generality that its sabotaged version has only one asterisk or obelisk.
Theorem 3.6. If f is a total function, then RSu ( f ) = RS( f ).
Proof. We already argued that RS( f ) ≥ RSu ( f ). To show RSu ( f ) ≥ RS( f ), we argue that any zero-error
algorithm A for fusab also solves fsab . The main observation we need is that any input to fsab can be
completed to an input to fusab by replacing some asterisks or obelisks with 0s and 1s. To see this, let x
be an input to fsab . Without loss of generality, x ∈ Pf . Then there are two strings y, z ∈ Dom( f ) that are
consistent with x, satisfying f (y) = 0 and f (z) = 1.
     The strings y and z disagree on some set of bits B, and x has a ∗ or † on all of B. Consider starting with
y and flipping the bits of B one by one, until we reach the string z. At the beginning, we have f (y) = 0,
and at the end, we reach f (z) = 1. This means that at some point in the middle, we must have flipped a
bit that flipped the string from a 0-input to a 1-input. Let w0 and w1 be the inputs where this happens.
They differ in only one bit. If we replace that bit with ∗ or †, we get a partial assignment w consistent
with both, so w ∈ Pf . Moreover, w is consistent with x. This means we have completed an arbitrary input
to fsab to an input to fusab , as claimed.
     The algorithm A, which correctly solves fusab , when run on w (a valid input to fusab ) must find an
asterisk or obelisk in w. Now consider running A on the input x to fsab and compare its execution to when
it is run on w. If A ever queries a position that is different in x and w, then it has found an asterisk or
obelisk and the algorithm can now halt. If not, then it must find the single asterisk or obelisk present in w,
which is also present in x. This shows that the slightly modified version of A that halts if it queries an
asterisk or obelisk solves fsab and hence RS( f ) = R0 ( fsab ) ≤ R0 ( fusab ) = RSu ( f ).

                        T HEORY OF C OMPUTING, Volume 14 (5), 2018, pp. 1–27                                 8
              R ANDOMIZED Q UERY C OMPLEXITY OF S ABOTAGED AND C OMPOSED F UNCTIONS

4      Direct sum and composition theorems
In this section, we establish the main composition theorems for randomized sabotage complexity. To do
so, we first need to establish direct sum theorems for the problem fsab . In fact, our direct sum theorems
hold more generally for zero-error randomized query complexity of partial functions (and even relations).
To prove this, we will require Yao’s minimax theorem [27].

Theorem 4.1 (Minimax). Let f be an n-bit partial function. There is a distribution µ over inputs in
Dom( f ) such that all zero-error algorithms for f use at least R0 ( f ) expected queries on µ.

      We call any distribution µ that satisfies the assertion in Yao’s theorem a hard distribution for f .

4.1     Direct sum theorems
We start by defining the m-fold direct sum of a function f , which is simply the function that accepts m
inputs to f and outputs f evaluated on all of them.

Definition 4.2. Let f : Dom( f ) → Z, where Dom( f ) ⊆ Xn , be a partial function with input and output
alphabets X and Z. The m-fold direct sum of f is the partial function f ⊕m : Dom( f )m → Zm such that for
any (x1 , x2 , . . . , xm ) ∈ Dom( f )m , we have

                               f ⊕m (x1 , x2 , . . . , xm ) = ( f (x1 ), f (x2 ), . . . , f (xm )) .     (4.1)

   We can now prove a direct sum theorem for zero-error randomized and more generally ε-error
expected randomized query complexity, although we only require the result about zero-error algorithms.
We prove these results for partial functions, but they also hold for arbitrary relations.

Theorem 4.3 (Direct sum). For any n-bit partial function f and any positive integer m, we have
R0 ( f ⊕m ) = m R0 ( f ). Moreover, if µ is a hard distribution for f given by Theorem 4.1, then µ ⊗m is a
hard distribution for f ⊕m . Similarly, for ε-error randomized algorithms we get Rε ( f ⊕m ) ≥ mRε ( f ).

Proof. The upper bound follows from running the R0 ( f ) algorithm on each of the m inputs to f . By
linearity of expectation, this algorithm solves all m inputs after m R0 ( f ) expected queries.
    We now prove the lower bound. Let A be a zero-error randomized algorithm for f ⊕m that uses T
expected queries when run on inputs from µ ⊗m . We convert A into an algorithm B for f that uses T /m
expected queries when run on inputs from µ.
    Given an input x ∼ µ, the algorithm B generates m − 1 additional “fake” inputs from µ. B then
shuffles these together with x, and runs A on the result. The input to A is then distributed according to
µ ⊗m , so A uses T queries (in expectation) to solve all m inputs. B then reads the solution to the true input
x.
    Note that most of the queries A makes are to fake inputs, so they don’t count as real queries. The
only real queries B has to make happen when A queries x. But since x is shuffled with the other
(indistinguishable) inputs, the expected number of queries A makes to x is the same as the expected
number of queries A makes to each fake input; this must equal T /m. Thus B makes T /m queries to x (in
expectation) before solving it.

                          T HEORY OF C OMPUTING, Volume 14 (5), 2018, pp. 1–27                               9
                                S HALEV B EN -DAVID AND ROBIN KOTHARI

    Since B is a zero-error randomized algorithm for f that uses T /m expected queries on inputs from µ,
we must have T /m ≥ R0 ( f ) by Theorem 4.1. Thus T ≥ m R0 ( f ), as desired.
    The same lower bound proof carries through for ε-error expected query complexity, Rε ( f ), as long
as we use a version of Yao’s theorem for this model. For completeness, we prove this version of Yao’s
theorem in Appendix B.


    Theorem 4.3 is essentially [14, Theorem 2], but our theorem statement looks different since we deal
with expected query complexity instead of worst-case query complexity. From Theorem 4.3, we can also
prove a direct sum theorem for worst-case randomized query complexity since for ε ∈ (0, 1/2),

                            Rε ( f ⊕m ) ≥ Rε ( f ⊕m ) ≥ mRε ( f ) ≥ 2δ m Rε+δ ( f ) ,                  (4.2)

for any δ > 0, where the last inequality used Lemma 2.3.
    For our applications, however, we will need a strengthened version of this theorem, which we call a
threshold direct sum theorem.

Theorem 4.4 (Threshold direct sum). Given an input to f ⊕m sampled from µ ⊗m , we consider solving
only some of the m inputs to f . We say an input x to f is solved if a z-certificate was queried that proves
 f (x) = z. Then any randomized algorithm that takes an expected T queries and solves an expected k of
the m inputs when run on inputs from µ ⊗m must satisfy T ≥ k R0 ( f ).

Proof. We prove this by a reduction to Theorem 4.3. Let A be a randomized algorithm that, when run on
an input from µ ⊗m , solves an expected k of the m instances, and halts after an expected T queries. We
note that these expectations average over both the distribution µ ⊗m and the internal randomness of A.
    We now define a randomized algorithm B that solves the m-fold direct sum f ⊕m with zero error. B
works as follows: given an input to f ⊕m , B first runs A on that input. Then B checks which of the m
instances of f were solved by A (by seeing if a certificate proving the value of f was found for a given
instance of f ). B then runs the optimal zero-error algorithm for f , which makes R0 ( f ) expected queries,
on the instances of f that were not solved by A.
    Let us examine the expected number of queries used by B on an input from µ ⊗m . Recall that a
randomized algorithm is a probability distribution over deterministic algorithms; we can therefore think
of A as a distribution. For a deterministic algorithm D ∼ A and an input x to f ⊕m , we use D(x) to denote
the number of queries used by D on x, and S(D, x) ⊆ [m] to denote the set of inputs to f the algorithm D
solves when run on x. Then by assumption

                        T=     E     E D(x)        and       k=     E     E |S(D, x)| .                (4.3)
                             x∼µ ⊗m D∼A                           x∼µ ⊗m D∼A


Next, let R be the randomized algorithm that uses R0 ( f ) expected queries and solves f on any input. For
an input x to f ⊕m , we write x = x1 x2 . . . xm with xi ∈ Dom( f ). Then the expected number of queries used

                       T HEORY OF C OMPUTING, Volume 14 (5), 2018, pp. 1–27                               10
              R ANDOMIZED Q UERY C OMPLEXITY OF S ABOTAGED AND C OMPOSED F UNCTIONS

by B on input from µ ⊗m can be written as
                                                                                                  !
                               E     E     D(x) + E         E ··· E              ∑         Di (xi )               (4.4)
                             x∼µ ⊗m D∼A             D1 ∼R D2 ∼R       Dm ∼R i∈[m]\S(D,x)
                                                                               !
                          = E E            D(x) +       ∑           E Di (xi )                                    (4.5)
                             ⊗m
                             x∼µ   D∼A              i∈[m]\S(D,x) Di ∼R
                                                                          !
                          ≤ E E            D(x) +       ∑          R0 ( f )                                       (4.6)
                             ⊗m
                             x∼µ   D∼A              i∈[m]\S(D,x)

                          = E E (D(x) + (m − |S(D, x)|) R0 ( f ))                                                 (4.7)
                             ⊗m
                             x∼µ   D∼A

                          = T + (m − k) R0 ( f ) .                                                                (4.8)

   Since B solves the direct sum problem on µ ⊗m , the expected number of queries it uses is at least
m R0 ( f ) by Theorem 4.3. Hence T + (m − k) R0 ( f ) ≥ m R0 ( f ), so T ≥ k R0 ( f ).

4.2    Composition theorems
Using the direct sum and threshold direct sum theorems we have established, we can now prove com-
position theorems for randomized sabotage complexity. We start with the behavior of RS itself under
composition.

Theorem 4.5. Let f and g be partial functions. Then RS( f ◦ g) ≥ RS( f ) RS(g).

Proof. Let A be any zero-error algorithm for ( f ◦ g)sab , and let T be the expected query complexity of A
(maximized over all inputs). We turn A into a zero-error algorithm B for fsab .
    B takes a sabotaged input x for f . It then runs A on a sabotaged input to f ◦ g constructed as follows.
Each 0 bit of x is replaced with a 0-input to g, each 1 bit of x is replaced with a 1-input to g, and each
∗ or † of x is replaced with a sabotaged input to g. The sabotaged inputs are generated from µ, the
hard distribution for gsab obtained from Theorem 4.1. The 0-inputs are generated by first generating
a sabotaged input, and then selecting a 0-input consistent with that sabotaged input. The 1-inputs are
generated analogously.
    This is implemented in the following way. On input x, the algorithm B generates n sabotaged inputs
from µ (the hard distribution for gsab ), where n is the length of the string x. Call these inputs y1 , y2 , . . . , yn .
B then runs the algorithm A on this collection of n strings, pretending that it is an input to f ◦ g, with
the following caveat: whenever A tries to query a ∗ or † in an input yi , B instead queries xi . If xi is 0, B
selects an input from f −1 (0) consistent with yi , and replaces yi with this input. It then returns to A an
answer consistent with the new yi . If xi is 1, B selects a consistent input from f −1 (1) instead. If xi is a ∗
or †, B returns a ∗ or †, respectively.
    Now B only makes queries to x when it finds a ∗ or † in an input to gsab . But this solves that instance
of gsab , which was drawn from the hard distribution for gsab . Thus the query complexity of B is upper
bounded by the number of instances of gsab that can be solved by a T -query algorithm with access to n
instances of gsab . We know from Theorem 4.4 that if A makes T expected queries, the expected number

                          T HEORY OF C OMPUTING, Volume 14 (5), 2018, pp. 1–27                                       11
                                 S HALEV B EN -DAVID AND ROBIN KOTHARI

of ∗ or † entries it finds among y1 , y2 , . . . , yn is at most T / RS(g). Hence the expected number of queries
B makes to x is at most T / RS(g). Thus we have RS( f ) ≤ T / RS(g), which gives T ≥ RS( f ) RS(g).

    Using this we can lower bound the randomized query complexity of composed functions. In the
following, f n denotes the function f composed with itself n times, i. e., f 1 = f and f i+1 = f ◦ f i .

Corollary 4.6. Let f be a partial function. Then R( f n ) ≥ RS( f )n /3.

    This follows straightforwardly from observing that R( f n ) = R1/3 ( f n ) ≥ (1 − 2/3) RS( f n ) (using
Theorem 3.4) and RS( f n ) ≥ RS( f )n (using Theorem 4.5).
    We can also prove a composition theorem for zero-error and bounded-error randomized query
complexity in terms of randomized sabotage complexity. In particular this yields a composition theorem
for R( f ◦ g) when R(g) = Θ(RS(g)).

Theorem 4.7. Let f and g be partial functions. Then Rε ( f ◦ g) ≥ Rε ( f ) RS(g).

Proof. The proof follows a similar argument to the proof of Theorem 4.5. Let A be a randomized
algorithm for f ◦ g that uses T expected queries and makes error ε. We turn A into an algorithm B for f
by having B generate inputs from µ, the hard distribution for gsab , and feeding them to A, as before. The
only difference is that this time, the input x to B is not a sabotaged input. This means it has no ∗ or †
entries, so all the sabotaged inputs that B generates turn into 0- or 1-inputs if A tries to query a ∗ or † in
them.
    Since A uses T queries, by Theorem 4.4, it finds at most T / RS(g) asterisks or obelisks (in expectation).
Therefore, B makes at most T / RS(g) expected queries to x. Since B is correct whenever A is correct, its
error probability is at most ε. Thus Rε ( f ) ≤ T / RS(g), and thus T ≥ Rε ( f ) RS(g).

    Setting ε to 0 yields the following corollary.

Corollary 4.8. Let f and g be partial functions. Then R0 ( f ◦ g) ≥ R0 ( f ) RS(g).

    For the more commonly used R( f ◦ g), we obtain the following composition result.

Corollary 4.9. Let f and g be partial functions. Then R( f ◦ g) ≥ R( f ) RS(g)/10.

    This follows from Lemma 2.4, which gives R1/3 ( f ) ≥ R( f )/10, and Theorem 4.7, since

                        R( f ◦ g) ≥ R1/3 ( f ◦ g) ≥ R1/3 ( f ) RS(g) ≥ R( f ) RS(g)/10 .                  (4.9)

    Finally, we can also show an upper bound composition result for randomized sabotage complexity.

Theorem 4.10. Let f and g be partial functions. Then RS( f ◦ g) ≤ RS( f ) R0 (g). We also have
                                                                         
                                  RS( f ◦ g) = O RS( f ) R(g) log RS( f ) .


                        T HEORY OF C OMPUTING, Volume 14 (5), 2018, pp. 1–27                                 12
              R ANDOMIZED Q UERY C OMPLEXITY OF S ABOTAGED AND C OMPOSED F UNCTIONS

Proof. We describe a simple algorithm for finding a ∗ or † in an input to f ◦ g. Start by running the
optimal algorithm for the sabotage problem of f . This algorithm uses RS( f ) expected queries. Then
whenever this algorithm tries to query a bit, run the optimal zero-error algorithm for g in the corresponding
input to g.
    Now, since the input to f ◦ g that we are given is a sabotaged input, it must be consistent with both a
0-input and a 1-input of f ◦ g. It follows that some of the g inputs are sabotaged, and moreover, if we
represent a sabotaged g-input by ∗ or †, a 0-input to g by 0, and a 1-input to g by 1, we get a sabotaged
input to f . In other words, from the inputs to g we can derive a sabotaged input for f .
    This means that the outer algorithm runs uses an expected RS( f ) calls to the inner algorithm, and
ends up calling the inner algorithm on a sabotaged input to g. Meanwhile, each call to the inner algorithm
uses an expected R0 (g) queries, and will necessarily find a ∗ or † if the input it is run on is sabotaged.
Therefore, the described algorithm will always find a ∗ or †, and its expected running time is RS( f ) R0 (g)
by linearity of expectation and by the independence of the internal randomness of the two algorithms.
    Instead of using a zero-error randomized algorithm for g, we can use a bounded-error randomized
algorithm for g as long as its error probability is small. Since we make O(RS( f )) calls to the inner
algorithm, if we boost the bounded-error algorithm’s success probability to make the error much smaller
than 1/ RS( f ) (costing an additional log RS( f ) factor), we will get a bounded-error algorithm for ( f ◦g)sab .
Since R(( f ◦ g)sab ) is the same as RS( f ◦ g) up to a constant factor (Theorem 3.3),

                                   RS( f ◦ g) = O(RS( f ) R(g) log RS( f )) ,                             (4.10)

as desired.


5    Composition with the index function
We now prove our main result (Theorem 1.1) restated more precisely as follows.

Theorem 1.1 (Precise version). Let f and g be (partial) functions, and let m = Ω(R(g)1.1 ). Then

                    R( f ◦ I NDm ◦ g) = Ω(R( f ) R(g) log m) = Ω(R( f ) R(I NDm ) R(g)) .

    Before proving this, we formally define the index function.

Definition 5.1 (Index function). The index function on m bits, denoted I NDm : {0, 1}m → {0, 1}, is
defined as follows. Let c be the largest integer such that c + 2c ≤ m. For any input x ∈ {0, 1}m , let y
be the first c bits of x and let z = z0 z1 · · · z2c −1 be the next 2c bits of x. If we interpret y as the binary
representation of an integer between 0 and 2c − 1, then the output of I NDm (x) equals zy .

   To prove Theorem 1.1, we also require the strong direct product theorem for randomized query
complexity that was established by Drucker [8].

Theorem 5.2 (Strong direct product). Let f be a partial Boolean function, and let k be a positive integer.
Then any randomized algorithm for f ⊕k that uses at most γ 3 k R( f )/11 queries has success probability at
most (1/2 + γ)k , for any γ ∈ (0, 1/4).

                        T HEORY OF C OMPUTING, Volume 14 (5), 2018, pp. 1–27                                   13
                                 S HALEV B EN -DAVID AND ROBIN KOTHARI

    The first step to proving R( f ◦ I ND ◦ g) = Ω(R( f ) R(I ND) R(g)) is to establish that R(I ND ◦ g) is
essentially the same as RS(I ND ◦ g) if the index gadget is large enough.

Lemma 5.3. Let f be a partial Boolean function and let m = Ω(R( f )1.1 ). Then

                           RS(I NDm ◦ f ) = Ω(R( f ) log m) = Ω(R(I NDm ) R( f )) .                        (5.1)
               ⊕c
Moreover, if find is defined as the index function on c + 2c bits composed with f in only the first c bits, we
           ⊕c             ⊕c
have RS( find ) ≥ RSu ( find ) = Ω(c R( f )) when c ≥ 1.1 log R( f ).

    Before proving Lemma 5.3, let us complete the proof of Theorem 1.1 assuming Lemma 5.3.

Proof of Theorem 1.1. By Corollary 4.9, we have R( f ◦ I NDm ◦ g) ≥ R( f ) RS(I NDm ◦ g)/10. Combining
this with Lemma 5.3 gives R( f ◦ I NDm ◦ g) = Ω(R( f ) R(g) log m), as desired.

    We can now complete the argument by proving Lemma 5.3.

Proof of Lemma 5.3. To understand what the inputs to (I NDm ◦ f )sab look like, let us first analyze the
function I NDm . We can split an input to I NDm into a small index section and a large array section. To
sabotage an input to I NDm , it suffices to sabotage the array element that the index points to (using only a
single ∗ or †). It follows that to sabotage an input to I NDm ◦ f , it suffices to sabotage the input to f at the
array element that the index points to. In other words, we consider sabotaged inputs where the only stars
in the input are in one array cell whose index is the output of the first c copies of f , where c is the largest
integer such that c + 2c ≤ m. Note that c = log m − Θ(1).
     We now convert any RS(I NDm ◦ f ) algorithm into a randomized algorithm for f ⊕c . First, using
Lemma 2.1, we get a 2 RS(I NDm ◦ f ) query randomized algorithm that finds a ∗ or † with probability
1/2 if the input is sabotaged. Next, consider running this algorithm on a non-sabotaged input. It makes
2 RS(I NDm ◦ f ) queries. With probability 1/2, one of these queries will be in the array cell whose index
is the true answer to f ⊕c evaluated on the first cn bits. We can then consider a new algorithm A that runs
the above algorithm for 2 RS(I NDm ◦ f ) queries, then picks one of the 2 RS(I NDm ◦ f ) queries at random,
and if that query is in an array cell, it outputs the index of that cell. Then A uses 2 RS(I NDm ◦ f ) queries
and evaluates f ⊕c with probability at least RS(I NDm ◦ f )−1 /4.
     Next, Theorem 5.2 implies that for any γ ∈ (0, 1/4), either A’s success probability is smaller than
(1/2 + γ)c , or else A uses at least γ 3 c R( f )/11 queries. This means either

              RS(I NDm ◦ f )−1 /4 ≤ (1/2 + γ)c         or      2 RS(I NDm ◦ f ) ≥ γ 3 c R( f )/11 .        (5.2)

Now if we choose γ = 0.01, it is clear that the second inequality in (5.2) yields RS(I NDm ◦ f ) =
Ω(c R( f )) = Ω(R( f ) log m) no matter what m (and hence c) is chosen to be.
    To complete the argument, we show that the first inequality in (5.2) also yields the same. Observe
that the first inequality is equivalent to
                                                           
                               2 c              2 log m−Θ(1)
   RS(I NDm ◦ f ) = Ω                    =Ω                       = Ω(mlog2 (2/1.02) ) = Ω(m0.97 ) . (5.3)
                             1 + 2γ            1 + 2γ

We now have m0.97 = Ω(m0.96 log m) = Ω(R( f )1.1×0.96 log m) = Ω(R( f ) log m), as desired.

                        T HEORY OF C OMPUTING, Volume 14 (5), 2018, pp. 1–27                                  14
             R ANDOMIZED Q UERY C OMPLEXITY OF S ABOTAGED AND C OMPOSED F UNCTIONS

                                 ⊕c
    The lower bound on RSu ( find    ) follows similarly once we makes two observations. First, this argument
                          ⊕c
works equally well for find instead of I NDm ◦ f . Second, sabotaging the array cell indexed by the outputs
                          ⊕c
to the c copies of f in find introduces only one asterisk or obelisk, so the argument above lower bounds
        ⊕c                      ⊕c
RSu ( find ) and not only RS( find ).


6    Relating lifting theorems
In this section we establish Theorem 1.2, which proves that a lifting theorem for zero-error randomized
communication complexity implies one for bounded-error randomized communication complexity.
    To begin, we introduce the two-party index gadget (also used in [11]).

Definition 6.1 (Two-party index gadget). For any integer b > 0, and finite set Y, we define the index
                                 b                                           b
function GI ND : {0, 1}b × Y2 → Y as follows. Let (x, y) ∈ {0, 1}b × Y2 be an input to GI ND . Then if
we interpret x as the binary representation of an integer between 0 and 2b − 1, the function GI ND (x, y)
evaluates to yx , the xth letter of y. We also let Gb be the index function with Y = {0, 1} and let G0b be the
index function with Y = {0, 1, ∗, †}.

    The index gadget is particularly useful in communication complexity because it is “complete” for
functions with a given value of min{|X|, |Y|}. More precisely, any problem F : X × Y → {0, 1} can be
reduced to Gb for b = dlog min{|X|, |Y|}e. To see this, say |X| ≤ |Y| and let |X| = 2b . We now map every
input (x, y) ∈ X × Y to an input (x0 , y0 ) for Gb . Since X has size 2b , we can view x as a string in {0, 1}b
                                                               b
and set x0 = x. The string y0 = y00 y01 · · · y02b −1 ∈ {0, 1}2 is defined as y0x = F(x, y). Hence we can assume
without loss of generality that a supposed lifting theorem for zero-error protocols is proved using the
two-party index gadget of some size.
    Our first step is to lower bound the bounded-error randomized communication complexity of a
function in terms of the zero-error randomized communication complexity of a related function.
                                                                                             b
Lemma 6.2. Let f be an n-bit (partial) Boolean function and let Gb : {0, 1}b × {0, 1}2 → {0, 1} be the
index gadget with b = O(log n). Then

                                                   R0 ( fusab ◦ G0b )
                                                   cc                
                                  cc
                                 R ( f ◦ Gb ) = Ω                       ,                        (6.1)
                                                    log n log log n
                                                                b
where G0b is the index gadget mapping {0, 1}b × {0, 1, ∗, †}2 to {0, 1, ∗, †}.

Proof. We will use a randomized protocol A for f ◦ Gb to construct a zero-error protocol B for fusab ◦ G0b .
Note the given input to fusab ◦ G0b must have a unique copy of G0b that evaluates to ∗ or †, with all other
copies evaluating to 0 or 1. The goal of B is to find this copy and determine if it evaluates to ∗ or †. This
will evaluate fusab ◦ G0b with zero error.
    Note that if we replace all ∗ and † symbols in Bob’s input with 0 or 1, we would get a valid input to
to f ◦ Gb , which we can evaluate using A. Moreover, there is a single special ∗ or † in Bob’s input that
governs the value of this input to f ◦ Gb no matter how we fix the rest of the ∗ and † symbols. Without
loss of generality, we assume that if the special symbol is replaced by 0, the function f ◦ Gb evaluates to
0, and if it is replaced by 1, it evaluates to 1.

                        T HEORY OF C OMPUTING, Volume 14 (5), 2018, pp. 1–27                                 15
                                S HALEV B EN -DAVID AND ROBIN KOTHARI

     We can now binary search to find this special symbol. There are at most n2b asterisks and obelisks in
Bob’s input. We can set the left half to 0 and the right half to 1, and evaluate the resulting input using A.
If the answer is 0, the special symbol is on the left half; otherwise, it is on the right half. We can proceed
to binary search in this way, until we have zoomed in on one gadget that must contain the special symbol.
This requires narrowing down the search space from n possible gadgets to 1, which requires log n rounds.
Each round requires a call to A, times a O(log log n) factor for error reduction. We can therefore find the
right gadget with bounded error, using O(Rcc ( f ◦ Gb ) log n log log n) bits of communication.
     Once we have found the right gadget, we can certify its validity by having Alice send the right index
to Bob, using b bits of communication, and Bob can check that it points to an asterisk or obelisk. Since
we found a certificate with constant probability, we can use Lemma 2.5 to turn this into a zero-error
algorithm. Thus
                                            0
                            Rcc                          cc
                              0 ( f usab ◦ Gb ) = O(b + R ( f ◦ Gb ) log n log log n) .                   (6.2)
Since b = O(log n), we obtain the desired result Rcc            0        cc
                                                  0 ( f usab ◦ Gb ) = O(R ( f ◦ Gb ) log n log log n).

    Equipped with this lemma we can prove the connection between lifting theorems (Theorem 1.2),
stated more precisely as follows.

Theorem 1.2 (Precise version). Suppose that for all partial Boolean functions f on n bits, we have

                                    Rcc
                                     0 ( f ◦ Gb ) = Ω(R0 ( f )/ polylog n)                                    (6.3)

with b = O(log n). Then for all partial functions Boolean functions, we also have

                                    Rcc ( f ◦ G2b ) = Ω(R( f )/ polylog n) .                                  (6.4)

The polylog n loss in the Rcc result is only log n log log2 n worse than the loss in the Rcc
                                                                                          0 hypothesis.

Proof. First we show that for any function f and positive integer c,
                                                    cc ⊕c
                                                    R ( find ◦ G2b )
                                                                     
                                   cc
                                 R ( f ◦ G2b ) = Ω                     .                                      (6.5)
                                                         c log c
                                      ⊕c
To see this, note that we can solve find ◦ G2b by solving the c copies of f ◦ G2b and then examining the
appropriate cell of the array. This uses c Rcc ( f ◦ G2b ) bits of communication, times O(log c) since we
must amplify the randomized protocol to an error of O(1/c).
                                                 ⊕c
    Next, using (6.5) and Lemma 6.2 on Rcc ( find   ◦ G2b ), we get
                                                     ⊕c                                  ⊕c          0 
                                             Rcc ( find ◦ G2b )                Rcc
                                                                                 0 (( f ind )usab ◦ G2b )
                                                                         
                   Rcc ( f ◦ G2b ) = Ω                                =Ω                                  .   (6.6)
                                                  c log c                      c log c log n log log n

From here we want to use the assumed lifting theorem for R0 . However, there is a technicality: the gadget
                                                            ⊕c
G02b is not the standard index gadget, and the function ( find )usab does not have Boolean alphabet. To
remedy this, we use two bits to represent each of the symbols {0, 1, ∗, †}. Using this representation, we
                          ⊕c bin
define a new function ( find )usab on twice as many bits.

                        T HEORY OF C OMPUTING, Volume 14 (5), 2018, pp. 1–27                                    16
             R ANDOMIZED Q UERY C OMPLEXITY OF S ABOTAGED AND C OMPOSED F UNCTIONS

                          ⊕c bin              ⊕c
     We now compare ( find  )usab ◦ Gb to ( find )usab ◦ G02b . Note that the former uses two pointers of size b
to index two bits, while the latter uses one pointer of size 2b to index one symbol in {0, 1, ∗, †} (which is
equivalent to two bits). It’s not hard to see that the former function is equivalent to the latter function
restricted to a promise. This means the communication complexity of the former is smaller, and hence
                                       ⊕c          0                 ⊕c bin
                              Rcc                            cc
                               0 (( f ind )usab ◦ G2b ) = Ω(R0 (( f ind )usab ◦ Gb )) .                      (6.7)

We are now ready to use the assumed lifting theorem for R0 . To be more precise, let’s suppose a lifting
result that states Rcc
                    0 ( f ◦ Gb ) = Ω(R0 ( f )/`(n)) for some function `(n). Thus

                                       ⊕c bin                       ⊕c bin
                              Rcc
                               0 (( f ind )usab ◦ Gb ) = Ω(R0 (( f ind )usab )/`(n)) .                       (6.8)

We note that
                                   ⊕c bin                 ⊕c                      ⊕c
                           R0 (( find )usab ) = Ω(R0 (( find )usab )) = Ω(RSu ( find )) .                    (6.9)
                                              ⊕c
Setting c = 1.1 log R( f ), we have RSu ( find   ) = Ω(c R( f )) by Lemma 5.3. Combining this with (6.7),
(6.8), and (6.9), we get
                                          ⊕c          0
                                 Rcc
                                  0 (( f ind )usab ◦ G2b ) = Ω(c R( f )/`(n)) .                   (6.10)
Combining this with (6.6) yields
                                                                                                   
                                              c R( f )                                 R( f )
             Rcc ( f ◦ G2b ) = Ω                                       =Ω                               .   (6.11)
                                     `(n)c log c log n log log n                `(n) log n log log2 n
This gives the desired lifting theorem for bounded-error randomized communication with polylog n loss
that is at most log n log log2 n worse than the loss in the assumed Rcc
                                                                     0 lifting theorem.



7    Comparison with other lower bound methods
In this section we compare RS( f ) with other lower bound techniques for bounded-error randomized query
complexity. Figure 1 shows the two most powerful lower bound techniques for R( f ), the partition bound
(prt( f )) and quantum query complexity (Q( f )), which subsume all other general lower bound techniques.
The partition bound and quantum query complexity are incomparable, since there are functions for which
the partition bound is larger, e. g., the O R function, and functions for which quantum query complexity is
larger [3]. Another common lower bound measure, approximate polynomial degree (deg)       f is smaller than
both.
    Randomized sabotage complexity (RS) can be much                                   R
larger than the partition bound and quantum query com-
plexity as we now show. We also show that randomized
sabotage complexity is always as large as randomized                         RS       prt      Q
certificate complexity (RC), which itself is larger than
block sensitivity, another common lower bound technique.
                                                                                 RC       deg
Lastly, we also show that R0 ( f ) = O(RS( f )2 log RS( f )),
                                                                                           f

showing that RS is a quadratically tight lower bound, even             Figure 1: Lower bounds on R( f ).
for zero-error randomized query complexity.

                        T HEORY OF C OMPUTING, Volume 14 (5), 2018, pp. 1–27                                   17
                                   S HALEV B EN -DAVID AND ROBIN KOTHARI

7.1    Partition bound and quantum query complexity
We start by showing the superiority of randomized sabotage complexity against the two best lower bounds
for R( f ). Informally, what we show is that any separation between R( f ) and a lower bound measure like
Q( f ), prt( f ), or deg(
                     f f ) readily gives a similar separation between RS( f ) and the same measure.

Theorem 7.1. There exist total functions f and g such that RS( f ) ≥ prt( f )2−o(1) and RS(g) = Ω(Q(g)
                                                                                                e     2.5 ).
                                                   f 4−o(1) .
There also exists a total function h with RS(h) ≥ deg(h)

Proof. These separations were shown with R( f ) in place of RS( f ) in [2] and [3]. To get a lower bound
on RS, we can simply compose I ND with these functions and apply Lemma 5.3. This increases RS to be
the same as R (up to logarithmic factors), but it does not increase prt, deg,
                                                                         f or Q more than logarithmically,
so the desired separations follow.

    As it turns out, we didn’t even need to compose I ND with these functions. It suffices to observe that
they all use the cheat sheet construction, and that an argument similar to the proof of Lemma 5.3 implies
that RS( fCS ) = Ω(R(
                  e    f )) for all f (where fCS denotes the cheat sheet version of f , as defined in [2]). In
particular, cheat sheets can never be used to separate RS from R (by more than logarithmic factors).

7.2    Randomized certificate complexity
Finally, we also show that randomized sabotage complexity upper bounds randomized certificate com-
plexity. To show this, we first define randomized certificate complexity.
     Given a string x, a block is a set of bits of x (that is, a subset of {1, 2, . . . , n}). If B is a block and x is
a string, we denote by xB the string given by flipping the bits specified by B in the string x. If x and xB are
both in the domain of a (possibly partial) function f : {0, 1}n → {0, 1} and f (x) 6= f (xB ), we say that B
is a sensitive block for x with respect to f .
     For a string x in the domain f , the maximum number of disjoint sensitive blocks of x is called the
block sensitivity of x, denoted by bsx ( f ). The maximum of bsx ( f ) over all x in the domain of f is the
block sensitivity of f , denoted by bs( f ).
     A fractionally disjoint set of sensitive blocks of x is an assignment of non-negative weights to the
sensitive blocks of x such that for all i ∈ {1, 2, . . . , n}, the sum of the weights of blocks containing i is
at most 1. The maximum total weight of any fractionally disjoint set of sensitive blocks is called the
fractional block sensitivity of x. This is also sometimes called the randomized certificate complexity of x,
and is denoted by RCx ( f ) [1, 24, 10]. The maximum of this over all x in the domain of f is RC( f ) the
randomized certificate complexity of f .
     Aaronson [1] observed that bsx ( f ) ≤ RCx ( f ) ≤ Cx ( f ). We therefore have

                                   bs( f ) ≤ RC( f ) ≤ C( f ) ≤ R0 ( f ) ≤ D( f ) .                              (7.1)

The measure RC( f ) is also a lower bound for R( f ); indeed, from arguments in [1] it follows that
Rε ( f ) ≥ RC( f )/(1 − 2ε), so R( f ) ≥ RC( f )/3.

Theorem 7.2. Let f : {0, 1}n → {0, 1} be a partial function. Then RS( f ) ≥ RC( f )/4.

                         T HEORY OF C OMPUTING, Volume 14 (5), 2018, pp. 1–27                                      18
             R ANDOMIZED Q UERY C OMPLEXITY OF S ABOTAGED AND C OMPOSED F UNCTIONS

Proof. Let us fix x to be any input that maximizes RCx ( f ) (i. e., RC( f ) = RCx ( f )). We will show that
any zero-error randomized algorithm that solves the sabotage problem for f makes at least RCx ( f )/4
queries, and hence RS( f ) ≥ RCx ( f )/4 = RC( f )/4.
    For our fixed choice of x, let B1 , B2 , . . . Bm be all the (not necessarily disjoint) sensitive blocks of x.
For each i ∈ {1, 2, . . . , m}, let yi be the sabotaged input formed by replacing block Bi in x with ∗ entries.
Note that since each Bi is a sensitive block, each yi is a sabotaged input (i. e., it is consistent with a 0-input
and a 1-input). Hence the optimal zero-error randomized algorithm for the sabotage problem for f when
run on one of the inputs in Y = {y1 , y2 , . . . , ym } will find a ∗ in yi with at most RS( f ) expected queries.
    We now have a randomized algorithm which when run on an input in Y will find a ∗ with at most
RS( f ) queries in expectation. By Markov’s inequality (Lemma 2.1), we know that this algorithm when
run on an input in Y finds a ∗ with probability at least 1/2 after b2 RS( f )c queries.
    In general, this is an adaptive algorithm, which decides which bits to query based on the bits it has
already queried. The next step is to construct a non-adaptive algorithm (i. e., a probability distribution
over input bits) with similar properties. More formally, the probability distribution over input bits should
be such that if we make one query to an input in Y using this probability distribution, we would find a ∗
with probability Ω(1/ RS( f )). In particular, this means repeating this non-adaptive algorithm O(RS( f ))
times yields an algorithm that finds a ∗ in an input from Y with high probability.
    We now use reasoning from [1] to construct the non-adaptive algorithm. For each t between 1 and
T = b2 RS( f )c, let pt (y) be the probability that the adaptive algorithm when run on input y ∈ Y finds a ∗
on query t, conditioned on the previous queries not finding a ∗. Then for any y ∈ Y we have

                                     p1 (y) + p2 (y) + · · · + pT (y) ≥ 1/2 .                               (7.2)

We now construct a non-adaptive algorithm that finds a ∗ on input y with probability

                                              1 T            1
                                               · ∑ pi (y) ≥    .                                            (7.3)
                                              T i=1         2T

Our non-adaptive algorithm first picks a t ∈ [T ] uniformly at random and simulates query t of the adaptive
algorithm on input y assuming that the previous queries have not found a ∗. This query can be simulated
because we know that previous queries have not found a ∗ and hence the answers to these queries are
consistent with x, which is known to us. This non-adaptive algorithm when run on y finds a ∗ with
probability equal to the average of the pt (y), since for a given t ∈ [T ], the probability that it finds a ∗ in y
is pt (y). Hence the non-adaptive algorithm finds a ∗ in any y ∈ Y with probability

                                        1 T            1      1
                                         · ∑ pi (y) ≥    ≥           .                                      (7.4)
                                        T i=1         2T   4 RS( f )

Note that this non-adaptive algorithm does not depend on y, but merely on x, which is fixed.
     Let the probability distribution over inputs bits obtained from this non-adaptive algorithm be
(q1 , q2 , . . . , qn ) (i. e., the algorithm queries bit i with probability qi and ∑ni=1 qi = 1). Now for a sen-
sitive block B j , if y j is the corresponding input in Y , we know that the algorithm when run on y j finds a ∗
with probability at least 1/(4 RS( f )), hence it must query inside B j with probability at least 1/(4 RS( f )),
which gives ∑i∈B j qi ≥ 1/(4 RS( f )).

                        T HEORY OF C OMPUTING, Volume 14 (5), 2018, pp. 1–27                                   19
                                 S HALEV B EN -DAVID AND ROBIN KOTHARI

    For each sensitive block B j , let w j be the weight of B j under the maximum fractional set of disjoint
blocks of x. Then ∑mj=1 w j = RCx ( f ) = RC( f ) and for each bit i, we have ∑ j:i∈B j w j ≤ 1. We then have
                           m                 m             n            n
                 RC( f )             1
                         = ∑ wj ·          ≤ ∑ w j ∑ qi = ∑ qi ∑ w j ≤ ∑ qi · 1 ≤ 1 .                       (7.5)
                4 RS( f ) j=1     4 RS( f ) j=1 i∈B j     i=1 j:i∈B j  i=1

Hence RS( f ) ≥ RC( f )/4.

7.3   Zero-error randomized query complexity
Theorem 7.3. Let f : {0,     n                                                       2
                         p 1} → {0, 1} be a total function. Then R0 ( f ) = O(RS( f ) log RS( f )) or
alternately, RS( f ) = Ω( R0 ( f )/ log R0 ( f )).
Proof. Let A be the RS( f ) algorithm. The idea will be to run A on an input to x for long enough that we
can ensure it queries a bit in every sensitive block of x; this will mean A found a certificate for x. That
will allow us to turn the algorithm into a zero-error algorithm for f .
    Let x be any input, and let b be a sensitive block of x. If we replace the bits of x specified by b with
stars, then we can find a ∗ with probability 1/2 by running A for 2 RS( f ) queries by Lemma 2.1. This
means that if we run A on x for 2 RS( f ) queries, it has at least 1/2 probability of querying a bit in any
given sensitive block of x. If we repeat this k times, we get a 2k RS( f ) query algorithm that queries a bit
in any given sensitive block of x with probability at least 1 − 2−k .
    Now, by [18], the number of minimal sensitive blocks in x is at most RC( f )bs( f ) for a total function f .
Our probability of querying a bit in all of these sensitive blocks is at least 1 − 2−k RC( f )bs( f ) by the union
bound. When k ≥ 1 + bs( f ) log2 RC( f ), this is at least 1/2. Since a bit from every sensitive block is a
certificate, by Lemma 2.5, we can turn this into a zero-error randomized algorithm with expected query
complexity at most 4(1 + bs( f ) log2 RC( f )) RS( f ), which gives R0 ( f ) = O(RS( f ) bs( f ) log RC( f )).
                                                                                     2
   p bs( f ) ≤ RC( f ) = O(RS( f )) by Theorem 7.2, we have R0 ( f ) = O(RS( f ) log RS( f )), or RS( f ) =
Since
Ω( R0 ( f )/ log R0 ( f )).


8     Deterministic sabotage complexity
Finally we look at the deterministic analogue of randomized sabotage complexity. It turns out that
deterministic sabotage complexity (as defined in Definition 3.2) is exactly the same as deterministic query
complexity for all (partial) functions. Since we already know perfect composition and direct sum results
for deterministic query complexity, it is unclear if deterministic sabotage complexity has any applications.
Theorem 8.1. Let f : {0, 1}n → {0, 1} be a partial function. Then DS( f ) = D( f ).
Proof. For any function DS( f ) ≤ D( f ) since a deterministic algorithm that correctly computes f must
find a ∗ or † when run on a sabotaged input, otherwise its output is independent of how the sabotaged bits
are filled in.
    To show the other direction, let D( f ) = k. This means for every k − 1 query algorithm, there are
two inputs x and y with f (x) 6= f (y), such that they have the same answers to the queries made by the
algorithm. If this is not the case then this algorithm computes f (x), contradicting the fact that D( f ) = k.

                        T HEORY OF C OMPUTING, Volume 14 (5), 2018, pp. 1–27                                   20
             R ANDOMIZED Q UERY C OMPLEXITY OF S ABOTAGED AND C OMPOSED F UNCTIONS

Thus if there is a deterministic algorithm for fsab that makes k − 1 queries, there exist two inputs x and
y with f (x) 6= f (y) that have the same answers to the queries made by the algorithm. If we fill in the
rest of the inputs bits with either asterisks or obelisks, it is clear that this is a sabotaged input (since it
can be completed to either x or y), but the purported algorithm for fsab cannot distinguish them. Hence
D( fsab ) ≥ k, which means DS( f ) ≥ D( f ).


Acknowledgements
We thank Mika Göös for finding an error in an earlier proof of Theorem 4.4. We also thank the anonymous
referees of Theory of Computing for their comments.


A     Properties of randomized algorithms
We now provide proofs of the properties described in Section 2, which we restate for convenience.

Lemma A.1 (Markov’s Inequality). Let A be a randomized algorithm that makes T queries in expectation
(over its internal randomness). Then for any δ ∈ (0, 1), the algorithm A terminates within bT /δ c queries
with probability at least 1 − δ .

Proof. If A does not terminate within bT /δ c queries, it must use at least bT /δ c + 1 queries. Let’s say this
happens with probability p. Then the expected number of queries used by A is at least p(bT /δ c+1) (using
the fact that the number of queries used is always non-negative). We then get T ≥ p(bT /δ c + 1) > pT /δ ,
or p < δ . Thus A terminates within T /δ queries with probability greater than 1 − δ .

Lemma A.2 (Amplification). If f is a function with Boolean output and A is a randomized algorithm for
 f with error ε < 1/2, repeating A several times and taking the majority vote of the outcomes decreases
the error. To reach error ε 0 > 0, it suffices to repeat the algorithm 2 ln(1/ε 0 )/(1 − 2ε)2 times.

Proof. Let’s repeat A an odd number of times, say 2k + 1. The error probability of A0 , the algorithm that
takes the majority vote of these runs, is

                        k                                           k        
                           2k + 1 2k+1−i        i    2k+1−k        k     2k + 1
                      ∑ i ε              (1 − ε) ≤ ε        (1 − ε) ∑             ,                       (A.1)
                      i=0                                            i=0    i

which is at most

             ε k+1 (1 − ε)k (22k+1 /2) = ε k+1 (1 − ε)k 4k = ε(4ε(1 − ε))k = ε(1 − (1 − 2ε)2 )k .         (A.2)

It suffices to choose k large enough so that ε(1 − (1 − 2ε)2 )k ≤ ε 0 , or ln ε + k ln(1 − (1 − 2ε)2 ) ≤ ln ε 0 .
Using the inequality ln(1 − x) ≤ −x, it suffices to choose k so that k(1 − 2ε)2 ≥ ln(1/ε 0 ) − ln(1/ε), or

                                                ln(1/ε 0 ) − ln(1/ε)
                                           k≥                        .                                    (A.3)
                                                    (1 − 2ε)2

                        T HEORY OF C OMPUTING, Volume 14 (5), 2018, pp. 1–27                                  21
                                 S HALEV B EN -DAVID AND ROBIN KOTHARI

In particular, we can choose
                              ln(1/ε 0 ) − ln(1/ε)      ln(1/ε 0 )
                                                  
                                                                         ln(1/ε)
                        k=                   2
                                                     ≤            2
                                                                    +1−           .                    (A.4)
                                  (1 − 2ε)             (1 − 2ε)         (1 − 2ε)2

It is not hard to check that 3(1 − 2ε)2 ≤ 2 ln(1/ε) for all ε ∈ (0, 1/2), so we can choose k to be at most

                                                ln(1/ε 0 )
                                                           − 1/2 .                                     (A.5)
                                               (1 − 2ε)2
This means 2k + 1 is at most
                                                    2 ln(1/ε 0 )
                                                                 ,                                     (A.6)
                                                    (1 − 2ε)2
as desired.

Lemma A.3. Let f be a partial function, δ > 0, and ε ∈ [0, 1/2). Then we have
                                                  1 − 2ε             1
                                   Rε+δ ( f ) ≤          Rε ( f ) ≤    Rε ( f ) .
                                                    2δ              2δ
Proof. Let A be the Rε ( f ) algorithm. Let B be the algorithm that runs A for bRε ( f )/αc queries, and if A
doesn’t terminate, outputs 0 with probability 1/2 and 1 with probability 1/2. Then by Lemma 2.1, the
error probability of B is at most α/2 + (1 − α)ε. If we let α = 2δ /(1 − 2ε), then the error probability of
B is at most
                                       δ      (1 − 2ε − 2δ )ε
                                            +                 = ε +δ ,                                 (A.7)
                                     1 − 2ε        1 − 2ε
as desired. The number of queries made by B is at most
                                                   1 − 2ε             1
                                 bRε ( f )/αc ≤           Rε ( f ) ≤    Rε ( f ) .                     (A.8)
                                                     2δ              2δ

Lemma A.4. If f is a partial function, then for all ε ∈ (0, 1/2), we have
                                                          ln(1/ε)
                                        Rε ( f ) ≤ 14              Rε ( f ) .
                                                         (1 − 2ε)2

When ε = 1/3, we can improve this to R( f ) ≤ 10R( f ).
Proof. Repeating an algorithm with error 1/3 three times decreases its error to 7/27, so in particular
R7/27 ( f ) ≤ 3R( f ). Then using Lemma 2.3 with ε + δ = 1/3 and ε = 7/27, we get

                              1 − 14/27               13            39
                 R( f ) ≤                R     ( f ) = R7/27 ( f ) ≤ R( f ) ≤ 10R( f ) .               (A.9)
                            2(1/3 − 7/27) 7/27        4             4
    To deal with arbitrary ε, we need to use Lemma 2.2. It gives us
                                                        2 ln(1/ε 0 )
                                         Rε 0 ( f ) ≤                Rε ( f ) .                       (A.10)
                                                        (1 − 2ε)2

                       T HEORY OF C OMPUTING, Volume 14 (5), 2018, pp. 1–27                               22
              R ANDOMIZED Q UERY C OMPLEXITY OF S ABOTAGED AND C OMPOSED F UNCTIONS

When combined with Lemma 2.3, this gives

                                                      1 − 2ε 0 ln(1/ε 0 )
                                       Rε 0 ( f ) ≤                       Rε 0 ( f ) .                 (A.11)
                                                      ε − ε 0 (1 − 2ε)2

Setting ε = (1 + 4ε 0 )/6 (which is greater than ε 0 if ε 0 < 1/2) gives

                                           27 ln(1/ε 0 )                  ln(1/ε 0 )
                            Rε 0 ( f ) ≤                 Rε 0 ( f ) ≤ 14              Rε 0 ( f ) ,     (A.12)
                                           2(1 − 2ε 0 )2                 (1 − 2ε 0 )2

which gives the desired result (after exchanging ε and ε 0 ).

Lemma A.5. Let A be a randomized algorithm that uses T queries in expectation and finds a certificate
with probability 1 − ε. Then repeating A when it fails to find a certificate turns it into an algorithm that
always finds a certificate (i. e., a zero-error algorithm) that uses at most T /(1 − ε) queries in expectation.

Proof. Let A0 be the algorithm that runs A, checks if it found a certificate, and repeats if it didn’t. Let N1
be the random variable for the number of queries used by A0 . We know that the maximum number of
queries A0 ever uses is the input size; it follows that E(N1 ) converges and is at most the input size.
    Let M1 be the random variable for the number of queries used by A in the first iteration. Let S1 be the
Bernoulli random variable for the event that A fails to find a certificate. Then E(M1 ) = T and E(S1 ) = ε.
Let N2 be the random variable for the number of queries used by A0 starting from the second iteration
(conditional on the first iteration failing). Then

                                                    N1 = M1 + S1 N2 ,                                  (A.13)

so by linearity of expectation and independence,

                      E(N1 ) = E(M1 ) + E(S1 )E(N2 ) = T + εE(N2 ) ≤ T + εE(N1 ) .                     (A.14)

This implies
                                                  E(N1 ) ≤ T /(1 − ε) ,                                (A.15)
as desired.

Lemma A.6. Let f be a partial function. Let A be a randomized algorithm that solves f using at most T
expected queries and with error at most ε. For x, y ∈ Dom( f ) if f (x) 6= f (y) then when A is run on x, it
must query an entry on which x differs from y with probability at least 1 − 2ε.

Proof. Let p be the probability that A queries an entry on which x differs from y when it is run on x.
Let q be the probability that A outputs an invalid output for x given that it doesn’t query a difference
from y. Let r be the probability that A outputs an invalid output for y given that it doesn’t query such a
difference. Since one of these events always happens, we have q + r ≥ 1. Note that A errs with probability
at least (1 − p)q when run on x and at least (1 − p)r when run on y. This means that (1 − p)q ≤ ε and
(1 − p)r ≤ ε. Summing these gives 1 − p ≤ (1 − p)(q + r) ≤ 2ε, so p ≥ 1 − 2ε, as desired.

                        T HEORY OF C OMPUTING, Volume 14 (5), 2018, pp. 1–27                                23
                                 S HALEV B EN -DAVID AND ROBIN KOTHARI

B     Minimax theorem for bounded-error algorithms
We need the following version of Yao’s minimax theorem for Rε ( f ). The proof is similar to other
minimax theorems in the literature, but we include it for completeness.

Theorem B.1. Let f be a partial function and ε ≥ 0. Then there exists a distribution µ over Dom( f )
such that any randomized algorithm A that computes f with error at most ε on all x ∈ Dom( f ) satisfies
Ex∼µ A(x) ≥ Rε ( f ), where A(x) is the expected number of queries made by A on x.

     We note that Theorem B.1 talks only about algorithms that successfully compute f (with error ε) on
all inputs, not just those sampled from µ. An alternative minimax theorem where the error is with respect
to the distribution µ can be found in [25], although it loses constant factors.

Proof. We think of a randomized algorithm as a probability vector over deterministic algorithms; thus
randomized algorithms lie in RN , where N is the number of deterministic decision trees. In fact, the set S
of randomized algorithms forms a simplex, which is a closed and bounded set.
      Let err f ,x (A) := Pr[A(x) 6= f (x)] be the probability of error of the randomized A when run on x. Then
it is not hard to see that err f ,x (A) is a continuous function of A. Define err f (A) := maxx err f ,x (A). Then
err f (A) is also the maximum of a finite number of continuous functions, so it is continuous.
      Next consider the set of algorithms Sε := {A ∈ S : err f (A) ≤ ε}. Since err f (A) is a continuous function
and S is closed and bounded, it follows that Sε is closed and bounded, and hence compact. It is also easy
to check that Sε is convex. Let P be the set of probability distributions over Dom( f ). Then P is also
compact and convex. Finally, consider the function α(A, µ) := Ex∼µ ED∼A D(x) that accepts a randomized
algorithm and a distribution as input, and returns the expected number of queries the algorithm makes on
that distribution. It is not hard to see that α is a continuous function in both variables. In fact, α is linear
in both variables by the linearity of expectation.
      Since Sε and P are compact and convex subsets of the finite-dimensional spaces RN and RDom( f ) ,
respectively, and the objective function α(·, ·) is linear, we can apply Sion’s minimax theorem (see [23]
or [26, Theorem 1.12]) to get

                                    max min α(A, µ) = min max α(A, µ) .                                    (B.1)
                                    µ∈P A∈Sε              A∈Sε µ∈P


The right hand side is simply the worst-case expected query complexity of any algorithm computing f
with error at most ε, which is Rε ( f ) by definition. The left hand side gives us a distribution µ such that
for any algorithm A that makes error at most ε on all x ∈ Dom( f ), the expected number of queries A
makes on µ is at least Rε ( f ).



References
 [1] S COTT A ARONSON: Quantum certificate complexity. J. Comput. System Sci., 74(3):313–322, 2008.
     Preliminary version in CCC’03. [doi:10.1016/j.jcss.2007.06.020, arXiv:quant-ph/0210020] 18, 19

                        T HEORY OF C OMPUTING, Volume 14 (5), 2018, pp. 1–27                                   24
           R ANDOMIZED Q UERY C OMPLEXITY OF S ABOTAGED AND C OMPOSED F UNCTIONS

 [2] S COTT A ARONSON , S HALEV B EN -DAVID , AND ROBIN KOTHARI: Separations in query
     complexity using cheat sheets. In Proc. 48th STOC, pp. 863–876. ACM Press, 2016.
     [doi:10.1145/2897518.2897644, arXiv:1511.01937] 3, 4, 18

 [3] A NDRIS A MBAINIS , M ARTINS KOKAINIS , AND ROBIN KOTHARI: Nearly optimal separations
     between communication (or query) complexity and partitions. In Proc. 31st IEEE Conf. on Compu-
     tational Complexity (CCC’16), pp. 4:1–4:14. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik,
     2016. [doi:10.4230/LIPIcs.CCC.2016.4, arXiv:1512.01210] 4, 17, 18

 [4] A NURAG A NSHU , A LEKSANDRS B ELOVS , S HALEV B EN -DAVID , M IKA G ÖÖS , R AHUL JAIN ,
     ROBIN KOTHARI , T ROY L EE , AND M IKLOS S ANTHA: Separations in communication complexity
     using cheat sheets and information complexity. In Proc. 57st FOCS, pp. 555–564. IEEE Comp. Soc.
     Press, 2016. [doi:10.1109/FOCS.2016.66, arXiv:1605.01142] 4

 [5] B OAZ BARAK , M ARK B RAVERMAN , X I C HEN , AND A NUP R AO: How to compress interactive
     communication. SIAM J. Computing, 42(3):1327–1363, 2013. Preliminary version in STOC’10.
     [doi:10.1137/100811969] 2

 [6] S HALEV B EN -DAVID AND ROBIN KOTHARI: Randomized query complexity of sabotaged and
     composed functions. In Proc. 43rd Internat. Colloq. on Automata, Languages and Program-
     ming (ICALP’16), pp. 60:1–60:14. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, 2016.
     [doi:10.4230/LIPIcs.ICALP.2016.60, arXiv:1605.09071] 1

 [7] H ARRY B UHRMAN AND RONALD DE W OLF: Complexity measures and decision tree complexity:
     A survey. Theoret. Computer Sci., 288(1):21–43, 2002. [doi:10.1016/S0304-3975(01)00144-X] 6

 [8] A NDREW D RUCKER: Improved direct product theorems for randomized query complexity. Comput.
     Complexity, 21(2):197–244, 2012. Preliminary version in CCC’11. [doi:10.1007/s00037-012-0043-
     7, arXiv:1005.0644] 13

 [9] T OMÀS F EDER , E YAL K USHILEVITZ , M ONI NAOR , AND N OAM N ISAN: Amortized communica-
     tion complexity. SIAM J. Computing, 24(4):736–750, 1995. [doi:10.1137/S0097539792235864]
     2

[10] J USTIN G ILMER , M ICHAEL S AKS , AND S RIKANTH S RINIVASAN: Composition limits and sepa-
     rating examples for some Boolean function complexity measures. Combinatorica, 36(3):265–311,
     2016. Preliminary version in CCC’13. [doi:10.1007/s00493-014-3189-x, arXiv:1306.0630] 18

[11] M IKA G ÖÖS , T ONIANN P ITASSI , AND T HOMAS WATSON: Deterministic communication
     vs. partition number. In Proc. 56st FOCS, pp. 1077–1088. IEEE Comp. Soc. Press, 2015.
     [doi:10.1109/FOCS.2015.70] 4, 15

[12] P ETER H ØYER , T ROY L EE , AND ROBERT Š PALEK: Negative weights make adversaries stronger.
     In Proc. 39th STOC, pp. 526–535. ACM Press, 2007. [doi:10.1145/1250790.1250867, arXiv:quant-
     ph/0611054] 3

                     T HEORY OF C OMPUTING, Volume 14 (5), 2018, pp. 1–27                        25
                             S HALEV B EN -DAVID AND ROBIN KOTHARI

[13] R AHUL JAIN AND H ARTMUT K LAUCK: The partition bound for classical communication complex-
     ity and query complexity. In Proc. 25th IEEE Conf. on Computational Complexity (CCC’10), pp.
     247–258. IEEE Comp. Soc. Press, 2010. [doi:10.1109/CCC.2010.31, arXiv:0910.4266] 4
[14] R AHUL JAIN , H ARTMUT K LAUCK , AND M IKLOS S ANTHA: Optimal direct sum results for
     deterministic and randomized decision tree complexity. Inform. Process. Lett., 110(20):893–897,
     2010. [doi:10.1016/j.ipl.2010.07.020, arXiv:1004.0105] 2, 10
[15] R AHUL JAIN , T ROY L EE , AND N ISHEETH K. V ISHNOI: A quadratically tight partition bound for
     classical communication complexity and query complexity, 2014. Available at author’s webpage.
     [arXiv:1401.4512] 4
[16] M AURICIO K ARCHMER , R AN R AZ , AND AVI W IGDERSON: Super-logarithmic depth lower
     bounds via the direct sum in communication complexity. Comput. Complexity, 5(3-4):191–204,
     1995. Preliminary version in SCT’91. [doi:10.1007/BF01206317] 2
[17] S HELBY K IMMEL: Quantum adversary (upper) bound. Chicago J. Theoret. Computer Sci., (4),
     2013. [doi:10.4086/cjtcs.2013.004] 2
[18] R AGHAV K ULKARNI AND AVISHAY TAL: On fractional block sensitivity. Chicago J. Theoret.
     Computer Sci., 2016(8), 2016. [doi:10.4086/cjtcs.2016.008] 20
[19] T ROY L EE , R AJAT M ITTAL , B EN W. R EICHARDT, ROBERT Š PALEK , AND M ARIO S ZEGEDY:
     Quantum query complexity of state conversion. In Proc. 52nd FOCS, pp. 344–353. IEEE Comp.
     Soc. Press, 2011. [doi:10.1109/FOCS.2011.75, arXiv:1011.3020] 2
[20] A SHLEY M ONTANARO: A composition theorem for decision tree complexity. Chicago J. Theoret.
     Computer Sci., 2014(6), 2014. [doi:10.4086/cjtcs.2014.006, arXiv:1302.4207] 2
[21] D ENIS PANKRATOV: Direct sum questions in classical communication complexity. Master’s thesis,
     University of Chicago, 2012. Available at advisor’s webpage. 2
[22] B EN W. R EICHARDT: Reflections for quantum query algorithms. In Proc. 22nd Ann.
     ACM-SIAM Symp. on Discrete Algorithms (SODA’11), pp. 560–569. ACM Press, 2011.
     [doi:10.1137/1.9781611973082.44] 2
[23] M AURICE S ION: On general minimax theorems.           Pacific J. Math., 8(1):171–176, 1958.
     [doi:10.2140/pjm.1958.8.171] 24
[24] AVISHAY TAL: Properties and applications of Boolean function composition. In Proc. 4th
     Innovations in Theoret. Computer Sci. Conf. (ITCS’13), pp. 441–454. ACM Press, 2013.
     [doi:10.1145/2422436.2422485] 2, 18
[25] N IKOLAI K. V ERESHCHAGIN: Randomized Boolean decision trees: Several remarks. Theoret.
     Computer Sci., 207(2):329–342, 1998. [doi:10.1016/S0304-3975(98)00071-1] 24
[26] J OHN WATROUS: The Theory of Quantum Information. Cambridge University Press, 2018. Avail-
     able at CUP and at author’s webpage. 24

                     T HEORY OF C OMPUTING, Volume 14 (5), 2018, pp. 1–27                        26
           R ANDOMIZED Q UERY C OMPLEXITY OF S ABOTAGED AND C OMPOSED F UNCTIONS

[27] A NDREW C HI -C HIH YAO: Probabilistic computations: Toward a unified measure of complexity.
     In Proc. 18st FOCS, pp. 222–227. IEEE Comp. Soc. Press, 1977. [doi:10.1109/SFCS.1977.24] 9


AUTHORS

     Shalev Ben-David
     Hartree postdoctoral fellow
     Joint Center for Quantum Information
     and Computer Science
     University of Maryland
     College Park, MD, USA
     shalev umd edu


     Robin Kothari
     Researcher
     Quantum Architectures and Computation group (QuArC)
     Microsoft Research
     Redmond, WA, USA
     robin.kothari microsoft com
     http://www.robinkothari.com


ABOUT THE AUTHORS

     S HALEV B EN -DAVID received his Ph. D. in computer science from M.I.T. in 2017, under
        the advisorship of Scott Aaronson. His research interests include complexity theory and
        quantum computing. He is currently a Hartree postdoctoral fellow at the University of
        Maryland, though this paper was written when he was a student at MIT. He is always
        looking for people to play Hex with.


     ROBIN KOTHARI received his Ph. D. in Computer Science from the University of Waterloo
       in 2014, under the supervision of Andrew Childs and John Watrous. This research was
       performed while he was a postdoctoral associate at the Center for Theoretical Physics at
       the Massachusetts Institute of Technology. He is currently a Researcher in the Quantum
       Architectures and Computation group (QuArC) at Microsoft Research. His research
       interests include quantum algorithms and complexity theory.




                     T HEORY OF C OMPUTING, Volume 14 (5), 2018, pp. 1–27                         27