Authors Joshua Brody, Jae Tak Kim, Peem Lerdputtipongporn, Hariharan Srinivasulu,
License CC-BY-3.0
T HEORY OF C OMPUTING, Volume 19 (11), 2023, pp. 1β14
www.theoryofcomputing.org
A Strong XOR Lemma for Randomized
Query Complexity
Joshua Brody Jae Tak Kim Peem Lerdputtipongporn
Hariharan Srinivasulu
Received August 3, 2020; Revised December 30, 2023; Published December 31, 2023
Abstract. We give a strong direct sum theorem for computing XOR π β¦ π, the XOR
of π instances of the partial Boolean function π. Specifically, we show that for every
π and every π β₯ 2, the randomized query complexity of computing the XOR of π
instances of π satisfies Rπ (XOR π β¦ π) = Ξ(π R ππ (π)), where Rπ ( π ) denotes the expected
number of queries made by the most efficient randomized algorithm computing π
with π error. This matches the naive success amplification upper bound and answers
a conjecture of Blais and Brody (CCCβ19).
As a consequence of our strong direct sum theorem, we give a total function
π for which R(XOR π β¦ π) = Ξ(π log(π) Β· R(π)), where R( π ) is the number of queries
made by the most efficient randomized algorithm computing π with 1/3 error. This
answers a question from Ben-David et al. (RANDOMβ20).
1 Introduction
We show that XOR admits a strong direct sum theorem for randomized query complexity.
Generally, the direct sum problem asks how the cost of computing a partial function π scales
with the number π of instances of the function that we need to compute simultaneously
ACM Classification: F.1.1, F.1.3
AMS Classification: 68Q09,68Q10,68Q17
Key words and phrases: lower bounds, query complexity, direct sum
Β© 2023 Joshua Brody, Jae Tak Kim, Peem Lerdputtipongporn, and Hariharan Srinivasulu
c b Licensed under a Creative Commons Attribution License (CC-BY) DOI: 10.4086/toc.2023.v019a011
J OSHUA B RODY, JAE TAK K IM , P EEM L ERDPUTTIPONGPORN , AND H ARIHARAN S RINIVASULU
(in parallel) . This is a foundational computational problem that has received considerable
attention [9, 2, 13, 14, 10, 6, 8, 7, 3, 4, 5], including recent a recent paper by Blais and Brody [7],
which showed that expected query complexity obeys a direct sum theorem in a strong senseβ
computing π copies of a partial function π with overall error π requires π times the cost of
computing π on one input with very low (π/π) error. This matches the naive success amplification
algorithm which runs an ππ -error algorithm for π once on each of π inputs and applies a union
bound to get an overall error guarantee of π.
What happens if we do not need to compute π on all instances, but only on a function π β¦ π of
those instances? Clearly the same success amplification trick (compute π on each input with low
error, then apply π to the answers) works for computing π β¦ π; however, in principle, computing
π β¦ π can be easier than computing each instance of π individually. When a function π β¦ π
requires success amplification for all π, we say that π admits a strong direct sum theorem. Our main
result shows that XOR admits a strong direct sum theorem.
1.1 Query complexity
A query algorithm, also known as a decision tree, computing π , is an algorithm π that takes an
input π₯ to π , examines (or queries) bits of π₯, and outputs an answer for π (π₯). A leaf of π is a bit
string π β {0, 1}β representing the answers to the queries made by π on input π₯. Let leaf(π, π₯)
denote the leaf of π reached on input π₯. Naturally, our general goal is to minimize the length of
π, i. e., minimize the number of queries needed to compute π .
A randomized algorithm π computes a function π : {0, 1} π β {0, 1} with error π β₯ 0 if for
every input π₯ β {0, 1} π , the algorithm outputs the value π (π₯) with probability at least 1 β π. The
query cost of π is the maximum number of bits of π₯ that it queries, with the maximum taken
over both the choice of input π₯ and the internal randomness of π. The π-error randomized query
complexity of π (also known as the randomized decision tree complexity of π ) is the minimum query
cost of an algorithm π that computes π with error at most π. We denote this complexity by
Rπ ( π ), and we write R( π ) := R 1 ( π ) to denote the 13 -error randomized query complexity of π .
3
Another natural measure for the query cost of a randomized algorithm π is the expected
number of coordinates of an input π₯ that it queries. Taking the maximum expected number
of coordinates queried by π over all inputs yields the expected query cost of π. The minimum
expected query cost of an algorithm π that computes a function π with error at most π is the
π-error expected query complexity of π , which we denote by Rπ ( π ). We again write R( π ) := R 1 ( π ).
3
Note that R0 ( π ) corresponds to the standard notion of zero-error randomized query complexity of π .
1.2 Our results
Our main result is a strong direct sum theorem for XOR.
Theorem 1.1. For every partial function π : {0, 1} π β {0, 1} and all π > 0, we have Rπ (XOR π β¦ π) =
Ξ©(π Β· Rπ/π (π)).
This answers Conjecture 1 of Blais and Brody [7] in the affirmative. We prove Theorem 1.1 by
proving an analogous result in distributional query complexity. We also allow our algorithms to
T HEORY OF C OMPUTING, Volume 19 (11), 2023, pp. 1β14 2
A S TRONG XOR L EMMA FOR R ANDOMIZED Q UERY C OMPLEXITY
π
abort with a given probability. Let π be a distribution on valid inputs for π . Let DπΏ,π ( π ) denote
the minimal query cost of a deterministic query algorithm that aborts with probability at most πΏ
and errs with probability at most π, where the probability is taken over inputs π βΌ π. Similarly,
let RπΏ,π ( π ) denote the minimal query cost of a randomized algorithm that computes π with abort
probability at most πΏ and error probability at most π for each valid input. (Here the probabilities
are taken over the internal randomness of the algorithm.)
Our main technical result is the following strong direct sum result for XOR π β¦ π for
distributional algorithms.
Lemma 1.2 (Main Technical Lemma, informally stated.). For every partial function π : {0, 1} π β
{0, 1}, every distribution π on the set of valid inputs and every sufficiently small πΏ, π > 0, we have
ππ π
DπΏ,π (XOR π β¦ π) = Ξ©(π DπΏ0 ,π0 (π)) ,
for πΏ0 = Ξ(1) and π0 = Ξ(π/π).
In [7], Blais and Brody also gave a total function π : {0, 1} π β {0, 1} whose π-error expected
query complexity satisfies Rπ (π) = Ξ©(R(π) Β· log 1π ). We use our strong XOR Lemma together
with this function to show the following.
Corollary 1.3. There exists a total function π : {0, 1} π β {0, 1} such that
Rπ (XOR π β¦ π) = Ξ©(π log(π) Β· Rπ (π)) .
Proof. Let π : {0, 1} π β {0, 1} be a function guaranteed by [7]. Then, we have
R(XOR π β¦ π) β₯ R(XOR π β¦ π) β₯ Ξ©(π Β· R1/3π (π)) β₯ Ξ©(π Β· R(π) Β· log(3π)) = Ξ©(π log(π) Β· R(π)) ,
where the second inequality is by Theorem 1.1 and the third inequality is from the query
complexity guarantee of π.
This answers Open Question 1 from a recent paper by Ben-David et al. [5].
1.3 Previous and related work
Jain et al. [10] gave direct sum theorems for deterministic and randomized query complexity.
In particular, Jain et al. show Rπ ( π π ) β₯ πΏ Β· π Β· Rπ+πΏ ( π ). While their direct sum result holds
for randomized query complexity, the lower bound is in terms of the query complexity of
computing π with an increased error of π + πΏ. This weakens the right-hand side of their inequality.
ππ π
Shaltiel [14] gave a function π such that D0,π ( π π ) π D0,π ( π ), thus showing that a similar direct
sum theorem fails to hold for distributional complexity.
Drucker [8] gave a direct product theorem for randomized query complexity, showing that
any algorithm computing π π using πΌπ R(π) queries for a constant πΌ < 1 has success probability
exponentially small in π. Drucker also gave the following XOR Lemma, showing that any
algorithm for XOR π β¦ π that makes ππ
(π) queries has success probability exponentially close
to 1/2 [8, Theorem 1.3].
T HEORY OF C OMPUTING, Volume 19 (11), 2023, pp. 1β14 3
J OSHUA B RODY, JAE TAK K IM , P EEM L ERDPUTTIPONGPORN , AND H ARIHARAN S RINIVASULU
Theorem 1.4 (Drucker). Suppose any randomized π-query algorithm has success probability β€ 1 β π0 in
computing the Boolean function π on input π₯ βΌ π for some input distribution π. Then, for all 0 < πΌ < 1,
any randomized algorithm making πΌπ0π π queries to compute XOR π β¦ π on input distribution π π (π
inputs drawn independently from π) has success probability at most 12 1 + [1 β 2π0 + 6πΌ ln(2/πΌ)π0] π .
Druckerβs XOR Lemma applies to randomized query complexity R(XOR π β¦ π), while ours
applies to expected randomized query complexity R(XOR π β¦ π).
Note the π0 factor in the query complexity in Druckerβs theorem. When π0 is a constant close
to 1/2, Druckerβs lower bound is stronger than ours by a large constant factor. However, when
π0 = π(1), his bound degrades significantly. Couched in our notation, Druckerβs XOR Lemma
yields Rπ (XOR π β¦ π) = Ξ©(π0 π Rπ0 (π)), for some π0 = π(π/π). This simplifies to Rπ (XOR π β¦ π) =
Ξ©(ππ
π/π (π)), a loss of a factor of π.
As far as we know, it remains open whether this π0 factor is needed in the query complexity
lower bound of Druckerβs XOR Lemma. However, Shaltielβs counterexample [14] shows that the
π0 factor is required for distributional query complexity. This rules out the most direct approach
for proving a tighter XOR Lemma for R(XOR π β¦ π).
Our paper is most closely related to that of Blais and Brody [7], who give a strong direct sum
theorem for the expected query complexity of computing π copies of π in parallel, for any partial
function π , and explicitly conjecture that XOR admits a strong direct sum theorem. Both [7] and
our paper use techniques similar to work of Molinaro et al. [11, 12] who give strong direct sum
theorems for communication complexity.
Our strong direct sum theorem for XOR is an example of a composition theoremβa lower bound
on the query complexity of functions of the form π β¦ π. Several recent articles study composition
theorems in query complexity. Bassilakis et al. [1] show that R( π β¦ π) = Ξ©(fbs( π ) R(π)), where
fbs( π ) is the fractional block sensitivity of π . Ben-David and Blais [3, 4] give a tight lower bound
on R( π β¦ π) as a product of R(π) and a new measure they define called noisyR( π ), which
measures the complexity of computing π on noisy inputs. They also characterize noisyR( π ) in
terms of the gap-majority function. Ben-David et al [5] explicitly consider strong direct sum
theorems for composed functions in randomized query complexity, asking whether the naive
success amplification algorithm is necessary to compute π β¦ π. They give a partial strong direct
sum theorem, showing that there exists a partial function π such that computing XOR π β¦ π
requires success amplification, even in a model where the abort probability may be arbitrarily
close to 1.1 Ben-David et al. explicitly ask whether there exists a total function π such that
R(XOR π β¦ π) = Ξ©(π log(π) R(π)).
1.4 Our technique
Our technique most closely follows the strong direct sum theorem of Blais and Brody. We start
with a query algorithm that computes XOR π β¦ π and use it to build a query algorithm for comput-
ing π with low error. To do this, we will take an input for π and embed it into an input for XOR π β¦ π.
Given π₯ β {0, 1} π , π β [π], and π¦ β {0, 1} πΓπ , let π¦ (πβπ₯) := (π¦ (1) , . . . , π¦ (πβ1) , π₯, π¦ (π+1) , . . . π¦ (π) ) denote
1In this query complexity model, called PostBPP, the query algorithm is allowed to abort with any probability
strictly less than 1. When it does not abort, it must output π with probability at least 1 β π.
T HEORY OF C OMPUTING, Volume 19 (11), 2023, pp. 1β14 4
A S TRONG XOR L EMMA FOR R ANDOMIZED Q UERY C OMPLEXITY
the input obtained from π¦ by replacing the π-th coordinate π¦ (π) with π₯. Note that if π₯ βΌ π and
π¦ βΌ π π ,2 then π¦ (πβπ₯) βΌ π π for all π β [π].
We require the following observation [8, Lemma 3.2].
Lemma 1.5 (Drucker). Let π¦ βΌ π π be an input for a query algorithm π, and consider any execution of
queries by π. The distribution of coordinates of π¦, conditioned on the queries made by π, remains a
product distribution.
In particular, the answers to π(π¦ (π) ) remain independent bits conditioned on any set of queries
made by the query algorithm. Our first observation is that in order to compute XOR π β¦ π(π¦)
with high probability, we must be able to compute π(π¦ (π) ) with very high probability for many
πβs. The intuition behind this observation is captured by the following simple fact about the
XOR of independent random bits.
Define the bias of a random bit π β {0, 1} as π(π) := maxπβ{0,1} Pr[π = π]. Define the
advantage of π as adv(π) := 2π(π) β 1. Note that when adv(π) = πΏ, then π(π) = 12 (1 + πΏ).
Fact 1.6. Let π1 , . . . , π π be independent random bits, and let π π be the advantage of ππ . Then,
π
Γ
adv(π1 β Β· Β· Β· β π π ) = adv(ππ ) .
π=1
Proof. For each π, let π π := argmaxπβ{0,1} Pr[ππ = π] and πΏ π := adv(ππ ). Then Pr[ππ = π π ] =
2 (1 + πΏ π ). We prove Fact 1.6 by induction on π. When π = 1, there is nothing to prove. For π = 2,
1
note that
1 1 1 1
Pr[π1 β π2 = π 1 β π2 ] = (1 + πΏ1 ) (1 + πΏ2 ) + (1 β πΏ1 ) (1 β πΏ2 )
2 2 2 2
1 1
= (1 + πΏ1 + πΏ2 + πΏ1 πΏ2 ) + (1 β πΏ1 β πΏ2 + πΏ1 πΏ2 )
4 4
1
= (1 + πΏ1 πΏ2 ) .
2
Hence π1 β π2 has advantage πΏ1 πΏ2 and the claim holds for π = 2. For an induction hypothesis,
suppose that the claim holds for π1 β Β· Β· Β· β π πβ1 . Then, setting π := π1 β Β· Β· Β· β π πβ1 , by the
Γ πβ1
induction hypothesis, we have adv(π) = π=1 adv(ππ ). Moreover, π1 β Β· Β· Β· β π π = π β π π , and
π
Γ
adv(π1 β Β· Β· Β· β π π ) = adv(π β π π ) = adv(π) adv(π π ) = adv(ππ ) .
π=1
Given an algorithm for XOR π β¦ π that has error π, it follows that for typical leaves the
advantage of computing XOR π β¦ π is & 1 β 2π. Fact 1.6 shows that for such leaves, the advantage
of computing π(π¦ (π) ) for most coordinates π is & (1 β 2π)1/π = 1 β Ξ(π/π). Thus, conditioned on
2We use π π to denote the distribution obtained on π-tuples of {0, 1} π obtained by sampling each coordinate
independently according to π.
T HEORY OF C OMPUTING, Volume 19 (11), 2023, pp. 1β14 5
J OSHUA B RODY, JAE TAK K IM , P EEM L ERDPUTTIPONGPORN , AND H ARIHARAN S RINIVASULU
reaching this leaf of the query algorithm, we could compute π(π¦ (π) ) with very high probability.
We would like to fix a coordinate π β such that for most leaves, our advantage in computing π
on coordinate π β is 1 β π(π/π). There are other complications, namely that (i) our construction
needs to handle aborts gracefully and (ii) our construction must ensure that the algorithm for
XOR π β¦ π does not query the π β -th coordinate too many times. Our construction identifies a
coordinate π β and a string π§ β {0, 1} πΓπ , and on input π₯ β {0, 1} π it emulates a query algorithm
β
for XOR π β¦ π on input π§ (π βπ₯) and outputs our best guess for π(π₯) (which is now π evaluated on
β
coordinate π β of π§ (π βπ₯) ), aborting when needed e. g., when the algorithm for XOR π β¦ π aborts or
when it queries too many bits of π₯. We defer full details of the proof to Section 2.
1.5 Preliminaries and notation
A partial Boolean function on the domain {0, 1} π is a function π : π β {0, 1} for some subset
π β {0, 1} π . Call π the set of valid inputs for π . Let π be a partial Boolean function on {0, 1} π
and π a distribution whose support is a subset of the valid inputs. We use [π] to denote the
set {1, . . . , π} and π βπ
π to denote an element π sampled uniformly from a set π. Let π π
denote the distribution obtained on π-tuples of {0, 1} π obtained by sampling each coordinate
independently according to π.
An algorithm π is a [π, πΏ, π, π]-distributional query algorithm for π if π is a deterministic
algorithm with query cost π that computes π with error probability at most π and abort
probability at most πΏ when the input π₯ is drawn from π.3
Our main theorem is a direct sum result for XOR π β¦ π for expected randomized query
complexity; however, Lemma 1.2 uses distributional query complexity with aborts. To translate
between the two, we need two results from Blais and Brody [7] that connect the query complexities
in the randomized, expected randomized, and distributional query models.
Fact 1.7 ([7], Proposition 14). For every partial function π : {0, 1} π β {0, 1}, every 0 β€ π < 21 and
every 0 < πΏ < 1,
πΏ Β· RπΏ,π ( π ) β€ Rπ ( π ) β€ 1βπΏ
1
Β· RπΏ,(1βπΏ)π ( π ).
Fact 1.7 shows that when πΏ = 1 β Ξ©(1), to achieve a lower bound for Rπ ( π ), it suffices to lower
bound RπΏ,π ( π ). Next, we need the following generalization of Yaoβs minimax lemma, which
connects randomized and distributional query complexity in the presence of aborts.
Fact 1.8 ([7], Lemma 15). For any πΌ, π½ > 0 such that πΌ + π½ β€ 1, we have
π π
max DπΏ/πΌ,π/π½ ( π ) β€ RπΏ,π ( π ) β€ max DπΌπΏ,π½π ( π ).
π π
For simplicity, it might be helpful to consider the simplest case where πΌ = π½ = 12 . In this case,
π π
we recover maxπ D2πΏ,2π ( π ) β€ RπΏ,π ( π ) β€ maxπ DπΏ/2,π/2 ( π ). Fact 1.8 shows that to prove a lower
3Note: in the literature, the error probability is sometimes defined as being conditioned on not aborting (e. g.,[5]).
We define the error probabilty without conditioning to match article [7] most closely related to our work.
T HEORY OF C OMPUTING, Volume 19 (11), 2023, pp. 1β14 6
A S TRONG XOR L EMMA FOR R ANDOMIZED Q UERY C OMPLEXITY
bound on RπΏ,π ( π ), it suffices to prove a lower bound on distributional complexity (albeit with a
constant factor increase in abort and error probabilities).
We will also use the following convenient facts about expected value.
Fact 1.9 (Law of Conditional Expectations). Let π and π be random variables. Then, we have
E[π] = E[E[π |π]] .
Fact 1.10 (Markov Inequality for Bounded Variables). Let π be a real-valued random variable with
0 β€ π β€ 1. Suppose that E[π] β₯ 1 β π. Then, for any π > 1 it holds that
1
Pr[π < 1 β ππ] < .
π
Proof. Let π := 1 β π. Then, E[π] β€ π. By Markovβs Inequality we have
1
Pr[π < 1 β ππ] = Pr[π > ππ] β€ .
π
2 Strong XOR lemma
In this section, we prove our main result.
Lemma 2.1 (Formal Restatement of Lemma 1.2). For every partial function π : {0, 1} π β {0, 1},
every distribution π on {0, 1} π , every 0 β€ πΏ β€ 51 , and every 0 < π β€ 200
1
, we have
ππ π π
DπΏ,π (XOR π β¦ π) β₯ D 0 0 (π) ,
25 πΏ ,π
πΏ0 = 0.36 + 3πΏ and π0 = 15000π
π .
ππ
Proof. Let π := DπΏ,π (XOR π β¦π), and suppose that π is a [π, πΏ, π, π π ]-distributional query algorithm
for XOR π β¦ π. Our goal is to construct an [π(π/π), πΏ0 , π0 , π]-distributional query algorithm for π.
Towards that end, for each leaf β of π define
πβ := argmax Pr [XOR π β¦ π(π₯) = π| leaf(π, π₯) = β ]
π
πβ{0,1} π₯βΌπ
πβ := Pr [XOR π β¦ π(π₯) = πβ | leaf(π, π₯) = β ]
π₯βΌπ π
πβ := 2πβ β 1 .
Call πβ the advantage of π on leaf β .
The purpose of π is to compute XOR π β¦ π; however, we will show that π must additionally
be able to compute π reasonably well on many coordinates of π₯. For any π β [π] and any leaf β ,
T HEORY OF C OMPUTING, Volume 19 (11), 2023, pp. 1β14 7
J OSHUA B RODY, JAE TAK K IM , P EEM L ERDPUTTIPONGPORN , AND H ARIHARAN S RINIVASULU
define
π π,β := argmax Pr [π = π(π₯ (π) )| leaf(π, π₯) = β ]
π
πβ{0,1} π₯βΌπ
π π,β := Pr [π π,β = π(π₯ (π) )| leaf(π, π₯) = β ]
π₯βΌπ π
π π,β := 2π π,β β 1 .
If π reaches leaf β on input π¦, then write π(π¦)π := π π,β . π(π¦)π represents πβs best guess for
π(π¦ (π) ).
Next, we define some structural characteristics of leaves that we will need to complete the
proof.
Definition 2.2 (Good leaves, good coordinates).
β’ Call a leaf β good if πβ β₯ 1 β 50π. Otherwise, call β bad.
β’ Call a leaf β good for π if π π,β β₯ 1 β 5000π/π. Otherwise, call a leaf β bad for π.
When a leaf is good for π, then π, conditioned on reaching this leaf, computes π(π₯ (π) ) with
very high probability. Before presenting the main reduction, we give a few simple claims to
help our proof. Our first claim shows that we reach a good leaf with high probability.
Claim 2.3. Prπ₯βΌππ [leaf(π, π₯) is bad |π(π₯) doesnβt abort] β€ 25
1
.
Proof. Conditioned on π not aborting, it outputs the correct value of XOR π β¦ π with probability
π
at least 1 β 1βπΏ β₯ 1 β 2π. We analyze this error probability by conditioning on which leaf is
reached. Let π be the distribution on leaf(π, π₯) when π₯ βΌ π π , conditioned on π not aborting.
Let πΏ βΌ π. Then, we have:
1 β 2π β€ Pr [π(π₯) = XOR π β¦ π(π₯)|π doesnβt abort]
π₯βΌπ π
Γ
= Pr [πΏ = β ] Β· Pr[π(π₯) = XOR π β¦ π(π₯)|πΏ = β ]
πΏβΌπ
leaf β
Γ
= Pr[πΏ = β ] Β· πβ
β
= E[ππΏ ] .
πΏ
Thus, E[ππΏ ] β₯ 1 β 2π. Recalling that β is good if πβ β₯ 1 β 50π and using Fact 1.10, πΏ is bad
1
with probability at most 25 .
Next, we claim that each good leaf is good for many π.
Claim 2.4. Let β be any good leaf, and let πΌ be uniform on [π]. Then, we have:
1
Pr[β is bad for πΌ] β€ .
πΌ 25
T HEORY OF C OMPUTING, Volume 19 (11), 2023, pp. 1β14 8
A S TRONG XOR L EMMA FOR R ANDOMIZED Q UERY C OMPLEXITY
Proof. Fix a good leaf β , and let π½β := PrπΌ [β is bad for πΌ]. Recall that if β is good, then πβ β₯ 1 β 50π.
Therefore, πβ β₯ 1 β 100π. Using 1 + π₯ β€ π π₯ and π β2π₯ β€ 1 β π₯ (which holds for all 0 β€ π₯ β€ 1/2),
we have for any good leaf β
π
Γ ππ½β
5000π
1 β 100π β€ πβ = π π,β β€ 1 β β€ π β5000πΒ·π½β β€ 1 β 2500ππ½β .
π
π=1
Rearranging terms, we see that π½β β€ 25
1
.
Next, we describe a randomized algorithm π 0 for π whose expected query cost, abort
probability, and error probability match the guarantees we want to provide when the input
π₯ βΌ π. We will complete the proof of Lemma 2.1 by fixing the randomness used in π 0. Our
algorithm works by independently π§ βΌ π π and π uniformly from [π], embedding π₯ in the π-th
coordinate of π§, and emulating π on the resulting string.
Algorithm 1 π 0(π₯)
1: Independently sample πΌ uniformly from [π] and π§ βΌ π π .
2: π¦ β π§ (πΌβπ₯)
3: Emulate algorithm π on input π¦.
4: Abort
(i) if π aborts,
(ii) if π reaches a bad leaf, or
(iii) if π reaches a leaf that is bad for πΌ.
π bits of π₯,
25π
(iv) if π queries more than
5: Otherwise, output π(π¦).
Note that the emulation is possible since whenever π queries the π-th bit of π¦ (πΌ) , we can
query π₯ π , and we can emulate π querying a bit of π¦ (π) for π β πΌ directly since π§ is fixed. We claim
that (i) π 0 makes at most π queries, (ii) π 0 aborts with probability at most πΏ + 0.12, and (iii)
25π
π 0 errs with probability at most 5000π
π .
First, note that π 0 makes at most π queries, since it aborts instead of making more queries.
25π
Second, consider the abort probability of π 0. Our algorithm aborts if π aborts, if we reach
a bad leaf, if the leaf we reach is bad for πΌ, of if π makes more than π bits of π¦ (πΌ) . Let β°1
25π
be the event that π aborts on input π¦. Similarly, let β°2 , β°3 , β°4 be the events that π reaches a
bad leaf, π reaches a leaf that is bad for π, and π queries more than π bits of π₯ respectively.
25π
Since π₯ βΌ π, π§ βΌ π π , and πΌ is uniform on [π], it follows that π¦ βΌ π π . By the abort guarantees
of π, we have Pr[β°1 ] β€ πΏ. By Claim 2.3 we have Pr[β°2 |β°1 ] β€ 1/25, and by Claim 2.4 we have
Pr[β°3 |β°1 , β°2 ] β€ 1/25. Thus, we have Pr[β°1 β¨ β°2 β¨ β°3 ] β€ πΏ + 25
2
.
Next, for each π β [π], let π π (π¦) denote the number of queries that π makes to π¦ (π) on
input π¦. The query cost of π guarantees that for each input π¦, 1β€πβ€π π π (π¦) β€ π. Therefore,
Γ
T HEORY OF C OMPUTING, Volume 19 (11), 2023, pp. 1β14 9
J OSHUA B RODY, JAE TAK K IM , P EEM L ERDPUTTIPONGPORN , AND H ARIHARAN S RINIVASULU
π
for any π¦, at most 25 indices π β [π] satisfy π π (π¦) β₯ π . Hence, for πΌ βπ
[π], π₯ βΌ π, and
25π
π
π§ βΌ π , and recalling that π¦ = π§ (πΌβπ₯) , we have: Pr[β°4 ] β€ 25 1
. By a union bound, we have
PrπΌ,π§,π₯ [π aborts on input π¦] = Pr[β°1 β¨ β°2 β¨ β°3 β¨ β°4 ] β€ πΏ + 25 = πΏ + 0.12.
0 3
Third, we analyze the error probability of π 0. This algorithm errs only when it reaches a leaf
that is good for πΌ. By Claim 2.4, we are correct with probability at least ππΌ,β = 2 πΌ,β β₯ 1 β 5000π
π .
1+π
Thus, we have Pr[π 0 errs] β€ 5000π
π .
Letting π be the indicator variable for the event that π 0 aborts and π = (πΌ, π§), Fact 1.9 gives
Pr[π 0 aborts ] = E[π 0 aborts ] = E [E[π 0 aborts |πΌ, π§]] = E [Pr[π 0 aborts ]] .
πΌ,π§ πΌ,π§
Thus algortihm π 0 is a randomized algorithm that, when given an input π₯ βΌ π, makes at most
25π
π queries and has the following guarantees:
E [Pr[π 0 aborts]] = Pr [π 0 aborts] β€ πΏ + 0.12, and
πΌ,π§ π₯ πΌ,π₯,π§
5000π
E [Pr[π 0(π¦)(πΌ) β π(π₯)]] = Pr [π 0(π¦)(πΌ) β π(π₯)] β€ .
πΌ,π§ π₯ πΌ,π₯,π§ π
By Markovβs inequality and a union bound, there must be a setting of (π β , π§ β ) such that
Prπ₯ [π 0 aborts ] β€ 3πΏ + 0.36 and Prπ₯ [π 0(π¦)(π β ) β π(π₯)] β€ 15000π 00
π . Let π be a deterministic
algorithm that takes an input π₯ βΌ π and emulates algorithm π with π and π§ β in place of the
0 β
randomly sampled πΌ, π§. This algorithm queries at most π , aborts with probability at most
25π
3πΏ + 0.36, and errs with probability at most 15000π π . Thus, it is a [π(π/π), 3πΏ + 0.36, π , π]-
15000π
distributional algorithm for π, as required.
2.1 Proof of Theorem 1.1
Proof of Theorem 1.1. Define π0 := 30000π. Let π be the input distribution for π achieving
π
maxπ D 1 π0 (π), and let π π be the π-fold product distribution of π. By the first inequality of
2, π
Fact 1.7 and the first inequality of Fact 1.8, we have
1 1 ππ
Rπ (XOR π β¦ π) β₯ R 1 ,π (XOR π β¦ π) β₯ D (XOR π β¦ π) .
50 50 50 251 ,2π
Additionally, by Lemma 2.1 and the second inequalities of Fact 1.7 and Fact 1.8, we have
ππ π π π π
D1 (XOR π β¦ π) β₯ D 0 (π) β₯ R 2 4π0 (π) β₯ R 12π0 (π) .
25 ,2π 120 12 , ππ 120 3 , π 360 π
ππ ππ
Thus, we have Rπ (XOR π β¦ π) = Ξ© D 1 (XOR π β¦ π) and D 1 (XOR π β¦ π) = Ξ© π R 12π0 (π) . By
25 ,2π 25 ,2π π
standard success amplification R 12π0 (π) = Ξ(R (π)). Putting these together yields
π
π
π
ππ
Rπ (XOR π β¦ π) = Ξ© D 1 (XOR π β¦ π) = Ξ© π R 12π0 (π) = Ξ© R (π) ,π
25 ,2π π π
T HEORY OF C OMPUTING, Volume 19 (11), 2023, pp. 1β14 10
A S TRONG XOR L EMMA FOR R ANDOMIZED Q UERY C OMPLEXITY
hence Rπ (XOR π β¦ π) = Ξ© π R ππ (π) completing the proof.
References
[1] Andrew Bassilakis, Andrew Drucker, Mika GΓΆΓΆs, Lunjia Hu, Weiyun Ma, and Li-Yang Tan:
The power of many samples in query complexity. In Proc. 47th Internat. Colloq. on Automata,
Languages, and Programming (ICALPβ20), pp. 9:1β18. Schloss DagstuhlβLeibniz-Zentrum fuer
Informatik, 2020. [doi:10.4230/LIPIcs.ICALP.2020.9, arXiv:2002.10654, ECCC:TR20-027] 4
[2] Yosi Ben-Asher and Ilan Newman: Decision trees with boolean threshold queries. J. Comput.
System Sci., 51(3):495β502, 1995. Preliminary version in CCCβ95. [doi:10.1006/jcss.1995.1085]
2
[3] Shalev Ben-David and Eric Blais: A new minimax theorem for randomized algorithms. In
Proc. 61st FOCS, pp. 403β411. IEEE Comp. Soc., 2020. [doi:10.1109/FOCS46700.2020.00045]
2, 4
[4] Shalev Ben-David and Eric Blais: A tight composition theorem for the randomized query
complexity of partial functions. In Proc. 61st FOCS, pp. 240β246. IEEE Comp. Soc., 2020.
[doi:10.1109/FOCS46700.2020.00031, arXiv:2002.10809] 2, 4
[5] Shalev Ben-David, Mika GΓΆΓΆs, Robin Kothari, and Thomas Watson: When is amplification
necessary for composition in randomized query complexity? In Proc. 24th Internat.
Conf. on Randomization and Computation (RANDOMβ20), pp. 28:1β16. Schloss Dagstuhlβ
Leibniz-Zentrum fuer Informatik, 2020. [doi:10.4230/LIPICS.APPROX/RANDOM.2020.28,
arXiv:2006.10957] 2, 3, 4, 6
[6] Shalev Ben-David and Robin Kothari: Randomized query complexity of sabotaged and
composed functions. Theory of Computing, 14(5):1β27, 2018. Preliminary version in ICALPβ16.
[doi:10.4086/toc.2018.v014a005, arXiv:1605.09071, ECCC:TR16-087] 2
[7] Eric Blais and Joshua Brody: Optimal separation and strong direct sum for random-
ized query complexity. In Proc. 34th Comput. Complexity Conf. (CCCβ19), pp. 29:1β17.
Schloss DagstuhlβLeibniz-Zentrum fuer Informatik, 2019. [doi:10.4230/LIPIcs.CCC.2019.29,
arXiv:1908.01020] 2, 3, 4, 6
[8] Andrew Drucker: Improved direct product theorems for randomized query com-
plexity. Comput. Complexity, 21(2):197β244, 2012. Preliminary version in CCCβ11.
[doi:10.1007/s00037-012-0043-7, arXiv:1005.0644, ECCC:TR10-080] 2, 3, 5
[9] Russell Impagliazzo, Ran Raz, and Avi Wigderson: A direct product theorem. In Proc.
9th IEEE Conf. Structure in Complexity Theory (SCTβ94), pp. 88β96. IEEE Comp. Soc., 1994.
[doi:10.1109/SCT.1994.315814] 2
T HEORY OF C OMPUTING, Volume 19 (11), 2023, pp. 1β14 11
J OSHUA B RODY, JAE TAK K IM , P EEM L ERDPUTTIPONGPORN , AND H ARIHARAN S RINIVASULU
[10] Rahul Jain, Hartmut Klauck, and Miklos Santha: Optimal direct sum results for
deterministic and randomized decision tree complexity. Inform. Process. Lett., 110(20):893β
897, 2010. [doi:10.1016/j.ipl.2010.07.020] 2, 3
[11] Marco Molinaro, David P. Woodruff, and Grigory Yaroslavtsev: Beating the direct
sum theorem in communication complexity with implications for sketching. In Proc. 24th
Ann. ACMβSIAM Symp. on Discrete Algorithms (SODAβ13), pp. 1738β1756. SIAM, 2013.
[doi:10.1137/1.9781611973105.125] 4
[12] Marco Molinaro, David P. Woodruff, and Grigory Yaroslavtsev: Amplification of
one-way information complexity via codes and noise sensitivity. In Proc. 42nd Internat.
Colloq. on Automata, Languages, and Programming (ICALPβ15), pp. 960β972. Springer, 2015.
[doi:10.1007/978-3-662-47672-7_78, ECCC:TR15-031] 4
[13] Noam Nisan, Steven Rudich, and Michael E. Saks: Products and help bits in deci-
sion trees. SIAM J. Comput., 28(3):1035β1050, 1998. Preliminary version in FOCSβ94.
[doi:10.1137/S0097539795282444] 2
[14] Ronen Shaltiel: Towards proving strong direct product theorems. Comput. Complex-
ity, 12(1):1β22, 2003. Preliminary version in CCCβ01. [doi:10.1007/s00037-003-0175-x,
ECCC:TR01-009] 2, 3, 4
AUTHORS
Joshua Brody
Associate Professor
Department of Computer Science
Swarthmore College
Swarthmore, PA, USA
brody cs swarthmore edu
http://cs.swarthmore.edu/βΌbrody
Jae Tak Kim
Google
Mountain View, CA, USA
jkim17 swarthmore edu
https://linkedin.com/in/jae-tak-kim
T HEORY OF C OMPUTING, Volume 19 (11), 2023, pp. 1β14 12
A S TRONG XOR L EMMA FOR R ANDOMIZED Q UERY C OMPLEXITY
Peem Lerdputtipongporn
Ph. D. student
Statistics Department
Carnegie Mellon University
Pittsburgh, PA, USA
plerdput andrew cmu edu
https://www.cmu.edu/dietrich/statistics-datascience/
people/phd/peem-lerdputtipongporn.html
Hariharan Srinivasulu
Palantir Technologies
New York, NY, USA
srinivasulu hari gmail com
https://www.linkedin.com/in/hari-srinivasulu/
ABOUT THE AUTHORS
Joshua Brody, graduated from Dartmouth College in 2010; his advisor was Amit
Chakrabarti. The subject of his thesis was communication complexity. Since
graduating, his research interests have broadened to streaming algorithms,
property testing, and data structure lower bounds. He was introduced to
computer science by his father, who taught him to count on his fingers in binary
when he was four years old. He spends most of his spare time with his children,
who are good at math but donβt seem to share his interest in counting in binary.
Jae Tak Kim graduated Swarthmore College in 2022, with a B. A. in computer
science. During his undergraduate studies, he worked on research areas in query
complexity theory and static programm analysis. In his free time, he enjoys
climbing, reading, and listening to music. Jae Tak Kim currently works at Google,
where he is a software engineer.
Peem Lerdputtipongporn is a Ph. D. candidate in Statistics and Public Policy at
Carnegie Mellon University. His advisors are David Choi and Nynke Niezink.
His parents named him βPeem,β a short Thai word that means βcommanding
respect,β to offset the fact that he was born underweight. He received a B. A.
in Mathematics and Computer Science from Swarthmore College in 2021. At
Swarthmore, he he worked with Professor Joshua Brody on Query Complexity
and Professor Steve Wang on Statistical Paleontology. His current interests are
algorithmic fairness and statistical machine learning.
T HEORY OF C OMPUTING, Volume 19 (11), 2023, pp. 1β14 13
J OSHUA B RODY, JAE TAK K IM , P EEM L ERDPUTTIPONGPORN , AND H ARIHARAN S RINIVASULU
Hariharan Srinivasulu is a 2022 graduate of Swarthmore College with a B. A. in
Physics and Computer Science. He was named after a Hindu deity, although
he suspects his name was inspired by one of his dadβs favorite musicians. He
was born and brought up in Chennai, India, and moved to the US for his
undergraduate studies. He was mentored by Mike Brown and Joshua Brody
at Swarthmore, and has done research in the areas of Computational Plasma
Physics and Query Complexity. He hopes to enter a career in research in the
future and is interested in areas such as quantum computing and distributed
systems. Hariharan currently works for Palantir Technologies as a software
engineer.
T HEORY OF C OMPUTING, Volume 19 (11), 2023, pp. 1β14 14