DOKK Library

Hitting Sets Give Two-Sided Derandomization of Small Space

Authors Kuan Cheng, William M. Hoza,

License CC-BY-3.0

Plaintext
                          T HEORY OF C OMPUTING, Volume 18 (21), 2022, pp. 1–32
                                       www.theoryofcomputing.org

                                        S PECIAL ISSUE : CCC 2020



              Hitting Sets Give Two-Sided
             Derandomization of Small Space
                              Kuan Cheng∗                           William M. Hoza†
                  Received July 19, 2020; Revised June 21, 2022; Published September 20, 2022




       Abstract. A hitting set is a “one-sided” variant of a pseudorandom generator
       (PRG), naturally suited to derandomizing algorithms that have one-sided error. We
       study the problem of using a given hitting set to derandomize algorithms that have
       two-sided error, focusing on space-bounded algorithms. For our first result, we show
       that if there is a log-space hitting set for polynomial-width read-once branching
       programs (ROBPs), then not only does L = RL hold, but L = BPL as well. This
       answers a question raised by Hoza and Zuckerman (SICOMP 2020).
           Next, we consider constant-width ROBPs. We show that if there are log-space
       hitting sets for constant-width ROBPs, then given black-box access to a constant-
       width ROBP 𝑓 , it is possible to deterministically estimate 𝔼[ 𝑓 ] to within ±𝜖 in space
     A conference version of this paper appeared in the Proceedings of the 35th Computational Complexity Conference,
2020 [11].
   ∗ This research was conducted while the author was a postdoctoral researcher at the University of Texas at Austin,
supported by a Simons Investigator Award (#409864, David Zuckerman).
   † Most of this work was done while the author was a graduate student at the University of Texas at Austin,
supported by the NSF GRFP under Grant DGE-1610403 and by a Harrington Fellowship from UT Austin. Part of this
work was done while the author was visiting the Simons Institute for the Theory of Computing.


ACM Classification: F.1.2, F.1.3
AMS Classification: 68Q10, 68Q15, 68Q87
Key words and phrases: derandomization, pseudorandomness, hitting set, branching programs,
space complexity


© 2022 Kuan Cheng and William M. Hoza
c b Licensed under a Creative Commons Attribution License (CC-BY)                       DOI: 10.4086/toc.2022.v018.a021
                                K UAN C HENG AND W ILLIAM M. H OZA

      𝑂(log(𝑛/𝜖)). Unconditionally, we give a deterministic algorithm for this problem
      with space complexity 𝑂(log2 𝑛 + log(1/𝜖)), slightly improving over previous work.
          Finally, we investigate the limits of this line of work. Perhaps the strongest
      reduction along these lines one could hope for would say that for every explicit
      hitting set generator, there is an explicit PRG with similar parameters. In the setting of
      constant-width ROBPs over a large alphabet, we prove that establishing such a strong
      reduction is at least as difficult as constructing a good PRG outright. Quantitatively,
      we prove that if the strong reduction holds, then for every constant 𝛼 > 0, there is
      an explicit PRG for constant-width ROBPs with seed length 𝑂(log1+𝛼 𝑛). Along the
      way, unconditionally, we construct an improved hitting set for ROBPs over a large
      alphabet.


1   Introduction
Suppose some decision problem can be solved by an efficient randomized algorithm. That’s
good, but an efficient deterministic algorithm would be even better. We would therefore like to
deterministically analyze the acceptance probability of the randomized algorithm on a given
input. An ambitious approach to derandomization is to try to design a suitable pseudorandom
generator (PRG).
Definition 1.1. Let ℱ be a class of functions 𝑓 : {0, 1} 𝑛 → {0, 1}. An 𝜖-PRG for ℱ is a function
𝐺 : {0, 1} 𝑠 → {0, 1} 𝑛 such that for every 𝑓 ∈ ℱ , 𝔼[ 𝑓 ] − 𝔼𝑋∈{0,1} 𝑠 [ 𝑓 (𝐺(𝑋))] ≤ 𝜖.
     Let 𝑛 be the number of random bits used by the randomized algorithm, and ensure that ℱ
can compute the action of the randomized algorithm on its random bits. By iterating over all
“seeds” 𝑥 ∈ {0, 1} 𝑠 and plugging 𝐺(𝑥) into the randomized algorithm, we can get an estimate of
its acceptance probability with additive error at most 𝜖.
     Unfortunately, designing efficient PRGs has proved to be extremely difficult. Constructing a
hitting set is sometimes less difficult.
Definition 1.2. Let ℱ be a class of functions 𝑓 : {0, 1} 𝑛 → {0, 1}. An 𝜖-hitting set for ℱ is a set
𝐻 ⊆ {0, 1} 𝑛 such that for every 𝑓 ∈ ℱ with 𝔼[ 𝑓 ] ≥ 𝜖, there is some 𝑥 ∈ 𝐻 such that 𝑓 (𝑥) = 1.
    The image of any 𝜖-PRG is clearly an 𝜖0-hitting set for every 𝜖0 > 𝜖. By iterating over all
strings in an 𝜖-hitting set, we can at least distinguish acceptance probability 0 from acceptance
probability ≥ 𝜖. This is already sufficient for derandomizing some algorithms (namely, those
with “one-sided error”). In this paper, we investigate the possibility of using a hitting set in a
nontrivial way to obtain an estimate of the acceptance probability with a small additive error,
just like what a PRG would have provided.
    This possibility was previously studied in the context of derandomizing time-bounded
algorithms. Several proofs have been discovered showing that if there is a polynomial-time
hitting set for size-𝑛 circuits, then P = BPP [3, 9, 4, 13]. In Section 5 we provide yet another
proof of this theorem; our short proof is arguably simpler than all previous proofs. However,
the focus of our paper is derandomizing space-bounded algorithms.

                      T HEORY OF C OMPUTING, Volume 18 (21), 2022, pp. 1–32                        2
                   H ITTING S ETS G IVE T WO -S IDED D ERANDOMIZATION OF S MALL S PACE

1.1    Derandomizing log-space algorithms
The behavior of a small-space algorithm as a function of its random bits can be modeled by a
read-once1 branching program (ROBP). A width-𝑤 length-𝑛 ROBP is a directed graph consisting of
𝑛 + 1 layers with 𝑤 vertices per layer. There is a designated “start vertex” 𝑣start in the first layer.
Every vertex not in the last layer has two outgoing edges labeled 0 and 1 leading to the next
layer. An 𝑛-bit input string naturally identifies a path through the graph by reading from left to
right. The program accepts or rejects this string depending on whether the path ends at the
designated “accept vertex” 𝑣 acc in the last layer.
    Recall that BPL and RL are the classes of languages that can be decided by randomized
log-space algorithms that always halt with two-sided and one-sided error respectively. A
log-space hitting set2 for polynomial-width ROBPs would immediately imply L = RL. For our
first result, we show that such a hitting set would also imply L = BPL.

Theorem 1.3. Assume that for every 𝑛 ∈ ℕ , there is a 12 -hitting set for width-𝑛, length-𝑛 ROBPs that
can be computed in space 𝑂(log 𝑛). Then L = BPL.


1.2    Motivation: Recent work on hitting sets
Theorem 1.3 is especially interesting in light of recent constructions of improved hitting sets for
ROBPs [8, 19, 10]. The best PRG known for polynomial-width ROBPs is still Nisan’s PRG [25],
which has seed length
                                   𝑂(log2 𝑛 + log 𝑛 log(1/𝜖)) .

Until recently, Nisan’s PRG also provided the best hitting set for polynomial-width ROBPs.
Using sophisticated and novel techniques, Braverman, Cohen, and Garg obtained a hitting set
with space complexity
                                                𝑂(log
                                                e    2
                                                       𝑛 + log(1/𝜖)) ,

which is an improvement when 𝜖 is very small [8].
   Actually, Braverman, Cohen, and Garg constructed something better than a hitting set, called
a weighted pseudorandom generator (WPRG) or alternatively a pseudorandom pseudodistribution
generator.

Definition 1.4 ([8]). Let ℱ be a class of functions 𝑓 : {0, 1} 𝑛 → {0, 1}. An 𝜖-WPRG for ℱ is a
pair (𝐺, 𝜌), where 𝐺 : {0, 1} 𝑠 → {0, 1} 𝑛 and 𝜌 : {0, 1} 𝑠 → ℝ, such that for every 𝑓 ∈ ℱ ,


                                     𝔼[ 𝑓 ] −      𝔼        [ 𝑓 (𝐺(𝑋)) · 𝜌(𝑋)] ≤ 𝜖 .
                                                𝑋∈{0,1} 𝑠

   1Because space-bounded algorithms only have read-once access to their random bits, it does not seem possible to
adapt the existing derandomizations of BPP using a hitting set to the BPL case.
   2When we say “a log-space hitting set,” we mean a family of hitting sets 𝐻𝑛 ⊆ {0, 1} 𝑛 such that given input 𝑛 ∈ ℕ ,
the set 𝐻𝑛 can be enumerated in space 𝑂(log 𝑛). For such a family, |𝐻𝑛 | ≤ poly(𝑛).


                         T HEORY OF C OMPUTING, Volume 18 (21), 2022, pp. 1–32                                       3
                                 K UAN C HENG AND W ILLIAM M. H OZA

   A WPRG can be used to estimate 𝔼[ 𝑓 ] to within ±𝜖, provided that 𝐺 and 𝜌 can both be
computed efficiently. The concept of a WPRG generalizes the concept of a PRG, which is the
special case 𝜌(𝑥) ≡ 1. In turn, if (𝐺, 𝜌) is an 𝜖-WPRG, then the image of 𝐺 is an 𝜖0-hitting set for
every 𝜖0 > 𝜖. So a WPRG is intermediate between a hitting set and a genuine PRG.
   After Braverman, Cohen, and Garg’s work [8], Hoza and Zuckerman gave a simpler
construction of an 𝜖-hitting set for polynomial-width ROBPs, with the slightly improved space
complexity 𝑂(log2 𝑛 + log(1/𝜖)) [19]. Their construction is weaker in that it does not provide a
WPRG. Theorem 1.3 bridges the gap between the two concepts somewhat: by Theorem 1.3, any
generic hitting set can be used for two-sided derandomization, which was the main strength of
a WPRG over a hitting set in the first place.
   (Independently of our work, Chattopadhyay and Liao gave an improved WPRG construction
with seed length 𝑂(log
                   e     2
                           𝑛) + 𝑂(log(1/𝜖)) [10]. Further WPRG constructions were devised after
the preliminary version of our work [11] appeared, leading to a seed length of 𝑂(log2 𝑛 +
log(1/𝜖)) [12, 28, 17], which matches the parameters of Hoza and Zuckerman’s hitting set for
polynomial-width ROBPs [19].)

1.3   The constant-width setting
However, there is a weakness of Theorem 1.3. A PRG or a WPRG would provide a black-box
derandomization, whereas the algorithm of Theorem 1.3 is not black-box. This weakness is
especially acute when we consider the constant-width case. Given a constant-width ROBP 𝑓
directly as input, it is trivial to compute 𝔼[ 𝑓 ] with high accuracy, so the algorithm of Theorem 1.3
is meaningless. Nevertheless, constant-width ROBPs can compute many interesting functions,
and it is a major open challenge to design improved PRGs, WPRGs, or hitting sets for constant-
width ROBPs. (For width 2, optimal PRGs are known [7]. For width 3, the current best PRG
has seed length 𝑂(log
                  e        𝑛 log(1/𝜖)) [24]. The best hitting sets for width 3 are superior, with space
complexity 𝑂(log(𝑛/𝜖)) for small 𝜖 [16] or 𝑂(log 𝑛) for large enough 𝜖 [29]. For width 4, the state
              e
of the art is simply the best results for polynomial-width ROBPs mentioned above [25, 19, 17].)
    To address this weakness of Theorem 1.3, we abstract the “black-box” feature of PRGs and
WPRGs in the following definition.

Definition 1.5. Let ℱ be a class of functions 𝑓 : {0, 1} 𝑛 → {0, 1}. A deterministic 𝜖-sampler for ℱ
is a deterministic oracle algorithm 𝐴 that outputs a real number such that for every 𝑓 ∈ ℱ ,

                                            |𝐴 𝑓 − 𝔼[ 𝑓 ]| ≤ 𝜖 .

    The concept of a deterministic sampler generalizes that of a WPRG, because given a WPRG
(𝐺, 𝜌) with seed length 𝑠, one can set 𝐴 𝑓 = 2−𝑠 𝑥 𝑓 (𝑥)𝜌(𝑥). In the other direction, deterministic
                                                Í
samplers imply hitting sets.

Proposition 1.6. Identify 0 with the constant 0 function on {0, 1} 𝑛 , and assume 0 ∈ ℱ . Let 𝐴 be a
deterministic 𝜖-sampler for ℱ , and let 𝐻 ⊆ {0, 1} 𝑛 be the set of points where 𝐴0 queries its oracle. Then
for every 𝜖0 > 2𝜖, 𝐻 is an 𝜖0-hitting set for ℱ .

                       T HEORY OF C OMPUTING, Volume 18 (21), 2022, pp. 1–32                             4
                H ITTING S ETS G IVE T WO -S IDED D ERANDOMIZATION OF S MALL S PACE

Proof. Let 𝑓 ∈ ℱ satisfy 𝔼[ 𝑓 ] > 2𝜖. Since |𝐴0 − 0| ≤ 𝜖 and |𝐴 𝑓 − 𝔼[ 𝑓 ]| ≤ 𝜖, 𝐴0 ≠ 𝐴 𝑓 . Therefore,
𝐴 𝑓 must query 𝑓 at some point 𝑥 ∈ 𝑓 −1 (1). The first such query must be at a point 𝑥 ∈ 𝐻.         
     The deterministic sampler model captures some prior derandomization algorithms. In
particular, all known derandomizations of BPP using a hitting set [3, 9, 4, 13], including our
new derandomization in Section 5, can be interpreted as constructing deterministic samplers for
circuits. For our second result, we prove the analogous reduction for constant-width ROBPs,
i. e., we show that one can generically “upgrade” log-space hitting sets for such programs into
log-space deterministic samplers. (See Figure 1.)

                                               Distinguishing
                                          𝔼[ 𝑓 ] = 0 vs. 𝔼[ 𝑓 ] ≥ 𝜖




                               Hitting         Theorem 1.3            Estimating
                                 set                                   𝔼[ 𝑓 ] ± 𝜖



                       Theorem 1.7             Deterministic
                      (const. width)             sampler


                                                   WPRG


                                                    PRG


Figure 1: The relationships between different derandomization goals. The solid arrows are
implications that are immediate from the definitions and hold for essentially any class ℱ
(possibly with some loss in 𝜖). The dashed arrows are theorems in this paper, holding for ROBPs
specifically.

Theorem 1.7. Assume that for every constant 𝑤, for all 𝑛 ∈ ℕ , there is a 12 -hitting set for width-𝑤
length-𝑛 ROBPs that can be computed in space 𝑂(log 𝑛). Then for every constant 𝑤, for all 𝑛 ∈ ℕ and all
𝜖 > 0, there is a deterministic 𝜖-sampler for width-𝑤 length-𝑛 ROBPs that runs in space 𝑂(log(𝑛/𝜖)).
   More generally, given a 12 -hitting set for width-𝑤 0 length-𝑛 0 ROBPs for suitable values
𝑤 = 𝑂(𝑤 2 ) and 𝑛 0 = 𝑂(𝑛 2 /𝜖), we show how to construct a deterministic 𝜖-sampler for width-𝑤
 0

length-𝑛 ROBPs. If the hitting set has space complexity 𝑠, then our deterministic sampler has
space complexity 𝑂(𝑠 + 𝑤 log(𝑛/𝜖)). See Theorem 3.1 and Corollary 3.7.

                      T HEORY OF C OMPUTING, Volume 18 (21), 2022, pp. 1–32                          5
                                     K UAN C HENG AND W ILLIAM M. H OZA

    The proof of Theorem 1.7 uses different techniques than that of Theorem 1.3. Because of the
factor of 𝑤 in our deterministic sampler’s space complexity, our sampler becomes meaningless
when 𝑤 is large. Thus, Theorem 1.3 and Theorem 1.7 are incomparable.
    We also obtain a new unconditional deterministic sampler. When 𝜖 is moderate, the best
deterministic sampler for constant-width ROBPs is simply from Nisan’s PRG [25], which gives a
sampler with space complexity 𝑂(log2 𝑛 + log 𝑛 log(1/𝜖)). When 𝜖 is small, using prior work, the
best deterministic sampler for constant-width ROBPs was from Braverman, Cohen, and Garg’s
WPRG [8] (space complexity 𝑂(log
                               e    2
                                      𝑛 + log(1/𝜖))). The concurrent work by Chattopadhyay and
Liao [10] gives a slightly better WPRG, and hence a slightly better deterministic sampler (space
complexity 𝑂(log
             e     2
                     𝑛) + 𝑂(log(1/𝜖))). By applying the reduction underlying Theorem 1.7 to the
hitting set of Hoza and Zuckerman [19], we achieve a slightly better bound.
Theorem 1.8 (Unconditional sampler). For every constant 𝑤, for all 𝑛 ∈ ℕ and all 𝜖 > 0, there is a
deterministic 𝜖-sampler for width-𝑤 length-𝑛 ROBPs running in space 𝑂(log2 𝑛 + log(1/𝜖)).
    In light of Theorem 1.8, when it comes to deterministic samplers, there is now a slight gap
between the state of the art for polynomial-width ROBPs vs. the state of the art for width-𝑤
ROBPs with 𝑤 a large constant. In other words, Theorem 1.8 is a case where we can take
advantage of narrowness. There is no such gap when it comes to PRGs, WPRGs, or hitting sets.
(As mentioned previously, work subsequent to this paper constructed WPRGs for polynomial-
width ROBPs with seed length 𝑂(log2 𝑛 + log(1/𝜖)) [12, 28, 17], generalizing Theorem 1.8 and
closing the aforementioned gap between polynomial-width ROBPs and constant-width ROBPs.)

1.4    Implications of a stronger reduction
Theorem 1.7 raises the question of whether we can go even further and upgrade any hitting set
into a genuine PRG. In the time-bounded setting, this is indeed possible via the “hardness vs.
randomness” paradigm.3 Also, in the context of low-degree polynomials, Bogdanov showed
how to convert any hitting set with a certain density property into a PRG [6]. Can a similar
reduction be proven for small-space models?
    We focus on the setting of constant-width ROBPs over a large alphabet. (An ROBP over the
alphabet Σ computes a function 𝑓 : Σ𝑛 → {0, 1}; each vertex not in the last layer has |Σ| outgoing
edges labeled with the symbols in Σ.) We prove that if for every explicit hitting set in this
setting, there is an explicit PRG with similar parameters, then there is in fact an explicit PRG for
constant-width binary ROBPs with seed length 𝑂(log1+𝛼 𝑛), where 𝛼 > 0 is an arbitrarily small
constant. See Theorem 4.3 for the precise statement.
    Our result is similar to a theorem by Hoza and Umans [18]. Like us, Hoza and Umans
showed that if PRGs are equivalent to a seemingly weaker notion, then the equivalence itself can
be used to construct a good PRG. Hoza and Umans focused on the distinction between PRGs
and non-black-box derandomization, whereas we focus on the distinction between PRGs and
hitting sets.
   3If for every 𝑛 there is a hitting set for size-𝑛 circuits computable in poly(𝑛) time, then there is a language in E
that requires circuits of size 2Ω(𝑛) . A major achievement in complexity theory was to show that assuming such a
language exists, for every 𝑛, there is a polynomial-time logarithmic-seed PRG for size-𝑛 circuits [21].


                         T HEORY OF C OMPUTING, Volume 18 (21), 2022, pp. 1–32                                       6
                 H ITTING S ETS G IVE T WO -S IDED D ERANDOMIZATION OF S MALL S PACE

1.4.1   Interpretation

Like any conditional theorem, Theorem 4.3 has both a positive and a negative interpretation.4
According to the negative interpretation, Theorem 4.3 shows that it would be difficult to establish
a general reduction from PRGs to hitting sets. After all, it’s as difficult as constructing a good
PRG for constant-width ROBPs, which is a challenge that researchers have been struggling with
for decades. In this sense, Theorem 4.3 provides an “excuse” for the fact that Theorem 1.3 and
Theorem 1.7 do not provide genuine PRGs.
    We feel that the negative interpretation is more realistic, but there is also a sensible positive
interpretation. According to the positive interpretation, our work provides a new approach to
constructing improved PRGs or hitting sets for constant-width ROBPs. One “merely” needs to
bridge the gap between deterministic samplers and PRGs. This could be done in one of two
ways. One could improve Theorem 1.7 so that it concludes with a PRG instead of a deterministic
sampler. Alternatively, one could improve the construction of Theorem 4.3 so that rather than
relying on the equivalence of hitting sets and PRGs, it merely relies on the equivalence of hitting
sets and deterministic samplers. (In exchange, presumably the conclusion would merely be a
deterministic sampler rather than a true PRG, but that would still be a breakthrough.)


1.5     Overview of techniques
Let us first fix some notation. Let 𝑈𝑛 denote the uniform distribution over {0, 1} 𝑛 . For two
strings 𝑥, 𝑦, let 𝑥 ◦ 𝑦 denote the concatenation of 𝑥 with 𝑦. Suppose an ROBP 𝑓 is clear from
context. If 𝑢 and 𝑣 are vertices, let 𝑝 𝑢→𝑣 be the probability that a random walk starting at 𝑢
reaches 𝑣. We use the shorthand 𝑝 →𝑣 = 𝑝 𝑣start →𝑣 and 𝑝 𝑢→ = 𝑝 𝑢→𝑣acc . We use 𝑉𝑖 to denote the set
of vertices in the 𝑖-th layer of the ROBP, where 𝑖 ∈ {0, 1, . . . , 𝑛}.


1.5.1   Techniques for Theorem 1.3

We begin by outlining the proof of Theorem 1.3 (on derandomizing BPL). To derandomize BPL,
it suffices to show that given a width-𝑛 length-𝑛 ROBP 𝑓 , one can estimate 𝔼[ 𝑓 ] to within a
small additive error in log space. We do this using a hitting set 𝐻 for width-(𝑛 𝑐 ) length-(𝑛 𝑐 )
ROBPs, where 𝑐 is a large enough constant.
    Each 𝑥 ∈ 𝐻 is a string of length 𝑛 𝑐 . We think of 𝑥 as a list of many shorter strings. Specifically,
for every vertex 𝑣 in 𝑓 , the string 𝑥 provides poly(𝑛) “sample inputs” associated with 𝑣. We
compute the fraction b 𝑝 →𝑣 of those sample inputs that lead to 𝑣. The hope is that

                                                𝑝 →𝑣 ≈ 𝑝 →𝑣 .
                                            ∀𝑣, b                                                      (1.1)

Of course we cannot directly verify Equation (1.1), since we do not know the values 𝑝→𝑣 . Instead,
our algorithm looks for an 𝑥 ∈ 𝐻 such that the estimates b 𝑝 →𝑣 are locally consistent, i. e., for every

   4Throughout this discussion, we will ignore the issue of alphabet size, to simplify matters. The proof of
Theorem 1.7 does generalize well to the large-alphabet case.


                       T HEORY OF C OMPUTING, Volume 18 (21), 2022, pp. 1–32                              7
                                 K UAN C HENG AND W ILLIAM M. H OZA

𝑖 ∈ [𝑛] and every 𝑣 ∈ 𝑉𝑖 ,                      Õ
                                      𝑝 →𝑣 ≈
                                      b                 𝑝 →𝑢 · 𝑝 𝑢→𝑣 .
                                                        b
                                               𝑢∈𝑉𝑖−1

Having found such an 𝑥 ∈ 𝐻, we output the corresponding value b            𝑝→𝑣acc .
    To establish the correctness of our algorithm, we must show two assertions. First, if the
estimates b𝑝 →𝑣 pass the local consistency test, then 𝔼[ 𝑓 ] ≈ b
                                                               𝑝 →𝑣acc . Second, there is always a string
𝑥 ∈ 𝐻 that passes the local consistency test.
                                                       𝑝 →𝑣 − 𝑝→𝑣 | by induction on 𝑖. Because of
                                                Í
    To show the first assertion, we bound 𝑣∈𝑉𝑖 |b
the structure of the ROBP, the error accumulates mildly, only growing by a factor that is
approximately the size of 𝑓 .
    For the second assertion, we use the hitting property of 𝐻. At first glance, it might seem that
the assertion is immediate. After all, a random 𝑥 certainly passes the local consistency test with
high probability, and the local consistency test can be computed in small space. Unfortunately,
however, that computation involves reading the bits of 𝑥 multiple times, whereas 𝐻 is merely
guaranteed to hit read-once branching programs.
    To deal with this issue, we notice that if 𝑥 were chosen at random, then with high probability,
it would satisfy Equation (1.1). Furthermore, there exists a width-(𝑛 𝑐 ) length-(𝑛 𝑐 ) ROBP 𝑓 0
that determines whether its input 𝑥 satisfies Equation (1.1). The values 𝑝 →𝑣 are all hard-coded
into 𝑓 0. There is no need to algorithmically construct 𝑓 0; the mere fact that it exists implies the
existence of an 𝑥 ∈ 𝐻 that satisfies Equation (1.1). Satisfying Equation (1.1) readily implies that
𝑥 also passes the local consistency test, completing the proof of the second assertion.

1.5.2   Techniques for Theorem 1.7
The proof of Theorem 1.7 (on deterministic samplers) uses different techniques. Let 𝑓 be a
constant-width ROBP. To estimate 𝔼[ 𝑓 ], we use the assumed hitting set 𝐻 and queries to 𝑓 to
infer an approximation of the graph structure of 𝑓 in some sense.
    In more detail, let 𝐻∗ be the set of all prefixes of strings in 𝐻, and let 𝐻𝑖 = 𝐻∗ ∩ {0, 1} 𝑖 . We
use the hitting set 𝐻 in two different ways. First, we identify each prefix 𝑥 ∈ 𝐻∗ with the vertex
that is reached when 𝑓 reads 𝑥 (so prefixes of length 𝑖 correspond to vertices in layer 𝑖). In this
way, we can think of 𝑓 as a program over the vertex set 𝐻∗ , which we have direct access to.
    Second, we use 𝐻 to merge some of these “vertices.” We define an equivalence relation on
{0, 1} 𝑖 by the rule
                           𝑥 ∼ 𝑦 ⇐⇒ ∀𝑧 ∈ 𝐻𝑛−𝑖 , 𝑓 (𝑥 ◦ 𝑧) = 𝑓 (𝑦 ◦ 𝑧) .
Note that equivalence can be tested by making queries to 𝑓 . If two “vertices” 𝑥, 𝑦 ∈ 𝐻𝑖 are
equivalent, then we think of them as a single “merged” vertex. (This merging operation is
similar to a randomized learning algorithm by Gopalan, Klivans, and Meka [15]. Note that their
algorithm is not space-efficient.) The number of distinct equivalence classes – hence the number
of vertices per layer after merging – is bounded by the width of 𝑓 .
    Given the (approximate) graph structure of 𝑓 , we work our way backward through the
program starting with the final layer. We inductively compute the acceptance probability 𝑝 𝑣→
from each vertex 𝑣 using the formula 𝑝 𝑣→ = 12 (𝑝 𝑣0 → + 𝑝 𝑣1 → ), where 𝑣 0 and 𝑣 1 are the two

                      T HEORY OF C OMPUTING, Volume 18 (21), 2022, pp. 1–32                            8
                H ITTING S ETS G IVE T WO -S IDED D ERANDOMIZATION OF S MALL S PACE

out-neighbors of 𝑣. This allows us to estimate 𝔼[ 𝑓 ], i. e., the acceptance probability from the
start vertex 𝑝 𝑣start → .
    For the analysis, we must contend with two types of errors. First, there might be some
vertices in 𝑓 that are never visited when 𝑓 reads strings in 𝐻. Such vertices are effectively lost
when we think of 𝑓 as a program over the vertex set 𝐻∗ . Intuitively, these errors are tolerable,
because the hitting property of 𝐻 implies that for each such vertex 𝑣, the probability of reaching
𝑣 is small, say 𝑝 →𝑣 ≤ 1/poly(𝑛), and hence 𝑣 contributes little to 𝔼[ 𝑓 ]. Second, there might be
pairs of distinct vertices 𝑢, 𝑣 in some layer of 𝑓 such that starting at 𝑢 and reading a string 𝑥 ∈ 𝐻
always leads to the same outcome (accept or reject) as starting at 𝑣 and reading the same string 𝑥.
Such vertices are effectively merged by our algorithm, even though they are distinct. Intuitively,
these errors are also tolerable, because the hitting property of 𝐻 implies that 𝑝 𝑢→ ≈ 𝑝 𝑣→ for
such pairs.
    Let us highlight one of the challenges in implementing the plan described above. In our
approximation of the graph structure of 𝑓 , the vertex set is 𝐻∗ (modulo the equivalence relation
∼), but what exactly are the edges? Intuitively, if we are at a vertex 𝑥 ∈ 𝐻𝑖 and we read a bit
𝑏 ∈ {0, 1}, we would like to move to the vertex 𝑥 ◦ 𝑏. The main difficulty is that such a rule is not
well-defined, because 𝑥 ∼ 𝑦 does not necessarily imply 𝑥 ◦ 𝑏 ∼ 𝑦 ◦ 𝑏. We overcome this difficulty
by showing that it suffices to use whichever representative 𝑥 ∈ 𝐻𝑖 maximizes the acceptance
probability.


1.5.3   Techniques for Theorem 4.3

Recall that to prove Theorem 4.3, we must (conditionally) construct a PRG with seed length
𝑂(log1+𝛼 𝑛), where 𝛼 > 0 is an arbitrarily small constant. For simplicity, in this overview, we
will focus on the case 𝛼 = 1/2, i. e., seed length 𝑂(log3/2 𝑛). Recall also that we are focusing on
the constant-width case.
    The starting point of the construction is the INW PRG, which 𝜖-fools constant-width ROBPs
over the alphabet {0, 1} 𝑡 with seed length 𝑂(𝑡 + log(𝑛/𝜖) log 𝑛) [20]. (Nisan’s PRG [25] does not
achieve the same optimal dependence on 𝑡.) Next, we present a reduction, showing how to
convert a PRG with moderate error into a hitting set with very small threshold (Theorem 4.4).
Hoza and Zuckerman gave a similar reduction [19], but their reduction only applies to binary
ROBPs (the case 𝑡 = 1). Our reduction is based on a more sophisticated variant of a key lemma
in Hoza and Zuckerman’s work [19].
    Applying our new reduction to the INW generator, we unconditionally obtain an improved
hitting set. The best previous hitting sets had space complexity 𝑂(𝑡 + log2 𝑛 + log(1/𝜖) log 𝑛)
[20] or 𝑂(𝑡 log 𝑛 + log2 𝑛 + log(1/𝜖)) [19]. Our new hitting set (Corollary 4.8) achieves the “best
of both worlds,” with space complexity 𝑂(𝑡 + log2 𝑛 + log(1/𝜖)).
    The next step in the proof of Theorem 4.3 is to apply the assumption of Theorem 4.3,
converting our hitting set into a PRG. The final step is to use traditional “seed recycling”
techniques to trade the excellent dependence on 𝜖 for an improved dependence on 𝑛. Briefly,
starting with a length-𝑛 ROBP over the alphabet {0, 1} 𝑡 , we first use a randomized sampler [14]
to reduce the alphabet size to poly(𝑛). Then we divide our length-𝑛 ROBP of interest into blocks

                     T HEORY OF C OMPUTING, Volume 18 (21), 2022, pp. 1–32                         9
                                     K UAN C HENG AND W ILLIAM M. H OZA
                 √
of length 𝑚 = 2 log 𝑛 . We can fool each chunk to within error 1/poly(𝑛) using a seed of length
𝑂(log2 𝑚 + log 𝑛) = 𝑂(log 𝑛). Using the randomized sampler again, this allows us to effectively
pay 𝑂(log 𝑛) truly random bits p and reduce the length of the branching program by a factor of 𝑚.
After repeating this process log 𝑛 times, the length is reduced to a constant, and we have paid
a total of 𝑂(log3/2 𝑛) truly random bits.
    (To achieve seed length 𝑂(log1+𝛼 𝑛), we start the whole process over again and iterate roughly
1/𝛼 times. This iterative strategy is similar to the work of Hoza and Umans [18], but the specific
reductions are different.)

1.6     Related work
We have already referenced most of the work related to this paper, such as work on derandomizing
BPP using a hitting set [3, 9, 4, 13]. However, a couple of additional papers deserve mention.

1.6.1 BPL ⊆ ZP∗ L
Our derandomization of BPL given a hitting set is similar to Nisan’s unconditional proof that
BPL ⊆ ZP∗ L [26]. To estimate the acceptance probability of a width-𝑛 length-𝑛 ROBP 𝑓 , Nisan,
like us, interprets a string 𝑥 ∈ {0, 1}poly(𝑛) as a list of sample inputs, which he uses to compute
estimates of 𝑝→𝑣 for each vertex 𝑣. Nisan’s algorithm picks 𝑥 at random, and then in a similar
fashion as our algorithm, performs certain “local tests” at each vertex to verify that the sample
inputs are trustworthy. Nisan’s local tests can be computed in small space given two-way access
to 𝑥, and passing the local tests implies that the estimates are close to the corresponding true
probabilities. Our local consistency test also satisfies these properties, and indeed, one can
obtain an alternative proof that BPL ⊆ ZP∗ L from our analysis. However, a technical point is
that we use fresh samples for each vertex, whereas Nisan uses one set of 𝑛-bit sample inputs
for all the vertices. This crucial distinction is how we are able to ensure the existence of a
polynomial-width read-once branching program that verifies Equation (1.1). Unfortunately, using
fresh samples breaks Nisan’s local tests, hence our new local consistency test.

1.6.2   Deterministically simulating BPL with very low error
Recall that the current best unconditional hitting sets for polynomial-width ROBPs [8, 19, 10,
12, 28, 17] are superior to the state-of-the-art PRGs [25] in the small-𝜖 regime. In this work, we
present an algorithm for estimating the acceptance probability of a BPL algorithm given a hitting
set for ROBPs. Taken together, these two facts draw attention to the problem of estimating the
acceptance probability of a BPL algorithm to within ±𝜖 in the small-𝜖 regime.
    Unfortunately, our work does not imply an improved algorithm for this problem.5 The
good news is that Ahmadinejad, Kelner, Murtagh, Peebles, Sidford, and Vadhan recently
tackled this same problem with different techniques [1]. They show how to deterministically
    5To estimate the acceptance probability of a BPL algorithm to within ±𝜖, our algorithm relies on a 12 -hitting set
for ROBPs of length poly(𝑛/𝜖) rather than an 𝜖-hitting set for ROBPs of length poly(𝑛), so we are not able to get any
advantage from the recent unconditional hitting set constructions [8, 19, 10, 12, 28, 17].


                         T HEORY OF C OMPUTING, Volume 18 (21), 2022, pp. 1–32                                     10
                   H ITTING S ETS G IVE T WO -S IDED D ERANDOMIZATION OF S MALL S PACE

estimate the acceptance probability of a BPL algorithm to within ±𝜖 in space 𝑂(log3/2 𝑛 +
log 𝑛 log log(1/𝜖)) [1].


1.7     Outline of this paper
In Section 2, we present our derandomization of BPL given a hitting set for polynomial-width
ROBPs. In Section 3, we present our deterministic sampler for constant-width ROBPs given a
hitting set. In Section 4, we present our theorem on the limitations of this line of work. Finally,
in Section 5, we present our derandomization of BPP given a hitting set for polynomial-size
circuits.


2     Derandomizing BPL given a hitting set
In this section, we show that the acceptance probability of an arbitrary polynomial-width ROBP
can be approximated within a small bias in small space, given a certain hitting set. Theorem 1.3
will follow from this.

Theorem 2.1. Assume there is a 12 -hitting set 𝐻 for width-𝑤 0 length-𝑛 0 ROBPs that can be computed
in space 𝑠. Then the acceptance probability of a given width-𝑤 length-𝑛 ROBP 𝑓 can be approximated
within a bias ±𝜖, in space 𝑂(𝑠 + log 𝑤𝑛
                                      𝜖 ).
                       𝑤 3 𝑛 2 log(𝑤𝑛)                  𝑤 3 𝑛 4 log(𝑤𝑛)
               l                         m        l                   m
      Here 𝑤 0 = 9             𝜖2
                                             , 𝑛0 = 5           𝜖2
                                                                        .

    Strictly speaking, Theorem 2.1 ought to be phrased in terms of families of ROBPs, to make the
space bounds meaningful. That is, we assume there is an algorithm that constructs a 12 -hitting
set for width-𝑤 length-𝑛 ROBPs, given 𝑤 and 𝑛 as inputs, running in space 𝑠(𝑤, 𝑛). Then given
inputs 𝑓 , 𝜖, Theorem 2.1 should be understood to say that we can estimate 𝔼[ 𝑓 ] to within ±𝜖 in
space 𝑂(𝑠(𝑤 0 , 𝑛 0) + log(𝑤𝑛/𝜖)).
    We are most interested in the case that 𝜖 is a small constant, but we remark that when 𝜖 is very
small, the parameters of Theorem 2.1 could be improved by applying the recent amplification
technique by Ahmadinejad, Kelner, Murtagh, Peebles, Sidford, and Vadhan [1].
    We first give the derandomization and then give the analysis.


2.1     Derandomization based on a local consistency test
For 𝑥 ∈ {0, 1} 𝑛 , we interpret it as a concatenation of 𝑤𝑛 “segments” – one segment for each
                   0


vertex in 𝑉1 ∪ · · · ∪ 𝑉𝑛 . For each 𝑖 ∈ [𝑛] and each 𝑣 ∈ 𝑉𝑖 , the segment corresponding to 𝑣
consists of a concatenation of 𝑡 “sample strings” of length 𝑖, where 𝑡 is a power of two satisfying
𝑡 ≥ 4( 𝑤𝑛
        𝜖 ) log(𝑤𝑛). Let b
           2                 𝑝→𝑣 (𝑥) be the fraction of strings that lead to 𝑣 from the start vertex,
among these 𝑡 sample strings for 𝑣. When 𝑥 is clear, we simply denote it as b   𝑝 →𝑣 . Also, for 𝑣 ∈ 𝑉0 ,
       𝑝 →𝑣 = 1 if 𝑣 = 𝑣 start and b
we let b                           𝑝 →𝑣 = 0 otherwise.

                             T HEORY OF C OMPUTING, Volume 18 (21), 2022, pp. 1–32                    11
                                   K UAN C HENG AND W ILLIAM M. H OZA

   The derandomization conducts a local consistency test Test : {0, 1} 𝑛 → {0, 1} for every
                                                                                              0


𝑥 ∈ 𝐻 as follows. For all 𝑖 ∈ [𝑛], for all 𝑣 ∈ 𝑉𝑖 , check if
                                                            !                        !
                                       Õ                                Õ
                          𝑝 →𝑣 −
                          b                  𝑝 →𝑢 · 𝑝 𝑢→𝑣
                                             b                  ≤ 1+            𝑝 𝑢→𝑣 𝜖0 ,                       (2.1)
                                    𝑢∈𝑉𝑖−1                             𝑢∈𝑉𝑖−1

            𝜖
where 𝜖0 = 2𝑤𝑛 . If 𝑥 passes the checks for all 𝑣, then Test(𝑥) = 1, otherwise it is 0.
  Finally we find an 𝑥 ∈ 𝐻 that passes Test, and output b   𝑝→𝑣acc (𝑥) as the approximation of 𝔼[ 𝑓 ].

2.2   Analysis
We now define the “sample verification” function 𝑓 0 of 𝑓 . For each 𝑥 ∈ {0, 1} 𝑛 , we set 𝑓 0(𝑥) = 1
                                                                                                   0


if and only if for every vertex 𝑣 in 𝑓 ,

                                               𝑝 →𝑣 − 𝑝 →𝑣 ≤ 𝜖0 .
                                               b                                                                 (2.2)

We stress that our derandomization algorithm does not require computing 𝑓 0; we define 𝑓 0 only
for the sake of analysis.

Lemma 2.2. 𝑓 0 can be computed by a width-𝑤 0 length-𝑛 0 ROBP.

Proof. For each 𝑖 ∈ [𝑛] and each 𝑣 ∈ 𝑉𝑖 , we construct an ROBP 𝑓𝑣0 which simulates 𝑓 on each
sample string in 𝑣’s segment and counts how many lead to 𝑣. It stores a state of 𝑓 and a counter
value, for a total width of 𝑤 · (𝑡 + 1) and a total length 𝑖 · 𝑡. 𝑓𝑣0 accepts if and only if the counter
                      𝜖              𝜖
value is in [𝑝→𝑣 𝑡 − 2𝑤𝑛 𝑡, 𝑝→𝑣 𝑡 + 2𝑤𝑛 𝑡].
    To construct 𝑓 , we take the conjunction of 𝑓𝑣0, over all 𝑣 in 𝑓 . Note that this is a conjunction
                   0

                                l So we canm easily see that 𝑓 can be computed
of ROBPs over disjoint variables.                                 0                   by anmROBP with
                                       𝑤 3 𝑛 2 log(𝑤𝑛)                                       𝑤 3 𝑛 4 log(𝑤𝑛)
                                                                               l
width at most 𝑤(𝑡 + 1) + 1 ≤ 9                         , length at most 𝑡𝑤 𝑛𝑖=1 𝑖 ≤
                                                                          Í
                                               𝜖2
                                                                                         5           𝜖2
                                                                                                             .      

Lemma 2.3. The acceptance probability of 𝑓 0 is at least 21 .

Proof. By the construction of 𝑓 0, for each 𝑣 of 𝑓 , there are 𝑡 uniform random samples. For
each sample string, the probability that it leads to 𝑣 from 𝑣 start in 𝑓 is 𝑝 →𝑣 . Hence the
expected number of samples leading to 𝑣 from 𝑣 start is 𝑝 →𝑣 𝑡. So by Hoeffding’s inequality,
                         𝜖
Pr[|b𝑝 →𝑣 𝑡 − 𝑝→𝑣 𝑡| ≥ 2𝑤𝑛 𝑡] ≤ 2 · 2−2 log(𝑤𝑛) ≤ (𝑤𝑛)
                                                    2
                                                      2 . There are 𝑤𝑛 vertices that need to be tested in

𝑓 . (For 𝑣 ∈ 𝑉0 , the estimate b
                               𝑝 →𝑣 is always exactly correct.) Thus by a union bound,
                                   h                             𝜖 i       2
                                      𝑝 →𝑣 − 𝑝 →𝑣 ≤
                               Pr ∀𝑣, b                              ≥ 1−    .
                                                                2𝑤𝑛       𝑤𝑛

This is at least 12 when considering 𝑛 to be at least some large enough constant. So by the
definition of 𝑓 0, its acceptance probability is at least 12 .                            

Lemma 2.4. For every 𝑥 ∈ {0, 1} 𝑛 , if 𝑓 0(𝑥) = 1 then Test(𝑥) = 1.
                                        0




                      T HEORY OF C OMPUTING, Volume 18 (21), 2022, pp. 1–32                                        12
                    H ITTING S ETS G IVE T WO -S IDED D ERANDOMIZATION OF S MALL S PACE

Proof. For every 𝑖 ∈ [𝑛], every 𝑣 ∈ 𝑉𝑖 ,



                                                        Õ
                                            𝑝→𝑣 =               𝑝→𝑢 𝑝 𝑢→𝑣 ,                                       (2.3)
                                                       𝑢∈𝑉𝑖−1




by the structure of ROBP. So



          Õ                                                 Õ
𝑝 →𝑣 −
b                 b            𝑝 →𝑣 − 𝑝→𝑣 + 𝑝 →𝑣 −
                  𝑝 →𝑢 𝑝 𝑢→𝑣 = b                                    𝑝 →𝑢 𝑝 𝑢→𝑣
                                                                    b
         𝑢∈𝑉𝑖−1                                            𝑢∈𝑉𝑖−1
                                                  Õ                       Õ
                              𝑝 →𝑣 − 𝑝→𝑣 +
                            = b                           𝑝 →𝑢 𝑝 𝑢→𝑣 −            𝑝 →𝑢 𝑝 𝑢→𝑣
                                                                                  b            (Equation (2.3))
                                                 𝑢∈𝑉𝑖−1                  𝑢∈𝑉𝑖−1
                                                   Õ
                              𝑝 →𝑣 − 𝑝 →𝑣 +
                            ≤ b                            𝑝 →𝑢 − b
                                                                  𝑝→𝑢 𝑝 𝑢→𝑣                    (Triangle Inequality)
                                                  𝑢∈𝑉𝑖−1
                                                   !
                                    Õ
                            ≤ 1+            𝑝 𝑢→𝑣 𝜖0 .                                         (Equation (2.2))           
                                   𝑢∈𝑉𝑖−1




Lemma 2.5. For every 𝑥 ∈ {0, 1} 𝑛 , if Test(𝑥) = 1 then b
                                      0
                                                        𝑝 →𝑣acc − 𝑝 →𝑣acc ≤ 𝜖.




Proof. We use induction to show that for the 𝑖-th layer of 𝑓 ,



                                          Õ
                                                 𝑝→𝑣 − 𝑝 →𝑣 ≤ 2𝑤𝑖𝜖0 .
                                                 b
                                          𝑣∈𝑉𝑖




                                                                   𝑝 →𝑣 = 𝑝 →𝑣 for each 𝑣 ∈ 𝑉0 . For
   For the base case, when 𝑖 = 0, it’s trivially true since we set b

                         T HEORY OF C OMPUTING, Volume 18 (21), 2022, pp. 1–32                                      13
                                            K UAN C HENG AND W ILLIAM M. H OZA

the induction case, assume the hypothesis is true for layer 𝑖. Consider layer 𝑖 + 1.
       Õ
               𝑝 →𝑣 − 𝑝 →𝑣
               b
      𝑣∈𝑉𝑖+1
       Õ                 Õ
  =            𝑝 →𝑣 −
               b                𝑝 →𝑢 𝑝 𝑢→𝑣                                                         (Equation (2.3))
      𝑣∈𝑉𝑖+1            𝑢∈𝑉𝑖
       Õ                 Õ                       Õ                      Õ
  =            𝑝 →𝑣 −
               b                𝑝 →𝑢 𝑝 𝑢→𝑣 +
                                b                       𝑝 →𝑢 𝑝 𝑢→𝑣 −
                                                        b                      𝑝 →𝑢 𝑝 𝑢→𝑣
      𝑣∈𝑉𝑖+1            𝑢∈𝑉𝑖                    𝑢∈𝑉𝑖                    𝑢∈𝑉𝑖
                                                                                              !
       Õ                     Õ                      Õ                      Õ
  ≤             𝑝 →𝑣 −
                b                   𝑝→𝑢 𝑝 𝑢→𝑣 +
                                    b                      𝑝 →𝑢 𝑝 𝑢→𝑣 −
                                                           b                      𝑝→𝑢 𝑝 𝑢→𝑣        (Triangle Inequality)
      𝑣∈𝑉𝑖+1                 𝑢∈𝑉𝑖                  𝑢∈𝑉𝑖                    𝑢∈𝑉𝑖
       Õ                 Õ                        Õ Õ
  ≤            𝑝 →𝑣 −
               b                𝑝 →𝑢 𝑝 𝑢→𝑣 +
                                b                              𝑝 𝑢→𝑣 b
                                                                     𝑝 →𝑢 − 𝑝 →𝑢                   (Triangle Inequality)
      𝑣∈𝑉𝑖+1            𝑢∈𝑉𝑖                     𝑣∈𝑉𝑖+1 𝑢∈𝑉𝑖
                                      !
       Õ               Õ                      Õ Õ
  ≤            1+            𝑝 𝑢→𝑣 𝜖0 +                    𝑝 𝑢→𝑣 b
                                                                 𝑝→𝑢 − 𝑝→𝑢                         (Test(𝑥) = 1)
      𝑣∈𝑉𝑖+1        𝑢∈𝑉𝑖                     𝑣∈𝑉𝑖+1 𝑢∈𝑉𝑖
                Õ
          0
  = 2𝑤𝜖 +               𝑝 →𝑢 − 𝑝 →𝑢 |
                       |b                                                                                                  (2.4)
                𝑢∈𝑉𝑖

  ≤ 2𝑤𝜖 + 2𝑤𝑖𝜖0
          0
                                                                                                   (Induction hypothesis)
                         0
  = 2𝑤 · (𝑖 + 1) · 𝜖 .

Here Equation (2.4) is due to structures of ROBPs. Note that
                                           Õ Õ                     Õ Õ
                                                        𝑝 𝑢→𝑣 =                   𝑝 𝑢→𝑣 = 𝑤 .
                                          𝑣∈𝑉𝑖+1 𝑢∈𝑉𝑖              𝑢∈𝑉𝑖 𝑣∈𝑉𝑖+1

Also due to the same reasoning,
              Õ Õ                                        Õ Õ                                      Õ
                         𝑝 𝑢→𝑣 |b
                                𝑝→𝑢 − 𝑝 →𝑢 | =                         𝑝 𝑢→𝑣 |b
                                                                              𝑝 →𝑢 − 𝑝→𝑢 | =              𝑝→𝑢 − 𝑝 →𝑢 | .
                                                                                                         |b
          𝑣∈𝑉𝑖+1 𝑢∈𝑉𝑖                                    𝑢∈𝑉𝑖 𝑣∈𝑉𝑖+1                              𝑢∈𝑉𝑖

As a result, for the last layer,
                                                            Õ
                                     𝑝 →𝑣acc − 𝑝→𝑣acc | ≤
                                    |b                              𝑝 →𝑣 − 𝑝 →𝑣 | ≤ 2𝑤𝑛𝜖0 = 𝜖 .
                                                                   |b                                                         
                                                            𝑣∈𝑉𝑛

Lemma 2.6. The derandomization is in space 𝑂(𝑠 + log 𝑤𝑛
                                                      𝜖 ).

Proof. Since 𝐻 is computable in space 𝑠, for every 𝑥 ∈ 𝐻 we can output any specified bit of it in
space 𝑂(𝑠 + log 𝑛 0). So when considering the space for computing Test(𝑥) and b𝑝→𝑣 (𝑥), we can
just regard 𝑥 as an input string and only consider working space.

                              T HEORY OF C OMPUTING, Volume 18 (21), 2022, pp. 1–32                                          14
                 H ITTING S ETS G IVE T WO -S IDED D ERANDOMIZATION OF S MALL S PACE

    Given vertex 𝑣 in 𝑓 , we first consider the space for computing b     𝑝 →𝑣 . By the definition of b
                                                                                                      𝑝 →𝑣 ,
we can locate the starting position of the 𝑡 samples for 𝑣, taking space 𝑂(log 𝑤𝑛         𝜖 ). From there,
we read the 𝑡 samples one by one. For each sample, we run 𝑓 from 𝑣 start to the layer of 𝑣 to
test if the sample leads to 𝑣. We use a counter 𝑐 to record the number of samples leading to 𝑣.
Then compute b    𝑝 →𝑣 as 𝑐/𝑡. Since 𝑡 is a power of two, we can store this number exactly, with no
rounding errors. So this step takes space 𝑂(log(𝑤𝑛)) + 𝑂(log 𝑡) = 𝑂(log 𝑤𝑛           𝜖 ). Thus the whole
computation is in space 𝑂(log 𝑤𝑛     𝜖 ).
    Next we consider Test. By the definition of Test, for every 𝑖 ∈ [𝑛], for each vertex 𝑣 ∈ 𝑉𝑖 , we
                           𝑝 →𝑣 , 𝑢∈𝑉𝑖−1 b𝑝 →𝑢 · 𝑝 𝑢→𝑣 and then test Equation (2.1). This again takes
                                  Í
only need to compute b
space 𝑂(log 𝑤𝑛  𝜖 ). (Note that  the algorithm   for Test(𝑥) reads the bits of 𝑥 multiple times, because
the value b  𝑝 →𝑣 is used when testing Equation (2.1) for 𝑣 itself and then again when testing
Equation (2.1) for 𝑣’s out-neighbors.)
    So the overall space of the derandomization is 𝑂(𝑠 + log 𝑤𝑛      𝜖 ).                                

Proof of Theorem 2.1. Given a width-𝑤 length-𝑛 ROBP 𝑓 , by Lemma 2.2, the function 𝑓 0 can be
computed by a width-𝑤 0 length-𝑛 0 ROBP. By Lemma 2.3, the acceptance probability of 𝑓 0 is at
least 1/2. Since 𝐻 is a 12 -hitting set for width-𝑤 0 length-𝑛 0 ROBPs, there exists 𝑥 ∈ 𝐻 such that
 𝑓 0(𝑥) = 1. So by Lemma 2.4, there is an 𝑥 ∈ 𝐻 such that Test(𝑥) = 1. Hence we can exhaustively
search though 𝐻 to find an 𝑥 which passes Test. Further, by Lemma 2.5, for each such 𝑥,
|b 𝑝 →𝑣acc (𝑥) − 𝑝 →𝑣acc | ≤ 𝜖. This shows the derandomization outputs the desired approximation
for 𝑝 →𝑣acc .
      By Lemma 2.6, the derandomization can be done in space 𝑂(𝑠 + log 𝑤𝑛      𝜖 ).               

    Theorem 1.3 is directly implied from Theorem 2.1. The proof is straightforward by applying
the well-known transformation between log-space computations and ROBPs.


3    Deterministic samplers for constant-width ROBPs
In this section, we will show how to use hitting sets to construct deterministic samplers
for constant-width ROBPs, thereby proving Theorem 1.7. Most of the work will go toward
establishing the following reduction, which is meaningful even for slightly super-constant
width.
                                                               𝜖
Theorem 3.1. Let 𝑤, 𝑛 ∈ ℕ and let 𝜖 > 0. Assume there is an ( 2𝑛 )-hitting set 𝐻 for width-( 𝑤2 + 1)
                                                                                                     
length-𝑛 ROBPs computable in space 𝑠. Then there is a deterministic 𝜖-sampler for width-𝑤 length-𝑛
ROBPs that runs in space 𝑂(𝑠 + 𝑤 log(𝑛/𝜖)).

    Like Theorem 2.1, Theorem 3.1 technically ought to be phrased in terms of families of ROBPs.
We should also clarify the model of space-bounded oracle algorithm. We assume that the
sampler has write-only access to a “query tape” where it can write down an 𝑛-bit query string
(the query string does not count against the sampler’s space complexity). The sampler can then
enter a special “query” state, which returns the result of the query into the algorithm’s state and
clears the query tape. This natural model was perhaps first studied by Ladner and Lynch [22].

                      T HEORY OF C OMPUTING, Volume 18 (21), 2022, pp. 1–32                              15
                                       K UAN C HENG AND W ILLIAM M. H OZA

3.1     Setting up the reduction
Toward proving Theorem 3.1, we begin by setting up some notation. For any ROBP 𝑓 and a
string 𝑥 ∈ {0, 1} ≤𝑛 , let 𝑣 𝑓 (𝑥) be the vertex reached when 𝑓 reads 𝑥. Furthermore, define

                                            𝑝 𝑓 (𝑥) = 𝔼[ 𝑓 (𝑥 ◦ 𝑈𝑛−|𝑥| )] ,

i. e., 𝑝 𝑓 (𝑥) = 𝑝 𝑣 𝑓 (𝑥)→ .
     Now, let 𝐻 ⊆ {0, 1} 𝑛 be an 𝜖 𝐻 -hitting set for width-( 𝑤2 + 1) ROBPs. For 𝑖 ≤ 𝑛, let 𝐻𝑖 be the
                                                                                
set of 𝑖-bit prefixes of strings in 𝐻, i. e., 𝐻𝑖 = {𝑥 1 𝑥 2 . . . 𝑥 𝑖 : 𝑥 ∈ 𝐻}. One can verify that 𝐻𝑖 is an
                              𝑤
𝜖 𝐻 -hitting set for width-( 2 + 1) length-𝑖 ROBPs.
     Let 𝑓 be the width-𝑤 ROBP to which we have oracle access. Let 𝜆 denote the empty string.
Our goal is to estimate 𝑝 𝑓 (𝜆). For each 𝑖 ≤ 𝑛, define an equivalence relation ∼ on {0, 1} 𝑖 by the
rule
                                   𝑥 ∼ 𝑦 ⇐⇒ ∀𝑧 ∈ 𝐻𝑛−𝑖 , 𝑓 (𝑥 ◦ 𝑧) = 𝑓 (𝑦 ◦ 𝑧) .

Lemma 3.2. If 𝑥 ∼ 𝑦, then |𝑝 𝑓 (𝑥) − 𝑝 𝑓 (𝑦)| < 𝜖 𝐻 .

Proof. Let 𝑖 = |𝑥| = |𝑦|. Define 𝑔 : {0, 1} 𝑛−𝑖 → {0, 1} by

                                           𝑔(𝑧) = 𝑓 (𝑥 ◦ 𝑧) ⊕ 𝑓 (𝑦 ◦ 𝑧) .

The function 𝑔(𝑧) can be computed by an ROBP of width 𝑤2 + 1: we have one state in 𝑔 for each
                                                                                    
unordered pair of states in 𝑓 to run the computations 𝑓 (𝑥 ◦ 𝑧), 𝑓 (𝑦 ◦ 𝑧) in parallel, along with
one additional ⊥ state in 𝑔 to indicate that the two computations converged to the same state. If
|𝑝 𝑓 (𝑥) − 𝑝 𝑓 (𝑦)| ≥ 𝜖 𝐻 , then 𝐻𝑛−𝑖 hits 𝑔, hence 𝑥 / 𝑦.                                       

    Let [𝑥] denote the equivalence class of 𝑥, so [𝑥] ⊆ {0, 1} |𝑥| . Our deterministic sampler will
be based on numbers e        𝑝 𝑓 ([𝑥]) ∈ [0, 1] for each equivalence class [𝑥]. The definition of e 𝑝 𝑓 will
ensure that e 𝑝 𝑓 ([𝑥]) ≈ 𝑝 𝑓 (𝑥) for typical values of 𝑥, although there might be some anomalous
values of 𝑥 where e   𝑝 𝑓 ([𝑥]) 0 𝑝 𝑓 (𝑥).
    The definition of e    𝑝 𝑓 ([𝑥]) is inductive. For the base case, when 𝑥 ∈ {0, 1} 𝑛 , define e
                                                                                                 𝑝 𝑓 ([𝑥]) =
𝑓 (𝑥). This is well-defined, because 𝑥 ∼ 𝑦 =⇒ 𝑓 (𝑥) = 𝑓 (𝑦). For the inductive step, suppose
𝑥 ∈ {0, 1} 𝑖 with 𝑖 < 𝑛. Define
                                                                                             
                                                            1                  1
                                𝑝 𝑓 ([𝑥]) = max
                                e                             𝑝 𝑓 ([𝑥 0 ◦ 0]) + e
                                                              e                  𝑝 𝑓 ([𝑥 0 ◦ 1]) ,     (3.1)
                                         𝑥 0 ∈𝐻𝑖 ∩[𝑥]       2                  2

with the convention that e 𝑝 𝑓 ([𝑥]) = 0 if 𝐻𝑖 ∩ [𝑥] = ∅. Our sampler will output6 e     𝑝 𝑓 ([𝜆]). (In
Section 3.3, we will explain in more detail how to compute e 𝑝 𝑓 ([𝜆]) in a space-efficient manner.)

                                                        𝑝 𝑓 ([𝜆]) due to rounding errors.
   6Actually the sampler’s output differs slightly from e


                            T HEORY OF C OMPUTING, Volume 18 (21), 2022, pp. 1–32                        16
                   H ITTING S ETS G IVE T WO -S IDED D ERANDOMIZATION OF S MALL S PACE

3.2    Correctness
                   𝑝 𝑓 ([𝑥]) is straightforward:
The upper bound on e
Claim 3.3. For every 𝑖, for every 𝑥 ∈ {0, 1} 𝑛−𝑖 ,
                                               𝑝 𝑓 ([𝑥]) ≤ 𝑝 𝑓 (𝑥) + 𝑖𝜖 𝐻 .
                                               e
Proof. We proceed by induction on 𝑖. In the base case 𝑖 = 0, e 𝑝 𝑓 ([𝑥]) = 𝑓 (𝑥) = 𝑝 𝑓 (𝑥). For the
inductive step 𝑖 > 0, we consider two cases. If 𝐻𝑖 ∩ [𝑥] = ∅, then e 𝑝 𝑓 ([𝑥]) = 0 and the claim is
trivial. Otherwise, there is some 𝑥 0 ∈ 𝐻𝑖 ∩ [𝑥] such that
                          1                   1
                         𝑝 𝑓 ([𝑥 0 ◦ 0]) + e
             𝑝 𝑓 ([𝑥]) = e
             e                             𝑝 𝑓 ([𝑥 0 ◦ 1])                               (Equation (3.1))
                          2                   2
                          1                 1
                        ≤ 𝑝 𝑓 (𝑥 ◦ 0) + 𝑝 𝑓 (𝑥 0 ◦ 1) + (𝑖 − 1)𝜖 𝐻
                                    0
                                                                                         (Induction)
                          2                 2
                        = 𝑝 𝑓 (𝑥 0) + (𝑖 − 1)𝜖 𝐻
                        < 𝑝 𝑓 (𝑥) + 𝑖𝜖 𝐻                                                 (Lemma 3.2.)               
    The lower bound is a little more subtle. If 𝑢 is a vertex in layer 𝑖 of 𝑓 , we say that 𝑢 is
𝐻-reachable if there is some 𝑥 ∈ 𝐻𝑖 with 𝑣 𝑓 (𝑥) = 𝑢. Otherwise, we say that 𝑢 is 𝐻-unreachable.
Let e𝑓 be a width-(𝑤 + 1) ROBP obtained from 𝑓 by replacing all 𝐻-unreachable nodes with
reject nodes.7
Claim 3.4. For every 𝑖, for every 𝑥 ∈ {0, 1} 𝑛−𝑖 ,
                                                   𝑝 𝑓 ([𝑥]) ≥ 𝑝 e𝑓 (𝑥) .
                                                   e

Proof. We proceed by induction on 𝑖. In the base case 𝑖 = 0, e   𝑝 𝑓 ([𝑥]) = 𝑓 (𝑥) ≥ e
                                                                                     𝑓 (𝑥) = 𝑝 e𝑓 (𝑥). For
the inductive step 𝑖 > 0, we consider two cases. If 𝑓 visits some 𝐻-unreachable node when it
reads 𝑥, then 𝑝 e𝑓 (𝑥) = 0 and the claim is trivial. Therefore, assume that when 𝑓 reads 𝑥, every
node visited is 𝐻-reachable. Then there is some 𝑥 0 ∈ 𝐻𝑛−𝑖 such that 𝑣 𝑓 (𝑥) = 𝑣 𝑓 (𝑥 0). Of course
when 𝑓 reads 𝑥 0, every node visited is 𝐻-reachable, so
                                         𝑣 e𝑓 (𝑥 0) = 𝑣 𝑓 (𝑥 0) = 𝑣 𝑓 (𝑥) = 𝑣 e𝑓 (𝑥) .

Therefore,
                                   1                 1
              𝑝 e𝑓 (𝑥) = 𝑝 e𝑓 (𝑥 0) =𝑝 e𝑓 (𝑥 0 ◦ 0) + 𝑝 e𝑓 (𝑥 0 ◦ 1)
                                   2                 2
                                   1                   1
                                  ≤ e𝑝 𝑓 ([𝑥 0 ◦ 0]) + e  𝑝 𝑓 ([𝑥 0 ◦ 1])                (Induction)
                                   2                   2
                                   𝑝 𝑓 (𝑥)
                                  ≤e                                                     (Equation (3.1).)
(The last inequality uses the fact that 𝑣 𝑓 (𝑥) = 𝑣 𝑓 (𝑥 0) and hence 𝑥 ∼ 𝑥 0.)                                     
   7More precisely, add an extra node to each layer labeled ⊥. In layers prior to the final layer, both outgoing edges
from ⊥ lead to ⊥, and both outgoing edges from 𝐻-unreachable nodes lead to ⊥. In the final layer, 𝐻-unreachable
nodes and ⊥ are reject nodes.


                          T HEORY OF C OMPUTING, Volume 18 (21), 2022, pp. 1–32                                    17
                                     K UAN C HENG AND W ILLIAM M. H OZA

                𝑝 𝑓 ([𝜆]) − 𝔼[ 𝑓 ]| ≤ 𝑛 · 𝜖 𝐻 .
Corollary 3.5. |e
Proof. By Claim 3.3,
                                 𝑝 𝑓 ([𝜆]) ≤ 𝑝 𝑓 (𝜆) + 𝑛 · 𝜖 𝐻 = 𝔼[ 𝑓 ] + 𝑛 · 𝜖 𝐻 .
                                 e
In the other direction, by Claim 3.4,

                                                 𝑝 𝑓 ([𝜆]) ≥ 𝑝 e𝑓 (𝜆) = 𝔼[ e
                                                 e                         𝑓].

Define 𝑔 : {0, 1} 𝑛 → {0, 1} by

                 𝑔(𝑥) = 1 ⇐⇒ when 𝑓 reads 𝑥, an 𝐻-unreachable node is visited.

Then 𝑔 can be computed by a width-(𝑤 + 1) ROBP by a construction very similar to that
   𝑓 . By construction, 𝑔 rejects every string in 𝐻. Therefore, 𝔼[𝑔] < 𝜖 𝐻 . Furthermore,
of e
𝑔(𝑥) = 0 =⇒ 𝑓 (𝑥) = e𝑓 (𝑥). Therefore, | 𝔼[ e
                                            𝑓 ] − 𝔼[ 𝑓 ]| < 𝜖 𝐻 , so e
                                                                     𝑝 𝑓 ([𝜆]) > 𝔼[ 𝑓 ] − 𝜖 𝐻 . 

3.3                          𝑝 𝑓 ([𝜆])
       Efficiently computing e
                                                                                      𝑝 𝑓 ([𝜆]).
To complete the proof of Theorem 3.1, we just need to show how to efficiently compute e
This is fairly straightforward from the definitions; the details follow.

Proof of Theorem 3.1. Say a string 𝑥 ∈ 𝐻𝑖 is a representative if it is the lexicographically first element
of [𝑥] ∩ 𝐻𝑖 . Let 𝑥 (𝑖,1) , 𝑥 (𝑖,2) , . . . be an enumeration of the representatives in 𝐻𝑖 in lexicographic
order. Given 𝑖, 𝑗, and oracle access to 𝑓 , one can compute 𝑥 (𝑖,𝑗) in space 𝑂(𝑠).
    Our sampler works its way backward through the branching program, starting at layer 𝑛
and ending with layer 0. The sampler stores data about layer 𝑖 and uses it when processing layer
𝑖 − 1. Specifically, the data stored regarding layer 𝑖 consists of a list of numbers 𝑝 𝑖,1 , 𝑝 𝑖,2 , . . . ,
with the interpretation 𝑝 𝑖,𝑗 = e          𝑝 𝑓 ([𝑥 (𝑖,𝑗) ]), or rather 𝑝 𝑖,𝑗 ≈ e
                                                                               𝑝 𝑓 ([𝑥 (𝑖,𝑗) ]) due to rounding error.
    For layer 𝑛, we can compute this value exactly by setting 𝑝 𝑖,𝑗 = 𝑓 (𝑥 (𝑖,𝑗) ). Given these values
for layer 𝑖 + 1, define 𝑝 𝑓 (𝑥) for each 𝑥 ∈ {0, 1} 𝑖+1 by
                                             (
                                                 𝑝 𝑖+1,𝑗           if 𝑗 is such that 𝑥 (𝑖+1,𝑗) ∼ 𝑥
                               𝑝 𝑓 (𝑥) =
                                                 0                 if no such 𝑗 exists.

                                                   𝑝 𝑓 ([𝑥]) = 0 if such a 𝑗 does not exist.) Then, we
(Note that 𝑗 is unique if it exists, and note that e
compute each value 𝑝 𝑖,𝑗 by the rule
                                                                                               
                                                                   1                1
                              𝑝 𝑖,𝑗 :=        max                    𝑝 𝑓 (𝑥 0 ◦ 0) + 𝑝 𝑓 (𝑥 0 ◦ 1) ,            (3.2)
                                         𝑥 0 ∈𝐻𝑖 ∩[𝑥 (𝑖,𝑗) ]       2                2

with the convention that 𝑝 𝑖,𝑗 = 0 if 𝐻𝑖 ∩ [𝑥 (𝑖,𝑗) ] = ∅.
   The sampler performs the arithmetic in Equation (3.2) to within dlog(2𝑛/𝜖)e bits of precision.
                                                                                           𝑝 𝑓 ([𝑥 (𝑖,𝑗) ])| ≤
This ensures that the rounding error is not too large in each step; by induction, |𝑝 𝑖,𝑗 − e
𝜖(𝑛−𝑖)                                                                                        𝜖
  2𝑛 . The sampler outputs 𝑝 0,1 , which is within 𝜖 of 𝔼[ 𝑓 ] by Corollary 3.5, since 𝜖 𝐻 = 2𝑛 .


                         T HEORY OF C OMPUTING, Volume 18 (21), 2022, pp. 1–32                                     18
                  H ITTING S ETS G IVE T WO -S IDED D ERANDOMIZATION OF S MALL S PACE

    The number of vertices in each layer of 𝑓 is at most 𝑤, so the number of equivalence classes
in {0, 1} 𝑖 is also at most 𝑤. Therefore, there are at most 𝑤 representatives in 𝐻𝑖 , and hence
there are only 𝑤 numbers 𝑝 𝑖,𝑗 being stored for each layer. Storing those numbers for the layer
currently being processed and the layer most recently processed takes 𝑂(𝑤 log(𝑛/𝜖)) bits of
space, so overall, the space complexity of the sampler is 𝑂(𝑠 + 𝑤 log(𝑛/𝜖)) as claimed.       

    Interestingly, the sampler of Theorem 3.1 can be implemented to be non-adaptive, because it
only queries 𝑓 at strings of the form 𝑥 ◦ 𝑦 or 𝑥 ◦ 𝑏 ◦ 𝑧, where 𝑥 ∈ 𝐻𝑖 , 𝑦 ∈ 𝐻𝑛−𝑖 , 𝑏 ∈ {0, 1}, and
𝑧 ∈ 𝐻𝑛−𝑖−1 .


3.4   Applying the reduction
                                                         𝜖
                                                           -hitting set 𝐻 even for polynomial-
                                                                                   
Proof of Theorem 1.8. Hoza and Zuckerman constructed an 2𝑛
width ROBPs that can be computed in space 𝑂(log2 𝑛 + log(1/𝜖)) [19]. Combining this result
with Theorem 3.1 immediately proves Theorem 1.8.                                           

                                                                                       𝜖
                                                                                                       
    To prove Theorem 1.7, we must first amplify the assumed 12 -hitting set to get an 2𝑛 -hitting
set. This is straightforward, although we must pay a small penalty in terms of width, length,
and cardinality.

Lemma 3.6. Suppose 𝐻 is a 12 -hitting set for width-(𝑤 + 1) length-(𝑛𝑚) ROBPs. Divide each string
𝑥 ∈ 𝐻 into blocks of length 𝑛, 𝑥 = 𝑥 (1) ◦ 𝑥 (2) ◦ · · · ◦ 𝑥 (𝑚) . Let 𝐻 0 = {𝑥 (𝑖) : 𝑥 ∈ 𝐻, 𝑖 ∈ [𝑚]}. Then 𝐻 0 is
a ( 𝑚1 )-hitting set for width-𝑤 length-𝑛 ROBPs.

Proof. Let 𝑓 be a width-𝑤 length-𝑛 ROBP with 𝔼[ 𝑓 ] ≥ 1/𝑚. Define 𝑔 : ({0, 1} 𝑛 )𝑚 → {0, 1} by
                                                                   Ü
                                     𝑔(𝑥 (1) ◦ · · · ◦ 𝑥 (𝑚) ) =           𝑓 (𝑥 (𝑖) ) .
                                                                   𝑖∈[𝑚]


Then 𝑔 can be computed by a width-(𝑤 + 1) ROBP. Furthermore,
                                                                                  𝑚
                                                         𝑚      1                             1
                              𝔼[𝑔] = 1 − (1 − 𝔼[ 𝑓 ]) ≥ 1 − 1 −                           >     .
                                                                𝑚                             2

Therefore, 𝐻 hits 𝑔, hence 𝐻 0 hits 𝑓 .                                                                         

   Combining Lemma 3.6 with Theorem 3.1 proves the following corollary, which immediately
implies Theorem 1.7.

Corollary 3.7. Let 𝑤, 𝑛 ∈ ℕ and let 𝜖 > 0. Let 𝑤 0 = 𝑤2 + 2 and 𝑛 0 = 𝑛 · d2𝑛/𝜖e. Assume there is
                                                                     

a 12 -hitting set 𝐻 for width-𝑤 0 length-𝑛 0 ROBPs computable in space 𝑠. Then there is a deterministic
𝜖-sampler for width-𝑤 length-𝑛 ROBPs that runs in space 𝑂(𝑠 + 𝑤 log(𝑛/𝜖)).

                        T HEORY OF C OMPUTING, Volume 18 (21), 2022, pp. 1–32                                  19
                                       K UAN C HENG AND W ILLIAM M. H OZA

4     Using a hypothetical stronger reduction to construct PRGs
To directly compare hitting sets and PRGs, it is convenient to address the strings in the hitting
set using a hitting set generator (HSG).
Definition 4.1. An 𝜖-HSG for ℱ is a function 𝐺 : {0, 1} 𝑠 → {0, 1} 𝑛 such that 𝐺({0, 1} 𝑠 ) is an
𝜖-hitting set for ℱ .
    In our theorem statements so far, we have been somewhat informal with the distinction
between an individual generator vs. a family of generators. Since the result we discuss in this
section is more “meta” than our other results, we will make a precise definition for clarity’s sake.
Definition 4.2. Let 𝑠(𝑛, 𝑡, 𝜖) be a space-constructible8 function. An explicit PRG (HSG) family
for width-𝑤 large-alphabet ROBPs with seed length 𝑠 is a uniform algorithm 𝐺 that takes as
input the parameters 𝑛, 𝑡, 𝜖 and a string 𝑦 ∈ {0, 1} 𝑠(𝑛,𝑡,𝜖) and outputs a string 𝐺 𝑛,𝑡,𝜖 (𝑦) ∈ {0, 1} 𝑡𝑛 ,
interpreted as a length-𝑛 string over the alphabet {0, 1} 𝑡 . The algorithm runs in space 𝑂(𝑠(𝑛, 𝑡, 𝜖)),
and for each fixed 𝑛, 𝑡, 𝜖, we require that 𝐺 𝑛,𝑡,𝜖 is an 𝜖-PRG (𝜖-HSG) for width-𝑤 length-𝑛 ROBPs
over the alphabet {0, 1} 𝑡 .
    The assumption of Theorem 4.3 below says that hitting sets can be upgraded into PRGs with
essentially no loss: the width parameter remains the same, and the seed length only increases
by a constant factor, for any arbitrary setting of 𝑛, 𝑡, 𝜖. This is only for simplicity’s sake. The
proof would still go through even if the parameters deteriorated a little when moving from
hitting sets to PRGs.
Theorem 4.3. Let 𝑤 be a constant. Assume that for every space-constructible function 𝑠(𝑛, 𝑡, 𝜖), if
there exists an explicit HSG family for width-𝑤 large-alphabet ROBPs with seed length 𝑠, then there
exists a space-constructible function 𝑠 0(𝑛, 𝑡, 𝜖) = 𝑂(𝑠(𝑛, 𝑡, 𝜖)) and an explicit PRG family for width-𝑤
large-alphabet ROBPs with seed length 𝑠 0. Then for every constant 𝛼 > 0, there exists an explicit PRG
family for width-𝑤 large-alphabet ROBPs with seed length 𝑠 𝛼 (𝑛, 𝑡, 𝜖), where
                                       𝑠 𝛼 (𝑛, 𝑡, 𝜖) = 𝑂(𝑡 + log(𝑛/𝜖) log𝛼 𝑛) .

4.1    From PRGs with moderate error to HSGs with tiny threshold
As outlined in Section 1.5.3, the proof of Theorem 4.3 is based on two reductions. The first
reduction shows how to convert any PRG with inverse polynomial error into an 𝜖-HSG for any
𝜖, and the second reduction shows how to convert a PRG with a good dependence on 𝜖 into a
PRG with a good dependence on 𝑛.
    In this section, we present the first reduction. In the regime 𝑛 ≥ 𝑤, our reduction is a
generalization of Hoza and Zuckerman’s reduction [19] to the large-alphabet case 𝑡  1.
Theorem 4.4. Let 𝑤, 𝑛, 𝑡 ∈ ℕ and let 𝜖 > 0. Assume there is a ( 2𝑤13 𝑛 2 )-PRG 𝐺 for width-𝑤 length-𝑛
ROBPs over the alphabet {0, 1} 𝑡 , with seed length and space complexity bounded by 𝑠. Then there
is an 𝜖-hitting set 𝐻 for width-𝑤 length-𝑛 ROBPs over the alphabet {0, 1} 𝑡 , computable in space
𝑂(𝑠 + 𝑡 + log(𝑤𝑛/𝜖)).
    8I. e., given 𝑛, 𝑡, 𝜖, the value 𝑠(𝑛, 𝑡, 𝜖) can be computed in space 𝑂(𝑠(𝑛, 𝑡, 𝜖)).


                           T HEORY OF C OMPUTING, Volume 18 (21), 2022, pp. 1–32                         20
                  H ITTING S ETS G IVE T WO -S IDED D ERANDOMIZATION OF S MALL S PACE

   (Just like Theorem 2.1 and Theorem 3.1, Theorem 4.4 technically ought to be phrased in
terms of families of ROBPs.)

4.1.1   Construction of the hitting set 𝐻
Our hitting set 𝐻 relies on a hitting set 𝐻rect for combinatorial rectangles [23]. Recall that a
combinatorial rectangle over alphabet Γ of dimension 𝑟 is a function 𝑔 : Γ𝑟 → {0, 1} of the form
𝑔(𝑥 1 , . . . , 𝑥 𝑟 ) = 𝑔1 (𝑥 1 ) ∧ · · · ∧ 𝑔 𝑟 (𝑥 𝑟 ). Without loss of generality, assume 𝜖 < 𝑤 21𝑛 2 and 𝑠 ≥ 𝑡. The
algorithm to enumerate 𝐻 is as follows.
                    n             j              ko
   1. For all 𝑟 ∈ 1, 2, . . . ,
                                      log(1/𝜖)
                                      log(𝑤𝑛)
                                                      :

        (a) Let 𝐻rect ⊆ ({0, 1} 𝑠 )2𝑟−1 be an 𝜖4 -hitting set for combinatorial rectangles over alphabet
            {0, 1} 𝑠 of dimension 2𝑟 − 1.
        (b) For all sequences (𝑥 1 , 𝑦1 , 𝑥2 , 𝑦2 , . . . , 𝑥 𝑟−1 , 𝑦𝑟−1 , 𝑥 𝑟 ) ∈ 𝐻rect and for all sequences of
            nonnegative integers (𝑛1 , . . . , 𝑛 𝑟 ) satisfying 𝑛1 + 𝑛2 + · · · + 𝑛 𝑟 = 𝑛 − 𝑟 + 1, output the
            (𝑛𝑡)-bit string

                        (𝐺(𝑥 1 )| 𝑛1 𝑡 ) ◦ (𝑦1 | 𝑡 ) ◦ (𝐺(𝑥2 )| 𝑛2 𝑡 ) ◦ (𝑦2 | 𝑡 ) ◦ · · · ◦ (𝑦𝑟−1 | 𝑡 ) ◦ (𝐺(𝑥 𝑟 )| 𝑛𝑟 𝑡 ) .   (4.1)

In Equation (4.1), the notation 𝑦| 𝑡 denotes the 𝑡-bit prefix of the bitstring 𝑦. The key difference
between our construction and Hoza and Zuckerman’s original hitting set construction [19] is
the presence of the strings 𝑦 𝑖 , which do not pass through the PRG 𝐺.

4.1.2   Proof of correctness
Hoza and Zuckerman’s reduction was based on a simple structural lemma for ROBPs [19,
Lemma 1]. Toward proving the correctness of 𝐻, we will now prove a new variant of that lemma,
applicable to ROBPs over a large alphabet. For two vertices 𝑢, 𝑣 in an ROBP 𝑓 , write 𝑢 { 𝑣
if there is an edge from 𝑢 to 𝑣. Let { ∗ be the reflexive transitive closure of { , i. e., 𝑢 { ∗ 𝑣 if
𝑢 = 𝑣 or there is a path from 𝑢 to 𝑣.
     The way to think about the following Lemma 4.5 is to suppose that one is choosing a route
from 𝑢 to 𝑣 acc . Lemma 4.5 suggests two vertices 𝑣 { 𝑢 0 that one could visit on the way. Item 2
says that it is not difficult to find 𝑣. Item 3 says that if one can make it to 𝑢 0, it will be quite a bit
easier to find 𝑣acc from there. Item 4 says that overall, visiting 𝑣 and 𝑢 0 is only a mild detour.
     In general, in any ROBP over the alphabet Σ, if 𝑣 { 𝑢 0, then 𝑝 𝑣→𝑢0 ≥ 1/|Σ|. In Hoza and
Zuckerman’s lemma [19, Lemma 1], they assume Σ = {0, 1}, and they use the fact that therefore
𝑝 𝑣→𝑢0 ≥ 1/2. At a high level, the reason we need a new structural lemma is that if Σ is large,
𝑝 𝑣→𝑢0 might be small. Indeed, observe that Lemma 4.5 does not guarantee any lower bound on
𝑝 𝑣→𝑢0 .

Lemma 4.5. Let 𝑓 be a width-𝑤, length-𝑛 ROBP over any alphabet. Let 𝑢 be a vertex in 𝑓 , and assume
0 < 𝑝 𝑢→ ≤ 𝑤𝑛
            1
              . Then there is a pair of vertices (𝑣, 𝑢 0) in 𝑓 such that:

                        T HEORY OF C OMPUTING, Volume 18 (21), 2022, pp. 1–32                                                     21
                                         K UAN C HENG AND W ILLIAM M. H OZA

   1. 𝑢 { ∗ 𝑣 { 𝑢 0.
   2. 𝑝 𝑢→𝑣 ≥ 𝑤 31𝑛 2 .
   3. 𝑝 𝑢0→ ≥ 𝑤𝑛 · 𝑝 𝑢→ .
                                     𝑝
   4. 𝑝 𝑢→𝑣 · 𝑝 𝑣→𝑢0 · 𝑝 𝑢0→ ≥ 𝑤𝑢→
                                 2𝑛 .

Proof. Suppose some pair (𝑣, 𝑢 0) satisfies Item 1, but it violates Item 4. For such a pair, if we take
                                                                                    𝑝
a random walk from 𝑢, the probability that we visit 𝑣, 𝑢 0, and 𝑣 acc is less than 𝑤𝑢→2 𝑛 . The number

of such pairs is at most 𝑤 𝑛, so by the union bound, when we start at 𝑢 and read random bits,
                             2

the probability that we visit any such pair and 𝑣 acc is less than 𝑝 𝑢→ . Therefore, there is some
path from 𝑢 to 𝑣 acc that never visits such a pair.
    Let 𝑢 0 be the first vertex along that path that satisfies Item 3. (Such a 𝑢 0 exists, because if
nothing else we can let 𝑢 0 = 𝑣 acc .) Let 𝑣 be the vertex immediately preceding 𝑢 0 in the path.
(This makes sense, because 𝑝 𝑢→ < 𝑤𝑛 · 𝑝 𝑢→ , so 𝑢 0 ≠ 𝑢.) This pair clearly satisfies Item 1, Item 3,
and Item 4; all that remains is to verify Item 2. Indeed,
                        𝑝 𝑢→
                               ≤ 𝑤 2 𝑛 · 𝑝 𝑣→𝑢0 · 𝑝 𝑢0→                 (Item 4)
                       𝑝 𝑢→𝑣
                                    ≤ 𝑤 2 𝑛 · 𝑝 𝑣→
                                    < 𝑤 2 𝑛 · 𝑤𝑛 · 𝑝 𝑢→ ,
where the last inequality holds because 𝑢 0 is the first vertex in the path satisfying Item 3, and 𝑣
precedes 𝑢 0, so 𝑣 must not satisfy Item 3. Rearranging completes the proof.                       
Corollary 4.6. Let 0 < 𝜖 ≤ 𝑤𝑛  1
                                  . Let 𝑓 be a width-𝑤, length-𝑛 ROBP over any alphabet with 𝔼[ 𝑓 ] ≥ 𝜖.
Then there is a sequence of vertices
                          𝑣 start = 𝑢1 { ∗ 𝑣 1 { 𝑢2 { ∗ 𝑣 2 { · · · { 𝑢𝑟 { ∗ 𝑣 𝑟 = 𝑣 acc
such that:
   1. For every 𝑖, 𝑝 𝑢𝑖 →𝑣 𝑖 ≥ 𝑤 31𝑛 2 .

   2. 𝑟 ≤ log(𝑤𝑛) .
             log(1/𝜖)


   3. 𝑝 𝑢1 →𝑣1 · 𝑝 𝑣1 →𝑢2 · 𝑝 𝑢2 →𝑣2 · · · 𝑝 𝑣 𝑟−1 →𝑢𝑟 · 𝑝 𝑢𝑟 →𝑣 𝑟 ≥ 𝜖 3 .
Proof. We define the sequence inductively, starting with 𝑢1 = 𝑣 start . Assume we’ve defined
𝑢1 , 𝑣1 , 𝑢2 , 𝑣2 , . . . , 𝑢𝑖 . If 𝑝 𝑢𝑖 → ≥ 𝑤 31𝑛 2 , then set 𝑟 = 𝑖, set 𝑣 𝑖 = 𝑣 acc , and terminate the sequence.
Otherwise, let (𝑣 𝑖 , 𝑢𝑖+1 ) be the vertices provided by plugging 𝑢 = 𝑢𝑖 into Lemma 4.5.
     Item 1 of Lemma 4.5 implies that 𝑢𝑖 { ∗ 𝑣 𝑖 and 𝑣 𝑖 { 𝑢𝑖+1 . Item 1 is guaranteed by Item 2 of
Lemma 4.5 and the termination condition. By Item 3 of Lemma 4.5, 𝑝 𝑢𝑖+1 → ≥ 𝑤𝑛 · 𝑝 𝑢𝑖 → , which
implies Item 2. Finally, iteratively applying Item 4 of Lemma 4.5 shows that
                                                                                             𝑝 𝑢1 →
                             𝑝 𝑢1 →𝑣1 · 𝑝 𝑣1 →𝑢2 · 𝑝 𝑢2 →𝑣2 · · · 𝑝 𝑣 𝑟−1 →𝑢𝑟 · 𝑝 𝑢𝑟 →𝑣 𝑟 ≥          ≥ 𝜖3 ,
                                                                                            (𝑤 2 𝑛)𝑟
i. e., Item 3 holds.                                                                                             

                           T HEORY OF C OMPUTING, Volume 18 (21), 2022, pp. 1–32                                22
                   H ITTING S ETS G IVE T WO -S IDED D ERANDOMIZATION OF S MALL S PACE

      We are now ready to complete the proof of correctness of our hitting set 𝐻.

Claim 4.7. If 𝑓 is a width-𝑤 length-𝑛 ROBP over the alphabet {0, 1} 𝑡 with 𝔼[ 𝑓 ] ≥ 𝜖, then 𝑓 −1 (1)∩𝐻 ≠ ∅.

Proof. Let 𝑢1 { ∗ 𝑣 1 { · · · { 𝑢𝑟 { ∗ 𝑣 𝑟 be the sequence of vertices guaranteed by Corollary 4.6.
Let 𝑛 𝑖 be the distance from 𝑢𝑖 to 𝑣 𝑖 . Let 𝑔 : ({0, 1} 𝑠 )2𝑟−1 → {0, 1} be the following combinatorial
rectangle:

  𝑔(𝑥1 , 𝑦1 , 𝑥2 , 𝑦2 , . . . , 𝑥 𝑟−1 , 𝑦𝑟−1 , 𝑥 𝑟 ) = 1 ⇐⇒
              ∀𝑖 ∈ [𝑟], 𝐺(𝑥 𝑖 )| 𝑛 𝑖 𝑡 leads from 𝑢𝑖 to 𝑣 𝑖 and ∀𝑖 ∈ [𝑟 − 1], 𝑦 𝑖 | 𝑡 leads from 𝑣 𝑖 to 𝑢𝑖+1 .

By Item 1 of Corollary 4.6, 𝑝 𝑢𝑖 →𝑣 𝑖 ≥ 𝑤 31𝑛 2 . Since 𝐺 has error 2𝑤13 𝑛 2 ,

                                                                      1
                                   Pr[𝐺(𝑈) leads from 𝑢𝑖 to 𝑣 𝑖 ] ≥     𝑝 𝑢 →𝑣 .
                                                                      2 𝑖 𝑖

Therefore, by Item 3 of Corollary 4.6, 𝔼[𝑔] ≥ 𝜖 3 · 2−𝑟 ≥ 𝜖 4 . Therefore, there is some sequence
(𝑥 1 , 𝑦1 , . . . , 𝑦𝑟−1 , 𝑥 𝑟 ) ∈ 𝐻rect that hits 𝑔. By construction, the corresponding element of 𝐻 is
accepted by 𝑓 .                                                                                       


4.1.3    Efficiency

Proof of Theorem 4.4. To complete the proof of Theorem 4.4, let us analyze the space complexity
of 𝐻. We use an 𝜖 4 -hitting set for combinatorial rectangles 𝐻rect constructed by Linial, Luby,
Saks, and Zuckerman [23]. Specifically, we use their simple “low-dimension” construction [23,
Section 5]. In general, for combinatorial rectangles over alphabet Γ of dimension 𝑟, that hitting
set can be enumerated in space 𝑂(𝑟 + log(|Γ|/𝜖)). In our case, the space complexity bound
becomes 𝑂(𝑠 + log(1/𝜖)).
    The number 𝑟 can be stored using 𝑂(log log(1/𝜖)) bits of space. The integers 𝑛1 , . . . , 𝑛 𝑟 can
be straightforwardly stored using 𝑂(𝑟 log 𝑛) = 𝑂(log(1/𝜖)) bits of space. Thus, overall, the
space complexity of enumerating 𝐻 is 𝑂(𝑠 + log(1/𝜖)). (Recall that we assumed without loss of
generality that 𝜖 < 𝑤 21𝑛 2 and 𝑠 ≥ 𝑡.)                                                             


4.2     Application: Unconditional improved hitting sets for large-alphabet ROBPs
As outlined in Section 1.5.3, plugging the classic INW generator [20] into the reduction of
Theorem 4.4 already gives something interesting: an improved hitting set for large-alphabet
ROBPs, even of polynomial width.

Corollary 4.8. Let 𝑤, 𝑛, 𝑡 ∈ ℕ and let 𝜖 > 0. There is an 𝜖-hitting set 𝐻 for width-𝑤 length-𝑛 ROBPs
over the alphabet {0, 1} 𝑡 , computable in space 𝑂(𝑡 + log(𝑤𝑛) log 𝑛 + log(1/𝜖)).

                          T HEORY OF C OMPUTING, Volume 18 (21), 2022, pp. 1–32                                  23
                                       K UAN C HENG AND W ILLIAM M. H OZA

4.3    Trading a good dependence on 𝜖 for a good dependence on 𝑛
Recall that to prove Theorem 4.3, we must (conditionally) construct a PRG with a good
dependence on 𝑛. So far, unconditionally, Theorem 4.4 has provided us with an HSG with a
good dependence on 𝜖. The assumption of Theorem 4.3 allows us to convert that HSG into a PRG
with the same seed length, 𝑂(𝑡 + log2 𝑛 + log(1/𝜖)) (for width 𝑤, a constant). In this section, we
show how to convert that PRG into another PRG with seed length 𝑂(𝑡 +log3/2 𝑛 +log(1/𝜖) log 𝑛),
                                                                                          p

i. e., we improve the dependence on 𝑛 at the expense of a worse dependence on 𝜖. That follows
from setting 𝛼 = 1/2 in the following more general reduction.
                                                                                                    l             m
Lemma 4.9. Let 𝛼 ∈ (0, 1) be a constant. Let 𝑤, 𝑛, 𝑡 ∈ ℕ and 𝜖 > 0. Define 𝑚 = 2(log 𝑛)
                                                                                                            1−𝛼
                                                                                                                      and
                                                                             𝜖
𝑑 = d𝐶 log(𝑛/𝜖)e, where 𝐶 is an appropriate constant. Assume there is an ( 4𝑛  )-PRG 𝐺 for width-𝑤
                                        𝑑
length-𝑚 ROBPs over the alphabet {0, 1} with seed length and space complexity bounded by 𝑠. Then
there is an 𝜖-PRG 𝐺0 for width-𝑤 length-𝑛 ROBPs over the alphabet {0, 1} 𝑡 with seed length and space
complexity 𝑂(𝑡 + 𝑠 · log𝛼 𝑛 + log(𝑤𝑛/𝜖)).

    As usual, Lemma 4.9 should technically be phrased in terms of families of ROBPs. As
suggested in Section 1.5.3, the proof of Lemma 4.9 is an application of traditional seed-recycling
techniques, similar to classic constructions of PRGs for space-bounded computation [25, 20, 27].
Our construction and analysis are especially similar to Armoni’s work [5].
    One difference is that we use randomized samplers rather than extractors for convenience;
in this respect, our construction is similar to a variant of the INW generator [20] described by
Braverman, Cohen, and Garg [8] as a warm-up to their main construction. In particular, we rely
on the following randomized sampler by Goldreich and Wigderson [14].

Theorem 4.10 ([14, Lemma 6.6]). For all 𝑡 ∈ ℕ , 𝛿 > 0, there exists a function Samp𝑡,𝛿 : {0, 1} 𝑡 ×
{0, 1} 𝑂(log(1/𝛿)) → {0, 1} 𝑡 such that for any9 function 𝑓 : {0, 1} 𝑡 → [0, 1],
                                                                           
                              Pr 𝔼[ 𝑓 (Samp𝑡,𝛿 (𝑥, 𝑦))] − 𝔼[ 𝑓 ] ≤ 𝛿 ≥ 1 − 𝛿 .
                               𝑥       𝑦


Furthermore, given 𝑡, 𝛿, 𝑥, 𝑦 as inputs, Samp𝑡,𝛿 (𝑥, 𝑦) can be computed in space 𝑂(𝑡).

   We will recursively use the following basic PRG, which stretches 𝑡 + 𝑑𝑛 bits to 𝑡𝑛 bits where
𝑑 = 𝑂(log(1/𝛿)). It might be helpful to think of the case 𝑡 = 100𝑑.

Lemma 4.11. Let 𝑡, 𝛿 be arbitrary, and let Samp𝑡,𝛿 : {0, 1} 𝑡 × {0, 1} 𝑑 → {0, 1} 𝑡 be the randomized
sampler of Theorem 4.10. Define 𝐺0 : {0, 1} 𝑡 × ({0, 1} 𝑑 )𝑛 → ({0, 1} 𝑡 )𝑛 by

                       𝐺0 (𝑥, 𝑧1 ◦ · · · ◦ 𝑧 𝑛 ) = Samp𝑡,𝛿 (𝑥, 𝑧1 ) ◦ · · · ◦ Samp𝑡,𝛿 (𝑥, 𝑧 𝑛 ) .

Then 𝐺0 fools width-𝑤 length-𝑛 ROBPs over the alphabet {0, 1} 𝑡 with error 2𝛿𝑤 2 𝑛.
   9Goldreich and Wigderson analyze the case that 𝑓 is {0, 1}-valued, but the [0, 1]-valued case automatically follows
with only a quadratic loss in 𝛿.


                         T HEORY OF C OMPUTING, Volume 18 (21), 2022, pp. 1–32                                         24
                        H ITTING S ETS G IVE T WO -S IDED D ERANDOMIZATION OF S MALL S PACE

   The proof of Lemma 4.11 is straightforward, and we omit it. When reading the proof of
Lemma 4.9, it might be helpful to keep in mind that all “𝑥” variables are strings of length 𝑡, all
“𝑦” variables are strings of length 𝑠, and all “𝑧” variables are strings of length 𝑑.



Proof of Lemma 4.9. Define 𝑛 𝑖 = 𝑛/𝑚 𝑖 . For simplicity, we ignore rounding issues, i. e., we assume
that 𝑛 𝑖 is an integer and that 𝑚 = 2(log 𝑛) exactly. Let 𝛿 = 8𝑤𝜖2 𝑛 , and let 𝑑 = 𝑂(log(𝑤𝑛/𝜖)) be
                                            1−𝛼


the length of the second input to the function Samp𝑡,𝛿 of Theorem 4.10. We will recursively
define a sequence of PRGs

                                        𝐺 𝑖 : {0, 1} 𝑡 × ({0, 1} 𝑠 )𝑖 × ({0, 1} 𝑑 )𝑛 𝑖 → ({0, 1} 𝑡 )𝑛 .

The base case 𝑖 = 0 is the basic PRG of Lemma 4.11:

                                 𝐺0 (𝑥, 𝑧1 ◦ · · · ◦ 𝑧 𝑛 ) = Samp𝑡,𝛿 (𝑥, 𝑧1 ) ◦ · · · ◦ Samp𝑡,𝛿 (𝑥, 𝑧 𝑛 ) .

For the inductive step 𝑖 > 0, we define

                  𝐺 𝑖 (𝑥, 𝑦1 ◦ · · · ◦ 𝑦 𝑖 , 𝑧1 ◦ · · · ◦ 𝑧 𝑛 𝑖 )
                   = 𝐺 𝑖−1 (𝑥, 𝑦1 ◦ · · · ◦ 𝑦 𝑖−1 , 𝐺(Samp𝑠,𝛿 (𝑦 𝑖 , 𝑧1 )) ◦ · · · ◦ 𝐺(Samp𝑠,𝛿 (𝑦 𝑖 , 𝑧 𝑛 𝑖 ))) ,


where 𝐺 : {0, 1} 𝑠 → {0, 1} 𝑑𝑚 is the given PRG. To analyze these generators, let 𝑓 be a width-𝑤
length-𝑛 ROBP over the alphabet {0, 1} 𝑡 . For each 𝑖 and each fixing of 𝑥, 𝑦1 , . . . , 𝑦 𝑖 , define


                      𝑔 (𝑥,𝑦1 ,...,𝑦𝑖 ) (𝑧1 ◦ · · · ◦ 𝑧 𝑛 𝑖 ) = 𝑓 (𝐺 𝑖 (𝑥, 𝑦1 ◦ · · · ◦ 𝑦 𝑖 , 𝑧1 ◦ · · · ◦ 𝑧 𝑛 𝑖 ))
                  ℎ (𝑥,𝑦1 ,...,𝑦𝑖 ) (𝑦10 ◦ · · · ◦ 𝑦𝑛0 𝑖+1 ) = 𝑓 (𝐺 𝑖 (𝑥, 𝑦1 ◦ · · · ◦ 𝑦 𝑖 , 𝐺(𝑦10 ) ◦ · · · ◦ 𝐺(𝑦𝑛0 𝑖+1 )) .

These functions are related to one another by the rules


       ℎ (𝑥,𝑦1 ,...,𝑦𝑖 ) (𝑦10 ◦ · · · ◦ 𝑦𝑛0 𝑖+1 ) = 𝑔 (𝑥,𝑦1 ,...,𝑦𝑖 ) (𝐺(𝑦10 ) ◦ · · · ◦ 𝐺(𝑦𝑛0 𝑖+1 ))                                     (4.2)
             (𝑥,𝑦1 ,...,𝑦 𝑖 )                                𝑥,𝑦1 ,...,𝑦 𝑖−1
         𝑔                      (𝑧 1 ◦ · · · ◦ 𝑧 𝑛 𝑖 ) = ℎ                     (Samp𝑠,𝛿 (𝑦 𝑖 , 𝑧1 ) ◦ · · · ◦ Samp𝑠,𝛿 (𝑦 𝑖 , 𝑧 𝑛 𝑖 )) .   (4.3)

This shows by induction on 𝑖 that each 𝑔 function can be computed by a width-𝑤 ROBP over the
alphabet {0, 1} 𝑑 and each ℎ function can be computed by a width-𝑤 ROBP over the alphabet
{0, 1} 𝑠 .
    Let us now show by induction on 𝑖 that 𝐺 𝑖 fools 𝑓 with error (2𝛿𝑤 2 + 𝜖 𝐺 ) · 𝑖𝑗=0 𝑛 𝑗 , where 𝜖 𝐺
                                                                                  Í
is the error of 𝐺. The base case 𝑖 = 0 is already established by Lemma 4.11. For the inductive

                                   T HEORY OF C OMPUTING, Volume 18 (21), 2022, pp. 1–32                                                    25
                                                         K UAN C HENG AND W ILLIAM M. H OZA

step, we have
      𝔼 [ 𝑓 (𝐺 𝑖 (𝑥, 𝑦1 ◦ · · · ◦ 𝑦 𝑖 , 𝑧1 ◦ · · · ◦ 𝑧 𝑛𝑖 ))]
       𝑥
                                                                                                                                         (4.4)
  𝑦1 ,...,𝑦 𝑖
 𝑧 1 ,...,𝑧 𝑛 𝑖

            =          𝔼𝑥 [𝑔 (𝑥,𝑦1 ,...,𝑦𝑖 ) (𝑧1 ◦ · · · ◦ 𝑧 𝑛𝑖 )]
                   𝑦1 ,...,𝑦 𝑖
                  𝑧 1 ,...,𝑧 𝑛 𝑖

            =          𝔼𝑥 [ℎ (𝑥,𝑦1 ,...,𝑦𝑖−1 ) (Samp𝑠,𝛿 (𝑦 𝑖 , 𝑧1 ) ◦ · · · ◦ Samp𝑠,𝛿 (𝑦 𝑖 , 𝑧 𝑛𝑖 ))]                        (Equation (4.3))
                   𝑦1 ,...,𝑦 𝑖
                  𝑧 1 ,...,𝑧 𝑛 𝑖

            ≤           𝔼𝑥         [ℎ (𝑥,𝑦1 ,...,𝑦𝑖−1 ) (𝑦10 ◦ · · · ◦ 𝑦𝑛0 𝑖 )] + 2𝛿𝑤 2 𝑛 𝑖                                  (Lemma 4.11)
                  𝑦1 ,...,𝑦 𝑖−1
                  𝑦10 ,...,𝑦𝑛0 𝑖

            =           𝔼𝑥         [𝑔 (𝑥,𝑦1 ,...,𝑦𝑖−1 ) (𝐺(𝑦10 ) ◦ · · · ◦ 𝐺(𝑦𝑛0 𝑖 ))] + 2𝛿𝑤 2 𝑛 𝑖                           (Equation (4.2))
                  𝑦1 ,...,𝑦 𝑖−1
                  𝑦10 ,...,𝑦𝑛0 𝑖

            ≤            𝔼𝑥          [𝑔 (𝑥,𝑦1 ,...,𝑦𝑖−1 ) (𝑧1 ◦ · · · ◦ 𝑧 𝑛 𝑖−1 )] + (2𝛿𝑤 2 + 𝜖 𝐺 ) · 𝑛 𝑖
                   𝑦1 ,...,𝑦 𝑖−1
                  𝑧 1 ,...,𝑧 𝑛 𝑖−1

            =            𝔼𝑥          [ 𝑓 (𝐺 𝑖−1 (𝑥, 𝑦1 ◦ · · · ◦ 𝑦 𝑖−1 , 𝑧1 ◦ · · · ◦ 𝑧 𝑛 𝑖−1 ))] + (2𝛿𝑤 2 + 𝜖 𝐺 ) · 𝑛 𝑖 .
                   𝑦1 ,...,𝑦 𝑖−1
                  𝑧 1 ,...,𝑧 𝑛 𝑖−1

The lower bound on the quantity in Equation (4.4) follows the same argument. Let 𝑟 = log𝛼 𝑛
and 𝐺0 = 𝐺 𝑟 . Then 𝐺0 fools 𝑓 with error
                                                   𝑟
                                                   Õ                                        ∞
                                                                                            Õ
                           (2𝛿𝑤 2 + 𝜖 𝐺 ) ·              𝑛 𝑖 ≤ (2𝛿𝑤 2 + 𝜖 𝐺 ) · 𝑛 ·               𝑚 −𝑖 ≤ 2𝑛 · (2𝛿𝑤 2 + 𝜖 𝐺 ) ≤ 𝜖 .
                                                   𝑖=0                                      𝑖=0

Furthermore, the seed length of 𝐺0 is 𝑡 + 𝑟𝑠 + 𝑑 as claimed, and the space complexity of 𝐺0 is
clearly also 𝑂(𝑡 + 𝑟𝑠 + 𝑑).                                                                                                                 

4.4        Putting things together to prove Theorem 4.3
Proof of Theorem 4.3. We will show by induction on 𝑎 that for each constant 𝑎 ∈ ℕ , there is an
explicit PRG family for width-𝑤 ROBPs with seed length 𝑂(𝑡 + log(𝑛/𝜖) log1/𝑎 𝑛). The base case
𝑎 = 1 holds unconditionally – this is the seed length of the classic INW generator [20].
    For the inductive step, suppose 𝑎 > 1. Let 𝑡, 𝑛, 𝜖 be arbitrary; we will construct an 𝜖-PRG
for width-𝑤 length-𝑛 ROBPs over the alphabet {0, 1} 𝑡 . Define 𝛼 = 1/𝑎. Let 𝑚 = 2(log 𝑛) and
                                                                                            1−𝛼


𝑑 = 𝐶 log(𝑛/𝜖), as in Lemma 4.9.
    By induction, there is a ( 2𝑤13 𝑚 2 )-PRG 𝐺 for width-𝑤 length-𝑚 ROBPs over the alphabet {0, 1} 𝑑 ,
                                                                                                             1
with seed length and space complexity bounded by 𝑂(𝑑 + log1+ 𝑎−1 𝑚). Now, by the choice of 𝑚,

                                                     log1+ 𝑎−1 𝑚 = (log 𝑛)(1− 𝑎 )·(1+ 𝑎−1 ) = log 𝑛 ,
                                                              1                         1          1




                                          T HEORY OF C OMPUTING, Volume 18 (21), 2022, pp. 1–32                                             26
                 H ITTING S ETS G IVE T WO -S IDED D ERANDOMIZATION OF S MALL S PACE

so 𝐺 has seed length and space complexity bounded by 𝑂(log(𝑛/𝜖)). Plugging 𝐺 into Theo-
                      𝜖
rem 4.4, we get an ( 4𝑛 )-hitting set 𝐻 for width-𝑤 length-𝑚 ROBPs over the alphabet {0, 1} 𝑑 ,
computable in space 𝑂(log(𝑛/𝜖)). Now we use our assumption to convert 𝐻 into a PRG 𝐺0 with
exactly the same parameters. Finally, plugging 𝐺0 into Lemma 4.9 gives the desired PRG.    


5    Derandomizing BPP given a hitting set
In this section, as mentioned in Section 1, we present an alternative proof for the known theorem
that optimal explicit hitting sets for circuits would imply P = BPP.

Theorem 5.1 ([3]). Assume that for every 𝑠, 𝑛 ∈ ℕ , there is a 12 -hitting set 𝐻𝑠,𝑛 for size-𝑠 circuits on 𝑛
input bits that can be computed in time poly(𝑠, 𝑛). Then P = BPP.

Proof. By naïve amplification, we may assume that the randomized algorithm has failure
probability 2−𝑁 , where 𝑁 is the input length. Let 𝐶 be a size-𝑛 circuit on 𝑛 input bits describing
the action of this algorithm on its random bits, so 𝑛 = poly(𝑁) and we are trying to distinguish
the cases 𝔼[𝐶] ≤ 2−𝑁 vs. 𝔼[𝐶] ≥ 1 − 2−𝑁 . Our algorithm accepts if and only if there exists
𝑥 ∈ 𝐻𝑛 𝑐 ,𝑛 such that for all 𝑦 ∈ 𝐻3𝑛,𝑛 , 𝐶(𝑥 ⊕ 𝑦) = 1. Here, 𝑐 is a suitable constant that will become
clear later. The runtime is clearly poly(𝑁).
    For the correctness proof, first suppose 𝔼[𝐶] ≤ 2−𝑁 . For any fixed 𝑥, the function 𝑦 ↦→
¬𝐶(𝑥 ⊕ 𝑦) has expectation at least 1 − 2−𝑁 and can be computed by a circuit of size 3𝑛. Therefore,
there is some 𝑦 ∈ 𝐻3𝑛,𝑛 such that 𝐶(𝑥 ⊕ 𝑦) = 0, and hence our algorithm rejects. Conversely,
suppose 𝔼[𝐶] ≥ 1 − 2−𝑁 . Consider sampling 𝑥 ∈ {0, 1} 𝑛 and 𝑦 ∈ 𝐻3𝑛,𝑛 uniformly at random.
Since 𝑥 is uniform, 𝔼𝑥,𝑦 [¬𝐶(𝑥 ⊕ 𝑦)] ≤ 2−𝑁 . By Markov’s inequality,
                                                                            
                                                                        −𝑁
                                 Pr              𝔼 [¬𝐶(𝑥 ⊕ 𝑦)] < 2 · 2           > 1/2 .
                               𝑥∈{0,1} 𝑛       𝑦∈𝐻3𝑛,𝑛

Since 𝐻3𝑛,𝑛 can be computed in polynomial time, |𝐻3𝑛,𝑛 | ≤ poly(𝑁). Therefore, when 𝑁 is
sufficiently large,

                        𝔼 [¬𝐶(𝑥 ⊕ 𝑦)] < 2 · 2−𝑁 =⇒                  𝔼 [¬𝐶(𝑥 ⊕ 𝑦)] = 0 .
                     𝑦∈𝐻3𝑛,𝑛                                      𝑦∈𝐻3𝑛,𝑛

Therefore,
                                    Pr         [∀𝑦 ∈ 𝐻3𝑛,𝑛 , 𝐶(𝑥 ⊕ 𝑦) = 1] > 1/2 .
                                 𝑥∈{0,1} 𝑛

Given input 𝑥, the predicate ∀𝑦 ∈ 𝐻3𝑛,𝑛 , 𝐶(𝑥 ⊕ 𝑦) = 1 can be computed by a circuit 𝐶 0 of size 𝑛 𝑐
for some suitable constant 𝑐. Therefore, there is some 𝑥 ∈ 𝐻𝑛 𝑐 ,𝑛 such that 𝐶 0(𝑥) = 1.         


6    Directions for further research
In this paper, we have shown that hitting sets for RL would derandomize BPL. Constructing
a hitting set is the most natural way to prove L = RL, but there are also other approaches. In

                       T HEORY OF C OMPUTING, Volume 18 (21), 2022, pp. 1–32                             27
                               K UAN C HENG AND W ILLIAM M. H OZA

general, does L = RL imply L = BPL? In the polynomial-time setting, the “promise” variant
of this question has been answered in the affirmative, i. e., prP = prRP =⇒ P = BPP [9].
Does prL = prRL imply L = BPL? Or relaxing the challenge even further, does L = NL imply
L = BPL?
     We gave two different algorithms for estimating the expectation of an ROBP given a hitting
set, one suited for 𝑤 = poly(𝑛) (Theorem 2.1) and one suited for 𝑤 = 𝑂(1) (Theorem 3.1). What
about the case 𝑛 = polylog(𝑤), i. e., 𝑤 = 2𝑛 ? Unconditionally, there are optimal hitting
                                                Θ(1)


sets known in this regime [2, 19]. Given such an ROBP 𝑓 as input, is it possible to compute
𝔼[ 𝑓 ] ± 𝑤1 in space 𝑂(log 𝑤)? (The Nisan-Zuckerman PRG [27] achieves seed length 𝑂(log 𝑤) in
this regime, but only for moderate error 𝜖  𝑤1 .) An affirmative answer would imply that any
space-𝑠 decision algorithm that uses 𝑛 random bits could be simulated by another space-𝑂(𝑠)
algorithm using only 𝑂(𝑛/𝑠 𝑐 ) random bits, where 𝑐 is an arbitrarily large constant.
     Recently, Meka, Reingold, and Tal constructed a PRG for width-3 ROBPs with seed length
𝑂(log
e       𝑛 log(1/𝜖)) [24]. This is near-optimal when 𝜖 is not too small, but for 𝜖 = 1/𝑛 it is worse
than Nisan’s PRG [25]. On the other hand, there is an explicit hitting set for width-3 ROBPs
with near-optimal seed length 𝑂(log(𝑛/𝜖))
                                    e          [16]. Can one construct an explicit deterministic
sampler for width-3 ROBPs with near-optimal space complexity? Unfortunately, to produce a
deterministic sampler for width-3 ROBPs, Theorem 3.1 would require a hitting set for width-4
ROBPs.
     Assuming the existence of a log-space hitting set for polynomial-width ROBPs, is it possible
to construct a log-space deterministic sampler for polynomial-width ROBPs?
     Recall that WPRGs are superior to deterministic samplers (see Figure 1). Is it possible
to improve Theorem 1.7 so that it concludes with a WPRG rather than a mere deterministic
sampler?


7   Acknowledgments
We thank David Zuckerman, Dean Doron, and Pooya Hatami for helpful discussions and for
comments on drafts of this paper. We thank Adam Klivans for bringing his work with Gopalan
and Meka [15] to our attention. We thank anonymous referees for helpful comments.


References
 [1] AmirMahdi Ahmadinejad, Jonathan Kelner, Jack Murtagh, John Peebles, Aaron Sidford,
     and Salil Vadhan: High-precision estimation of random walks in small space. In Proc.
     61st FOCS, pp. 1295–1306. IEEE Comp. Soc., 2020. [doi:10.1109/FOCS46700.2020.00123,
     arXiv:1912.04524] 10, 11

 [2] Miklós Ajtai, János Komlós, and Endre Szemerédi: Deterministic simulation in LOGSPACE.
     In Proc. 19th STOC, pp. 132–140. ACM Press, 1987. [doi:10.1145/28395.28410] 28

                    T HEORY OF C OMPUTING, Volume 18 (21), 2022, pp. 1–32                       28
                H ITTING S ETS G IVE T WO -S IDED D ERANDOMIZATION OF S MALL S PACE

 [3] Alexander E. Andreev, Andrea E. F. Clementi, and José D. P. Rolim: A new general
     derandomization method. J. ACM, 45(1):179–213, 1998. Preliminary version in ICALP’96.
     [doi:10.1145/273865.273933] 2, 5, 10, 27

 [4] Alexander E. Andreev, Andrea E. F. Clementi, José D. P. Rolim, and Luca Trevisan: Weak
     random sources, hitting sets, and BPP simulations. SIAM J. Comput., 28(6):2103–2116, 1999.
     Preliminary version in FOCS’97. [doi:10.1137/S0097539797325636, ECCC:TR97-011] 2, 5,
     10

 [5] Roy Armoni: On the derandomization of space-bounded computations. In Proc. 2nd
     Internat. Workshop on Randomization and Computation (RANDOM’98), pp. 47–59. Springer,
     1998. [doi:10.1007/3-540-49543-6_5] 24

 [6] Andrej Bogdanov: Pseudorandom generators for low degree polynomials. In Proc. 37th
     STOC, pp. 21–30. ACM Press, 2005. [doi:10.1145/1060590.1060594] 6

 [7] Andrej Bogdanov, Zeev Dvir, Elad Verbin, and Amir Yehudayoff: Pseudorandom-
     ness for width-2 branching programs. Theory of Computing, 9(7):283–292, 2013.
     [doi:10.4086/toc.2013.v009a007, ECCC:TR09-070] 4

 [8] Mark Braverman, Gil Cohen, and Sumegha Garg: Pseudorandom pseudo-distributions
     with near-optimal error for read-once branching programs. SIAM J. Comput., 49(5):242–299,
     2020. Preliminary version in STOC’18. [doi:10.1137/18M1197734, ECCC:TR17-161] 3, 4, 6,
     10, 24

 [9] Harry Buhrman and Lance Fortnow: One-sided versus two-sided error in probabilistic
     computation. In Proc. 16th Symp. Theoret. Aspects of Comp. Sci. (STACS’99), pp. 100–109.
     Springer, 1999. [doi:10.1007/3-540-49116-3_9] 2, 5, 10, 28

[10] Eshan Chattopadhyay and Jyun-Jie Liao: Optimal error pseudodistributions for read-
     once branching programs. In Proc. 35th Comput. Complexity Conf. (CCC’20), pp. 25:1–27.
     Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, 2020. [doi:10.4230/LIPIcs.CCC.2020.25,
     arXiv:2002.07208, ECCC:TR20-069] 3, 4, 6, 10

[11] Kuan Cheng and William M. Hoza: Hitting sets give two-sided derandomization of small
     space. In Proc. 35th Comput. Complexity Conf. (CCC’20), pp. 10:1–25. Schloss Dagstuhl–
     Leibniz-Zentrum fuer Informatik, 2020. [doi:10.4230/LIPIcs.CCC.2020.10, ECCC:TR20-016]
     1, 4

[12] Gil Cohen, Dean Doron, Oren Renard, Ori Sberlo, and Amnon Ta-Shma: Error reduction
     for weighted PRGs against read once branching programs. In Proc. 36th Comput. Complexity
     Conf. (CCC’21), pp. 22:1–17. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, 2021.
     [doi:10.4230/LIPIcs.CCC.2021.22, ECCC:TR21-020] 4, 6, 10

[13] Oded Goldreich, Salil Vadhan, and Avi Wigderson: Simplified derandomization of BPP us-
     ing a hitting set generator. In Oded Goldreich, editor, Studies in Complexity and Cryptography,

                     T HEORY OF C OMPUTING, Volume 18 (21), 2022, pp. 1–32                       29
                              K UAN C HENG AND W ILLIAM M. H OZA

    pp. 59–67. Springer, 2011. Preliminary version in RANDOM’99. [doi:10.1007/978-3-642-
    22670-0_8, ECCC:TR00-004] 2, 5, 10

[14] Oded Goldreich and Avi Wigderson: Tiny families of functions with random properties: a
     quality-size trade-off for hashing. Random Struct. Algor., 11(4):315–343, 1997. Preliminary
     version in STOC’94. [doi:10.1002/(SICI)1098-2418(199712)11:4<315::AID-RSA3>3.3.CO;2-D,
     ECCC:TR94-002] 9, 24

[15] Parikshit Gopalan, Adam R. Klivans, and Raghu Meka: Learning functions of halfspaces
     using prefix covers. In Proc. 25th Ann. Conf. on Learning Theory (COLT’12), pp. 15.1–15.10.
     Springer, 2012. PMLR. 8, 28

[16] Parikshit Gopalan, Raghu Meka, Omer Reingold, Luca Trevisan, and Salil Vadhan:
     Better pseudorandom generators from milder pseudorandom restrictions. In Proc. 53rd
     FOCS, pp. 120–129. IEEE Comp. Soc., 2012. [doi:10.1109/FOCS.2012.77, arXiv:1210.0049,
     ECCC:TR12-123] 4, 28

[17] William M. Hoza: Better pseudodistributions and derandomization for space-bounded
     computation. In Proc. 25th Internat. Conf. on Randomization and Computation (RAN-
     DOM’21), pp. 28:1–23. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, 2021.
     [doi:10.4230/LIPIcs.APPROX/RANDOM.2021.28, ECCC:TR21-048] 4, 6, 10

[18] William M. Hoza and Chris Umans: Targeted pseudorandom generators, simulation
     advice generators, and derandomizing logspace. SIAM J. Comput., 51(2):281–304, 2022.
     Preliminary version in STOC’17. [doi:10.1137/17M1145707, arXiv:1610.01199] 6, 10

[19] William M. Hoza and David Zuckerman: Simple optimal hitting sets for small-
     success RL. SIAM J. Comput., 49(4):811–820, 2020. Preliminary version in FOCS’18.
     [doi:10.1137/19M1268707, ECCC:TR18-063] 3, 4, 6, 9, 10, 19, 20, 21, 28

[20] Russell Impagliazzo, Noam Nisan, and Avi Wigderson: Pseudorandomness for network
     algorithms. In Proc. 26th STOC, pp. 356–364. ACM Press, 1994. [doi:10.1145/195058.195190]
     9, 23, 24, 26

[21] Russell Impagliazzo and Avi Wigderson: P = BPP if E requires exponential circuits:
     Derandomizing the XOR lemma. In Proc. 29th STOC, pp. 220–229. ACM Press, 1997.
     [doi:10.1145/258533.258590] 6

[22] Richard E. Ladner and Nancy A. Lynch: Relativization of questions about log space
     computability. Math. Sys. Theory, 10(1):19–32, 1976. [doi:10.1007/BF01683260] 15

[23] Nathan Linial, Michael Luby, Michael Saks, and David Zuckerman: Efficient construction
     of a small hitting set for combinatorial rectangles in high dimension. Combinatorica,
     17(2):215–234, 1997. Preliminary version in STOC’93. [doi:10.1007/BF01200907] 21, 23

                    T HEORY OF C OMPUTING, Volume 18 (21), 2022, pp. 1–32                    30
               H ITTING S ETS G IVE T WO -S IDED D ERANDOMIZATION OF S MALL S PACE

[24] Raghu Meka, Omer Reingold, and Avishay Tal: Pseudorandom generators for
     width-3 branching programs. In Proc. 51st STOC, pp. 626–637. ACM Press, 2019.
     [doi:10.1145/3313276.3316319, arXiv:1806.04256, ECCC:TR18-112] 4, 28

[25] Noam Nisan: Pseudorandom generators for space-bounded computation. Combinatorica,
     12(4):449–461, 1992. Preliminary version in STOC’90. [doi:10.1007/BF01305237] 3, 4, 6, 9,
     10, 24, 28

[26] Noam Nisan: On read-once vs. multiple access to randomness in logspace. Theoret. Comput.
     Sci., 107(1):135–144, 1993. Preliminary version in SCT’90. [doi:10.1016/0304-3975(93)90258-
     U] 10

[27] Noam Nisan and David Zuckerman: Randomness is linear in space. J. Comput. System Sci.,
     52(1):43–52, 1996. Preliminary version in STOC’93. [doi:10.1006/jcss.1996.0004] 24, 28

[28] Edward Pyne and Salil Vadhan: Pseudodistributions that beat all pseudorandom gen-
     erators (extended abstract). In Proc. 36th Comput. Complexity Conf. (CCC’21), pp. 33:1–15.
     Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, 2021. [doi:10.4230/LIPIcs.CCC.2021.33,
     ECCC:TR21-019] 4, 6, 10

[29] Jiří Šíma and Stanislav Žák: A polynomial-time construction of a hitting set for read-once
     branching programs of width 3. Fundamenta Informaticae, 184(4):307–354, 2021. Prelim-
     inary versions in CSR’11 and SOFSEM’12. [doi:10.3233/fi-2021-2101, arXiv:2101.01151,
     ECCC:TR10-088] 4


AUTHORS

     Kuan Cheng
     Assistant professor
     Center on Frontiers of Computing Studies
     Peking University
     Beijing, China
     ckkcdh hotmail com
     https://sites.google.com/site/ckkcdh/


     William M. Hoza
     Postdoctoral fellow
     Simons Institute for the Theory of Computing
     University of California, Berkeley
     Berkeley, CA, USA
     williamhoza berkeley edu
     https://williamhoza.com/tcs/



                    T HEORY OF C OMPUTING, Volume 18 (21), 2022, pp. 1–32                    31
                             K UAN C HENG AND W ILLIAM M. H OZA

ABOUT THE AUTHORS

    Kuan Cheng is an Assistant Professor in the Center on Frontiers of Computing
      Studies at Peking University. He got his Ph. D. from Johns Hopkins University in
      2019, advised by Xin Li. After that, he was a postdoc hosted by David Zuckerman,
      at the University of Texas at Austin. He truly had a wonderful time doing research
      in these two places. Many new ideas, experiences, and happy collaborations
      were achieved during these years. Kuan has a broad interest in theoretical
      computer science, including randomness in computation, coding theory, highly
      efficient algorithms, learning theory, etc. In his spare time, he likes playing soccer,
      basketball, and skiing. Sometimes he can even play these by himself for a long
      time. Besides, he likes driving randomly to beautiful places around his living
      place.


    William M. Hoza is a postdoctoral fellow in Avishay Tal’s group at the University of
       California, Berkeley through the Simons Institute for the Theory of Computing.
       He received his B. S. from the California Institute of Technology in 2016, where
       he received valuable mentorship from Leonard Schulman and Chris Umans. He
       received his Ph. D. from the University of Texas at Austin in 2021, where he was
       advised by David Zuckerman. He is especially interested in pseudorandomness
       and derandomization. As stated on his personal website, he likes pineapple on
       pizza, and he likes the Star Wars prequel trilogy, but he does not like chocolate.




                   T HEORY OF C OMPUTING, Volume 18 (21), 2022, pp. 1–32                        32