Deterministic Approximation of Random Walks in Small Space

Authors Jack Murtagh, Omer Reingold, Aaron Sidford, Salil Vadhan,
Plaintext
                            T HEORY OF C OMPUTING, Volume 17 (4), 2021, pp. 1–35
                                         www.theoryofcomputing.org

                           S PECIAL ISSUE : APPROX-RANDOM 2019



                   Deterministic Approximation of
                   Random Walks in Small Space
     Jack Murtagh∗                  Omer Reingold†                    Aaron Sidford ‡            Salil Vadhan §
             Received February 6, 2020; Revised September 19, 2020; Published September 18, 2021




       Abstract. We give a deterministic, nearly logarithmic-space algorithm that given an
       undirected multigraph G, a positive integer r, and a set S of vertices, approximates the
       conductance of S in the r-step random walk on G to within a factor of 1 + ε, where ε > 0
       is an arbitrarily small constant. More generally, our algorithm computes an ε-spectral
       approximation to the normalized Laplacian of the r-step walk.
           Our algorithm combines the derandomized square graph operation (Rozenman and
       Vadhan, RANDOM’05), which we recently used for solving Laplacian systems in nearly
       logarithmic space (Murtagh et al., FOCS’17), with ideas from (Cheng et al., 2015) which
       gave an algorithm that is time-efficient (while ours is space-efficient) and randomized (while
       ours is deterministic) for the case of even r (while ours works for all r). Along the way, we
       provide some new results that generalize technical machinery and yield improvements over
    A conference version of this paper appeared in the Proceedings of the 23rd International Conference on Randomization and
Computation (RANDOM 2019) [31].
  ∗ Supported by NSF grant CCF-1763299.
  † Supported by NSF grant CCF-1763311.
  ‡ Supported by NSF CAREER Award CCF-1844855.
  § Supported by NSF grant CCF-1763299.



ACM Classification: F.2.2, G.2.2
AMS Classification: 68Q25, 68R10
Key words and phrases: random walks, space complexity, derandomization, expander graphs, spectral
approximation


© 2021 Jack Murtagh, Omer Reingold, Aaron Sidford, and Salil Vadhan
c b Licensed under a Creative Commons Attribution License (CC-BY)                             DOI: 10.4086/toc.2021.v017a004
                JACK M URTAGH , O MER R EINGOLD , A ARON S IDFORD , AND S ALIL VADHAN

      previous work. First, we obtain a nearly linear-time randomized algorithm for computing a
      spectral approximation to the normalized Laplacian for odd r. Second, we define and analyze
      a generalization of the derandomized square for irregular multigraphs and for sparsifying the
      product of two distinct multigraphs. As part of this generalization, we also give a strongly
      explicit construction of expander graphs of every size.


1    Introduction
Random walks provide the most dramatic example of the power of randomized algorithms for solving
computational problems in the space-bounded setting, as they only require logarithmic space (to store the
current state or vertex). In particular, since undirected graphs have polynomial cover time, random walks
give a randomized logspace (RL) algorithm for U NDIRECTED S-T C ONNECTIVITY [2]. Reingold [34]
showed that this algorithm can be derandomized, and hence that U NDIRECTED S-T C ONNECTIVITY is
in deterministic logspace (L). However, Reingold’s algorithm does not match the full power of random
walks on undirected graphs; in particular it does not allow us to approximate properties of the random
walk at lengths below the mixing time.
    In this article, we provide a nearly logarithmic-space algorithm for approximating properties of
arbitrary-length random walks on an undirected graph, in particular the conductance of any set of vertices:
Definition 1.1. Let G = (V, E) be an undirected graph, r a positive integer, and S ⊆ V a set of vertices.
The conductance of S under the r-step random walk on G is defined as
                                         Φr (S) = Pr[Vr 6∈ S|V0 ∈ S],
where V0 ,V1 , . . . ,Vr is a random walk on G started at the stationary distribution Pr[V0 = v] = deg(v)/2|E|.
Theorem 1.2. There is a deterministic algorithm that given an undirected multigraph G on n vertices, a
positive integer r, a set of vertices S, and ε > 0, computes a number Φ
                                                                      e such that

                                   (1 − ε) · Φr (S) ≤ Φ
                                                      e ≤ (1 + ε) · Φr (S)

and runs in space O(log N + (log r) · log(1/ε) + (log r) · log log r), where N is the bit length of the input
graph G (the length of the description of G as a list of edges, in a binary encoding).
     Previously, approximating conductance could be done in O(log3/2 (N/ε) + log log r) space, which
follows from the proof by Saks and Zhou that RL is in L3/2 [37].
    √Two interesting parameter regimes where we improve the Saks-Zhou bound are when r = 1/ε =
2O( log N) , in which case our algorithm runs in space O(log N), or when ε = 1/polylog(N) and r ≤
poly(N), in which case our algorithm runs in space O(log e     N). When r exceeds the poly(N) · log(1/ε)
time for random walks on undirected graphs to mix to within distance ε of the stationary distribution,
the conductance can be approximated in space O(log(N/ε) + log log r) by using Reingold’s algorithm
to find the connected components of G, and the bipartitions of the components that are bipartite and
calculating the stationary probability of S restricted to each of these pieces, which is proportional to the
sum of degrees of vertices in S.
     We prove Theorem 1.2 by providing a stronger result that with the same amount of space it is possible
to compute an ε-spectral approximation to the normalized Laplacian of the r-step random walk on G.

                        T HEORY OF C OMPUTING, Volume 17 (4), 2021, pp. 1–35                                 2
                    D ETERMINISTIC A PPROXIMATION OF R ANDOM WALKS IN S MALL S PACE

Definition 1.3. Let G be an undirected graph with adjacency matrix A, diagonal degree matrix D, and
transition matrix T = AD−1 . The transition matrix for the r-step random walk on G is T r . The normalized
Laplacian of the r-step random walk is the symmetric matrix I − M r for M = D−1/2 AD−1/2 .

    Spectral approximation of the normalized Laplacian is a strong definition that guarantees the two
matrices have similar eigenvalues, and their corresponding graphs have similar cuts and random walk
behavior [39, 9]. Note that the normalized Laplacian can also be expressed as I −M r = D−1/2 (I −T r )D1/2 ,
so it does indeed capture the behavior of r-step random walks on G.1

Theorem 1.4 (Main result). There is a deterministic algorithm that given an undirected multigraph G on
n vertices with normalized Laplacian I − M, a nonnegative integer r, and ε > 0, constructs an undirected
multigraph G e whose normalized Laplacian L  e is an ε-spectral approximation of L = I − M r . That is, for
all vectors v ∈ Rn

                                (1 − ε) · vT Lv ≤ vT Lv
                                                     e ≤ (1 + ε) · vT Lv.

The algorithm runs in space O(log N + (log r) · log(1/ε) + (log r) · log log r), where N is the bit length of
the input graph G.

    Theorem 1.2 follows from Theorem 1.4 by taking v to be D1/2 eS where eS is the characteristic vector
of the set S and normalizing appropriately. (See Section 5).
    Our main technique for proving Theorem 1.4 is the derandomized product, a new generalization of
the derandomized square, which was introduced by Rozenman and Vadhan [36] to give an alternative
proof that U NDIRECTED S-T C ONNECTIVITY is in L. Our main result follows from carefully applying
the derandomized product and analyzing its properties with inequalities from the theory of spectral
approximation. Specifically, our analysis is inspired by the work of Cheng, Cheng, Liu, Peng, and
Teng [14], who studied the approximation of random walks by randomized algorithms running in nearly
linear time. We emphasize that [14] gives a randomized algorithm with high space complexity (but
low time complexity) for approximating properties of even-length walks, while we give a deterministic,
space-efficient algorithm for approximating properties of walks of every length. Interestingly, while the
graphs in our results are all undirected, some of our analyses use techniques for spectral approximation of
directed graphs introduced by Cohen, Kelner, Peebles, Peng, Rao, Sidford, and Vladu [16, 15].
    The derandomized square can be viewed as applying the pseudorandom generator of Impagliazzo,
Nisan, and Wigderson [23] to random walks on labelled graphs. It is somewhat surprising that repeated
derandomized squaring does not blow up the error by a factor proportional to the length of the walk being
derandomized. For arbitrary branching programs, the INW generator does incur error that is linear in the
length of the program. Some special cases such as regular [11, 12, 17] and permutation [17, 40] branching
programs of constant width have been shown to have a milder error growth as a function of the walk
length. Our work adds to this list by showing that properties of random walks of length k on undirected
graphs can be estimated in terms of spectral approximation without error accumulating linearly in k.
    In our previous paper [30], we showed that the Laplacian of the derandomized square of a regular
graph spectrally approximates the Laplacian of the true square, I − M 2 , and this was used in a recursion
from [33] to give a nearly logarithmic-space algorithm for approximately solving Laplacian systems
   1 When G is irregular, the matrix I − T r is not necessarily symmetric. It is a directed Laplacian as defined in [16, 15]. See

Definition 2.5.


                            T HEORY OF C OMPUTING, Volume 17 (4), 2021, pp. 1–35                                               3
                JACK M URTAGH , O MER R EINGOLD , A ARON S IDFORD , AND S ALIL VADHAN

Lx = b. A natural idea to approximate the Laplacian of higher powers, I − M r , is to repeatedly apply
the derandomized square. This raises three challenges, and we achieve our result by showing how to
overcome each:

   1. It is not guaranteed from [30] that repeated derandomized squaring preserves spectral approxima-
      tion. For this, we use ideas from [14] to argue that it does.

   2. When r is not a power of 2, the standard approach would be to write r = b0 + 2 · b1 + . . . + 2z · bz
                                                                        i
      where bi is the i-th bit of r and multiply approximations to M 2 for all i such that bi 6= 0. The
      problem is that multiplying spectral approximations of matrices does not necessarily yield a
      spectral approximation of their product. Our solution is to generalize the derandomized square to
      produce sparse approximations to the product of distinct graphs. In particular, given I − M and
      an approximation I − M  e to I − M k , our derandomized product allows us to combine M and M   e to
                          k+1
      approximate I − M . Although our generalized graph product is defined for undirected graphs,
      its analysis uses machinery for spectral approximation of directed graphs, introduced in [15].

   3. We cannot assume that our graph is regular without loss of generality. In contrast, [34, 36, 30]
      could do so, since adding self-loops does not affect connectivity or solutions to Laplacian systems
      of G, however, it does affect random walks. Our solution is to define and analyze the derandomized
      product for irregular graphs.

     A key element in the derandomized product is a strongly explicit (i. e., neighbor relations can be
computed in time polylog(N) and space O(log N)) construction of expander graphs whose sizes equal the
degrees of the vertices in the graphs being multiplied. Many commonly used strongly explicit expander
families only contain graph sizes that are certain subsets of N such as powers of 2 (Cayley graphs based
on [32] and [5]), perfect squares [28, 18], and other size distributions [35] or are only mildly explicit in
the sense of having running time or parallel work that is poly(N) [26]. It was folklore that one can convert
any sufficiently dense family of expanders (such as the aforementioned constructions) into a family of
expanders of every size, and there were offhand remarks in the literature to this effect (see, e. g., [7], [19]),
albeit without an analysis of expansion or explicitness. In Section 3, we give a self-contained description
and proof of such a reduction, with an analysis in terms of both strong explicitness and spectral expansion.
Subsequent to our work, Goldreich [20] has provided an analysis of the construction previously described
in [19] in terms of vertex expansion and Alon [4] has given a stronger reduction that provides nearly
Ramanujan expanders (i. e., ones with nearly optimal spectral expansion as a function of the degree) of
every size N.
     Many of our techniques are inspired by Cheng, Cheng, Liu, Peng, and Teng [14], who gave two
algorithms for approximating random walks. One is a nearly linear time randomized algorithm for
approximating random walks of even length and another works for all walk lengths r but has a running
time that is quadratic in r, and so only yields a nearly linear time algorithm for r that is polylogarithmic in
the size of the graph. In addition, [24] studied the problem of computing sparse spectral approximations
of random walks but the running time in their work also has a quadratic dependence on r. We extend
these results by giving a nearly linear time randomized algorithm for computing a spectral approximation
to I − M r for all r. This is discussed in Section 5.

                         T HEORY OF C OMPUTING, Volume 17 (4), 2021, pp. 1–35                                  4
                    D ETERMINISTIC A PPROXIMATION OF R ANDOM WALKS IN S MALL S PACE

Subsequent work. Subsequent to [31], the conference version of this paper, Ahmadinejad et al. [1]
strengthened Theorem 1.2 by providing a deterministic algorithm that, when given a graph G, two
vertices s and t, and a positive integer r, estimates the probability that a random walk of length r in G
started at s ends at t to within ±ε and uses space O(log(r · N) · log log(r · N/ε)). Furthermore, their result
applies to the more general setting of Eulerian digraphs.2 In undirected graphs, all nonzero conductances
can be shown to be of magnitude at least 1/poly(N) and can be expressed as a sum of at most N 2 /4
random walk probabilities (entries of powers of the transition matrix). Consequently, with the same
space bound the algorithm of [1] can estimate the conductance of any set S to within a multiplicative
factor of (1 ± 1/poly(N)). While the result of [1] is generally stronger than Theorem 1.2, with a better
dependence on the approximation parameter ε, the algorithm and the analysis in our proof of Theorem 1.2
are considerably simpler than the corresponding items in [1]. The other results in our paper are not
superseded by [1]. In particular, we believe that the derandomized graph product that we introduce to
prove Theorem 1.2 is interesting in its own right. In addition, we fill a gap in the literature by extending
the main result of [14] to compute sparse approximations of I − M r for odd r in nearly linear time, while
their theorem works only for even r.


2     Preliminaries
2.1     Spectral graph theory
Given an undirected multigraph G the Laplacian of G is the symmetric matrix D − A, where D is the
diagonal matrix of vertex degrees and A is the adjacency matrix of G. The transition matrix of the random
walk on G is T = AD−1 . Ti j is the probability that a uniformly random edge from vertex j leads to vertex
i (i. e., the number of edges between j and i divided by the degree of j). The normalized Laplacian of G
is the symmetric matrix I − M = D−1/2 (D − A)D−1/2 . Note that when G is regular, D is a multiple of the
identity and M and T are the same matrix: M = D−1/2 AD−1/2 = AD−1 = T . The transition matrix of the
r-step random walk on G is T r . For all probability distributions π, T r π is the distribution over vertices
that results from picking a random vertex according to π and then running a random walk on G for r
steps. The transition matrix of the r-step random walk on G is related to the normalized Laplacian in the
following way:
                                       I − M r = D−1/2 (I − T r )D1/2 .

For undirected multigraphs, the matrix M = D−1/2 AD−1/2 has real eigenvalues between −1 and 1 and so
I − M r has eigenvalues in [0, 2] and thus is positive semidefinite (PSD). The spectral norm of a real matrix
M, denoted kMk, is the largest singular value of M. That is, the square root of the largest eigenvalue
of M T M. When M is symmetric, kMk equals the largest eigenvalue of M in absolute value. For an
undirected graph G with adjacency matrix A and a positive integer k, we write k · G to denote the graph
with adjacency matrix k · A, i. e., the multigraph G with all edges duplicated to have multiplicity k.
    Given a symmetric matrix L, its Moore-Penrose Pseudoinverse, denoted L† , is the unique matrix with
the same eigenvectors as L such that for each eigenvalue λ of L, the corresponding eigenvalue of L† is
    2 Eulerian digraphs are directed graphs where the in-degree of each vertex equals its out-degree. Undirected graphs are a

special case of Eulerian digraphs.


                            T HEORY OF C OMPUTING, Volume 17 (4), 2021, pp. 1–35                                           5
                  JACK M URTAGH , O MER R EINGOLD , A ARON S IDFORD , AND S ALIL VADHAN

1/λ if λ 6= 0 and 0 otherwise. When L is a Laplacian, we write L†/2 to denote the unique symmetric PSD
matrix square root of the pseudoinverse of L.
   To measure the approximation between graphs we use spectral approximation3 [38]:
Definition 2.1. Let L, Le ∈ Rn×n be symmetric PSD matrices. We say that L
                                                                        e is an ε-approximation of L
         e ≈ε L) if for all vectors v ∈ R
(written L                                n


                                        (1 − ε) · vT Lv ≤ vT Lv
                                                             e ≤ (1 + ε) · vT Lv.

    Note that Definition 2.1 is not symmetric in L and L.
                                                       e Spectral approximation can also be written in
terms of the Loewner partial ordering of PSD matrices:

                                               (1 − ε) · L  L
                                                             e  (1 + ε) · L

where for two matrices A, B, we write A  B if B − A is PSD. Spectral approximation has a number of
useful properties listed in the following proposition.
Proposition 2.2. If W, X,Y, Z ∈ Rn×n are PSD symmetric matrices then:
   1. if X ≈ε Y for ε < 1 then Y ≈ε/(1−ε) X,

   2. if X ≈ε1 Y and Y ≈ε2 Z then X ≈ε1 +ε2 +ε1 ·ε2 Z,

   3. if X ≈ε Y and V is any n × n matrix then V T XV ≈ε V T YV ,

   4. if X ≈ε Y then X + Z ≈ε Y + Z,

   5. if W ≈ε1 X and Y ≈ε2 Z then W +Y ≈max{ε1 ,ε2 } X + Z, and

   6. if X ≈ε Y then c · X ≈ε c ·Y for all nonnegative scalars c
    Following [41, 6, 3], we measure how well-connected a graph is using the second eigenvalue.
Definition 2.3. Let G be a regular undirected graph with transition matrix T . Define
                               kT vk
              λ (G) := max           = 2nd largest absolute value of the eigenvalues of T ∈ [0, 1].
                          v⊥~1 kvk
                          v6=0

1 − λ (G) is called the spectral gap of G.
    The smaller λ (G), the faster a random walk on G converges to the uniform distribution. Graphs G
with λ (G) bounded away from 1 are called expanders. Expanders can equivalently be characterized as
graphs that spectrally approximate the complete graph. This is formalized in the next lemma.
Lemma 2.4. Let H be a c-regular undirected multigraph on n vertices with transition matrix T and let
J ∈ Rn×n be a matrix with 1/n in every entry (i. e., J is the transition matrix of the complete graph with a
self-loop on every vertex). Then λ (H) ≤ λ if and only if I − T ≈λ I − J.
                                                                               e ≈ε L if for all v ∈ Rn , e−ε · vT Lv ≤ vT Lv
   3 In [30], we use an alternative definition of spectral approximation where L                                           e ≤
eε · vT Lv. We find Definition 2.1 more convenient for this paper.


                            T HEORY OF C OMPUTING, Volume 17 (4), 2021, pp. 1–35                                            6
                  D ETERMINISTIC A PPROXIMATION OF R ANDOM WALKS IN S MALL S PACE

Proof. Let v1 , v2 , . . . , vn be the orthonormal eigenvectors of I − T where v1 is the uniform distribution (the
vector containing 1/n in every coordinate). Note that these are also eigenvectors of I − J as (I − J)v1 = ~0
and (I − J)vi = vi for all i 6= 1.
     Let V be a matrix with v1 , . . . , vn as columns. V is an orthogonal matrix so V T V = I. From Propo-
sition 2.2 Part 3, we have I − T ≈λ I − J if and only if V T (I − T )V ≈λ V T (I − J)V . V T (I − T )V is a
diagonal matrix with the eigenvalues 0, 1 − λ2 , . . . 1 − λn of I − T along the diagonal and V T (I − J)V
is a diagonal matrix with the eigenvalues 0, 1, . . . , 1 of I − J along the diagonal. Diagonal matrices
spectrally approximate each other if and only if their diagonal entries satisfy the spectral inequalities, so
V T (I − T )V ≈λ V T (I − J)V if and only if for all i ∈ {2, . . . , n}

                                        (1 − λ ) · 1 ≤ 1 − λi ≤ (1 + λ ) · 1.

The above holds if and only if |λi | ≤ λ for all i ∈ {2, . . . , n}, which is equivalent to λ (H) ≤ λ .

   In [15] Cohen, Kelner, Peebles, Peng, Rao, Sidford, and Vladu introduced a definition of spectral
approximation for asymmetric matrices. Although the results in our paper only concern undirected graphs,
some of our proofs use machinery from the theory of directed spectral approximation.
Definition 2.5 (Directed Laplacian [16, 15]). A matrix L ∈ Rn×n is called a directed Laplacian if Li j ≤ 0
for all i 6= j and Lii = − ∑ j6=i L ji for all i ∈ [n]. The associated directed graph has n vertices and an edge
(i, j) of weight −L ji for all i 6= j ∈ [n] with L ji 6= 0.

Definition 2.6 (Asymmetric Matrix Approximation [15]). Let L e and L be (possibly asymmetric) matrices
                    T
such that U = (L + L )/2 is PSD. We say that L
                                             e is a directed ε-approximation of L if:

   1. ker(U) ⊆ ker(L             e − L)T ), and
                   e − L) ∩ ker((L

   2.   U †/2 (L
               e − L)U †/2       ≤ ε.
                             2

   Below we state some useful lemmas about directed spectral approximation. The first gives an
equivalent formulation of Definition 2.6.
Lemma 2.7 ([15] Lemma 3.5). Let L ∈ Rn×n be a (possibly asymmetric) matrix and let U = (L + LT )/2.
         e is a directed ε-approximation of L if and only if for all vectors x, y ∈ Rn
A matrix L
                                                        ε
                                        xT (L
                                            e − L)y ≤     · (xT Ux + yT Uy).
                                                        2
                                    e is a directed ε-approximation of L and let U = (L + LT )/2
Lemma 2.8 ([15] Lemma 3.6). Suppose L
    e = (L
and U    e+L T          e ≈ε U.
            e )/2. Then U

    Lemma 2.8 says that directed spectral approximation implies the usual notion from Definition 2.1 for
“symmetrized” versions of the matrices L and L.
                                              e In fact, when the matrices L and Le are both symmetric,
the two definitions are equivalent:
Lemma 2.9. Let Le and L be symmetric PSD matrices. Then L
                                                        e is a directed ε-approximation of L if and
        e ≈ε L.
only if L

                         T HEORY OF C OMPUTING, Volume 17 (4), 2021, pp. 1–35                                   7
                JACK M URTAGH , O MER R EINGOLD , A ARON S IDFORD , AND S ALIL VADHAN

Proof. The “only if” direction follows from Lemma 2.8 by observing that U := (L + LT )/2 = L and
e := (L
U     e+L eT )/2 = L.
                    e
                                   e ≈ε L. Equivalently,
   For the “if” direction, suppose L

                                          −ε · L  L
                                                   e − L  ε · L.

Multiplying on the left and right by L†/2 preserves the ordering and yields

                            −ε · L†/2 LL†/2  L†/2 (L
                                                    e − L)L†/2  ε · L†/2 LL†/2 .
                                                                        √
For each nonzero eigenvalue λ of L, L†/2 has corresponding eigenvalue 1/ λ and hence L†/2 LL†/2 ≤ 1.
Since L is symmetric we have U := (L + LT )/2 = L. Combining this with the above gives

                                  U †/2 (L
                                         e − L)U †/2 = L†/2 (L
                                                             e − L)L†/2

                                                      ≤ ε · L†/2 LL†/2
                                                      ≤ ε.

      e ≈ε L by assumption, we must have ker(L) = ker(L).
Since L                                                                        e − L) = ker(L) and
                                                         e It follows that ker(L
     e     T
ker((L − L) ) = ker(L) since L and L are symmetric. Since U = L, we have
                                   e

                                  ker(U) = ker(L)
                                               e − L)
                                         = ker(L
                                                                  e T ).
                                                e − L) ∩ ker((L − L)
                                          = ker(L


2.2     Space-bounded computation
We use a standard model of space-bounded computation where the machine M has a read-only input
tape, a constant number of read/write work tapes, and a write-only output tape. If throughout every
computation on inputs of length at most n, M uses at most s(n) total tape cells on all the work tapes, we
say M runs in space s = s(n). Note that M may write more than s cells (in fact as many as 2O(s) ) but the
output tape is write-only. The following proposition describes the behavior of space complexity when
space bounded algorithms are composed.

Proposition 2.10. Let f1 , f2 be functions that can be computed in space s1 (n), s2 (n) ≥ log n, respectively,
and f1 has output of length at most `1 (n) on inputs of length n. Then f2 ◦ f1 can be computed in space

                                           O(s2 (`1 (n)) + s1 (n)).

      A proof of Proposition 2.10 can be found in [8] (Lemma 4.15).

                        T HEORY OF C OMPUTING, Volume 17 (4), 2021, pp. 1–35                                8
                  D ETERMINISTIC A PPROXIMATION OF R ANDOM WALKS IN S MALL S PACE

2.3    Rotation maps
In the space-bounded setting, it is convenient to use local descriptions of graphs. Such descriptions allow
us to navigate large graphs without loading them entirely into memory. For this we use rotation maps,
functions that describe graphs through their neighbor relations. Rotation maps are defined for graphs
with labeled edges as described in the following definition.

Definition 2.11 ([35]). A two-way labeling of an undirected multigraph G = (V, E) with vertex degrees
(dv )v∈V , is a labeling of the edges in G such that

    1. every edge (u, v) ∈ E has two labels: one in [du ] as an edge incident to u and one in [dv ] as an edge
       incident to v;

    2. for every vertex v ∈ V , the labels of the dv edges incident to v are distinct.

    In [36], two-way labelings are referred to as undirected two-way labelings. Note that every graph has
a two-way labeling where each vertex “names” its neighbors uniquely in some canonical way based on
the order they’re represented in the input. We will describe multigraphs with two-way labelings using
rotation maps:

Definition 2.12 ([35]). Let G be an undirected multigraph on n vertices with a two-way labeling. The
rotation map RotG is defined as follows: RotG (v, i) = (w, j) if the i-th edge to vertex v leads to vertex w
and this edge is the j-th edge incident to w.

    We will use expanders that have efficiently computable rotation maps. We call such graphs strongly
explicit. The usual definition of strong explicitness only refers to time complexity, but we will use it for
both time and space.

Definition 2.13. A family of two-way labeled graphs G = {Gn,c }(n,c) , where Gn,c is a c-regular graph on
n vertices, is called strongly explicit if given n, c, a vertex v ∈ [n] and an edge label a ∈ [c], RotGn,c (v, a)
can be computed in time poly(log(nc)) and space O(log nc).


3     The derandomized product and expanders of all sizes
In this section we introduce our derandomized graph product. The derandomized product generalizes
the derandomized square graph operation that was introduced by Rozenman and Vadhan [36] to give an
alternative proof that U NDIRECTED S-T C ONNECTIVITY is in L. Unlike the derandomized square, the
derandomized product is defined for irregular graphs and produces a sparse approximation to the product
of any two (potentially different) graphs with the same vertex degrees.
    Here, by the “product” of two graphs G0 , G1 , we mean the reversible Markov chain with transitions
defined as follows: from a vertex v, with probability 1/2 take a random step on G0 followed by a random
step on G1 and with probability 1/2 take a random step on G1 followed by a random step on G0 .
    When G0 = G1 = G, this is the same as taking a 2-step random walk on G. Note, however, that when
G is irregular, a 2-step random walk is not equivalent to doing a 1-step random walk on the graph G2
whose edges correspond to paths of length 2 in G. Indeed, even the stationary distribution of the random

                         T HEORY OF C OMPUTING, Volume 17 (4), 2021, pp. 1–35                                  9
                   JACK M URTAGH , O MER R EINGOLD , A ARON S IDFORD , AND S ALIL VADHAN

walk on G2 may be different than on G.4 If A is the adjacency matrix of G, then G2 has adjacency matrix
A2 and thus transition matrix A2 D−1 2 , where D2 is a diagonal matrix whose v’th entry is the sum of the
degrees of the neighbors of vertex v in G. In contrast, the two-step random walk on G has transition
matrix T 2 = (AD−1 )2 , where D is the diagonal matrix of degrees in G. Our goal in the derandomized
product is to produce a relatively sparse graph whose 1-step random walk approximates the 2-step random
walk on G.
    The intuition behind the derandomized product is as follows: rather than build a graph with every
such two-step walk, we use expander graphs to pick a pseudorandom subset of the walks. Specifically,
we first pick b ∈ {0, 1} at random. Then, as before we take a truly random step from v to u in Gb . But
for the second step, we don’t use an arbitrary edge leaving u in Gb̄ , but rather correlate it to the edge on
which we arrived at u using a c-regular expander on deg(u) vertices, where we assume that the vertex
degrees in G0 and G1 are the same. When c < deg(u), the vertex degrees of the resulting two-step graph
will be sparser than without derandomization. However using the pseudorandom properties of expander
graphs, we can argue that the derandomized product is a good approximation of the true product. The
combinatorial description of the derandomized product is as follows. For every vertex u ∈ [n], we place a
bipartite expander between the deg(u) neighbors of u in G0 and the deg(u) neighbors of u in G1 .
Definition 3.1 (Derandomized Product). Let G0 , G1 be undirected multigraphs on n vertices with two-
way labelings and identical vertex degrees d1 , d2 , . . . , dn . Let H = {Hi } be a family of two-way labeled,
c-regular expanders of sizes including d1 , . . . , dn . The derandomized product with respect to H, denoted
G0 p H G1 , is an undirected multigraph on n vertices with vertex degrees 2 · c · d1 , . . . , 2 · c · dn and
rotation map RotG0 p H G1 defined as follows: For v0 ∈ [n], j0 ∈ [dv0 ], a0 ∈ [c], and b ∈ {0, 1} we compute
RotG0 p H G1 (v0 , ( j0 , a0 , b)) as:
    1. let (v1 , j1 ) = RotGb (v0 , j0 ),

    2. let ( j2 , a1 ) = RotHdv ( j1 , a0 ),
                                 1

    3. let (v2 , j3 ) = RotGb̄ (v1 , j2 ),

    4. output (v2 , ( j3 , a1 , b̄)),
where b̄ denotes the bit-negation of b.
    Note that when G0 = G1 the derandomized product generalizes the derandomized square [36] to
irregular graphs, albeit with each edge duplicated twice. To see that G0 p H G1 is undirected, one can
check that RotG0 p H G1 (RotG0 p H G1 (v0 , ( j0 , a0 , b))) = (v0 , ( j0 , a0 , b))).
    Note that Definition 3.1 requires that each vertex i has the same degree di in G0 and G1 , ensuring
that the random walks on G0 , G1 , and G0 p H G1 all have the same stationary distribution. This can be
generalized to the case that there is an integer k such that for each vertex v with degree dv in G1 , v has
degree k · dv in G0 . For this, we can duplicate each edge in G1 k times to match the degrees of G0 and
then apply the derandomized product to the result. In such cases we abuse notation and write G0 p H G1
to mean G0 p H k · G1 .
   4 For example, let G be the graph on two vertices with one edge (u, v) connecting them and a single self-loop on u. Then

[2/3, 1/3] is the stationary distribution of G and [3/5, 2/5] is the stationary distribution of G2 .


                             T HEORY OF C OMPUTING, Volume 17 (4), 2021, pp. 1–35                                       10
                   D ETERMINISTIC A PPROXIMATION OF R ANDOM WALKS IN S MALL S PACE

   In [30] we showed that the derandomized square produces a spectral approximation to the true square.
We now show that the derandomized product also spectrally approximates a natural graph product.

Theorem 3.2. Let G0 , G1 be undirected multigraphs on n vertices with two-way labelings, and normalized
Laplacians I − M0 and I − M1 . Let G0 have vertex degrees d1 , . . . , dn and G1 have vertex degrees d10 , . . . , dn0
where for all i ∈ [n], di = k · di0 for a positive integer k. Let H = {Hi } be a family of two-way labeled,
c-regular expanders with λ (Hi ) ≤ λ for all Hi ∈ H, of sizes including d1 , . . . , dn . Let I − M  e be the
normalized Laplacian of G  e = G0 p H G1 . Then

                                            e ≈λ I − 1 · (M0 M1 + M1 M0 ).
                                         I −M
                                                     2
Proof of Theorem 3.2. Note that k · G1 has the same transition matrix and normalized Laplacian as G1 .
So we can replace G1 with k · G1 and assume k = 1 without loss of generality.
   Since G0 and G1 have the same vertex degrees, we can we write
                              1                              1
                           I − · (M0 M1 + M1 M0 ) = I − D−1/2 · (T0 T1 + T1 T0 )D1/2                                    (3.1)
                              2                              2
where T0 and T1 are the transition matrices of G0 and G1 , respectively.
     Following the proofs in [36] and [30], we can write the transition matrix for the random walk on
G as Te = 12 · (PR0 BR
e                    e 1 Q + PR1 BR
                                 e 0 Q), where each matrix corresponds to a step in the definition of the
derandomized product. The two terms correspond to b = 0 and b = 1 in the derandomized product and,
setting d¯ = ∑i∈[n] di :

    • Q is a d¯ × n matrix that “lifts” a probability distribution over [n] to one over [d] ¯ where the mass on
      each coordinate i ∈ [n] is divided uniformly over the corresponding degree di . That is, Q(u,i),v = 1/di
      if u = v and 0 otherwise where the rows of Q are ordered (1, 1), (1, 2), . . . , (1, d1 ), (2, 1), . . . , (2, d2 ),
      . . . (n, 1), . . . , (n, dn ).

    • R0 and R1 are the d¯ × d¯ symmetric permutation matrices corresponding to the rotation maps of G0
      and G1 , respectively. That is, entry (u, i), (v, j) in Ra is 1 if RotGa (u, i) = (v, j) and 0 otherwise for
      a ∈ {0, 1}.

    • Be is a d¯ × d¯ symmetric block-diagonal matrix with n blocks where block i is the transition matrix
      for the random walk on Hdi ∈ H, the expander in our family with di vertices.

    • P = DQT is the n × d¯ matrix that maps any d-vector  ¯          to an n-vector by summing all the entries
      corresponding to edges incident to the same vertex in G0 and G1 . This corresponds to projecting a
                        ¯ back down to a distribution over [n]. Pv,(u,i) = 1 if u = v and 0 otherwise where
      distribution on [d]
      the columns of P are ordered (1, 1), (1, 2), . . . , (1, d1 ), (2, 1), . . . , (2, d2 ), . . . (n, 1), . . . , (n, dn ).

Likewise, we can write
                                     (T0 T1 + T1 T0 ) = (PR0 JR
                                                             e 1 Q + PR1 JR
                                                                         e 0 Q)                                         (3.2)
where Jeis a d¯× d¯ symmetric block-diagonal matrix with n blocks where block i is Ji , the transition matrix
for the complete graph on di vertices with a self-loop on every vertex. That is, every entry of Ji is 1/di .

                           T HEORY OF C OMPUTING, Volume 17 (4), 2021, pp. 1–35                                            11
                JACK M URTAGH , O MER R EINGOLD , A ARON S IDFORD , AND S ALIL VADHAN

    We will show that
                              1                               1       ¯ 1 + R1 JR
                                                                                ¯ 0 ).
                         Id¯ − · (R0 B̄R1 + R1 B̄R0 ) ≈λ Id¯ − · (R0 JR
                              2                               2
From this the theorem follows by multiplying by D−1/2 P on the left and (D−1/2 P)T = QD1/2 on the right
and applying Proposition 2.2 Part 3. Since D−1/2 PQD1/2 = In , the left-hand side becomes

                                   In − D−1/2 TeD1/2 = In − D
                                                            e −1/2 TeD
                                                                     e 1/2
                                                     = In − M
                                                            e

where De = 2 · c · D is the diagonal matrix of vertex degrees of G.   e By Equation (3.1) and Equation (3.2),
                                      1
the right-hand side becomes In − 2 (M0 M1 + M1 M0 ).
    By Lemma 2.4, each graph in H is a λ -approximation of the complete graph on the same number of
vertices. It follows that Id¯ − Be ≈λ Id¯ − Jebecause the quadratic form of a block diagonal matrix equals the
sum of the quadratic forms of its blocks. By Lemma 2.9 and the fact that Id − Jeis PSD, Id¯ − Be is also a
directed λ -approximation of Id¯ − J.  e So for all vectors x, y ∈ Rd¯ we have

                                        e ≤ λ · (xT (Id¯ − J)x
                               xT (Be − J)y                e + yT (Id¯ − J)y)
                                                                         e
                                             2
                                             λ
                                            ≤ · (xT x + yT y − 2xT Jy).
                                                                   e
                                             2
The first inequality uses Lemma 2.7. We can add the absolute values on the left-hand side since the
right-hand side is always nonnegative (Id − Jeis PSD) and invariant to swapping x with −x. The second
inequality follows from the fact that Jeis PSD and so

                              0 ≤ (x − y)T J(x
                                           e − y) = xT Jx
                                                       e + yT Jy
                                                              e − 2 · xT Jy.
                                                                         e
           ¯
Fix v ∈ Rd and set x = R0 v and y = R1 v. Recall that R0 and R1 are symmetric permutation matrices
and hence R20 = R21 = Id¯. Also note that for all square matrices A and vectors x, xT Ax = xT (A + AT )x/2.
Combining these observations with the above gives
                                                 
           T 1                                e 0 v = vT R0 (Be − J)R
          v      · R0 (Be − J)R
                              e 1 + R1 (Be − J)R                        e 1v
               2
                                                           λ
                                                         ≤ · (vT R20 v + vT R21 v − 2vT R0 JR
                                                                                           e 1 v)
                                                           2
                                                         = λ · (vT v − vT R0 JR
                                                                             e 1 v)
                                                                                             
                                                                T       1  e
                                                         = λ · v I − · R0 JR1 + R1 JR0 v. e
                                                                        2
Rearranging the above shows that
                              1                               1       ¯ 1 + R1 JR
                                                                                ¯ 0 ),
                         Id¯ − · (R0 B̄R1 + R1 B̄R0 ) ≈λ Id¯ − · (R0 JR
                              2                               2
which proves the theorem.

                        T HEORY OF C OMPUTING, Volume 17 (4), 2021, pp. 1–35                               12
                  D ETERMINISTIC A PPROXIMATION OF R ANDOM WALKS IN S MALL S PACE

     Note that for a graph G with normalized Laplacian I − M and transition matrix T , approximating
I − 12 · (M0 M1 + M1 M0 ) as in Theorem 3.2 for M0 = M k0 and M1 = M k1 gives a form of approximation to
random walks of length k1 + k2 on G, as

                                I − T k1 +k2 = D1/2 (I − M k1 +k2 )D−1/2
                                                  1
                                             = I − · D1/2 (M0 M1 + M1 M0 )D−1/2 .
                                                  2
    To apply the derandomized product, we need an expander family H with sizes equal to all of the
vertex degrees. However, existing constructions of strongly explicit expander families only give graphs of
sizes that are subsets of N such as all powers of 2 or all perfect squares. In [36, 30] this was handled by
adding self-loops to make the vertex degrees all equal and matching the sizes of expanders in explicit
families. Adding self-loops was acceptable in those articles because it does not affect connectivity (the
focus of [36]) or the Laplacian (the focus of [30]). However it does affect long random walks (our focus),
so we cannot add self-loops. Instead, we show how to obtain strongly explicit expanders of all sizes. Our
construction works by starting with a strongly explicit expander from one of the existing constructions
and merging vertices to achieve any desired size:

Theorem 3.3. There exists a family of strongly explicit expanders H such that for all n > 1 and λ ∈ (0, 1)
there is a c = poly(1/λ ) and a c-regular graph Hn,c ∈ H on n vertices with λ (Hn,c ) ≤ λ .

Proof. Let H 0 be a c0 -regular expander on n̂ vertices such that n ≤ n̂ ≤ 2n, c0 is a constant independent
of n and λ (H 0 ) ≤ λ 0 < 1/4. H 0 can be constructed using already known strongly explicit constructions
such as [18, 35] followed by squaring the graph a constant number of times to achieve λ 0 < 1/4. We will
construct H as follows: Pair off the first (n̂ − n) vertices with the last (n̂ − n) vertices in H 0 and merge
each pair into a single vertex (which will then have degree 2 · c0 ). To make the graph regular, add c0
self-loops to all of the unpaired vertices. More precisely, given u0 ∈ [n] and i0 ∈ [c] = [2 · c0 ] we compute
RotH (u0 , i0 ) as follows:

   1. If 1 ≤ u0 ≤ n̂ − n [u0 is a paired vertex]:

        (a) If 1 ≤ i0 ≤ c0 , let u = u0 , i = i0 [u0 is the first vertex in pair]
        (b) else let u = n̂ − u0 , i = i0 − c0 [u0 is the second vertex in pair]
        (c) let (v, j) = RotH 0 (u, i)

   2. else (if n̂ − n < u0 ≤ n ) [u0 is an unpaired vertex]

        (a) If 1 ≤ i0 ≤ c0 , let u = u0 , i = i0 , and (v, j) = RotH (u, j) [original edge]
        (b) else let (v, j) = (u0 , i0 ) [new self-loop]

   3.   (a) If v ≤ n, let (v0 , j0 ) = (v, j)
        (b) else let v0 = n̂ − v, j0 = j + c0 .

   4. Output (v0 , j0 )

                          T HEORY OF C OMPUTING, Volume 17 (4), 2021, pp. 1–35                             13
                JACK M URTAGH , O MER R EINGOLD , A ARON S IDFORD , AND S ALIL VADHAN

    Notice from the description above that RotH can be computed in O(log n) space. Next we show that
λ (H) is bounded below 1 by a constant. The theorem then follows by taking the O(log 1/λ )-th power
of H to drive λ (H) below λ . This gives the graph degree c = poly(1/λ ). Since H has constant degree,
edges in H O(log(1/λ )) can be represented in O(log(1/λ )) = O(log c) space as a path of length log(1/λ ) in
H. Hence, computing neighbors in H O(log(1/λ )) can be done in the O(log n) space to compute neighbors
in H plus the O(log n + log c) space to follow a path of length log(1/λ ) in H for a total of O(log(n · c))
space as desired.
    Let A0 be the adjacency matrix of H 0 and K 0 be the n̂× n̂ all ones matrix. Since λ (H 0 ) ≤ λ 0 , Lemma 2.4
implies that
                                      1                       1
                                         · (c0 · I − A0 ) ≈λ 0 · (n̂ · I − K 0 ).
                                      c0                      n̂
Define B to be the n̂ × n matrix such that Bu0 ,u = 1 if and only if vertex u0 ∈ V (H 0 ) was merged into vertex
u ∈ V (H) or vertex u ∈ V (H 0 ) was not merged and is labeled vertex u0 in H. That is, Bu0 ,u = 1 if and only
if u = u0 or n ≤ u = n̂ − u0 . Then the unnormalized Laplacian of the expander after the merging step is
BT (c0 · I − A0 )B. Adding self-loops to a graph does not change its Laplacian. So applying Proposition 2.2
parts 3 and 6 we get
                                      1                          1
                            L(H) = 0 · BT (c0 · I − A0 )B ≈λ 0      · BT (n̂ · I − K)B.
                                     2c                         2n̂
Note that the righthand side is the normalized Laplacian of the graph U that results from starting with
the complete graph on n̂ vertices, merging the same pairs of vertices that are merged in H and adding n̂
self-loops to all of the unmerged vertices for regularity.
    We finish the proof by showing that λ (U) ≤ 1/2 and thus H is a (λ 0 + 1/2 + λ 0 /2)-approximation
of the complete graph by Proposition 2.2 Part 2 and Lemma 2.4. Recalling that λ 0 < 1/4 completes the
proof.
    U has at least n̂ edges between every pair of vertices so we can write its transition matrix Tu as

                                                        1        1
                                              Tu =        · Jn̂ + · E
                                                        2        2
where Jn̂ is the transition matrix of the complete graph on n̂ vertices with self-loops on every vertex and
E is the transition matrix for an n̂-regular multigraph. Since E is the transition matrix of a regular graph,
we have kEk ≤ 1 and thus

                                                      kTu vk
                                       λ (U) = sup
                                                 v⊥~1
                                                       kvk
                                                        1
                                                            · (kJn̂ vk + kEvk)
                                             ≤ sup 2
                                                 v⊥~1
                                                                    kvk
                                                 1      1
                                             ≤     · 0 + · 1,
                                                 2      2

which completes the proof.

                        T HEORY OF C OMPUTING, Volume 17 (4), 2021, pp. 1–35                                  14
                   D ETERMINISTIC A PPROXIMATION OF R ANDOM WALKS IN S MALL S PACE

4      Main result
In this section we prove Theorem 1.4, our main result regarding space bounded computation of the
normalized Laplacian of the r-step random walk on G.

Theorem 1.4 (restated). There is a deterministic algorithm that given an undirected multigraph G on n
vertices with normalized Laplacian I − M, a nonnegative integer r, and ε > 0, constructs an undirected
multigraph G e whose normalized Laplacian L e is an ε-spectral approximation of L = I − M r . That is, for
all vectors v ∈ Rn

                               (1 − ε) · vT Lv ≤ vT Lv
                                                    e ≤ (1 + ε) · vT Lv.

The algorithm runs in space

                               O(log N + (log r) · log(1/ε) + (log r) · log log r),

where N is the bit length of the input graph G.

    The algorithm described below is inspired by techniques used in [14] to approximate random walks
with a randomized algorithm in nearly linear time. Our analyses use ideas from the work of Cohen,
Kelner, Peebles, Peng, Rao, Sidford, and Vladu on directed Laplacian system solvers even though all of
the graphs we work with are undirected.

4.1     Algorithm description and proof overview
Let I − M be the normalized Laplacian of our input and r be the target power. We will first describe an
algorithm for computing I − M r without regard for space complexity and then convert it into a space-
efficient approximation algorithm. The algorithm iteratively approximates larger and larger powers of M.
On a given iteration, we will have computed I − M k for some k < r and we use the following operations
to increase k:

      • Square: I − M k → I − M 2k ,

      • Plus one: I − M k → I − 12 · (M · M k + M k · M) = I − M k+1 .

     Interleaving these two operations appropriately can produce any power r of M, invoking each
operation at most log2 r times. To see this, let bz bz−1 . . . b0 be the bits of r in its binary representation
where b0 is the least significant bit and bz = 1 is the most significant. We are given I − M = I − M bz .
The algorithm will have z iterations and each one will add one more bit from most significant to least
significant to the binary representation of the exponent. So after iteration i we will have I − M bz bz−1 ...bz−i .
     For iterations 1, . . . , z, we read the bits of r from bz−1 to b0 one at a time. On each iteration we start
with some power I − M k . If the corresponding bit is a 0, we square to create I − M 2k (which adds a 0 to
the binary representation of the current exponent) and proceed to the next iteration. If the corresponding
bit is a 1, we square and then invoke a plus one operation to produce I − M 2k+1 (which adds a 1 to the
binary representation of the current exponent). After iteration z we will have I − M r .
     Implemented recursively, this algorithm has log2 r levels of recursion and uses O(log N) space at each
level for the matrix multiplications, where N is the bit length of the input graph. This results in total space

                          T HEORY OF C OMPUTING, Volume 17 (4), 2021, pp. 1–35                                 15
               JACK M URTAGH , O MER R EINGOLD , A ARON S IDFORD , AND S ALIL VADHAN

O(log r · log N), which is more than we want to use (cf. Theorem 1.4). We reduce the space complexity
by replacing each square and plus one operation with the corresponding derandomized product, discussed
in Section 3.
    Theorem 3.2 says that the derandomized product produces spectral approximations to the square and
the plus one operation. Since we apply these operations repeatedly on successive approximations, we
need to maintain our ultimate approximation to a power of I − M. In other words, we need to show that
given Ge such that

                                                 e ≈ε I − M k
                                              I −M

we have:


          e 2 ≈ε I − M 2k ,
   1. I − M


   2. I − 21 · (M M
                  e + MM)
                      e   ≈ε I − M k+1 .


We prove these in Lemma 4.1 and Lemma 4.5. The transitive property of spectral approximation
(Proposition 2.2 Part 2) will then complete the proof of spectral approximation.
    We only know how to prove items 1 and 2 when M k is PSD. This is problematic because M is not
guaranteed to be PSD for arbitrary graphs and so M k may only be PSD when k is even. Simple solutions
like adding self-loops (to make the random walk lazy) are not available to us because loops may affect the
random walk behavior in unpredictable ways. Another attempt would be to replace the plus one operation
in the algorithm with a “plus two” operation


   • Plus two: I − M k → I − 12 · (M 2 · M k + M k · M 2 ) = I − M k+2 .


Interleaving the square and plus two would preserve the positive semidefiniteness of the matrix we’re
approximating and can produce any even power of M. If r is odd, we could finish with one plus one
operation, which will produce a spectral approximation because I − M r−1 is PSD. A problem with this
approach is that the derandomized product is defined only for unweighted multigraphs and M 2 may not
correspond to an unweighted multigraph when G is irregular. (When G is regular, the graph G2 consisting
of paths of length 2 in G does have normalized Laplacian I − M 2 .)
    For this reason we begin the algorithm by constructing an unweighted multigraph G0 whose normal-
ized Laplacian I − M0 approximates I − M 2 and where M0 is PSD. We can then approximate any power
       0
I − M0r using the square and plus one operation and hence can approximate I − M r for any even r (see
Lemma 4.6). For odd powers, we again can finish with a single plus one operation.
    Our main algorithm is presented below. Our input is an undirected two-way labeled multigraph G
with normalized Laplacian I − M, ε ∈ (0, 1), and r = bz bz−1 . . . b1 b0 .

                       T HEORY OF C OMPUTING, Volume 17 (4), 2021, pp. 1–35                            16
                  D ETERMINISTIC A PPROXIMATION OF R ANDOM WALKS IN S MALL S PACE



      Algorithm 4.1
      Input: G with normalized Laplacian I − M, ε ∈ (0, 1), r = bz bz−1 . . . b1 b0
      Output: Gz with normalized Laplacian I − Mz such that I − Mz ≈ε I − M r

         1. Set µ = ε/(32 · z)

         2. Let H be family of expanders of every size such that λ (H) ≤ µ for all H ∈ H.

         3. Construct G0 such that I − M0 ≈ε/(16·z) I − M 2 and M0 is PSD.

         4. For i in {1, . . . , z − 1}

              (a) If bz−i = 0, Gi = Gi−1 p H Gi−1
              (b) Else Gi = (Gi−1 p H Gi−1 ) p H G0

         5. If b0 = 0 (r even), Gz = Gz−1

         6. Else (r is odd), Gz = Gz−1 p H G

         7. Output Gz


    Note that each derandomized product multiplies every vertex degree by a factor of 2 · c. So the degrees
of G, G0 , . . . , Gz are all proportional to one another and the derandomized products in Algorithm 4.1 are
well-defined.

4.2    Proof of main result
In this section we prove Theorem 1.4 by showing that Algorithm 4.1 yields a spectral approximation of
our target power I − M r and can be implemented space-efficiently. First we show that our two operations,
square and plus one, preserve spectral approximation.
                                         e be symmetric matrices such that I − N
Lemma 4.1 (Adapted from [14]). Let N and N                                     e ≈ε I − N and N is
                2       2
PSD, then I − N ≈ε I − N .
              e

   The proof of Lemma 4.1 is adapted from Cheng, Cheng, Liu, Peng, and Teng [14], which uses ideas
from the work of Miller and Peng [29]. We present these arguments here for the sake of completeness.
We begin with three claims.
                     e be symmetric matrices. If I − N
Claim 4.2. Let N and N                               e ≈ε I − N and N is PSD then I + N
                                                                                      e ≈ε I + N

Proof. By definition, the hypothesis is equivalent to

                                 −ε · (I − N)  (I − N)
                                                     e − (I − N)  ε · (I − N),

which is equivalent to
                                 −ε · (I − N)  (I + N)
                                                     e − (I + N)  ε · (I − N).

                          T HEORY OF C OMPUTING, Volume 17 (4), 2021, pp. 1–35                           17
                  JACK M URTAGH , O MER R EINGOLD , A ARON S IDFORD , AND S ALIL VADHAN

Since N is PSD, we have I − N  I + N. Thus we have

                                   −ε · (I + N)  (I + N) − (I + N)
                                                                 e  ε · (I + N),

                           e ≈ε I + N.
which is equivalent to I + N

                    e are symmetric matrices such that I − N
Claim 4.3. If N and N                                      e ≈ε I − N and I + N
                                                                              e ≈ε I + N then
                                                                                            
                                                 I  −Ne                            I −N
                                                                       ≈ε                          .
                                                −Ne  I                            −N  I

Proof. We follow the proof of Lemma 4.4 in [14]. For all vectors x, y and symmetric matrices U,
                T                          
             x           I     −U           x           1
                                                          · (x + y)T (I −U)(x + y) + (x − y)T (I +U)(x − y) .
                                                                                                           
                                                    =
             y          −U      I           y           2

                                          e and our assumptions that I − N
The claim follows by taking U = N and U = N                              e ≈ε I − N and I + N
                                                                                            e ≈ε
I + N.

Claim 4.4. Let N be a symmetric matrix. Then
                                                                          T                           
                                   T            2                  x               I −N                x
                                  x (I − N )x = min                                                            .
                                                          y        y              −N  I                y

Proof. This claim follows from Lemma B.2 in Miller and Peng [29], where it is stated in terms of the
Schur complement. We modify the argument here without appealing to the Schur complement.
                                 T                                
                              x             I −N               x
                                                                           = kxk2 + kyk2 − 2 · hy, Nxi
                              y            −N  I               y
                                                                           ≥ kxk2 + kyk2 − 2 · kyk · kNxk
                                                                           ≥ kxk2 − kNxk2
                                                                           = xT (I − N 2 )x.

The first inequality follows from Cauchy-Schwarz and is tight if and only if y = c · Nx for some scalar c.
The second inequality follows from the fact that z2 − 2 · kNxk · z is minimized at z = kNxk. So equality is
achieved when y = Nx.

    Now we can prove Lemma 4.1.

Proof. Let
                                                                                                                
                                            I −N                                               I  −Ne
                              N :=                                     and         e :=
                                                                                   N                                   .
                                           −N  I                                              −Ne  I

                             T HEORY OF C OMPUTING, Volume 17 (4), 2021, pp. 1–35                                          18
                  D ETERMINISTIC A PPROXIMATION OF R ANDOM WALKS IN S MALL S PACE

                                                         e ≈ε I + N and hence that N
From our assumptions, it follows from Claim 4.2 that I + N                         e ≈ε N by Claim 4.3.
Combining this with Claim 4.4 gives that for all vectors x,
                                                                     T     
                                        T                          x       x
                              (1 − ε) · x (I − N)x ≤ (1 − ε) · e        N e
                                                                  Nx       Nx
                                                           T         
                                                         x            x
                                                   ≤ e          N
                                                                e
                                                        Nx           Nx
                                                                     e
                                                             e 2 )x.
                                                   = xT (I − N

Similarly:
                                                                    T     
                                        T       2                 x        x
                             (1 + ε) · x (I − N )x = (1 + ε) ·         N
                                                                 Nx       Nx
                                                          T         
                                                         x          x
                                                   ≥            N
                                                                e
                                                        Nx          Nx
                                                      T        2
                                                   ≥ x (I − Ne )x.

       e 2 ≈ε I − N 2 .
So I − N

    Next we show that the plus one operation in our algorithm also preserves spectral approximation.

Lemma 4.5. Let N,
               e N1 , and N2 be symmetric matrices with spectral norm at most 1 and suppose that N1
is PSD and commutes with N2 . If I − N
                                     e ≈ε I − N1 then

                                          1 e
                                       I − · (NN2 + N2 N) ≈ε I − N2 N1 .
                                                       e
                                          2

    Even though I − (1/2) · (NN          e and I − N2 N1 are both symmetric matrices (since N1 and
                               e 2 + N2 N)
N2 commute), the proof of Lemma 4.5 uses machinery from the theory of spectral approximation for
asymmetric matrices. This is because the proof first shows that I − N2 N
                                                                       e is an ε-approximation of I − N2 N1
in the asymmetric sense (Definition 2.6). The asymmetric definition is necessary for this claim because
I − N2 N
       e can be an asymmetric matrix, so the usual notion of spectral approximation does not apply. We
are not aware of any proofs of Lemma 4.5 that circumvent this application of asymmetric definitions.

Proof. Since I − N e ≈ε I − N1 and I − N1 is PSD (since kN1 k ≤ 1), Lemma 2.7 and Lemma 2.9 say that
                n
for all x, y ∈ R ,
                                 e − N1 )y ≤ ε · (xT (I − N1 )x + yT (I − N1 )y).
                             xT (N
                                             2
It follows (replacing x with N2 x) that for all vectors x, y ∈ Rn ,

                                 e − N2 N1 )y ≤ ε · (xT N2 (I − N1 )N2 x + yT (I − N1 )y).
                          xT (N2 N
                                                2

                          T HEORY OF C OMPUTING, Volume 17 (4), 2021, pp. 1–35                          19
                 JACK M URTAGH , O MER R EINGOLD , A ARON S IDFORD , AND S ALIL VADHAN

We now claim that N2 (I − N1 )N2  I − N1  I − N2 N1 . If the claim is true then the above implies that for
all vectors x, y ∈ Rn

                                 e − N2 N1 )y ≤ ε · (xT (I − N2 N1 )x + yT (I − N2 N1 )y),
                          xT (N2 N
                                                2
which by Lemma 2.7 implies

                                (I − N2 N1 )†/2 (N2 N
                                                    e − N2 N1 )(I − N2 N1 )†/2 ≤ ε.

Since for all matrices U it is the case that kU +U T k/2 ≤ kUk/2 + kU T k/2 = kUk we have
                                                                   
                             †/2      1 e
                 (I − N2 N1 )      I − · (NN2 + N2 N) − (I − N2 N1 ) (I − N2 N1 )†/2 ≤ ε.
                                                    e
                                      2

Therefore I − 12 · (NN        e is a directed ε-approximation of I − N2 N1 . Since these matrices are
                     e 2 + N2 N)
symmetric (N2 and N1 commute), Lemma 2.9 says that the approximation also holds in the undirected
sense.
     It remains to show that N2 (I − N1 )N2  I − N1  I − N2 N1 . Since N1 and N2 commute, they have the
same eigenvectors [22]. So the inequalities reduce to inequalities of their corresponding eigenvalues. Let
λ1 , λ2 be eigenvalues corresponding to the same eigenvector of N1 , N2 , respectively. Since kN1 k, kN2 k ≤ 1,
we have 1 − λ1 ≥ 0 and |λ2 | ≤ 1, so

                                            λ2 · (1 − λ1 ) · λ2 ≤ 1 − λ1 .

This proves that N2 (I − N1 )N2  I − N1 . Also, since N1 is PSD we have λ1 ≥ 0 and thus

                                                1 − λ1 ≤ 1 − λ2 · λ1

because |λ2 | ≤ 1. Thus I − N1  I − N2 N1 and the proof is complete.

    Setting N1 = M k and N2 = M in Lemma 4.5 shows that the plus one operation preserves spectral
approximation whenever M k is PSD. Recall that the first step in Algorithm 4.1 is to construct a graph G0
with normalized Laplacian I − M0 such that M0 is PSD and I − M0 approximates I − M 2 . We can then
approximate I − M0k for any k using squaring and plus one because M0k will always be PSD. The following
Lemma says that I − M0k spectrally approximates I − M 2k .

Lemma 4.6. Let r be a positive integer with bit length `(r) and A and B be symmetric PSD matrices with
kAk, kBk ≤ 1 such that I − A ≈ε I − B and I − B ≈ε I − A for ε ≤ 1/(2 · `(r)). Then I − Ar ≈2·ε·`(r) I − Br .

Proof. Let bz bz−1 . . . b0 be the bits in the binary representation of r with bz = 1 being the most significant
bit and b0 being the least. For each i ∈ {0, . . . , z}, let ri be the integer with binary representation
bz bz−1 . . . bz−i .
     The proof is by induction on i. As a base case, we have I − Abz ≈ε I − Bbz . We will show for the
inductive step that for each i ∈ {1, . . . , z}, if I − Ari−1 ≈2·(i−1)·ε I − Bri−1 then I − Ari ≈2·i·ε I − Bri . Assume
I − Ari−1 ≈2·(i−1)·ε I − Bri−1 . By Lemma 4.1, we have

                         T HEORY OF C OMPUTING, Volume 17 (4), 2021, pp. 1–35                                       20
                 D ETERMINISTIC A PPROXIMATION OF R ANDOM WALKS IN S MALL S PACE



                                           I − A2·ri−1 ≈2·(i−1)·ε I − B2·ri−1 .

If bi = 0 then 2 · ri−1 = ri so I − Ari ≈2·(i−1)·ε I − Bri . If bi = 1, then by applying Lemma 4.5 we have

                           1
                        I − (A2·ri−1 B + BA2·ri−1 ) ≈2·(i−1)·ε I − B2·ri−1 +1 = I − Bri .
                           2
Applying Lemma 4.5 again (recalling that I − B ≈ε I − A and A is PSD) gives

                               1
                            I − (A2·ri−1 B + BA2·ri−1 ) ≈ε I − A2·ri−1 +1 = I − Ari .
                               2
By Proposition 2.2 Part 1, this implies

                                                     1
                                 I − Ari ≈ε/(1−ε) I − (A2·ri−1 B + BA2·ri−1 ).
                                                     2

Putting this together with Proposition 2.2 Part 2 we have that I − Ari is an (ε/(1 − ε) + 2 · (i − 1) · ε + 2 ·
(i − 1) · ε 2 /(1 − ε))-approximation of I − Bri . Note that

                                          ε + 2 · (i − 1) · ε 2                     2 · ε − 2 · ε2
                      2 · (i − 1) · ε +                         ≤ 2 · (i − 1) · ε +
                                                1−ε                                     1−ε
                                                                = 2·i·ε

where the first inequality follows from our assumption that ε ≤ 1/(2 · `(r)). So I − Ari ≈2·i·ε I − Bri ,
completing the inductive step. This shows that when i = `(r), I − Ar2·`(r)·ε I − Br , as desired.

    Now we can prove Theorem 1.4. We prove the theorem with three lemmas: Lemma 4.7 shows how
to construct the graph G0 needed in Algorithm 4.1, Lemma 4.8 argues that the algorithm produces a
spectral approximation to I − M r , and Lemma 4.9 shows that the algorithm can be implemented in space
O(log N + (log r) · log(1/ε) + (log r) · log log r).


4.2.1   Building G0

Lemma 4.7. There is an algorithm that takes an undirected, unweighted multigraph G with normalized
Laplacian I − M and a parameter ε > 0, and outputs a rotation map RotG0 for an undirected, unweighted
multigraph G0 with a two-way labeling and normalized Laplacian I − M0 such that:

   1. M0 is PSD,

   2. I − M0 ≈ε I − M 2 , and

   3. the algorithm uses space O(log N + log(1/ε)), where N is the bit length of the input graph G.

                        T HEORY OF C OMPUTING, Volume 17 (4), 2021, pp. 1–35                                 21
                JACK M URTAGH , O MER R EINGOLD , A ARON S IDFORD , AND S ALIL VADHAN

Proof. Let δ = 1/d4/εe and t = 1/δ , an integer. Let H be a family of c-regular expanders of every size
from Theorem 3.3, such that for every H ∈ H, λ (H) ≤ δ (and hence c = poly(1/δ )).
    Let G e = G p H G be the derandomized square of G with normalized Laplacian I − M.        e Each vertex v
in G has degree dv = 2 · c · dv , where dv is the degree of v in G. We construct G0 as follows: duplicate
   e               e
every edge of G e to have multiplicity t and then for each vertex v, add dev self-loops. So for each vertex v
in G0 , v has degree (t + 1) · 2 · c · dv and hence G0 has the same stationary distribution as G. Note that we
can write
                                              M0 = (t · M
                                                        e + I)/(t + 1).

                                                            e ≈δ I − M 2 , so I − M
First we show that M0 is PSD. From Theorem 3.2, we have I − M                     e  (1 + δ ) · (I −
  2                        2
M )  (1 + δ ) · I, since M is PSD. Thus M  −δ · I and
                                         e

                                                t · (−δ · I) + I
                                         M0                      0.
                                                      t +1

    Next we prove that I − M0 ≈ε I − M 2

                                       I − M0 = (t/(t + 1)) · (I − M)
                                                                   e
                                                       
                                                    1
                                              =           · (I − M)
                                                                  e
                                                  1+δ
                                                I − M2.

                       e ≈δ I − M 2 , we also have
Observe that since I − M
                                                     
                                                   1
                                       I − M0 =         · (I − M)
                                                               e
                                                  1+δ
                                                     
                                                  1−δ
                                                       · (I − M 2 )
                                                  1+δ
                                               (1 − ε) · (I − M 2 ).

    We can construct a two-way labeling of G in space O(log N) by arbitrarily numbering the edges
incident to each vertex. Computing RotGe involves computing RotG twice and the rotation map of an
expander in H once. For a given vertex degree d in G, RotHd can be computed in space O(log(d · c)) =
O(log N + log(1/ε)). Duplicating the edges and adding self-loops for RotG0 adds at most O(log N +
log(1/ε)) overhead for a total of O(log N + log(1/ε)) space.


4.2.2   Proof of spectral approximation

Lemma 4.8. Let G be an undirected multigraph with normalized Laplacian I − M, r be a positive integer
and ε ∈ (0, 1). Let Gz be the output of Algorithm 4.1 with normalized Laplacian I − Mz . Then

                                             I − Mz ≈ε I − M r .

                        T HEORY OF C OMPUTING, Volume 17 (4), 2021, pp. 1–35                               22
                  D ETERMINISTIC A PPROXIMATION OF R ANDOM WALKS IN S MALL S PACE

Proof. Let bz bz−1 . . . b1 b0 be the binary representation of r. Recall that for the derandomized products in
our algorithm we use a family of c-regular expanders H from Theorem 3.3 such that for every H ∈ H,
λ (H) ≤ µ = ε/(32 · z) (and hence c = poly(1/µ) = poly((log r)/ε)).
    We construct G0 with normalized Laplacian I − M0 as in Lemma 4.7 such that M0 is PSD and
I − M0 ≈ε/(16·z) I − M 2 . By Proposition 2.2 Part 1, and the fact that

                                            ε/(16 · z)          ε
                                                        =
                                          1 − ε/(16 · z) (16 · z) − ε
                                                           ε
                                                        ≤     ,
                                                          8·z

we also have I − M 2 ≈ε/(8·z) I − M0 .
    For each i ∈ {0, . . . z} let ri be the integer with binary representation bz bz−1 . . . bz−i and let I − Mi be
the normalized Laplacian of Gi . We will prove by induction on i that Gi is a (4 · µ · i)-approximation to
                                                                          r
I − M0ri . Thus, Gz−1 is a 4 · µ · (z − 1) ≤ ε/8-approximation to I − M0z−1 .
                                                                                                              r
    The base case is trivial since r0 = 1. For the induction step, suppose that I −Mi−1 ≈4·µ·(i−1) I −M0z−i+1 .
On iteration i, if bz−i = 0, then Gi = Gi−1 p H Gi−1 . So we have
                                                         2
                                          I − Mi ≈µ I − Mi−1
                                                                      2·r
                                                   ≈4·µ·(i−1) I − M0 i−1
                                                   = I − M0ri

where the first approximation uses Theorem 3.2 and the second uses Lemma 4.1. By Proposition 2.2 Part
2 this implies that I − Mi approximates I − M0ri with approximation factor

                                  µ + 4 · µ · (i − 1) + 4 · µ 2 · (i − 1) ≤ 4 · µ · i

where we used the fact that µ < 1/(32 · (i − 1)).
   If bz−i = 1, Gi = (Gi−1 p H Gi−1 ) p H G0 . Let I − Mds be the normalized Laplacian of Gi−1 p H Gi−1 .
                                                                                                    2·r
By the analysis above, I − Mds is a (µ + 4 · µ · (i − 1) + 4 · µ 2 · (i − 1))-approximation of I − M0 i−1 . By
Theorem 3.2 and Lemma 4.5 we have
                                               1
                                 I − Mi ≈µ I − · (Mds M0 + M0 Mds )
                                               2
                                                                       2·r
                                        ≈µ+4·µ·(i−1)+4·µ 2 ·(i−1) I − M0 i−1 M0
                                         = I − M0ri .

Applying Proposition 2.2 Part 2 and noting that µ ≤ 1/(32 · (i − 1)) we get

                                              I − Mi ≈4·µ·i I − M0ri .
                                                      r
    So we conclude that I − Mz−1 ≈ε/8 I − M0z−1 . Furthermore, by Lemma 4.6 we have
                                                  r
                                           I − M0z−1 ≈ε/8 I − M 2·rz−1 .

                         T HEORY OF C OMPUTING, Volume 17 (4), 2021, pp. 1–35                                  23
                JACK M URTAGH , O MER R EINGOLD , A ARON S IDFORD , AND S ALIL VADHAN

By Proposition 2.2 Part 2, and the fact that ε ≤ 1, this gives

                                          I − Mz−1 ≈ε/3 I − M 2·rz−1 .

If b0 = 0 then 2 · rz−1 = r and we are done. If b0 = 1 then we apply one more plus one operation using
our original graph G to form Gz = Gz−1 p H G such that
                                                   1
                                    I − Mz ≈µ I − · (Mz−1 M + MMz−1 )
                                                   2
                                           ≈ε/3 I − M 2·rz−1 +1
                                            = I − Mr .

Applying Proposition 2.2 Part 2 then gives I − Mz ≈ε I − M r .

4.2.3   Analysis of space complexity
Lemma 4.9. Algorithm 4.1 can be implemented so that given an undirected multigraph G, a positive
integer r, and ε ∈ (0, 1), it computes its output Gz in space O(log N + (log r)·log(1/ε) +(log r)·log log r),
where N is the bit length of the input graph G.

Proof. We show how to compute RotGz in space O(log N + (log r) · log(1/ε) + (log r) · log log r). Let
bz bz−1 . . . b0 be the binary representation of r. Following Algorithm 4.1, G0 is constructed with normalized
Laplacian I − M0 ≈ε/(16·z) I − M 2 . From Lemma 4.7, we know RotG0 can be computed in space O(log N +
log(16 · z/ε)) = O(log N + log(1/ε) + log log r). Let d1 , . . . , dn be the vertex degrees in G0 and dmax be
the maximum degree.
     The algorithm is presented to have z iterations, where on iteration i ∈ [z − 1], if bz−i = 0 the deran-
domized product is invoked once, and if bz−i = 1, it is invoked twice. On iteration z it is either invoked
once (b0 = 1) or not at all (b0 = 0). It will be simpler for us to think of each derandomized product
happening in its own iteration. So we will consider τ = z + w = O(log r) iterations where w is the number
of ones in bz−1 , . . . , b0 . On iterations 1, . . . , z − 1, there are z − 1 derandomized square operations and w
plus one operations. The final iteration will either have a plus one operation with the graph G (if b0 = 1)
or no operation.
     We copy the bits of r into memory and expand them into τ bits as follows: for i ∈ {1, . . . z − 1} if
bz−i = 0, record a 0 (corresponding to a derandomized square) and if bz−i = 1, record a 0 followed by a 1
(corresponding to a derandomized square followed by a plus one operation). Finish by just recording
bz at the end. Now we have τ bits t1 , . . . ,tτ in memory where for i < τ, ti = 0 if the i-th derandomized
product in our algorithm is a derandomized square and ti = 1 if the i-th derandomized product is a plus
one with the graph G0 . If tτ = 0, we do no derandomized product on the last iteration and if tτ = 1 we
apply the plus one operation using G instead of G0 as described in the algorithm.
     We also re-number our graphs to be G1 , . . . , Gτ where Gi is the graph produced by following the
derandomized products corresponding to t1 , . . . ,ti . For each i ∈ [τ] and v ∈ [n], vertex v in graph Gi has
degree (2 · c)i · dv because each derandomized product multiplies every vertex degree by a factor of 2 · c.
     Since our graphs can be irregular, the input to a rotation map may have a different length than its
output. To simplify the space complexity analysis, when calling a rotation map, we will pad the edge

                         T HEORY OF C OMPUTING, Volume 17 (4), 2021, pp. 1–35                                  24
                 D ETERMINISTIC A PPROXIMATION OF R ANDOM WALKS IN S MALL S PACE

labels to always have the same length as inputs and outputs to the rotation map. For each graph Gi , we
pad its edge labels to have length `i = dlog2 dmax e + i · dlog2 (2 · c)e.
    Sublogarithmic-space complexity can depend on the model, so we will be explicit about the model we
use. We compute the rotation map of each graph Gi on a multi-tape Turing machine with the following
input/output conventions:

    • Input Description:

         – Tape 1 (read-only): Contains the input G, r, and ε with the head at the leftmost position of
           the tape.
         – Tape 2 (read-write): Contains the input to the rotation map (v0 , k0 ), where v0 ∈ [n] is a vertex
           of Gi , and k0 is the label of an edge incident to v0 padded to have total length `i . The tapehead
           is at the rightmost end of k0 . The rest of the tape may contain additional data.
         – Tape 3 (read-write): Contains the bits t1 , . . . ,tτ with the head pointing at ti .
         – Tapes 4+: (read-write): Blank worktapes with the head at the leftmost position.

    • Output Description:

         – Tape 1: The head should be returned to the leftmost position.
         – Tape 2: In place of (v0 , k0 ), it should contain (v2 , k2 ) = RotGi (v0 , k0 ), where v2 ∈ [n], and k2
           is padded to have total length `i . The head should be at the rightmost position of k2 and the
           rest of the tape should remain unchanged from its state at the beginning of the computation.
         – Tape 3: Contains the bits t1 , . . . ,tτ with the head pointing at ti .
         – Tapes 4+: (read-write): Are returned to the blank state with the heads at the leftmost position.

    Let Space(Gi ) be the space used on tapes other than tape 1 to compute RotGi . We will show that
Space(Gi ) = Space(Gi−1 ) + O(log c). Recalling that Space(G0 ) = O(log N + log(1/ε) + log log r) and
unraveling the recursion gives

                Space(Gz ) = O(log N + log(1/ε) + log log r + τ · log c)
                            = O(log N + log(1/ε) + log log r + log r · log(poly(log r)/ε))
                            = O(log N + (log r) · log(1/ε) + (log r) · log log r)

as desired. Below we prove the recurrence on Space(Gi ) by low level reasoning about rotation maps.
While similar space complexity analysis has appeared before (see [34, 36, 30]), we first provide some
intuition for how the space-efficient implementation of our algorithm works.
    Our ultimate goal is, given a vertex v and edge label e in Gz , compute the e-th neighbor of v in Gz . If
we can achieve this, we can compute the entire graph Gz by enumerating over vertices and edges. From
the way we build Gz out of the derandomized product, neighbor relations in Gz are defined by neighbor
relations in Gz−1 as well as neighbor relations in an expander graph (in accordance with Definition 3.1).
Likewise, neighbor relations in Gz−1 are defined by neighbor relations in Gz−2 as well as neighbor
relations in an expander graph and so on. So we can recursively compute neighbors in graphs with larger

                        T HEORY OF C OMPUTING, Volume 17 (4), 2021, pp. 1–35                                   25
                 JACK M URTAGH , O MER R EINGOLD , A ARON S IDFORD , AND S ALIL VADHAN

subscripts by reducing those computations to neighbors in graphs with smaller subscripts along with
computations on expander graphs.
     A naive implementation of this recursion would use O(log n) space for each level of the O(log n)
levels of recursion, for a total of O(log2 n) space. The key advantage to using rotation maps and the
structure inherent to the derandomized product, is that the space used to compute neighbors in the Gi in
the algorithm can be reused. If we can compute neighbors in Gi−1 , the only additional space needed to
compute neighbors in Gi is the space to store an edge label from a graph in our expander family. This
reasoning leads to the recurrence: Space(Gi ) = Space(Gi−1 ) + O(log c).
     To see how this operates at a low level with rotation maps, we begin with (v0 , k0 ) on Tape 2 (possibly
with additional data) and the tapehead at the far right of k0 . We parse k0 into k0 = ( j0 , a0 , b) where j0 is
an edge label in [(2 · c)i−1 · dv0 ] padded to have length `i−1 , a0 ∈ [c], and b ∈ {0, 1}.
     Note that Gi = Gi−1 p H G0 where for i 6= τ, we have G0 = Gi−1 if ti−1 = 0 and G0 = G0 when ti−1 = 1.
We compute RotGi according to Definition 3.1. We move the head left to the rightmost position of j0 .
If b = 0, we move the third tapehead to ti−1 and recursively compute RotGi−1 (v0 , j0 ) so that Tape 2 now
contains (v1 , j1 , a0 , b) (with j1 padded to have the same length as j0 ). The vertex v1 in the graph Gi−1 has
degree d 0 = (2 · c)i−1 · dv1 so we next compute RotHd0 ( j1 , a0 ) so that (v1 , j2 , a1 , b) is on the tape. Finally
we compute RotG0 (v1 , j2 ) and flip b to finish with (v2 , j3 , a1 , b̄) on the second tape. We then move the
third tapehead to ti . If b = 1 then we just swap the roles of Gi−1 and G0 above.
     So computing RotGi involves computing the rotation maps of Gi−1 , Hd 0 , and G0 each once. Note that
each of the rotation map evaluations occur in succession and can therefore reuse the same space. Clearly
Space(G0 ) ≤ Space(Gi−1 ) because either G0 = Gi−1 or G0 is either G0 or G, both of whose rotation maps
are subroutines in computing RotGi−1 . Computing RotHd0 adds an overhead of at most O(log c) space
to store the additional edge label a0 and the bit b. So we can compute the rotation map of Gτ in space
O(log N + (log r) · log(1/ε) + (log r) · log log r).


5     Corollaries
5.1    Random walks
Our algorithm immediately implies Theorem 1.2, which we prove below.

Theorem 1.2 Restated. There is a deterministic algorithm that given an undirected multigraph G on n
vertices, a positive integer r, a set of vertices S, and ε > 0, computes a number Φ
                                                                                  e such that

                                     (1 − ε) · Φr (S) ≤ Φ
                                                        e ≤ (1 + ε) · Φr (S)

and runs in space O(log N + (log r) · log(1/ε) + (log r) · log log r), where N is the bit length of the input
graph G (the length of the description of G as a list of edges, in a binary encoding).

Proof of Theorem 1.2. Let D be the diagonal degree matrix and I − M be the normalized Laplacian of
G. Let v = D1/2 eS where eS is the characteristic vector of the set S. Let dS be the sum of the degrees of
vertices in S. Then using the fact that I − M r = D−1/2 (I − T r )D1/2 where T is the transition matrix of G

                         T HEORY OF C OMPUTING, Volume 17 (4), 2021, pp. 1–35                                     26
                 D ETERMINISTIC A PPROXIMATION OF R ANDOM WALKS IN S MALL S PACE

gives:

                         1 T                   1
                            · v (I − M r )v = · eTS D1/2 D−1/2 (I − T r )D1/2 D1/2 eS
                         dS                   dS
                                               1
                                            = · eTS DeS − eTS (T r (DeS /dS ))
                                              dS
                                            = 1 − Pr[Vr ∈ S|V0 ∈ S]
                                           = Φr (S)

where the penultimate equality follows from the fact that DeS /dS is the probability distribution over
vertices in S where each vertex has mass proportional to its degree, i. e., the probability distribution
V0 k(V0 ∈ S). Multiplying this distribution by T r gives the distribution of Vr k(V0 ∈ S). Multiplying this
resulting distribution on the left by eTS , sums up the probabilities over vertices in S, which gives the
probability that our random walk ends in S.
    From Theorem 1.4, we can compute a matrix L      e such that Le ≈ε I − M r in space O(log N + (log r) ·
log(1/ε) + (log r) · log log r). It follows from Proposition 2.2, Part 6 and the definition of spectral
approximation that
                                                  1
                                (1 − ε) · Φr (S) ≤ · vT Lv
                                                        e ≤ (1 + ε) · Φr (S).
                                                  dS

    Our algorithm also implies an algorithm for approximating random walk matrix polynomials.

Definition 5.1 (Random walk matrix polynomials). Let G be an undirected multigraph with Laplacian
D−A and let α be a vector of nonnegative scalars α = (α1 , . . . , α` ) such that ∑i∈[`] αi = 1. The normalized
`-degree random walk matrix polynomial of G with respect to α is defined to be:

                                                  `
                                  Lα (G) := I − ∑ αr · M r
                                                 r=1
                                                                    !
                                                          `
                                          = D−1/2 I − ∑ αr · T r D1/2 ,
                                                         r=1


where M = D−1/2 AD−1/2 and T = AD−1

    Random walk matrix polynomials are Laplacian matrices that arise in algorithms for approximating
q-th roots of symmetric diagonally dominant matrices and efficient sampling from Gaussian graphical
models [27, 14, 13].

Corollary 5.2. There is a deterministic algorithm that given an undirected multigraph G, a positive
integer `, a vector of nonnegative scalars α = (α1 , . . . , α` ) that sum to 1, and ε > 0, computes an
ε-approximation to Lα (G) and runs in space O(log N + (log r) · log(1/ε) + (log r) · log log r).

                        T HEORY OF C OMPUTING, Volume 17 (4), 2021, pp. 1–35                                27
                JACK M URTAGH , O MER R EINGOLD , A ARON S IDFORD , AND S ALIL VADHAN

Proof. For each r ∈ [`], let Lr be an ε-approximation of I − M r . Applying Proposition 2.2, Parts 5 and 6
gives
                                       `             `
                                      ∑ αr · Lr ≈ε ∑ αr · I − αr · Mr
                                      r=1           r=1
                                                          `
                                                = I − ∑ αr · M r
                                                         r=1
                                                = Lα (G).
Our algorithm can compute each Lr in space O(log N + (log r) · log(1/ε) + (log r) · log log r). Multiplica-
tion and iterated addition can both be computed in O(log N) space where N is the input bit length [10, 21].
So by the composition of space bounded algorithms (Proposition 2.10), Lα (G) can be approximated in
space O(log N + (log r) · log(1/ε) + (log r) · log log r).

5.2     Odd-length walks in nearly linear time
Our approach to approximating odd-length walks deterministically and space-efficiently also leads to a
new result in the context of nearly linear-time (randomized) spectral sparsification algorithms. Specifically,
we extend the following Theorem of Cheng, Cheng, Liu, Peng, and Teng [14].
Theorem 5.3 ([14]). There is a randomized algorithm that given an undirected weighted graph G
with n vertices, m edges, and normalized Laplacian I − M, even integer r, and ε > 0 constructs an
undirected weighted graph G                             e containing O(n log n/ε 2 ) non-zero entries, in
                            e with normalized Laplacian L
              3       5   4                       r
time O(m · log n · log r/ε ), such that L
                                        e ≈ε I − M with high probability.
      Our approach to approximating odd-length walks can be used to extend Theorem 5.3 to odd r.
Corollary 5.4. There is a randomized algorithm that given an undirected weighted graph G with n
vertices, m edges, and normalized Laplacian I − M, odd integer r, and ε > 0 constructs an undirected
weighted graph G                               e containing O(n log n/ε 2 ) non-zero entries, in time
                   e with normalized Laplacian L
          3      5    4                      r
O(m · log n · log r/ε ), such that L ≈ε I − M with high probability.
                                   e
    Our proof of Corollary 5.4 uses Theorem 5.3 as a black box. So in fact, given G with normalized
Laplacian I − M and any graph G    e whose normalized Laplacian approximates I − M r for even r, we can
produce an approximation to I − M r+1 in time nearly linear in the sparsities of G and G.  e To prove the
corollary, we use the same method used in [33] and [15] for sparsifying two-step walks on undirected
and directed graphs, respectively. The idea is that the graphs constructed from two-step walks can be
decomposed into the union of product graphs: graphs whose adjacency matrices have the form xyT for
vectors x, y ∈ Rn . We use the following fact from [15] that says that product graphs can be sparsified in
time that is nearly-linear in the number of non-zero entries of x and y rather than the number of non-zero
entries in xyT , which may be much larger.
Lemma 5.5 (Adapted from [15] Lemma 3.18). Let x, y be non-negative vectors with kxk1 = kyk1 = r
and let ε ∈ (0, 1). Furthermore, let s denote the total number of non-zero entries in x and y and let
L = diag(y) − 1r · xyT . Then there is an algorithm that in time O(s · log s/ε 2 ) computes a matrix L
                                                                                                     e with
             2
O(s · log s/ε ) non-zeros such that L is a directed ε-approximation of L with high probability.
                                     e

                        T HEORY OF C OMPUTING, Volume 17 (4), 2021, pp. 1–35                               28
                   D ETERMINISTIC A PPROXIMATION OF R ANDOM WALKS IN S MALL S PACE

   After using Lemma 5.5 to sparsify each product graph in our decomposition, we then apply an
additional round of graph sparsification.

Lemma 5.6 ([25]). Given an undirected graph G with n vertices, m edges, and Laplacian L and ε > 0,
there is an algorithm that computes a graph G                   e containing O(n · log n/ε 2 ) non-zero
                                               e with Laplacian L
                         2   2
entries in time O(m · log n/ε ) such that L ≈ε L with high probability.
                                          e

    Now we can prove Corollary 5.4


Proof of Corollary 5.4. Theorem 5.3 says that we can compute a graph G      e with normalized Laplacian
                       2                                    3       5   4                 e ≈ε/8 I − M r−1
I − M with O(n log n/ε ) non-zero entries, in time O(m · log n · log r/ε ), such that I − M
    e
with high probability. By Lemma 4.5 we have

                                           1 e        e ≈ε/8 I − M r .
                                        I − · (MM + M M)                                                         (5.1)
                                           2

                                                                   e spectrally approximates I − M r−1 , the
    Our goal is to sparsify the lefthand side. Note that since I − M
corresponding graphs must have the same stationary distribution and hence proportional vertex degrees.
In other words there is a number k such that for all vertices v ∈ [n] we have degGe (v) = k · degG (v). We
will think of the graph that adds one step to our walk as k · G rather than G because k · G and Ge have the
same degrees and the normalized Laplacian of k · G is the same as the normalized Laplacian of G.
    Let A and A e be the adjacency matrices of k · G and G,
                                                         e respectively and let D be the diagonal matrix of
                                   −1
vertex degrees. Let Q = D − AD A and note that Q is the Laplacian of a weighted directed graph. We
                                      e
will show how to compute a sparse directed approximation of Q and use this to show how to compute a
sparse approximation to the lefthand side of Equation (5.1). Our approach is inspired by similar arguments
from [33, 15]. We decompose Q into n product graphs as follows. For each i ∈ [n] let

                                                    ei,: ) − 1 · A:,i A
                                          Qi = diag(A                 eTi,:
                                                            Di,i

where A ei,: and A:,i denote the i-th row of A   e and the i-th column of A, respectively. Observe that Qi is a
directed Laplacian of a bipartite graph between the neighbors of vertex i in k · G and the neighbors of i
   e and that Q = ∑i∈[n] Qi . Furthermore, each Qi is a product graph and hence can be sparsified using
in G
Lemma 5.5. Set xi = A:,i , yi = A ei,: , ri = Di,i , and let si be the total number of non-zero entries in x and y.
Note that kxi k1 = kyi k1 = ri because k · G and G     e have the same vertex degrees. By Lemma 5.5, for each
i ∈ [n] we can compute a directed ε/8-approximation Q           ei of Qi containing O(si · log si /ε 2 ) entries in time
                2
O(si · log si /ε ). Applying the lemma to each Qi yields Q                     ei , which contains O(m · log m/ε 2 )
                                                                    e = ∑i∈[n] Q
non-zero entries and can be computed in time O(m · log m/ε 2 ) because ∑i∈[n] si = O(m). By Lemma 2.8
we have
                                          1 e eT                 1
                                            · (Qi + Qi ) ≈ε/8 · (Qi + QTi )
                                          2                      2

                          T HEORY OF C OMPUTING, Volume 17 (4), 2021, pp. 1–35                                      29
               JACK M URTAGH , O MER R EINGOLD , A ARON S IDFORD , AND S ALIL VADHAN

for all i ∈ [n] with high probability. It follows from Proposition 2.2 Part 5 that

                                    1 e eT        1            eTi )
                                      · (Q + Q ) = · ∑ (Q ei + Q
                                    2             2 i∈[n]
                                                        1
                                                 ≈ε/8     · ∑ (Qi + QTi )
                                                        2 i∈[n]
                                                     1
                                                 =     · (Q + QT )
                                                     2
with high probability. From Proposition 2.2 Part 3, we then get

                              1 e eT −1/2            1
                         D−1/2 · (Q + Q )D ≈ε/8 D−1/2 · (Q + QT )D−1/2
                              2                      2
                                                 1 e
                                           = I − · (MM + M M)e
                                                 2

with high probability. Applying Lemma 5.6 we can re-sparsify the graph corresponding to D−1/2 21 · (Q e+
  T
Q )D
 e     −1/2                     0                                     0                 2
            to produce a graph G whose normalized Laplacian I − M has O(n · log n/ε ) non-zero entries
and I −M 0 ≈ε/8 D−1/2 21 ·(Q+
                           e Q eT )D−1/2 with high probability. This takes additional time O(m·log2 n/ε 2 )
due to Theorem 1.1 of [25]. Applying Proposition 2.2 Part 2 twice we get that I − M 0 ≈ε I − M r and the
total running time for the procedure was O(m · log3 n · log5 r/ε 4 ).



References
 [1] A MIR M AHDI A HMADINEJAD , J ONATHAN A. K ELNER , JACK M URTAGH , J OHN P EE -
     BLES , A ARON S IDFORD , AND S ALIL VADHAN : High-precision estimation of random
     walks in small space. In Proc. 61st FOCS, pp. 1295–1306. IEEE Comp. Soc., 2020.
     [doi:10.1109/FOCS46700.2020.00123] 5

 [2] ROMAS A LELIUNAS , R ICHARD M. K ARP, R ICHARD J. L IPTON , L ÁSZLÓ L OVÁSZ , AND
     C HARLES R ACKOFF: Random walks, universal traversal sequences, and the complexity of maze
     problems. In Proc. 20th FOCS, pp. 218–223. IEEE Comp. Soc., 1979. [doi:10.1109/SFCS.1979.34]
     2

 [3] N OGA A LON: Eigenvalues and expanders. Combinatorica, 6(2):83–96, 1986. Preliminary version
     in STOC’84. [doi:10.1007/BF02579166] 6

 [4] N OGA A LON: Explicit expanders of every degree and size.                       Combinatorica, 2021.
     [doi:10.1007/s00493-020-4429-x, arXiv:2003.11673] 4

 [5] N OGA A LON , O DED G OLDREICH , J OHAN H ÅSTAD , AND R ENÉ P ERALTA: Simple constructions
     of almost k-wise independent random variables. Random Struct. Algor., 3(3):289–304, 1992. See
     also addendum in vol. 4(1), 1993, pp. 199–120. [doi:10.1002/rsa.3240030308] 4

                       T HEORY OF C OMPUTING, Volume 17 (4), 2021, pp. 1–35                             30
                D ETERMINISTIC A PPROXIMATION OF R ANDOM WALKS IN S MALL S PACE

 [6] N OGA A LON AND V ITALY D. M ILMAN: λ1 , isoperimetric inequalities for graphs, and super-
     concentrators. J. Combin. Theory–B, 38(1):73–88, 1985. [doi:10.1016/0095-8956(85)90092-9]
     6

 [7] N OGA A LON , O DED S CHWARTZ , AND A SAF S HAPIRA: An elementary construc-
     tion of constant-degree expanders. Combin. Probab. Comput., 17(3):319–327, 2008.
     [doi:10.1017/S0963548307008851] 4

 [8] S ANJEEV A RORA AND B OAZ BARAK: Computational Complexity: A modern approach. Cam-
     bridge Univ. Press, 2009. 8

 [9] J OSHUA BATSON , DANIEL A. S PIELMAN , N IKHIL S RIVASTAVA , AND S HANG -H UA T ENG:
     Spectral sparsification of graphs: Theory and algorithms. Comm. ACM, 56(8):87–94, 2013.
     [doi:10.1145/2492007.2492029] 3

[10] PAUL W. B EAME , S TEPHEN A. C OOK , AND H. JAMES H OOVER: Log depth circuits for division
     and related problems. SIAM J. Comput., 15(4):994–1003, 1986. [doi:10.1137/0215070] 28

[11] M ARK B RAVERMAN , A NUP R AO , R AN R AZ , AND A MIR Y EHUDAYOFF: Pseudorandom genera-
     tors for regular branching programs. SIAM J. Comput., 43(3):973–986, 2014. Preliminary version
     in FOCS’10. [doi:10.1137/120875673] 3

[12] J OSHUA B RODY AND E LAD V ERBIN: The coin problem and pseudorandomness for branching
     programs. In Proc. 51st FOCS, pp. 30–39. IEEE Comp. Soc., 2010. [doi:10.1109/FOCS.2010.10] 3

[13] D EHUA C HENG , Y U C HENG , YAN L IU , R ICHARD P ENG , AND S HANG -H UA T ENG: Scalable
     parallel factorizations of SDD matrices and efficient sampling for Gaussian graphical models. 2014.
     [arXiv:1410.5392] 27

[14] D EHUA C HENG , Y U C HENG , YAN L IU , R ICHARD P ENG , AND S HANG -H UA T ENG: Spectral
     sparsification of random-walk matrix polynomials. 2015. [arXiv:1502.03496] 3, 4, 5, 15, 17, 18,
     27, 28

[15] M ICHAEL B. C OHEN , J ONATHAN A. K ELNER , J OHN P EEBLES , R ICHARD P ENG , A NUP R AO ,
     A ARON S IDFORD , AND A DRIAN V LADU: Almost-linear-time algorithms for Markov chains and
     new spectral primitives for directed graphs. In Proc. 49th STOC, pp. 410–419. ACM Press, 2017.
     [doi:10.1145/3055399.3055463] 3, 4, 7, 28, 29

[16] M ICHAEL B. C OHEN , J ONATHAN A. K ELNER , J OHN P EEBLES , R ICHARD P ENG , A ARON
     S IDFORD , AND A DRIAN V LADU: Faster algorithms for computing the stationary distribution,
     simulating random walks, and more. In Proc. 57th FOCS, pp. 583–592. IEEE Comp. Soc., 2016.
     [doi:10.1109/FOCS.2016.69] 3, 7

[17] A NINDYA D E: Pseudorandomness for permutation and regular branching programs. In Proc.
     26th IEEE Conf. on Comput. Complexity (CCC’11), pp. 221–231. IEEE Comp. Soc., 2011.
     [doi:10.1109/CCC.2011.23] 3

                      T HEORY OF C OMPUTING, Volume 17 (4), 2021, pp. 1–35                           31
               JACK M URTAGH , O MER R EINGOLD , A ARON S IDFORD , AND S ALIL VADHAN

[18] O FER G ABBER AND Z VI G ALIL: Explicit constructions of linear-sized superconcentrators. J.
     Comput. System Sci., 22(3):407–420, 1981. Preliminary version in FOCS’79. [doi:10.1016/0022-
     0000(81)90040-4] 4, 13

[19] O DED G OLDREICH: Computational complexity: A conceptual perspective. Cambridge Univ. Press,
     2008. [doi:10.1017/CBO9780511804106] 4

[20] O DED G OLDREICH: On constructing expanders for any number of vertices. In O DED G OLDREICH,
     editor, Computational Complexity and Property Testing, volume 12050 of LNCS, pp. 374–379.
     Springer, 2020. Preliminary version in Author’s home page. [doi:10.1007/978-3-030-43662-9_21]
     4

[21] W ILLIAM H ESSE , E RIC A LLENDER , AND DAVID A. M IX BARRINGTON: Uniform constant-depth
     threshold circuits for division and iterated multiplication. J. Comput. System Sci., 65(4):695–716,
     2002. With 2014 update in JCSS, Vol. 80(2), 496–497. [doi:10.1016/j.jcss.2013.09.002] 28

[22] ROGER A. H ORN AND C HARLES R. J OHNSON: Matrix Analysis. Cambridge Univ. Press, 1990.
     [doi:10.1017/CBO9781139020411] 20

[23] RUSSELL I MPAGLIAZZO , N OAM N ISAN , AND AVI W IGDERSON: Pseudorandomness for network
     algorithms. In Proc. 26th STOC, pp. 356–364. ACM Press, 1994. [doi:10.1145/195058.195190] 3

[24] G ORAV J INDAL , PAVEL KOLEV, R ICHARD P ENG , AND S AURABH S AWLANI: Density independent
     algorithms for sparsifying k-step random walks. In Proc. 20th Internat. Workshop on Approximation
     Algorithms for Combinat. Opt. Probl. (APPROX’17), pp. 14:1–17. Schloss Dagstuhl–Leibniz-
     Zentrum fuer Informatik, 2017. [doi:10.4230/LIPIcs.APPROX-RANDOM.2017.14] 4

[25] R ASMUS K YNG , JAKUB PACHOCKI , R ICHARD P ENG , AND S USHANT S ACHDEVA: A framework
     for analyzing resparsification algorithms. In Proc. 28th Ann. ACM–SIAM Symp. on Discrete
     Algorithms (SODA’17). SIAM, 2017. [doi:10.1137/1.9781611974782.132, arXiv:1611.06940] 29,
     30

[26] Y IN TAT L EE , R ICHARD P ENG , AND DANIEL A. S PIELMAN: Sparsified Cholesky solvers for
     SDD linear systems. 2015. [arXiv:1506.08204] 4

[27] P O - LING L OH AND M ARTIN J. WAINWRIGHT: Structure estimation for discrete graphical models:
     Generalized covariance matrices and their inverses. In Proc. 25th Adv. Neural Info. Proc. Sys.
     (NIPS’12), pp. 2087–2095. Curran Assoc., 2012. NIPS. 27

[28] G RIGORY A. M ARGULIS: Explicit constructions of expanders. Problemy Peredachi Informatsii,
     9(4):71–80, 1973. MathNet.ru. 4

[29] G ARY L. M ILLER AND R ICHARD P ENG: Approximate maximum flow on separable undirected
     graphs. In Proc. 24th Ann. ACM–SIAM Symp. on Discrete Algorithms (SODA’13), pp. 1151–1170.
     SIAM, 2013. [doi:10.1137/1.9781611973105.83] 17, 18

                      T HEORY OF C OMPUTING, Volume 17 (4), 2021, pp. 1–35                           32
                D ETERMINISTIC A PPROXIMATION OF R ANDOM WALKS IN S MALL S PACE

[30] JACK M URTAGH , O MER R EINGOLD , A ARON S IDFORD , AND S ALIL P. VADHAN: Derandomiza-
     tion beyond connectivity: Undirected Laplacian systems in nearly logarithmic space. In Proc. 58th
     FOCS, pp. 801–812. IEEE Comp. Soc., 2017. [doi:10.1109/FOCS.2017.79] 3, 4, 6, 11, 13, 25

[31] JACK M URTAGH , O MER R EINGOLD , A ARON S IDFORD , AND S ALIL P. VADHAN: Deterministic
     approximation of random walks in small space. In Proc. 23rd Internat. Conf. on Randomization
     and Computation (RANDOM’19), pp. 42:1–22. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik,
     2019. [doi:10.4230/LIPIcs.APPROX-RANDOM.2019.42] 1, 5

[32] J OSEPH NAOR AND M ONI NAOR: Small-bias probability spaces: Efficient constructions and
     applications. SIAM J. Comput., 22(4):838–856, 1993. [doi:10.1137/0222053] 4

[33] R ICHARD P ENG AND DANIEL A. S PIELMAN: An efficient parallel solver for SDD linear systems.
     In Proc. 46th STOC, pp. 333–342. ACM Press, 2014. [doi:10.1145/2591796.2591832] 3, 28, 29

[34] O MER R EINGOLD: Undirected connectivity in log-space.          J. ACM, 55(4):17:1–24, 2008.
     [doi:10.1145/1391289.1391291] 2, 4, 25

[35] O MER R EINGOLD , S ALIL VADHAN , AND AVI W IGDERSON: Entropy waves, the zig-zag
     graph product, and new constant-degree expanders. Ann. Math., 155(1):157–187, 2002.
     [doi:10.2307/3062153] 4, 9, 13

[36] E YAL ROZENMAN AND S ALIL VADHAN: Derandomized squaring of graphs. In Proc. 9th Inter-
     nat. Workshop on Randomization and Computation (RANDOM’05), pp. 436–447. Springer, 2005.
     [doi:10.1007/11538462_37] 3, 4, 9, 10, 11, 13, 25

[37] M ICHAEL S AKS AND S HIYU Z HOU: BPH SPACE(S) ⊆ DSPACE(S3/2 ). J. Comput. System Sci.,
     58(2):376–403, 1999. Preliminary version in FOCS’95. [doi:10.1006/jcss.1998.1616] 2

[38] DANIEL A. S PIELMAN AND S HANG -H UA T ENG: Nearly-linear time algorithms for graph parti-
     tioning, graph sparsification, and solving linear systems. In Proc. 36th STOC, pp. 81–90. ACM
     Press, 2004. [doi:10.1145/1007352.1007372] 6

[39] DANIEL A. S PIELMAN AND S HANG -H UA T ENG: Spectral sparsification of graphs. SIAM J.
     Comput., 40(4):981–1025, 2011. [doi:10.1137/08074489X] 3

[40] T HOMAS S TEINKE: Pseudorandomness for permutation branching programs without the group
     theory. Electron. Colloq. Comput. Complexity, TR12-083, 2012. [ECCC] 3

[41] R. M ICHAEL TANNER: Explicit concentrators from generalized N-gons. SIAM J. Alg. Discr.
     Methods, 5(3):287–293, 1984. [doi:10.1137/0605030] 6




                      T HEORY OF C OMPUTING, Volume 17 (4), 2021, pp. 1–35                         33
            JACK M URTAGH , O MER R EINGOLD , A ARON S IDFORD , AND S ALIL VADHAN

AUTHORS

    Jack Murtagh
    School of Engineering & Applied Sciences
    Harvard University
    Cambridge, MA USA
    murtagh.jack g harvard edu
    https://scholar.harvard.edu/jmurtagh/home


    Omer Reingold
    Professor
    Computer Science Department
    Stanford University
    Stanford, CA USA
    reingold stanford edu
    https://engineering.stanford.edu/people/omer-reingold


    Aaron Sidford
    Assistant professor
    Management Science & Engineering
    and Computer Science
    Stanford University
    Stanford, CA USA
    sidford stanford edu
    http://www.aaronsidford.com


    Salil Vadhan
    Professor
    School of Engineering & Applied Sciences
    Harvard University
    Cambridge, MA USA
    salil_vadhan harvard edu
    http://salil.seas.harvard.edu


ABOUT THE AUTHORS

    JACK M URTAGH got his Ph. D. in May 2020 from Harvard University where he was advised
       by Salil Vadhan. As an undergraduate, he studied math at Tufts University. Jack wrote
       his dissertation on spectral graph-theoretic methods for derandomizing space-bounded
       computation.


                   T HEORY OF C OMPUTING, Volume 17 (4), 2021, pp. 1–35                        34
          D ETERMINISTIC A PPROXIMATION OF R ANDOM WALKS IN S MALL S PACE

O MER R EINGOLD is the Rajeev Motwani Professor in Computer Science at Stanford
   University. His research is in the foundations of computer science and most notably in
   computational complexity, cryptography, and the societal impact of computation. Omer
   completed his Ph. D. at the Weizmann Institute under the supervision of Moni Naor.


A ARON S IDFORD is an assistant professor in Management Science and Engineering and
   in Computer Science at Stanford University. He received his Ph. D. from the Electrical
   Engineering and Computer Science Department at the Massachusetts Institute of Tech-
   nology where he was advised by Jonathan Kelner. Aaron’s research interests lie broadly
   in optimization, the theory of computation, and the design and analysis of algorithms.
   He is particularly interested in work at the intersection of continuous optimization, graph
   theory, numerical linear algebra, and data structures.


S ALIL VADHAN is the Vicky Joseph Professor of Computer Science and Applied Mathe-
   matics at the Harvard John A. Paulson School of Engineering & Applied Sciences. He
   received his Ph. D. under the supervision of Shafi Goldwasser at MIT in 1999; the title
   of his dissertation was “A Study of Statistical Zero-Knowledge Proofs.” Other research
   interests include the theory of pseudorandomness the theory and practice of data privacy.
   He enjoys spending leisure time with his wife and two daughters, as well as learning to
   surf in the cold waters of New England.




                T HEORY OF C OMPUTING, Volume 17 (4), 2021, pp. 1–35                             35