Span-Program-Based Quantum Algorithm for Evaluating Formulas

Authors Ben Reichardt, Robert {\v S}palek,
Plaintext
                              T HEORY OF C OMPUTING, Volume 8 (2012), pp. 291–319
                                           www.theoryofcomputing.org




      Span-Program-Based Quantum Algorithm
              for Evaluating Formulas
                                  Ben W. Reichardt                Robert Špalek
                   Received: September 22, 2008; revised: October 5, 2011; published: July 7, 2012.




         Abstract: We give a quantum algorithm for evaluating formulas over an extended gate set,
         including all two- and three-bit binary gates (e. g., NAND, 3-majority). The algorithm is
         optimal on read-once formulas for which each gate’s inputs are balanced in a certain sense.
             The main new tool is a correspondence between a classical linear-algebraic model of
         computation, “span programs,” and weighted bipartite graphs. A span program’s evaluation
         corresponds to an eigenvalue-zero eigenvector of the associated graph. A quantum computer
         can therefore evaluate the span program by applying spectral estimation to the graph.
             For example, the classical complexity of evaluating the balanced ternary majority formula
         is unknown, and the natural generalization of randomized alpha-beta pruning is known to be
         suboptimal. In contrast, our algorithm generalizes the optimal quantum AND-OR formula
         evaluation algorithm and is optimal for evaluating the balanced ternary majority formula.

ACM Classification: F.1.1, F.2.2
AMS Classification: 81P68, 68Q05, 68Q12
Key words and phrases: span programs, quantum computation, query complexity, formula evaluation

1      Introduction
A formula ϕ on gate set S and of size N is a tree with N leaves, such that each internal node is a gate
from S on its children. The read-once formula evaluation
                                                       √    problem is to evaluate ϕ(x) given oracle access
to the input string x = x1 x2 . . . xN . An optimal, O( N)-query quantum algorithm is known to evaluate
“approximately balanced” formulas over the gates S = {AND, OR, NOT} [4]. We develop an optimal
quantum algorithm for evaluating balanced formulas over a greatly extended gate set. The notion of

    2012 Ben W. Reichardt and Robert Špalek
    Licensed under a Creative Commons Attribution License                              DOI: 10.4086/toc.2012.v008a013
                                       B EN W. R EICHARDT AND ROBERT Š PALEK

                                   G(ρ1 ) G(ρk ) G(¬ρ1 ) G(¬ρk )                   G(ρ1 )
                                                                                            . . . G(ρ )              G(ρ1 ) G(ρ2 ) G(ρ3 )
                 ρ1                                                           ρk                          ρ1 ρ2 ρ3
  ρ       G(ρ)
                       ... ρ   k
                                       ...           ...           ρ1
                                                                        ...                          k



 NOT                  EQUALk                                            ORk                                 MAJ3




Figure 1: To convert a formula ϕ to the corresponding graph G(ϕ), we recursively apply substitution
rules starting at the root to convert each gate into a gadget subgraph. Some of the rules are shown here,
except with the edge weights not indicated. The dangling edges at the top and bottom of each gadget are
the input edges and output edge, respectively. To compose two gates, the output edge of one is identified
with an input edge of the next (see Figure 2).

balance is technical, but includes as a special case layered formulas in which gates at the same depth are
all equal.
     The idea of our algorithm is to consider a weighted graph G(ϕ) obtained by replacing each gate of the
formula ϕ with a small gadget subgraph, and possibly also duplicating subformulas. Figure 1 has several
examples. We relate the evaluation of ϕ to the presence or absence of small-eigenvalue eigenvectors
of the weighted adjacency matrix AG(ϕ) that are supported on the root vertex of G(ϕ). The quantum
algorithm runs spectral estimation to either detect these eigenvectors or not, and therefore to evaluate ϕ.
     As a special case, for example, our algorithm implies:

Theorem 1.1. A balanced ternary majority (MAJ3 ) formula of depth d, on N = 3d inputs, can be
evaluated by a quantum algorithm with bounded error using O(2d ) oracle queries, which is optimal.
                                                             d
                    √ d of evaluating this formula is Ω((5/2) ), and the previous best quantum algorithm,
The classical complexity
from [4], uses O( 5 ) queries.
    The graph gadgets themselves are derived from “span programs” [26], establishing a new connection
between span programs and quantum algorithms. The span program computational model has been used
previously in classical complexity theory to prove lower bounds on formula size [26, 5] and monotone
span programs are related to linear secret-sharing schemes [8]. (Most, though not all [1], applications are
over finite fields, whereas we use the definition over C.) Work subsequent to this paper has shown that
span programs are equivalent to quantum query algorithms [35].

Classical and quantum background The formula evaluation problem has been well-studied in the
classical computer model. Classically, the case S = {NAND} is best understood. A formula with only
NAND gates is equivalent to one with alternating levels of AND and OR gates, a so-called “AND-OR
formula,” also known as a read-once function or a two-player game tree. Using randomized alpha-
beta pruning, one can compute√       the value of a balanced binary AND-OR formula with zero error in
expected time O(N 2  log [(1+ 33)/4] ) = O(N 0.753... ) [42, 40], and this is optimal even for bounded-error
algorithms [41]. Other formulas have different query complexities; for example, evaluating a single
OR gate with N inputs needs Θ(N) oracle queries. The best general lower bound of Ω(N 0.51 ) [22] has
recently been improved to Ω(N 0.746 ) [2].
    If we allow the use of a√quantum computer with coherent oracle access to the input, however, then
the situation is simpler; Θ( N) queries are necessary and sufficient to evaluate any {AND, OR, NOT}

                           T HEORY OF C OMPUTING, Volume 8 (2012), pp. 291–319                                                      292
              S PAN -P ROGRAM -BASED Q UANTUM A LGORITHM FOR E VALUATING F ORMULAS

formula with bounded error. The general lower bound is due to [6]. For the upper  √     bound, in one extreme
case, Grover search [20, 21] evaluates an OR gate of degree N using O( N) oracle queries. In the
other extreme case, Farhi, Goldstone and Gutmann devised a breakthrough algorithm for evaluating the
depth-(log2 N) balanced                                          1/2+o(1) queries [17, 15]. Ambainis et al. [4]
                     √ binary AND-OR formula using N
improved this to O( N) queries, even for “approximately balanced” formulas, and also extended the
                                                                1
algorithm to arbitrary {AND, OR, NOT} formulas                  2 +o(1) queries and time after a preprocessing
                                                  √ with N
step. More recent work has given a general O( N)-query upper bound [37, 36].
    This paper shows other nice features of the formula evaluation problem in the quantum computer
model. Classically, with the exception of {NAND}, {NOR} and a few trivial cases like {PARITY}, most
gate sets are poorly understood. In 1986, Boppana asked the complexity of evaluating the balanced,
depth-d ternary majority (MAJ3 ) function [40], and today the complexity is only known to lie between
Ω((5/2)d ) and O((2.6537 . . .)d ) [25, 31]. In particular, the naïve generalization of randomized alpha-beta
pruning—recursively evaluate two random immediate subformulas and then the third if they disagree—
runs in expected time O((8/3)d ) and is suboptimal. This suggests that the balanced ternary majority
function is significantly different from the balanced k-ary NAND function, for which randomized alpha-
beta pruning is known to be optimal [40]. In contrast, we show that the optimal quantum algorithm
of [4] does extend to give an optimal O(2d )-query algorithm for evaluating the balanced ternary majority
formula. Moreover, the algorithm also generalizes to a significantly larger gate set S.


Subsequent work The graph gadgets we use are derived from constant-size span programs, but in the
conference version of this article [39] we speculated that “larger span programs could directly give useful
new quantum algorithms.” Later work has borne this out. Based on span programs, quantum algorithms
have been developed for matrix rank [9], various subgraph-detection and related problems [10, 45, 29, 12],
graph collision [18], the k-distinctness problem under a certain promise [11], and further formula-
evaluation problems [36, 38, 37, 44, 27]. For formula evaluation, the quantum query complexity of
arbitrary read-once formulas has been characterized, with no gate set restrictions or balance condition
needed.
    In fact, span programs have been shown to be equivalent to quantum query algorithms [35, 37, 30].
Formally, the general adversary lower bound on quantum query complexity exactly equals the least span
program witness size, which via an algorithm upper bounds query complexity to within a constant factor.
(The general adversary bound is a semi-definite program, which can be solved to find an optimal span
program for an arbitrary gate; therefore, we have removed most of the hand-constructed examples present
in the conference version of this article.)
    Although the quantum query complexity of read-once formulas is settled, the quantum time complexity
is not. The problem is that operations between queries to the input are not necessarily efficiently
implementable. Our formula-evaluation algorithm is both query optimal and time efficient. Its strict
balance condition has been relaxed, while keeping a time-efficient algorithm, in [38]. Compared to
this follow-up work, our proof is more technical, but our conclusions are also stronger. The analysis
in [38, 35] is based on showing that the output vertex of the graph has at most a small constant squared
overlap on the span of the small-eigenvalue eigenspaces. Such an “effective spectral gap” can be proved
relatively easily by studying the eigenvalue-zero eigenvectors. By contrast, in this article we prove a true
spectral gap, i. e., that the output vertex has no overlap on small-eigenvalue eigenspaces. This requires

                        T HEORY OF C OMPUTING, Volume 8 (2012), pp. 291–319                               293
                               B EN W. R EICHARDT AND ROBERT Š PALEK

studying eigenvalue-λ eigenvectors for small λ 6= 0, which is not easy (see Section 4.2.2 below). For the
quantum algorithm, an effective spectral gap suffices for applying phase estimation, and only leads to a
slightly higher error rate.

Organization In Section 2, we briefly recall the adversary lower bounds on quantum query complexity.
    We introduce span programs and present optimal span programs for all three-bit Boolean functions in
Section 3. A span program corresponds to a weighted bipartite graph, and its evaluation corresponds to
eigenvalue-zero eigenvectors of the graph (Theorem 3.5). This theorem provides useful intuition.
    We develop a quantitative version of Theorem 3.5 in Section 4. We bound from below the overlap of
the eigenvalue-zero eigenvector with a known starting state. This lower bound will imply completeness
of our quantum algorithm. To show soundness of the algorithm, we also analyze small-eigenvalue
eigenvectors in order to prove a spectral gap around zero. Essentially, we solve the eigenvalue equations
in terms of the eigenvalue λ , and expand a series around λ = 0. The results for small-eigenvalue and
eigenvalue-zero eigenvectors are closely related, and are unified using a new complexity measure we
term “span program witness size.”
    In Section 5, we present a quantum algorithm for evaluating formulas over the gate set S of all
three-bit Boolean functions. On formulas that are “adversary-balanced” (Definition 2.3), the algorithm is
optimal.


2    Adversary lower bounds on quantum query complexity
The query complexity or decision-tree complexity of a function f : D → E, with D ⊆ {0, 1}n , is the least
number of input bits one must look at in order to evaluate f on a worst-case input. The bounded-error
quantum query complexity, Q( f ), is defined similarly except input queries are allowed to be made in
quantum superposition and the output is allowed to be wrong with probability at most 1/3. For a formal
definition, see, e. g., the textbook [33].
    To show the optimality of our formula-evaluation algorithm, we will use the general adversary bound:

Definition 2.1 ([24]). Let f : {0, 1}k → {0, 1}. Define the general adversary bound Adv± ( f ) by
                                              
                              Adv± ( f ) = max kΓk : max kΓ ◦ Di k ≤ 1 .                            (2.1)
                                            Γ           i

The maximum is over all 2k × 2k symmetric matrices Γ satisfying hx|Γ|yi = 0 if f (x) = f (y). Γ ◦ Di
denotes the entrywise matrix product with Di , a zero-one-valued matrix defined by hx|Di |yi = 1 if and
only if xi 6= yi .

Theorem 2.2 ([24]). The bounded-error quantum query complexity of f , Q( f ), is Ω(Adv± ( f )).

    The general adversary bound strengthens the adversary bound [3, 7], which is defined as in Equa-
tion (2.1) but with Γ required to have non-negative entries.
    To match the lower bound of Theorem 2.2, our goal will be to use O(Adv± (ϕ)) queries to evaluate ϕ.
We will prove the optimality of our algorithm only for formulas satisfying a strict balance condition:

                       T HEORY OF C OMPUTING, Volume 8 (2012), pp. 291–319                           294
               S PAN -P ROGRAM -BASED Q UANTUM A LGORITHM FOR E VALUATING F ORMULAS

Definition 2.3. A formula ϕ is adversary-balanced if for every gate, the general adversary lower bounds
of its input subformulas are equal.

    This balance condition is motivated by the following composition result:

Theorem 2.4 ([24]). Let f = g ◦ (h1 , . . . , hk ), where Adv± (h1 ) = · · · = Adv± (hk ) and ◦ denotes function
composition. Then Adv± ( f ) ≥ Adv± (g)Adv± (h1 ).

    Subsequent work, using the equivalence between the general adversary bound and span programs, has
shown this to be an equality [35]. If ϕ is adversary-balanced, then by Theorem 2.4 Adv± (ϕg ) is at least
the product of the gate adversary bounds along any non-self-intersecting path χ from g up to an input,
Adv± (ϕg ) ≥ ∏h∈χ Adv± (h). Note that Adv± (¬ f ) = Adv± ( f ), so NOT gates can be inserted anywhere
in an adversary-balanced formula.


3    Span programs and eigenvalue-zero graph eigenvectors
A span program P is a linear-algebraic way of specifying a function fP . In this section, we will associate P
to a certain structured collection of weighted bipartite graphs GP (x), in such a way that on input x,
 fP (x) = 1 if and only if there exists an eigenvalue-zero eigenvector of the adjacency matrix AGP (x)
supported on a distinguished output node (Theorem 3.5).
     In turn, this will imply that specifying a span program P for a function f directly gives a quantum
algorithm for evaluating f , or for evaluating formulas including f as a gate (Section 5). The algorithm
works by spectral estimation on AGP (x) . Its running time depends on the span program’s “witness size,”
a new complexity measure we define that bounds the squared length of the above eigenvalue-zero
eigenvectors.

Definition 3.1 (Span program [26]). A span program P = (n, d, |ti,Vfree , {V j,b }) consists of a “target”
vector |ti ∈ Cd and finite sets Vfree and V1,0 ,V1,1 , . . . ,Vn,0 ,Vn,1 of “input” vectors from Cd .
    To P corresponds a Boolean function fP : {0, 1}n → {0, 1}, defined by fP (x) = 1 if and only if |ti lies
                                      S
in the span of the vectors in Vfree ∪ nj=1 V j,x j .

    For manipulating a span program, it is convenient to index the input vectors so Vfree = {vi : i ∈ Ifree }
                                                                                        S
and V j,b = {vi : i ∈ I j,b }, for disjoint index sets Ifree and I j,b . Let I = Ifree ∪ j,b I j,b . Furthermore, let A
be the matrix whose columns are the input vectors, i. e., in standard ket notation A = ∑i∈I |vi ihi|. For an
                                         S
input x ∈ {0, 1}n , let I(x) = Ifree ∪ nj=1 I j,x j be the indices of the available input vectors, and let Π(x) be
the projection onto the corresponding columns of A, i. e., Π(x) = ∑i∈I(x) |iihi|.

Definition 3.2 (Witness size). Consider a span program P, and a vector s ∈ [0, ∞)n . Define S =
             √
∑ j,b;i∈I j,b s j |iihi|. Define P’s witness size with costs s as

                                        wsizes (P) = max n wsizes (P, x) ,
                                                       x∈{0,1}


where wsizes (P, x) is given by:

                          T HEORY OF C OMPUTING, Volume 8 (2012), pp. 291–319                                      295
                                  B EN W. R EICHARDT AND ROBERT Š PALEK

      • If fP (x) = 1, then |ti ∈ Range(AΠ(x)), so there is a witness |wi ∈ C|I| satisfying AΠ(x)|wi = |ti.
        Then wsizes (P, x) is the minimum squared length of any such witness, weighted by the costs s:

                                       wsizes (P, x) =          min            kS|wik2 .                (3.1)
                                                         |wi: AΠ(x)|wi=|ti


                                / Range(AΠ(x)). Therefore there is a witness |w0 i ∈ Cd satisfying ht|w0 i = 1
      • If fP (x) = 0, then |ti ∈
                    †   0
        and Π(x)A |w i = 0. Then

                                       wsizes (P, x) =        min           kSA† |w0 ik2 .              (3.2)
                                                         |w0 i: ht|w0 i=1
                                                         Π(x)A† |w0 i=0


    When the subscript s is omitted, the costs are taken to be uniform, s = ~1 = (1, 1, . . . , 1), e. g.,
wsize(P) = wsize~1 (P). In this case, note that S = 1 − ∑i∈Ifree |iihi|.
    Observe that the witness size is invariant under the application of any invertible linear transformation
to the target and input vectors. Moreover, since the witness size does not charge for using vectors in
Span(Vfree ), we may eliminate the free input vectors. (See [35, Lemma 4.12, Prop. 4.10].)

Lemma 3.3. For any span program P, we may by an invertible linear transformation on the vectors
in Cd change the target vector to (1, 0, 0, . . . , 0). By projecting all vectors onto the space orthogonal to
the span of Vfree , we may assume Vfree = 0.
                                          / These operations change neither fP nor the witness size.

3.1     Span program as an adjacency matrix
Definition 3.4 (Graphs GP and GP (x)). For a span program P, let GP be the complex-weighted, bipartite
graph with biadjacency matrix                           
                                          BGP = |ti A .                                         (3.3)
That is, GP has 1 + |I| column vertices and d row vertices. The first column vertex, called the “output
vertex,” has incident edges weighted by the coordinates of |ti. Each of the remaining |I| column vertices,
or “input vertices,” has incident edges weighted by an input vector. The adjacency matrix of GP is
                                                            
                                                   0 BGP
                                         AGP =                  .                                   (3.4)
                                                  B†GP    0

    For an input x ∈ {0, 1}n , let GP (x) be the same graph, but with weight-one dangling edges attached to
the input vertices of I r I(x). That is, the biadjacency matrix of GP (x) is
                                                                 
                                                    |ti      A
                                         BGP (x) =                   .                                (3.5)
                                                     0 1 − Π(x)

Theorem 3.5. Let P be a span program, and x an input. Then there is a vector in the kernel of BGP (x)
                                                                                                          
supported on the output vertex if and only if fP (x) = 1. There is a vector in the kernel of A† , 1 − Π(x) ,
the first d coordinates of which having nonzero inner product with |ti, if and only if fP (x) = 0.

                          T HEORY OF C OMPUTING, Volume 8 (2012), pp. 291–319                             296
              S PAN -P ROGRAM -BASED Q UANTUM A LGORITHM FOR E VALUATING F ORMULAS

Proof. A vector (aO , |wi) ∈ C × C|I| is in the kernel of BGP (x) if and only if aO |ti + A|wi = 0 and
(1 − Π(x))|wi = 0. The second condition says that the support of |wi contains only vertices corresponding
to indices in I(x). The first condition says that the corresponding linear combination of input vectors
is −aO |ti. When aO 6= 0, the existence of a |wi satisfying these conditions is equivalent to fP (x) = 1.
     Next, we have that fP (x) = 0 if and only if |ti ∈  / Range(AΠ(x)), which in turn holds if and only
if there is a vector |w i ∈ C with ht|w i 6= 0 and Π(x)A† |w0 i = 0.
                          0     d           0
                                                                       This is equivalent to the existence
of a vector (|w0 i, |w00 i) ∈ Cd × C|I| in the kernel of A† , 1 − Π(x) with ht|w0 i =
                                                                                    6 0 (for i ∈ I r I(x), set
hi|w00 i = −hi|A† |w0 i).

Example 3.6. The span program P with n = 3, d = 2, |ti = (1, 0), Vfree = V1,0 = V2,0 = V3,1 = 0,   / V1,1 =
{(1, α)}, V2,1 = {(1, β )}, V3,0 = {(1, γ)} computes the function MAJ3 (x1 , x2 , x̄3 ), provided α, β , γ are
distinct and nonzero. The biadjacency matrix BGP (101) is
                                                              
                                               1 1 1         1
                                              0 α β         γ 
                                                              
                                              0 0 0         0 
                                                              .
                                              0 0 1         0 
                                               0 0 0         1

3.2   Dual span program
Observe that attaching a dangling weight-one edge to the output vertex complements the conditions of
Theorem 3.5. Therefore a length-one path can be thought of as a NOT gate gadget. The construction of
GP (x) from GP is also based on adding NOT gates to false inputs.

Definition 3.7 (Dual span program). For a span program P, define its dual P† by complementing the
output and all inputs with dangling weight-one edges, and also switching each input vector set V j,b
with V j,1−b .

Example 3.8. For distinct, nonzero α, β , γ, the span program

                                                      > x̄1 x̄2 x3
                                                 z}|{          
                                            1        10 0 0 0
                                          0       1 α 1 0 0
                                    |ti =        
                                          0, A =  1 β 0 1 0
                                                                  

                                            0        1γ 0 0 1

computes ¬ MAJ3 (x1 , x2 , x̄3 ). Here, we have introduced the notation that > labels input vectors in Vfree ,
and input vectors in V j,1 or V j,0 are indicated by the positive or negative literals x j or x̄ j , respectively;
thus, e. g., V3,1 = {(0, 0, 0, 1)}. This span program is dual to the span program in Example 3.6.

   From Definition 3.2, it is immediate that the dual span program P† satisfies fP† = ¬ fP and wsize(P† ) =
wsize(P). (See also [35, Lemma 4.1].)
   Alternative constructions of dual span programs are given in [16, 34].

                         T HEORY OF C OMPUTING, Volume 8 (2012), pp. 291–319                                 297
                                      B EN W. R EICHARDT AND ROBERT Š PALEK


                                                                                                      x1    x2 x3

                                                                                          x1 x2 x3
                   x1 x2 > x3 x4 x5
                                
           1        1 1 1 0 0 0
                                                                                                       x4           x4
         0      α β γ 0 0 0 
   |ti =        
         0, A = 0 0 1 1 1 1
                                   

           0        0 0 0 α β γ
                         (a)                                        (b)                              (c)


Figure 2: (a) A composed span program that computes the function MAJ3 (x1 , x2 , MAJ3 (x3 , x4 , x5 )),
provided α, β , γ are distinct and nonzero. (b) The associated composed graph, with labeled input vertices.
(c) Graph for MAJ3 (x1 , x2 , x3 ) ⊕ x4 , evaluated on input x = 1100. The span program for PARITY is from
Section 3.4. The eigenvalue-zero eigenvector promised by Theorem 3.5 is supported on the black vertices.

           Gate g              Adv± (g)                          Gate g                           Adv ±
                                                                                                     √ (g)
             0                    0                          x1 ∨ (x2 ⊕ x3 )                          5
             x1                   1                           x1 ⊕ x2 ⊕ x3                        p  3
                                 √                                                                      √
          x1 ∧ x2                  2                   (x1 ∧ x2 ∧ x3 ) ∨ (x̄1 ∧ x̄2 )              3+ 3
          x1 ⊕ x2                √2            IF-THEN-ELSE(x) = (x1 ∧ x2 ) ∨ (x̄1 ∧ x3 )            2
        x1 ∧ x2 ∧ x3             √3              MAJ3 (x) = (x1 ∧ x2 ) ∨ ((x1 ∨ x2 ) ∧ x3 )          2√
       x1 ∨ (x2 ∧ x3 )             3
                                   √          EQUAL3 (x) = (x1 ∧ x2 ∧ x3 ) ∨ (x̄1 ∧ x̄2 ∧ x̄3 )    3/√ 2
       x1 ⊕ (x2 ∧ x3 )          1+ 2          EXACT2 of 3 (x) = MAJ3 (x) ∧ (x̄1 ∨ x̄2 ∨ x̄3 )         7


Table 1: Inequivalent binary gates on up to three bits x = x1 x2 x3 ∈ {0, 1}3 , and their adversary bounds.


3.3   Span program composition
Definition 3.9 (Composed graph and span program). Consider span program P = (n, d, |ti,Vfree , {V j,b })
and span programs Q j and Q†j for j = 1, . . . , n. The composed graph is defined by attaching GP to copies
of the GQ j and GQ†j graphs as follows: each input vertex associated to V j,1 should be identified with the
output vertex of a copy of GQ j ; and each input vertex associated to V j,0 should be identified with the
output vertex of a copy of GQ†j . The composed span program is the span program corresponding to this
composed graph.

    The composed span program computes the composed function fP ◦ ( fQ1 , . . . , fQn ). Two examples are
given in Figure 2. Span program composition is studied in more detail in [35, Sec. 4.2].

3.4   Examples: Span programs for three-bit gates
In order to make the above formalism more concrete, we now present span programs for every function
{0, 1}3 → {0, 1}. Up to permutation of inputs and complementation of some or all inputs or output, there

                               T HEORY OF C OMPUTING, Volume 8 (2012), pp. 291–319                                  298
              S PAN -P ROGRAM -BASED Q UANTUM A LGORITHM FOR E VALUATING F ORMULAS

are fourteen inequivalent three-bit functions. Table 1 lists these functions together with their general
adversary bounds, as computed by [23]. For all three-bit functions, the general adversary bound and the
adversary bound are equal. Based on our algorithm in Section 5 below, it will follow that the general
adversary bound is a lower bound on span program witness size. Therefore the span programs we give
here are optimal.
Theorem 3.10. For every three-bit function g : {0, 1}3 → {0, 1}, there exists a span program P with fP = g,
such that the witness size of P with uniform costs is
                                           wsize(P) = Adv± (g) .                                          (3.6)

Proof. We begin by giving optimal span programs for the functions MAJ3 , following Example 3.6, and
AND (equivalent to OR). We construct optimal span programs for the other functions by combining these
two basic span programs.
Claim 3.11. Let PMAJ3 be the span program from Example 3.6 with (α, β , γ) = (1, e2πi/3 , e−2πi/3 ). Then
if MAJ3 (x) = 0, wsize(PMAJ3 , x) ≤ 6, while if MAJ3 (x) = 1, wsize(PMAJ3 , x) ≤ 2/3.

Proof. Substitute PMAJ3 into Definition
                                   √       3.2. Some of the witnesses are |w0000 i = (1, 0), |w0100 i = (1, −1),
and |w110 i = (e−iπ/6 , eiπ/6 , 0)/ 3, |w111 i = (1, 1, 1)/3. The other witness vectors are symmetrical.
                                                                                                            √
    The cases MAJ3 (x) = 0 or 1 can be balanced by scaling the input vectors by a factor of 1/ 3,
yielding a span program with witness size two.
    For functions b and b0 on disjoint inputs, the general adversary bound satisfies Adv± (b ⊕ b0 ) =
Adv± (b) + Adv± (b0 ) and Adv± (b ∨ b0 ) = (Adv± (b)2 + Adv± (b0 )2 )1/2 [6, 24]; we obtain matching upper
bounds for span program witness size.
Claim 3.12. Let s ∈ (0, ∞)n and let P be the span program with target vector |ti = 1, and input vectors
                                               
                                      √s 
                                            j
                              V j,1 = q            and V j,0 = Vfree = 0/ .
                                     4         
                                         ∑i s2i
                                                   q
Then fP is the n-bit OR function, and wsizes (P) = ∑ j s2j .

Proof. On input x = 0n , the unique witness |w0 i = 1 satisfies
                                                                     r
                                 ht|w0 i = 1   and kSA† |w0 ik2 =      ∑ s2j .
                                                                           j
                                                             q
                                                                        √
On input x = 10n−1 , the unique witness |wi, with coefficient 4 ∑i s2i / s1 on the V1,1 input vector, satisfies
                                                                   r
                                   A|wi = |ti and      kS|wik2 =      ∑ s2j .
                                                                       j

The other inputs with Hamming weight one are symmetrical, and the witness size only decreases for
larger Hamming weight inputs.

                         T HEORY OF C OMPUTING, Volume 8 (2012), pp. 291–319                               299
                                    B EN W. R EICHARDT AND ROBERT Š PALEK

      DeMorgan’s laws allow for converting ANDs into ORs. For example, a span program P for x1 ∧ x2
has
                               1       √ √
                    |ti = q           ( s1 , s2 ) ,    V j,1 = {(1, 0)} and V j,2 = {(0, 1)} .
                          4
                            s21 + s22

                     q
Then wsizes (P) =     s21 + s22 .
   Other span programs can be written by combining the above span programs for AND and for OR.
Our span program for PARITY is based on the identity x1 ⊕ x2 = (x1 ∧ x̄2 ) ∨ (x̄1 ∧ x2 ), as follows:

                                                             x1 ∧ x̄2 x̄1 ∧ x2
                                        |ti = (1) , A = (       1        1       ).

Here, we have introduced a convenient shorthand notation of labeling input vectors by AND gates on
subsets of the input bits (such input vectors are termed “grouped input vectors” in [39]). This should be
interpreted as indicating a composition with an unweighted span program for the AND gate, i. e., as the
span program
                                                        > x1 x̄2 > x̄1 x2
                                                                      
                                             1           1 0 0 1 0 0
                                           0         1 1 0 0 0 0 
                                                                       .                                       (3.7)
                                     |ti =           
                                            0  , A = 1 0 1 0 0 0 
                                                                         
                                           0         0 0 0 1 1 0 
                                             0           0 0 0 1 0 1

Claim 3.13. The span program P from Equation (3.7) has fP (x) = x1 ⊕ x2 and wsizes (P) = s1 + s2 .


   The proof is an easy calculation.
   Optimal span programs for most three-bit functions can be constructed using Claims 3.11, 3.12
and 3.13, e. g., the function EXACT2 of 3 (x1 , x2 , x3 ) = MAJ3 (x1 , x2 , x3 ) ∧ (x̄1 ∨ x̄2 ∨ x̄3 ), so there is a span
program with witness size at most

                                    q                              √
                                     wsize(MAJ3 )2 + wsize(OR3 )2 = 7 .


The only exceptions are (x1 ∧ x2 ∧ x3 ) ∨ (x̄1 ∧ x̄2 ) and the equality function. For the former function, use
the span program
                                                     x1 ∧ x2 x3 x̄1 ∧ x̄2
                                                                       
                                            1          α1 0 α1 α2
                                     |ti =     , A=
                                            0          α2 1         0

                           T HEORY OF C OMPUTING, Volume 8 (2012), pp. 291–319                                      300
               S PAN -P ROGRAM -BASED Q UANTUM A LGORITHM FOR E VALUATING F ORMULAS
          q                   p√
where α1 = 4 1 + √13 and α2 =   3 − 1. This expands out to

                                                   > x1 x2 x3 > x̄2 x̄3
                                                                     
                                         1        α1 0 0 0 α1 α2 0 0
                                       0       α 0 0 1 0 0 0 
                                                2                    
                                       0       1 1 0 0 0 0 0  .
                                               
                                 |ti =   , A =                       
                                                                        
                                       0       1 0 1 0 0 0 0 
                                       0       0 0 0 0 1 1 0 
                                         0         0 0 0 0 1 0 1
                        p   √
The witness size is      3 + 3, matching the adversary bound.1 Finally,
Claim 3.14. Using the condensed notation introduced for PARITY, an optimal span program for the
equality function, EQUALn (x) = (∧nj=1 x j ) ∨ (∧nj=1 x̄ j ), is given by

                                                           ∧ j x j ∧ j x̄ j
                                                          √        √
                                         |ti = (1) , A = ( n − 1 4 n − 1) .
                                                          4


                                     √
This span program has witness size n/ n − 1, which equals the adversary bound.2

Proof. The unweighted AND span program has witness size n on input x = 1n , and 1/(n√ − |x|) otherwise.
                                                               n                    4
    For the above√span program, therefore, on input x = 1 , the witness |wi = (1/ n − 1, 0) gives a
witness size of n/ n − 1, where the factor of n is from the AND gate. The input x = 0n is symmetrical.
    On a false input x, |w0 i = (1) gives a witness size of
                                                           
                                    √            1       1        n
                                      n−1            +        ≤√     .
                                              n − |x| |x|        n−1



    This completes the proof of Theorem 3.10.

    Many more optimal span programs, for various four-bit functions, can be found in an earlier arXiv
version of this article [39].
   1 A calculation gives that the optimal adversary matrix Γ in Definition 2.1 comes from the matrix

                        √                           q                                  q
                     1 a √3
                                             1
                                                             √ 1/2                  √         √ 
                     1 a √3 ,   where a =    2 (5 −    13 − 6 3)    ,    b = 21 − 1 − 3 + 2(8 + 3) ,
                     b0 3

and the rows correspond to inputs 011, 101, 110, the columns to inputs 000, 001, 111.
   2 The optimal adversary matrix Γ comes from the 2 × 2n matrix

                                                                         
                                                         1 1 ··· a a ··· ,
                                                         a a ··· 1 1 ···

where the rows correspond to inputs 0n and 1n , and the columns correspond to inputs of Hamming weight 1 and n − 1, and
a = 1/(n − 1).


                            T HEORY OF C OMPUTING, Volume 8 (2012), pp. 291–319                                    301
                                  B EN W. R EICHARDT AND ROBERT Š PALEK

4     Small-eigenvalue analysis of composed span programs
Definition 4.1 (Formula graph and span program). Given span programs for each gate in a formula ϕ,
span program P(ϕ) is defined as their composition according to the formula. Let G(ϕ, x) be the composed
graph on input x, G(ϕ, x) = GP(ϕ) (x)—except with length-two, weight-one paths attached to each input
vertex in I(x).
    Theorem 3.5 establishes that ϕ(x) = 1 if and only if G(ϕ, x) has an eigenvalue-zero eigenvector
supported on the output vertex. (Attaching length-two paths to the I(x) inputs clearly does not change
this property.) Thus if ϕ(x) = 0, the graph has a spectral gap centered on zero. The goal of this section
is to make Theorem 3.5 more quantitative, so that a phase estimation-based quantum algorithm for
evaluating ϕ can be constructed and analyzed.
    Since G(ϕ, x) is a bipartite graph, it suffices to study eigenvectors with non-negative eigenvalues.
Let λ ≥ 0 and assume that |ψλ i is a unit-normalized eigenvalue-λ eigenvector of the adjacency matrix
AG(ϕ,x) —except possibly not satisfying the eigenvector constraint at the output vertex. The output vertex
constraint is the only constraint that changes if we compose ϕ as an input to another span program, so
leaving it out facilitates proofs by induction in the formula depth. Let aO be the coefficient of |ψλ i on the
output vertex and bO be the coefficient of AG(ϕ,x) |ψλ i on the output vertex.
      • For λ = 0, we will place a lower bound on the achievable |aO |2 when ϕ(x) = 1. This will enable
        the algorithm to detect if ϕ(x) = 1 by starting a quantum walk at the output vertex.
      • For small enough eigenvalues λ > 0, we will show that the evaluation ϕ(x) corresponds to the
        ratio of the two output coefficients aO and bO . Let rO = aO /bO . If ϕ(x) = 1, then rO is large and
        negative, roughly of order −1/λ . If ϕ(x) = 0, then rO is small and positive, of order λ . It will
        follow that if ϕ(x) = 0, then there is a significant spectral gap centered on zero of eigenvectors
        with any support on the output vertex. This spectral gap will prevent the algorithm from outputting
        false positives.
    These two cases are considered in Sections 4.1 and 4.2 below. In our main theorem, Theorem 4.11
in Section 4.3, we will reimpose the output vertex constraint. For small enough λ > 0, it will give a
contradiction when ϕ(x) = 0, implying that the graph has a significant spectral gap.
    Assume for simplicity that the span programs used in ϕ, including dual span programs, have no free
input vectors, i. e., Vfree = 0.
                              / By Lemma 3.3, this assumption is without loss of generality. The span
program P(ϕ) will still have free input vectors at vertices between gates (as shown, e. g., in Figure 2).
We do not eliminate these free inputs, since doing so would make the graph dense, preventing an efficient
implementation of a quantum walk.

4.1     The case λ = 0
We begin the λ = 0 analysis by defining two subformula complexity measures. For each node v in a
formula ϕ, let Pv be the associated span program, and let
                                                                                      1
                  π(v) = max ∏ wsize(Pw ) ,                    σ (v) = 1 + max ∑          ,             (4.1)
                            ξ   w∈ξ                                          ξ   w∈ξ
                                                                                     π(w)


                         T HEORY OF C OMPUTING, Volume 8 (2012), pp. 291–319                             302
                S PAN -P ROGRAM -BASED Q UANTUM A LGORITHM FOR E VALUATING F ORMULAS

where the maximizations are both over simple paths in ϕ from v through its inputs to a leaf. The leaf itself
contributes +1 to σ (v). Let π(ϕ), σ (ϕ) be the same quantities evaluated at the root of ϕ. The second
quantity, σ (ϕ), accounts for amplitude on free inputs, and will be a constant in our ultimate application.

Lemma 4.2. If ϕ(x) = 1, then AG(ϕ,x) has an eigenvalue-zero eigenvector |ψ0 i with output coefficient aO
satisfying
                                       k|ψ0 ik2
                                                ≤ π(ϕ)σ (ϕ) .                                      (4.2)
                                        |aO |2
Proof. The proof is by induction in the depth of the formula.
    If the depth is zero, i. e., ϕ consists of a single variable x j or its negation, then since ϕ(x) = 1, G(ϕ, x)
is just a length-two path, with eigenvector |ψ0 i = (1, 0, −1). Thus |aO |2 = 12 k|ψ0 ik2 and π(ϕ) = 1,
σ (ϕ) = 2.
    Now for the induction step, assume that for each available input i of the base span program P we have
                                                      (i)
constructed an eigenvalue-zero eigenvector |ψ0 i of the corresponding subformula graph that satisfies the
induction hypothesis:
                                                  (i)
                                             k|ψ0 ik2
                                                  (i)
                                                          ≤ π(vi )σ (vi ) ,
                                               |aO |2
where vi is the gate in ϕ corresponding to input i. Let |wi be an optimal witness for P, achieving the
                                                                                                    (i)
minimum in Equation (3.1). Let aO = −1 and scale the ith eigenvector by the coefficient hi|wi/aO ,
combining them to give |ψ0 i, an eigenvalue-zero eigenvector for G(ϕ, x). Then

                            k|ψ0 ik2                    |hi|wi|2 (i)
                                  2
                                     = k|ψ0 ik2 = 1 + ∑ (i) k|ψ0 ik2
                             |aO |                    i |aO | 2

                                                                                      (i)
                                                                     2             k|ψ0 ik2
                                                      ≤ 1 + k|wik · max               (i)
                                                                           i        |aO |2
                                                      ≤ 1 + wsize(P) · max π(vi ) · max σ (vi )
                                                                               i              i
                                                      = π(ϕ)σ (ϕ) .

4.2    The case λ > 0
Now consider λ > 0. Similar to the proof of Lemma 4.2, our goal will be to construct an eigenvalue-λ
eigenvector |ψλ i for G(ϕ, x) inductively, based on joining together appropriately scaled eigenvectors for
the input subformula graphs.
     Let P be the root span program, with unit-length target vector |ti and matrix of input vectors A.
                                                             (i)
Assume that for each input i, we have constructed states |ψλ i satisfying the eigenvalue-λ constraints for
all the ith subgraph’s vertices except at the output vertex. Let ai and bi be the output coefficients for the
ith subgraph. Assume that there exists an ri 6= 0 such that ai = ri bi .3
   3 In Definition 4.1 we attached length-two paths to vertices in I(x) entirely so that this r exists even when input i is a single
                                                                                               i
literal, not another span program, and so that no special case in the analysis is needed.


                             T HEORY OF C OMPUTING, Volume 8 (2012), pp. 291–319                                               303
                                     B EN W. R EICHARDT AND ROBERT Š PALEK

                                                                      aO aI
                                                                         z}|{
                                                                             
                           a1                                          1 ho|    bO
                                                              BGP =  0       o
                                bO
                                        � �� �
                                                                       : C      bC
                                aO                                     0


Figure 3: The graph GP with coefficients aO , bO , |aI i, |bI i, |bC i indicated, in the case |ti = (1, 0, . . . , 0).


    Let ho| = ht|A and let C = (1 − |tiht|)A, so A = C + |tiho|. Let |aI i and |bI i denote the vectors of
scaled output coefficients for the input subgraphs, and let |bC i be the other row-vertex coefficients of
|ψλ i in the subgraph GP . The notation is meant to be suggestive: O for “output,” I for “input” and C for
“constraints.” Figure 3 places these coefficients on GP . From Equation (3.3), the additional eigenvalue-λ
eigenvector equations imposed by composition with GP are

                                         λ |bC i = C|aI i ,                                                    (4.3a)
                                           λ bO = ho|aI i + aO ,                                               (4.3b)
                                                                       †
                                          λ |aI i = |bI i + |oibO +C |bC i .                                   (4.3c)

Here, we have left off the eigenvalue-λ constraint at the output vertex, λ aO = bO , since this constraint
will be modified by later composition steps, as in (4.3c) for the input subgraphs. Letting r = ∑i ri |iihi|,
our goal is to solve Equations (4.3) with the subgraph output coefficients constrained by

                                                    |aI i = r|bI i .                                             (4.4)

    Our main technical theorem is:

Theorem 4.3. Fix an input x to ϕ, and let x̃ be the input to the root span program P. For each input
index i, let si = ri /λ if the input evaluates to 0 and let si = −ri−1 /λ if it evaluates to 1. Assume that
each si > 0.
     Then there exist εP , δP > 0, constants depending only on the span program P, such that, provided
0 < λ ≤ εP min{1, 1/ maxi si }, there exists an rO 6= 0 such that any solution to Equations (4.3) and (4.4)
                                                                          −1
satisfies aO = rO bO . Moreover, if sO is defined as be either rO /λ or −rO  /λ , depending on whether ϕ(x)
is 0 or 1, respectively, sO satisfies

                            0 < sO < wsize(P, x̃) · max si · (1 + δP λ 2 max s2i ) + δP .                        (4.5)
                                                        i                   i

    The proof of this theorem has two main parts. First, in Section 4.2.1 below, we will derive several
alternative expressions for span program witness size. Second, in Section 4.2.2, we will compute the
Taylor series of the inverse of a certain matrix, in order to solve Equations (4.3) and (4.4). The lowest-
order term in the solution will be related to the witness size of P, and we will bound the higher-order
terms.
    We will rely on the following notion of a matrix pseudoinverse:

                          T HEORY OF C OMPUTING, Volume 8 (2012), pp. 291–319                                     304
              S PAN -P ROGRAM -BASED Q UANTUM A LGORITHM FOR E VALUATING F ORMULAS

Definition 4.4. For a matrix M with singular-value decomposition M = ∑k mk |kihk0 | with all mk > 0 and
sets of orthonormal vectors {|ki}, {|k0 i}, let M + = ∑k m−1  0
                                                          k |k ihk|, the Moore-Penrose pseudoinverse of M.

    If M is invertible, then M + = M −1 . In general, MM + = ∑k |kihk| is the orthogonal projection onto
the range of M. If |bi is in the range of M, then
                                     M + |bi = arg min|xi:M|xi=|bi k|xik .

A convenient identity, immediate from the singular-value decomposition, is M + = (M † M)+ M † =
M † (MM † )+ . The norms of pseudoinverses can be bounded using the following claim:
Claim 4.5. For matrices A and B with Range(B† ) ⊆ Range(A) (i. e., B = BAA+ ), kA(BA)+ k ≤ kB+ k
and k(BA)+ k ≤ kA+ kkB+ k.
                                                      
Proof. Since A(BA)+ = A(BA)+ BB+ = A(BA)+ (BA)A+ B+ and the bracketed term is a projection,
kA(BA)+ k ≤ kB+ k. The second claimed bound follows, since (BA)+ = A+ · A(BA)+ .

4.2.1   Equivalent expressions for the witness size
Lemma 4.6. For S any positive-definite, diagonal matrix, let
                                   
                                    min|wi:AΠ|wi=|ti kS|wik2 if fP (x) = 1,
                   wsizeS (P, x) =                         † 0 2                                    (4.6)
                                    min|w0 i:ht|w0 i=1 kSA |w ik if fP (x) = 0.
                                                   ΠA† |w0 i=0

Then if fP (x) = 1,

                      wsizeS (P, x) =        min         |ho|ΠS−1 |wi|−2                            (4.7)
                                        |wi:hw|Π|wi=1
                                         CΠS−1 |wi=0

                                     = k(AΠS−1 )+ |tik2
                                     = k(Π − (CΠS−1 )+CΠS−1 )S−1 |oik−2

and, if fP (x) = 0, letting Π = 1 − Π, ∆ = CΠ(CΠ)+ and ∆ = 1 − ∆,

                      wsizeS (P, x) =         min         |ht|w0 i|−2                               (4.8)
                                        |w0 i:kSA† |w0 ik=1
                                           ΠA† |w0 i=0
                                                               
                                     = k 1 + (Π(AS)+ AS − 1)+ Π (AS)+ |tik−2
                                     = k(1 − (∆CS)+ ∆CS)S(1 −C† (ΠC† )+ )|oik2 .
Moreover,
        |w∗S i = arg min|wi:AΠ|wi=|ti kS|wik2 = S−1 (AΠS−1 )+ |ti has norm k|w∗S ik = O(1) and
        |w0∗                              † 0 2             † +          † +       †   † +
          S i = arg min|w0 i:ht|w0 i=1 kSA |w ik = |ti − (ΠC ) Π|oi − (SC ∆) S(1 −C (ΠC ) )|oi
                       ΠA† |w0 i=0

has norm k|w0∗
            S ik = O(1).


                        T HEORY OF C OMPUTING, Volume 8 (2012), pp. 291–319                           305
                                       B EN W. R EICHARDT AND ROBERT Š PALEK

Proof. Assume fP (x) = 1. Since |ti is in the range of AΠ,

                                             min           k|wik2 = k(AΠS−1 )+ |tik2 .
                                       |wi:AΠS−1 |wi=|ti

This is the second of the three expressions for wsizeS (P, x) in Equation (4.7). The other two expressions
follow by basic geometry: in general, for a vector |vi ∈ Rm and a subspace V ,

                                 min         k|wik2 =         min         |hv|wi|−2 = kΠV |vik−2 ,
                           |wi∈V :hv|wi=1                  w∈V :k|wik=1

where ΠV is the orthogonal projection onto V . Substitute |vi = S−1 Π|oi and V = Ker(CΠS−1 ) to
complete the proof of Equation (4.7).
   Next assume fP (x) = 0. As above,

                                       min         kSA† |w0 ik2 =         min         |ht|w0 i|−2 .
                                 |w0 i:ht|w0 i=1                    |w0 i:kSA† |w0 ik=1
                                  ΠA† |w0 i=0                           ΠA† |w0 i=0

Now, without loss of generality, |ti ∈ Range(A) = Range(AS), since otherwise fP is false on every
input. Therefore, ht|w0 i = ht|(SA† )+ (SA† )|w0 i = ht|(SA† )+ |wi if |wi = SA† |w0 i. We want to find the
length-one vector |wi that is in the range of SA† and also of Π, and that maximizes |ht|(SA† )+ |wi|2 . The
answer is clearly the normalized projection of (AS)+ |ti onto the intersection Range(SA† ) ∩ Range(Π).
In general, given two projections Π1 and Π2 , the projection onto the intersection of their ranges can be
written 1 − (Π1 Π2 − 1)+ (Π1 Π2 − 1). Substituting Π1 = Π and Π2 = (AS)+ AS gives the second claimed
expression.
    Finally, we show that

                       min           kSA† |w0 ik2 = k(1 − (∆CS)+ ∆CS)S(1 −C† (ΠC† )+ )|oik2 .
                   |w0 i:ht|w0 i=1
                    ΠA† |w0 i=0

Since fP (x) = 0, |ti does not lie in the span of the available input vectors, |ti ∈
                                                                                   / Range(AΠ), or equivalently
                    †                                          0
Π|oi ∈ Range(ΠC ). Therefore, there exists a vector |w i = |ti + |bC i that is orthogonal to the span of
the columns of AΠ and has inner product one with |ti. Any such |bC i has the form

                                              |bC i = −(ΠC† )+ Π|oi + ∆|vi ,

where |vi is an arbitrary vector with ht|vi = 0. We want to choose |vi to minimize the squared length of

                                      ΠSA† |w0 i = SΠ(|oi +C† |bC i)
                                                    = S(1 −C† (ΠC† )+ )|oi + SC† ∆|vi .

The answer is clearly the squared length of S(1 −C† (ΠC† )+ )|oi projected to the orthogonal complement
of the range of SC† ∆, as claimed. This corresponds to setting |vi = −(SC† ∆)+ S(1 −C† (ΠC† )+ )|oi.
    The norms of |w∗S i and |w0∗
                               S i are bounded using Claim 4.5.

    It is immediate that wsizes (P, x) is monotone increasing in each input complexity si . In the other
direction, we can show the following estimates:

                           T HEORY OF C OMPUTING, Volume 8 (2012), pp. 291–319                             306
             S PAN -P ROGRAM -BASED Q UANTUM A LGORITHM FOR E VALUATING F ORMULAS

Lemma 4.7. Let S and T be any positive-definite diagonal matrices. Then

                                     wsizeS√1+T (P, x) ≤ wsizeS (P, x) · (1 + kT k) ,                    (4.9)
                                                                                  2
                                     wsize√S2 +T 2 (P, x) ≤ wsizeS (P, x) + O(kT k ) .                  (4.10)

Proof. Equation
           √    (4.9) is immediate from the definition in Equation (4.6). To derive Equation (4.10), first
note that k S2 + T 2 |vik2 = kS|vik2 + kT |vik2 . Then when fP (x) = 1,

             min       (kS|wik2 + kT |wik2 ) ≤ kS|w∗S ik2 + kT |w∗S ik2 = wsizeS (P, x) + O(kT k2 ) ,
         |wi:AΠ|wi=|ti

where we have used that
                                              |w∗S i = arg min kS|wik2
                                                       |wi:AΠ|wi=|ti

has norm k|w∗S ik = O(1) by Lemma 4.6. The argument when fP (x) = 0 is similar:

                       min       (kSA† |w0 ik2 + kTA† |w0 ik2 ) ≤ wsizeS (P, x) + kTA† |w0∗  2
                                                                                         S ik ,
                   |w0 i:ht|w0 i=1
                    ΠA† |w0 i=0

where
                                             |w0∗               † 0 2
                                               S i = arg min kSA |w ik
                                                      |w0 i:ht|w0 i=1
                                                       ΠA† |w0 i=0

has O(1) norm by Lemma 4.6.

4.2.2   Proof of Theorem 4.3
Our proof will require two further matrix identities. First is a special case of the Woodbury matrix
inversion formula [19]: for matrices A and B of the correct sizes, if A and 1 + B† A−1 B are invertible, then
so is A + BB† , and
                          (A + BB† )−1 = A−1 − A−1 B(1 + B† A−1 B)−1 B† A−1 .                         (4.11)
Multiply the right-hand side by A + BB† to verify the identity. Second is the following claim:

Claim 4.8. Let A < 0, B  0 be positive semi-definite and positive definite matrices, respectively, of the
same size, and let ∆ = AA+ and ∆ = 1 − ∆ be projections onto the range of A and its complement. Denote
by B−1
    ∆
       the pseudoinverse (∆B∆)+ . Assume that A+ < A+ (B − BB∆−1 B)A+ . Then B − A is invertible and
                                                                    +
                   (B − A)−1 = B−1
                                ∆
                                   − (1 − B∆−1 B) A − ∆(B − BB−1
                                                              ∆
                                                                 B)∆ (1 − BB−1
                                                                            ∆
                                                                               ).                       (4.12)
                                                         √              √
Proof. By the assumption A+ < A+ (B − BB∆−1 B)A+ , 1 − A+ (B − BB−1∆
                                                                      B)  A+ is invertible on ∆.
To finish, multiply the claimed expression for (B − A) by B − A, and simplify using B∆−1 B = ∆ +
                                                      −1

B∆−1 B∆.

    With the necessary technical tools in place, we are ready to begin our proof.

                           T HEORY OF C OMPUTING, Volume 8 (2012), pp. 291–319                            307
                                   B EN W. R EICHARDT AND ROBERT Š PALEK

Proof of Theorem 4.3. Let Π = ∑i∈I(x̃) |iihi| be the projection onto the available inputs i, and let

                                                 Π=      ∑ |iihi|
                                                       i∈IrI(x̃)


be the projection onto the other inputs. With r = ∑i ri |iihi| and s = ∑i si |iihi|, we have

                                                    1
                                              s=      (−Πr−1 + Πr) .
                                                    λ
Let
                                                                    1
                                r̃ = (r−1 − λ )−1    and     s̃ =     (−Πr̃−1 + Πr̃) .
                                                                    λ
Whereas ri is the ratio between ai and the amplitude bi arriving from deeper in the formula, −r̃i can be
seen to be the ratio between ai and the amplitude arriving from shallower vertices. We first claim that r̃
and s̃ are well-defined:
                                         √                  √
Lemma 4.9. Assume that 0 < λ ≤ 1/ 2 and 0 < si λ ≤ 1/ 2 for all i ∈ I. Then r̃ = (r−1 − λ )−1 exists.
Moreover, for each input i, si < s̃i ≤ si + 1.

Proof. If input i ∈ I(x̃), then using 0 < si = −1/(λ ri ), r̃i = (ri−1 −λ )−1 exists, and s̃i = −1/(λ r̃i ) = si +1.
If input i ∈/ I(x̃), then using ri = si λ < 1, again r̃i exists, and s̃i = r̃i /λ = si /(1 − si λ 2 ). It follows that
si < s̃i ≤ si + 1.

      Next we solve Equations (4.3) and (4.4) to derive an exact expression for rO :

Lemma 4.10. Any solution to Equations (4.3) and (4.4) has aO = rO bO , where
                                                                           
                               rO = λ + ho| r̃ − λ r̃C† (λ 2 + λCr̃C† )−1Cr̃ |oi ,                            (4.13)

provided that r̃ = (r−1 − λ )−1 exists and λ 2 + λCr̃C† is invertible.

Proof. Substituting Equations (4.4) and (4.3a) into (4.3c), and rearranging terms gives

                                                  1   
                                         λ − r−1 − C†C |aI i = |oibO .
                                                  λ
From Equation (4.3b),
                                                                   
                                                             −1 1 † −1
                          aO = λ bO − ho|aI i = λ bO + ho| r̃ + C C    |oibO .
                                                                λ

Equation (4.13) then follows by the Woodbury identity, Equation (4.11), provided that r̃ and (1 +
(1/λ )Cr̃C† )−1 exist.

                          T HEORY OF C OMPUTING, Volume 8 (2012), pp. 291–319                                    308
              S PAN -P ROGRAM -BASED Q UANTUM A LGORITHM FOR E VALUATING F ORMULAS

             √
    Let S = s̃, y = CS−1 Π, ȳ = CSΠ, Y = yy† and Ȳ = ȳȳ† . Observe that kȲ k = O(ks̃k) and also, since
Y + = (y† )+ y+ , by Claim 4.5 kY + k = O(ks̃k). Let

                                     X = λ 2 + λCr̃C† = λ 2 (1 + Ȳ ) −Y .

Provided that X is invertible, we have from Equation (4.13) that
                                                   1                                                  !
           1                                         ho|S−1 y† X −1 yS−1 |oi + λ 3 ho|Sȳ† X −1 ȳS|oi
   rO = λ − ho|S−2 Π|oi + λ ho|S2 Π|oi −           λ                                                  .   (4.14)
           λ
                                                  −λ ho|S−1 y† X −1 ȳS|oi + ho|Sȳ† X −1 yS−1 |oi

   To evaluate X −1 , we will apply Claim 4.8 with A = Y and B = λ 2 (1 + Ȳ ). By a singular-value
decomposition ∆ȳ = ∑k mk |kihk0 |, we obtain

                                                           m2k
                                   ȳ† (1 + Ȳ )−1 ȳ = ∑      2
                                                                 |k0 ihk0 | ≺ 1 .
                                                ∆
                                                       k 1 + m k

Therefore, using our bounds on kY + k and kȳk, provided that λ ks̃k is at most a sufficiently small constant,
                  √                 √               √                               √
              1 − A+ (B − BB−1 ∆
                                 B) A+ = 1 − λ 2 Y + (1 + Ȳ − Ȳ (1 + Ȳ )∆−1Ȳ ) Y +  0 .

By Claim 4.8, X = B − A is invertible.
   Substitute Equation (4.12) and the series expansion
                                       + ∞ √ h√               √ i j√
                    A − ∆(B − BB∆−1 B)∆ = ∑ A+ A+ (B − BB∆−1 B) A+   A+
                                                 j=0

into Equation (4.14), and collect terms with like powers of λ . The lowest-order term is
                   1             1                         1
                  − ho|S−2 Π|oi + ho|S−1 y† A+ yS−1 |oi = − k(1 − y+ y)S−1 Π|oik2
                   λ             λ                         λ
since y† A+ y = y† (y† )+ y+ y = y+ y. Including the other terms gives

                         1                                                                
                  rO = −   k(1 − y+ y)S−1 Π|oik2 + λ 1 + hv|V |vi + k(y† )+ S−1 |oik2
                         λ
                         ∞                                  j/2 +                         2              (4.15)
                       + ∑ λ 3+2 j y+ (1 + ȳV ȳ† )(y† )+      y ȳV |vi − (y† )+ S−1 |oi
                           j=0


where we have let V = 1− ȳ† (1+ Ȳ )−1
                                     ∆
                                        ȳ  0 and |vi = (SΠ− ȳ† (y† )+ S−1 )|oi. Observe that all coefficients,
except the first, are non-negative, and that the coefficient of λ 3+2 j is of order O(ks̃k3+2 j ).
   When ϕ(x) = fP (x̃) = 1, the first term in (4.15) is −1/(λ wsizeS (P, x̃)), using the last expression of
Equation (4.7) for wsizeS (P, x̃). Therefore,

                                 −λ ks̃krO ≥ wsize(P)−1 − O(λ 2 ks̃k2 ) > 0 ,

                         T HEORY OF C OMPUTING, Volume 8 (2012), pp. 291–319                                 309
                                B EN W. R EICHARDT AND ROBERT Š PALEK

so sO = −1/(λ rO ) satisfies
                                                                                 
                           wsizeS (P, x̃) ≤ sO ≤ wsizeS (P, x̃) 1 + O(λ 2 ks̃k2 ) .                  (4.16)

In particular, since each s̃i ≤ si + 1 (Lemma 4.9), by Lemma 4.7,
                                                                   
                                sO ≤ wsize(P, x̃)ksk 1 + O(λ 2 ksk2 + O(1) .

   Assume then that ϕ(x) = fP (x̃) = 0, i. e., Π|oi ∈ Range(ΠC† ). The first term in (4.15) is zero.
Consider the second term. Using ΠC† (ΠC† )+ |oi = Π|oi and (y† )+ S−1 |oi = (ΠS−1C† )+ S−1 Π|oi =
(ΠC† )+ |oi,

                          |vi = (SΠ − ȳ† (ΠC† )+ )|oi + SΠ(1 −C† (ΠC† )+ )|oi
                               = S(1 −C† (ΠC† )+ )|oi .

Furthermore, from the singular-value decomposition of M = ∆ȳ, one infers that

             V = 1 − M † (1 + MM † )−1 M = (1 − M + M) + M + (1 − (1 + MM † )−1 )(M † )+ .

Substituting this into Equation (4.15), the second term becomes
                                                                                               
                       +
        λ k(1 − (∆ȳ) (∆ȳ))|vik2 + 1 + k(ΠC† )+ |oik2 + k(1 − (1 + ∆Ȳ ∆)−1 )1/2 (ȳ† ∆)+ |vik2 .

                                                                       +
Moreover, as ∆CΠ = 0, ∆ȳ = ∆CSΠ = ∆CS. Therefore, k(1 − (∆ȳ) (∆ȳ))|vik2 = wsizeS (P, x̃), using the
last expression of Equation (4.8) for wsizeS (P, x̃). Also,

                               k(1 − (1 + ∆Ȳ ∆)−1 )1/2 (ȳ† ∆)+ |vik2 = O(1) ,
                                +
since k(ȳ† ∆)+ |vik = O(kS(∆ȳ) k) = O(1) (Claim 4.5). Thus sO = rO /λ satisfies

                       wsizeS (P, x̃) + 1 ≤ sO ≤ wsizeS (P, x̃) + O(1) + O(λ 2 ks̃k3 )
                                                                                                     (4.17)
                                               ≤ wsizes (P, x̃) + O(λ 2 ksk3 ) + O(1) ,

where we have used ks̃k ≤ ksk + 1 (Lemma 4.9) and wsizeS (P, x̃) ≤ wsizes (P, x̃) + O(1) (Lemma 4.7).
   Thus in either case ϕ(x) = 1 or ϕ(x) = 0 we have found
                                                                   
                          0 < sO ≤ wsize(P, x̃)ksk 1 + O(λ 2 ksk2 ) + O(1) ,

as desired. This concludes the proof of Theorem 4.3.

4.3   Span program spectral analysis of ϕ
To study the spectra of the composed graphs G(ϕ, x) of Definition 4.1, and argue that in the case ϕ(x) = 0
there is an Ω(1/π(ϕ)) spectral gap, we will solve the recursion set up by Theorem 4.3:

                        T HEORY OF C OMPUTING, Volume 8 (2012), pp. 291–319                            310
              S PAN -P ROGRAM -BASED Q UANTUM A LGORITHM FOR E VALUATING F ORMULAS

Theorem 4.11. Let ϕ be a formula. For each node v, let Pv be the associated span program and assume
that wsize(Pv ) ≥ 1. Let π(v), σ (v) be defined as in Equation (4.1), and let σ̄ (v) be given by

                                            σ̄ (v) = max ∑ π(w)2 ,                                      (4.18)
                                                     ξ   w∈ξ

where the maximization is over simple paths in ϕ from v through its inputs to a leaf. Let π(ϕ), σ (ϕ), σ̄ (ϕ)
be the same quantities evaluated at the root of ϕ. Let C = max{σ (ϕ), σ̄ (ϕ)/π(ϕ)2 }. For a constant span
program P, let δP , εP be the constants from Theorem 4.3. Let δ = max{1, maxv δPv } and ε = 1/(8C3 δ 2 ).
    For an input x, let G(ϕ, x) be the composed graph from Definition 4.1, and let G̃(ϕ, p        x) be the
same except with the weights of edges to the final output vertex scaled by a factor of f = 1/ Cπ(ϕ).
(Equivalently, scale the target vector of the root span program by the same factor f .) Then,

    • If ϕ(x) = 1, there exists an eigenvalue-zero eigenvector |ψ̃0 i of the adjacency matrix AG̃(ϕ,x) with
      at least half of its weight on the output vertex, i. e., |ãO |2 /k|ψ̃0 ik2 ≥ 1/2.

    • If ϕ(x) = 0, then AG̃(ϕ,x) does not have any eigenvalue-λ eigenvectors with nonzero output coeffi-
      cients ãO or b̃O for

                                         1       np                            o
                               |λ | ≤        min   ε/δ , min{1, min εPv }/(2δC) .
                                        π(ϕ)                     v


Proof. If ϕ(x) = 1, then by Lemma 4.2, G(ϕ, x) has an eigenvalue-zero eigenvector |ψ0 i with output
coefficient aO satisfying
                                   |aO |2         1
                                          2
                                            ≥            ≥ f2.
                                  k|ψ0 ik     π(ϕ)σ (ϕ)
Let |ψ̃0 i be the same vector, except with output coefficient ãO = 1f aO . Then |ψ̃0 i is an eigenvalue-zero
eigenvector of G̃(ϕ, x), and
                                                     1
                               |ãO |2                 |a |2
                                                     f2 O                 1
                                        =    1
                                                                     ≥        .
                              k|ψ̃0 ik2      f2
                                                         2
                                                − 1 |aO | + k|ψ0 ik 2   2 − f2

    Now assume that ϕ(x) = 0. By Theorem 3.5, there does not exist any eigenvalue-zero eigenvector
with ãO 6= 0. Also b̃O = 0 necessarily at λ = 0 by the constraint λ ãO = f b̃O .
    For λ 6= 0, we need to solve the recursion from Theorem 4.3, then impose the final output vertex
constraint. The base cases are: at a false input, by the eigenvalue-λ constraint for a path of length one,
ai = λ bi , so si = 1; and at a true input, by the eigenvalue-λ constraints for a path of length two, similarly
si = 1/(1 − λ 2 ). From Equation (4.5), we have that provided |λ | ≤ εP min{1, 1/ maxi si },

                             sO < wsize(P) · max si · (1 + δP λ 2 max s2i ) + δP .
                                                 i                   i

The following claim solves this recursion, somewhat conservatively, assuming that |λ | is sufficiently
small.

                         T HEORY OF C OMPUTING, Volume 8 (2012), pp. 291–319                               311
                                  B EN W. R EICHARDT AND ROBERT Š PALEK

Claim 4.12. Define a quantity s(v) recursively by
                           (
                            2                                                 if v is a leaf of ϕ,
                  s(v) =                                         2                                                   (4.19)
                            wsize(Pv ) maxc s(c) 1 + ε max c s(c)
                                                        π(ϕ)   2     +δ         otherwise,

where in the second case the maximizations are over v’s inputs in ϕ. Then s(v) ≤ 2δCπ(v).

Proof. For brevity, let wv = wsize(Pv ) ≥ 1. Define a quantity s0 (v) recursively by

                                            s0 (v) = wv max s0 (c) + δ ,
                                                           c

for an internal node v, and s0 at a leaf is 2. Then by induction, one finds s0 (v) ≤
                                                                                    δ π(v)σ (v). (For the in-
duction step, use s0 (v)/δ ≤ wv maxc π(c)σ (c) + 1 ≤ π(v) maxc σ (c) + 1/π(v) = π(v)σ (v).) Moreover,
from their definitions, clearly s(v) ≥ s0 (v) always.
    Let D = 2δC and assume by induction that s(c) ≤ Dπ(c) for each input c. Then we have from
Equation (4.19)
                                                                                    
                       wv maxc s(c) + δ         wv maxc s(c) 1 + wv maxδ c s0 (c)           maxc ss(c)    0
                                                                                                   0 (c) s (v)
              s(v) ≤                    2   ≤                          2                ≤                  2     .
                       1 − εD2 max c π(c)
                                π(ϕ)2
                                                                 π(v)
                                                         1 − εD2 π(ϕ) 2
                                                                                                    π(v)
                                                                                            1 − εD2 π(ϕ) 2



The above inequality places a recursive bound on the ratio s(v)/s0 (v). Solve it, using (1 − a)(1 − b) ≥
1 − a − b for ab ≥ 0, to find
                                                   s0 (v)
                                        s(v) ≤           σ̄ (v)
                                                                 .
                                               1 − εD2 π(ϕ)    2


Using σ̄ (v) ≤ σ̄ (ϕ) ≤ Cπ(ϕ)2 and ε = 1/(2D2C), therefore s(v) ≤ 2s0 (v) ≤ 2δCπ(v), as desired.
                                      p
    Therefore, provided that |λ | ≤ ε/ maxv δPv /π(ϕ) and |λ | ≤ minv εPv /(2δCπ(ϕ)), we obtain that
either aO = bO = 0 or the final output ratio sO = aO /(λ bO ) satisfies 0 < sO < 2δCπ(ϕ). Here aO and bO
are the output coefficients for the unscaled graph G(ϕ, x); the same solution works in G̃(ϕ, x), except
with output coefficients ãO = aO / f and b̃O = bO .
    We have not yet used the eigenvector equation at p    the output vertex, √λ ãO = f b̃O . Combining this
equation with f ãO /(λ b̃O ) < 2δCπ(ϕ) implies |λ | > f / 2δCπ(ϕ) = 1/( 2δCπ(ϕ)). This contradicts
the assumption |λ | ≤ 1/(2δCπ(ϕ)). Therefore, the adjacency matrix of G̃P (x) cannot have an eigenvalue-
λ eigenvector with either output coefficient ãO or b̃O nonzero.

Proposition 4.13. Let P be a finite set of span programs each with witness size greater than one,
and let ϕ be a formula using only the associated gates. Then in the statement of Theorem 4.11, the
parameters
     p     C and δ can be bounded above by a function only of P, independent of ϕ, and similarly ε and
min{ ε/δ , min{1, minv εPv }/(2δC)} can be bounded below by a positive function of P. In particular,
when ϕ(x) = 0, AG̃(ϕ,x) has an Ω(1/π(ϕ)) spectral gap for eigenvectors with ãO or b̃O nonzero.


                         T HEORY OF C OMPUTING, Volume 8 (2012), pp. 291–319                                           312
              S PAN -P ROGRAM -BASED Q UANTUM A LGORITHM FOR E VALUATING F ORMULAS

Proof. Since P is finite, maxv∈ϕ δPv ≤ maxP∈P δP < ∞ and minv∈ϕ εPv ≥ minP∈P εP > 0, bounds indepen-
dent of ϕ. Since C = max{σ (ϕ), σ̄ (ϕ)/π(ϕ)2 }, it remains to show that σ (ϕ) and σ̄ (ϕ)/π(ϕ)2 can each
be bounded above by a function only of P.
    Recall from Equation (4.1) that σ (ϕ) = 1 + maxξ ∑v∈ξ 1/π(v), where the maximization is over
simple paths ξ from the root to a leaf. Note that for a node v ∈ ϕ, having a child c, π(v) ≥ wsize(Pv )π(c).
Therefore, for any fixed path ξ , the sum ∑w∈ξ 1/π(w) is term-wise dominated by the geometric series
∑∞j=0 1/(minP∈P wsize(P)) j < ∞.
    From Equation (4.18), σ̄ (ϕ)/π(ϕ)2 = maxξ ∑w∈ξ π(w)2 /π(ϕ)2 . The first term in the series, for the
root of ϕ, is π(ϕ)2 /π(ϕ)2 = 1, and the ratio between the term for a node v and for its child c is bounded
by π(v)2 /π(c)2 ≤ 1/wsize(Pv )2 . Therefore σ̄ (ϕ)/π(ϕ)2 < ∑∞j=0 1/(minP∈P wsize(P))2 j < ∞.

   In the following section, we will use this spectral gap, together with the completeness condition in
Theorem 4.11, to develop an O(π(ϕ))-query quantum algorithm for evaluating ϕ.


5    Quantum formula-evaluation algorithm
Our quantum algorithm for evaluating ϕ will run a quantum walk starting at the output vertex of the
graph G̃(ϕ, x) from Theorem 4.11. In brief, Szegedy [43] has determined a correspondence between
random walks and quantum walks, that can be reformulated as a correspondence between continuous-time,
Hamiltonian-based quantum walks and discrete-time quantum walks. This correspondence relates the
eigenvectors of the adjacency matrix AG̃(ϕ,x) to those of a discrete-time quantum walk, and relates the
eigenvalues according to λ ↔ e±i arccos λ . If ϕ(x) = 1, then the initial state will have large overlap with
walk eigenvectors with eigenvalues e±iπ/2 = ±i, whereas if ϕ(x) = 0, then the walk will have a spectral
gap around phases ±π/2. These two cases can therefore be distinguished by phase estimation [28, 32].
    Our original algorithm from earlier versions of this paper [39] has since been generalized. We
therefore only state the main theorem here, and refer the reader to [35] for a proof.
Theorem 5.1 ([35, Theorem 9.1]). Let G = (V, E) be a complex-weighted graph with Hermitian weighted
adjacency matrix AG ∈ L(CV ) satisfying hv|AG |vi ≥ 0 for all v ∈ V . Let Vinput be a subset of degree-one
                                                                                         F
vertices of G whose incident edges have weight one, and partition Vinput as Vinput = j∈{1,...,n},b∈{0,1} V j,b .
For x ∈ {0, 1}n , define G(x) from G by deleting all edges to vertices in ∪nj=1V j,x j . Let AG(x) ∈ L(CV ) be
the weighted adjacency of matrix of G(x). For any Λ ≥ 0, let ΠΛ (x) be the orthogonal projection onto
the span of those eigenvectors of AG(x) with eigenvalues at most Λ in magnitude.
     Let f : D → {0, 1}, with D ⊆ {0, 1}n . Let µ ∈ V r Vinput , ε = Ω(1) and Λ > 0. Assume that for
all inputs x with f (x) = 1, kΠ0 (x)|µik2 ≥ ε, and that for all x with f (x) = 0, kΠΛ (x)|µik2 ≤ ε/2.
Then f can be evaluated by a quantum algorithm with error probability at most 1/3 using at most
Q = O(k abs(AG )k/Λ) input queries, where abs(AG ) is the entry-wise absolute value of AG . Moreover,
if the maximum degree of a vertex in GP is d, then the time complexity of the algorithm for evaluating
                                               O(1)
 fP is at most a factor of (log d) log(Q log d)      worse, after classical preprocessing and assuming
constant-time coherent access to the preprocessed string.
    Note that Theorem 4.11 shows a spectral gap in the case ϕ(x) = 0, whereas the soundness condition
in Theorem 5.1 requires only the weaker assumption that the initial state have small overlap on the span

                         T HEORY OF C OMPUTING, Volume 8 (2012), pp. 291–319                               313
                                B EN W. R EICHARDT AND ROBERT Š PALEK

of the small-eigenvalue eigenvectors, or an “effective spectral gap.”
    Our formula-evaluation algorithm follows from Theorems 4.11 and 5.1:

Theorem 5.2. Let S be the set of all three-bit Boolean functions. There exists a quantum algorithm
that evaluates an adversary-balanced formula ϕ(x) over S using O(Adv± (ϕ)) input queries. After
efficient classical preprocessing independent of the input x, and assuming O(1)-time coherent access to
the preprocessed classical string, the running time of the algorithm is Adv± (ϕ)(log Adv± (ϕ))O(1) .

Proof. By Theorem 4.11 and Proposition 4.13, the algorithm of Theorem 5.1 has query complexity
O(π(ϕ)). By Equation (4.1), π(ϕ) equals the maximum product of the span program witness sizes along
a simple path from the root to a leaf. By Theorem 3.10, for optimal span programs this equals the product
of the general adversary bounds of the gates along the path. By Theorem 2.4, the general adversary bound
Adv± (ϕ) is at least the same product, so π(ϕ) = O(Adv± (ϕ)).
    During preprocessing, look up the optimal span programs to determine the input-independent weighted
edges of G(ϕ, x). Then factor the graph as required for implementing a discrete-time quantum walk. In
general, for each vertex, store the unitary for one step of the quantum walk from that vertex. However,
structured formulas may allow closed-form solutions, e. g., no preprocessing is needed for the balanced
MAJ3 formula of Theorem 1.1. The maximum degree d is a constant, since the span programs all have
constant size.


6    Extensions and open problems
Our formula-evaluation algorithm implies that the general adversary bound lower bounds the least span
program witness size:

Corollary 6.1. The general adversary bound lower bounds the least span program witness size for any
total Boolean function f :
                                     inf wsize(P) ≥ Adv± ( f ) .                              (6.1)
                                       P: fP = f

Proof. Fix f and let f d be the depth d balanced formula in which every gate is f . Let P be any span
program with fP = f . Then by Theorems 2.2 and 2.4, the quantum query complexity of evaluating f d
satisfies Q( f d ) ≥ Adv± ( f d ) ≥ Adv± ( f )d , whereas by Theorems 4.11 and 5.1, Q( f d ) = O(wsize(P)d ),
where the hidden constant may depend on f but not on d. Equation (6.1) follows by letting d tend to
infinity.

    Subsequent work has shown that this inequality is tight; the least span program witness size equals
the general adversary bound for any Boolean functions, total or partial [35]. Furthermore, by giving an
algorithm for evaluating arbitrary span programs, and not only the repeatedly composed span programs
that arise in formula evaluation, it has been shown that the general adversary bound equals the bounded-
error quantum query complexity, up to constant factors [35, 37, 30].
    Although the relationship between span programs, the general adversary bound and quantum query
complexity has been resolved, there remain numerous open problems, including most of the problems
from our earlier work [4]. Specifically, for evaluating formulas, the optimal quantum query algorithm

                        T HEORY OF C OMPUTING, Volume 8 (2012), pp. 291–319                              314
             S PAN -P ROGRAM -BASED Q UANTUM A LGORITHM FOR E VALUATING F ORMULAS

may or may not also have a time-efficient implementation. Time-efficient and query-optimal or nearly
query-optimal algorithms for formula evaluation have been studied further in [38, 36]. In [38], it is shown
that our exact balance condition can be relaxed to allow a constant-factor imbalance in the adversary
bounds of the inputs to any gate, and the optimal quantum query algorithm can still be implemented
time efficiently. In [36], it is shown using a new span program composition technique that for arbitrary
AND-OR formulas, without any balance condition, there exists a time-efficient evaluation algorithm that
uses only O(log n) times as many queries as the query-optimal algorithm. Is this tradeoff between query-
optimality and time-efficiency a general phenomenon? Can classical formula rebalancing results [13, 14]
allow further relaxed balanced conditions? Finally, when can the preprocessing step of our algorithm be
eliminated?
    Span programs are also a promising tool for designing new quantum algorithms for problems beyond
evaluating formulas.

Acknowledgements
We thank Troy Lee for pointing out that our Definition 3.1 corresponds to span programs. We thank
Andrew Childs, Sean Hallgren, Cris Moore, David Yonge-Mallo and Shengyu Zhang for helpful conver-
sations.
    B.R. received support from NSF Grants CCF-0524828 and PHY-0456720, and from ARO Grant
W911NF-05-1-0294. R.Š. received support from NSF Grant CCF-0524837 and ARO Grant DAAD
19-03-1-0082.


References
 [1] E RIC A LLENDER , ROBERT B EALS , AND M ITSUNORI O GIHARA: The complexity of matrix rank
     and feasible systems of linear equations. Comput. Complexity, 8(2):99–126, 1999. Preliminary
     version in STOC’96. [doi:10.1007/s000370050023] 292

 [2] K AZUYUKI A MANO: Bounding the randomized decision tree complexity of read-once boolean
     functions. In Proc. 22nd Ann. ACM-SIAM Symp. on Discrete Algorithms (SODA’11), pp. 1729–1744.
     ACM Press, 2011. [ACM:2133169] 292

 [3] A NDRIS A MBAINIS: Polynomial degree vs. quantum query complexity. J. Comput. System
     Sci., 72(2):220–238, 2006. Preliminary version in FOCS’03. [doi:10.1016/j.jcss.2005.06.006,
     arXiv:quant-ph/0305028] 294

 [4] A NDRIS A MBAINIS , A NDREW M. C HILDS , B EN W. R EICHARDT, ROBERT Š PALEK , AND
     S HENGYU Z HANG: Any AND-OR formula of size N can be evaluated in time N 1/2+o(1) on a
     quantum computer. SIAM J. Comput., 39(6):2513–2530, 2010. Preliminary version in FOCS’07.
     [doi:10.1137/080712167] 291, 292, 293, 314

 [5] L ÁSZLÓ BABAI , A NNA G ÁL , AND AVI W IGDERSON: Superpolynomial lower bounds for mono-
     tone span programs. Combinatorica, 19(3):301–319, 1999. Preliminary version in STOC’96.
     [doi:10.1007/s004930050058] 292

                        T HEORY OF C OMPUTING, Volume 8 (2012), pp. 291–319                            315
                              B EN W. R EICHARDT AND ROBERT Š PALEK

 [6] H OWARD BARNUM AND M ICHAEL S AKS: A lower bound on the quantum query complexity of
     read-once functions. J. Comput. System Sci., 69(2):244–258, 2004. [doi:10.1016/j.jcss.2004.02.002,
     arXiv:quant-ph/0201007] 293, 299

 [7] H OWARD BARNUM , M ICHAEL S AKS , AND M ARIO S ZEGEDY: Quantum query complexity and
     semi-definite programming. In Proc. 18th IEEE Conf. on Computational Complexity (CCC’03), pp.
     179–193. IEEE Comp. Soc. Press, 2003. [doi:10.1109/CCC.2003.1214419] 294

 [8] A MOS B EIMEL , A NNA G ÁL , AND M IKE PATERSON: Lower bounds for monotone span
     programs.    Comput. Complexity, 6(1):29–45, 1996. Preliminary version in FOCS’95.
     [doi:10.1007/BF01202040] 292

 [9] A LEKSANDRS B ELOVS: Span-program-based quantum algorithm for the rank problem. Technical
     report, 2011. [arXiv:1103.0842] 293

[10] A LEKSANDRS B ELOVS: Span programs for functions with constant-sized 1-certificates: extended
     abstract. In Proc. 44th STOC. ACM Press, 2012. [doi:10.1145/2213977.2213985, arXiv:quant-
     ph/0611054] 293

[11] A LEKSANDRS B ELOVS AND T ROY L EE: Quantum algorithm for k-distinctness with prior knowl-
     edge on the input. Technical report, 2011. [arXiv:1108.3022] 293

[12] A LEKSANDRS B ELOVS AND B EN W. R EICHARDT: Span programs and quantum algorithms for
     st-connectivity and claw detection. Technical report, 2012. [arXiv:1203.2603] 293

[13] M ARIA L UISA B ONET AND S AMUEL R. B USS: Size-depth tradeoffs for Boolean formulae. Inform.
     Process. Lett., 49(3):151–155, 1994. [doi:10.1016/0020-0190(94)90093-0] 315

[14] NADER H. B SHOUTY, R ICHARD C LEVE , AND WAYNE E BERLY: Size-depth tradeoffs for al-
     gebraic formulas. SIAM J. Comput., 24(4):682–705, 1995. Preliminary version in FOCS’91.
     [doi:10.1137/S0097539792232586] 315

[15] A NDREW M. C HILDS , R ICHARD C LEVE , S TEPHEN P. J ORDAN , AND DAVID L. YONGE -M ALLO:
     Discrete-query quantum algorithm for NAND trees. Theory of Computing, 5(1):119–123, 2009.
     [doi:10.4086/toc.2009.v005a005] 293

[16] RONALD C RAMER AND S ERGE F EHR: Optimal black-box secret sharing over arbitrary abelian
     groups. In 22nd Ann. Internat. Cryptology Conf. (CRYPTO’02), pp. 272–287. Springer, 2002.
     [doi:10.1007/3-540-45708-9_18] 297

[17] E DWARD FARHI , J EFFREY G OLDSTONE , AND S AM G UTMANN: A quantum algo-
     rithm for the Hamiltonian NAND tree.  Theory of Computing, 4(1):169–190, 2008.
     [doi:10.4086/toc.2008.v004a008] 293

[18] D MITRY G AVINSKY AND T SUYOSHI I TO: A quantum query algorithm for the graph collision
     problem. Technical report, 2012. [arXiv:1204.1527] 293

                       T HEORY OF C OMPUTING, Volume 8 (2012), pp. 291–319                         316
            S PAN -P ROGRAM -BASED Q UANTUM A LGORITHM FOR E VALUATING F ORMULAS

[19] G ENE H. G OLUB AND C HARLES F. VAN L OAN: Matrix Computations. Johns Hopkins, Baltimore,
     3rd edition, 1996. 307

[20] L OV K. G ROVER: A fast quantum mechanical algorithm for database search. In Proc. 28th STOC,
     pp. 212–219. ACM Press, 1996. [doi:10.1145/237814.237866, arXiv:quant-ph/9605043] 293

[21] L OV K. G ROVER: Tradeoffs in the quantum search algorithm. Phys. Rev. A, 66:052314, 2002.
     [doi:10.1103/PhysRevA.66.052314, arXiv:quant-ph/0201152] 293

[22] R AFI H EIMAN AND AVI W IGDERSON: Randomized vs. deterministic decision tree complexity
     for read-once Boolean functions. Comput. Complexity, 1:311–329, 1991. Preliminary version in
     Structure in Complexity Theory’91. [doi:10.1007/BF01212962] 292

[23] P ETER H ØYER , T ROY L EE , AND ROBERT Š PALEK: Source codes of semidefinite programs for
     ADV± . [Link], 2006. 299

[24] P ETER H ØYER , T ROY L EE , AND ROBERT Š PALEK: Negative weights make adversaries stronger.
     In Proc. 39th STOC, pp. 526–535. ACM Press, 2007. [doi:10.1145/1250790.1250867, arXiv:quant-
     ph/0611054] 294, 295, 299

[25] T. S. JAYRAM , R AVI K UMAR , AND D. S IVAKUMAR: Two applications of information complexity.
     In Proc. 35th STOC, pp. 673–682. ACM Press, 2003. [doi:10.1145/780542.780640] 293

[26] M AURICIO K ARCHMER AND AVI W IGDERSON: On span programs. In Proc. 8th IEEE
     Conf. on Structure in Complexity Theory, pp. 102–111. IEEE Comp. Soc. Press, 1993.
     [doi:10.1109/SCT.1993.336536] 292, 295

[27] S HELBY K IMMEL: Quantum adversary (upper) bound. Technical report, 2011. [arXiv:1101.0797]
     293

[28] A. Y U . K ITAEV: Quantum measurements and the Abelian stabilizer problem. Technical report,
     1995. [arXiv:quant-ph/9511026] 313

[29] T ROY L EE , F RÉDÉRIC M AGNIEZ , AND M IKLOS S ANTHA: A learning graph based quantum query
     algorithm for finding constant-size subgraphs. Technical report, 2011. [arXiv:1109.5135] 293

[30] T ROY L EE , R AJAT M ITTAL , B EN W. R EICHARDT, ROBERT Š PALEK , AND M ARIO S ZEGEDY:
     Quantum query complexity of state conversion. In Proc. 52nd FOCS, pp. 344–353. IEEE Comp.
     Soc. Press, 2011. [doi:10.1109/FOCS.2011.75, arXiv:1011.3020] 293, 314

[31] F RÉDÉRIC M AGNIEZ , A SHWIN NAYAK , M IKLOS S ANTHA , AND DAVID X IAO: Improved
     bounds for the randomized decision tree complexity of recursive majority. In Proc. 38th In-
     ternat. Colloq. on Automata, Languages and Programming (ICALP’11), pp. 317–329. Springer,
     2011. [doi:10.1007/978-3-642-22006-7_27] 293

[32] DANIEL NAGAJ , PAWEL W OCJAN , AND YONG Z HANG: Fast amplification of QMA. Quantum Inf.
     Comput., 9(11):1053–1068, 2011. [ACM:2012106, arXiv:0904.1549] 313

                      T HEORY OF C OMPUTING, Volume 8 (2012), pp. 291–319                     317
                              B EN W. R EICHARDT AND ROBERT Š PALEK

[33] M ICHAEL A. N IELSEN AND I SAAC L. C HUANG: Quantum computation and quantum information.
     Cambridge University Press, Cambridge, 2000. 294

[34] V ENTZISLAV N IKOV, S VETLA N IKOVA , AND BART P RENEEL: On the size of monotone span
     programs. In 4th Internat. Conf. on Security in Communication Networks (SCN’04), pp. 249–262.
     Springer, 2004. [doi:10.1007/978-3-540-30598-9_18] 297

[35] B EN W. R EICHARDT: Span programs and quantum query complexity: The general adversary bound
     is nearly tight for every boolean function (70 pp.). Technical report, 2009. Extended abstract in
     FOCS’09. [arXiv:0904.2759] 292, 293, 295, 296, 297, 298, 313, 314

[36] B EN R EICHARDT: Faster quantum algorithm for evaluating game trees. In Proc. 22nd Ann. ACM-
     SIAM Symp. on Discrete Algorithms (SODA’11), pp. 546–559. ACM Press, 2011. [ACM:2133079,
     arXiv:0907.1623] 293, 315

[37] B EN R EICHARDT: Reflections for quantum query algorithms. In Proc. 22nd Ann. ACM-SIAM
     Symp. on Discrete Algorithms (SODA’11), pp. 560–569. ACM Press, 2011. [ACM:2133080,
     arXiv:1005.1601] 293, 314

[38] B EN W. R EICHARDT: Span-program-based quantum algorithm for evaluating unbalanced formulas.
     6th Conf. Theory of Quantum Computation, Communication and Cryptography (TQC), 2011.
     [arXiv:0907.1622] 293, 315

[39] B EN W. R EICHARDT AND ROBERT Š PALEK: Span-program-based quantum algorithm for evaluat-
     ing formulas. In Proc. 40th STOC, pp. 103–112. ACM Press, 2008. [doi:10.1145/1374376.1374394,
     arXiv:0710.2630v3] 293, 300, 301, 313

[40] M ICHAEL E. S AKS AND AVI W IGDERSON: Probabilistic Boolean decision trees and the com-
     plexity of evaluating game trees. In Proc. 27th FOCS, pp. 29–38. IEEE Comp. Soc. Press, 1986.
     [doi:10.1109/SFCS.1986.44] 292, 293

[41] M IKLOS S ANTHA: On the Monte Carlo boolean decision tree complexity of read-once formulae.
     Random Structures Algorithms, 6(1):75–88, 1995. Preliminary version in Structure in Complexity
     Theory’91. [doi:10.1002/rsa.3240060108] 292

[42] M ARC S NIR: Lower bounds on probabilistic linear decision trees. Theoret. Comput. Sci., 38:69–82,
     1985. [doi:10.1016/0304-3975(85)90210-5] 292

[43] M ARIO S ZEGEDY: Quantum speed-up of Markov chain based algorithms. In Proc. 45th FOCS, pp.
     32–41. IEEE Comp. Soc. Press, 2004. [doi:10.1109/FOCS.2004.53, arXiv:quant-ph/0401053] 313

[44] B OHUA Z HAN , S HELBY K IMMEL , AND AVINATAN H ASSIDIM: Super-polynomial quantum
     speed-ups for boolean evaluation trees with hidden structure. In Innovations in Theoretical Com-
     puter Science 2012 (ITCM’12), pp. 249–265. ACM Press, 2012. [doi:10.1145/2090236.2090258,
     arXiv:1101.0796] 293

                       T HEORY OF C OMPUTING, Volume 8 (2012), pp. 291–319                         318
            S PAN -P ROGRAM -BASED Q UANTUM A LGORITHM FOR E VALUATING F ORMULAS

[45] Y ECHAO Z HU: Quantum query complexity of subgraph containment with constant-sized certificates.
     Technical report, 2011. [arXiv:1109.4165] 293


AUTHORS

      Ben Reichardt
      Department of Electrical Engineering
      University of Southern California, Los Angeles, CA
      ben.reichardt usc edu
      http://www-bcf.usc.edu/~breichar


      Robert Špalek
      Google, Mountain View, CA
      spalek google com
      http://www.ucw.cz/~robert


ABOUT THE AUTHORS

      B EN R EICHARDT graduated from UC Berkeley in 2006, advised by Umesh Vazirani. He
         studies algorithms and fault-tolerance schemes for quantum computers, and quantum
         cryptography.


      ROBERT Š PALEK obtained his Ph. D. in Theoretical Computer Science from CWI, Amster-
        dam in 2006, under the supervision of Harry Buhrman. His thesis focused on quantum
        algorithms and quantum query lower bounds, and these are his main research interests till
        today. He spent one year as a postdoc at UC Berkeley. Since 2007, he has been working
        as a software engineer in the search quality team at Google. He grew up in the Czech
        Republic, and now lives in the San Francisco Bay Area with his wife Raina and two little
        sons who take most of his time. His hobbies include reading, photography, and hiking.




                      T HEORY OF C OMPUTING, Volume 8 (2012), pp. 291–319                           319