Approximating the AND-OR Tree

Authors Alexander A. Sherstov,
Plaintext
                         T HEORY OF C OMPUTING, Volume 9 (20), 2013, pp. 653–663
                                       www.theoryofcomputing.org




               Approximating the AND-OR Tree
                                             Alexander A. Sherstov∗
                   Received February 12, 2013; Revised May 25, 2013; Published June 19, 2013




       Abstract: The approximate degree of a Boolean function f is the least degree of a real
       polynomial that approximates f within 1/3 at every point. We prove that the function
       Vn Wn
         i=1 j=1 xi j , known as the AND-OR tree, has approximate degree Ω(n). This lower bound
       is tight and closes a line of research on the problem, the best previous bound being Ω(n0.75 ).
                                                                                             √
       More generally, we prove that the function m
                                                      V Wn
                                                       i=1 j=1 xi j has approximate degree Ω( mn),
       which is tight. The same lower bound was obtained independently by Bun and Thaler (2013)
       using related techniques.

ACM Classification: F.0, F.1.3
AMS Classification: 68Q05, 68Q87
Key words and phrases: AND-OR tree, polynomial approximation, polynomial representations of
Boolean functions, approximate degree

1     Introduction
Over the past two decades, representations of Boolean functions by real polynomials have played
an important role in theoretical computer science. The surveys [7, 31, 10, 32, 1] provide a fairly
comprehensive overview of this body of work. Several kinds of representation [24, 23, 5, 7, 25] have
been studied, depending on the intended application. For our purposes, a real polynomial p represents a
Boolean function f : {0, 1}n → {0, 1} if
                                                                        1
                                                    | f (x) − p(x)| ≤
                                                                        3
for every x ∈ {0, 1}n , In other words, we are interested in the pointwise approximation of Boolean
functions by real polynomials. The least degree of a real polynomial that approximates f pointwise within
    ∗ Supported by NSF CAREER award CCF-1149018.



© 2013 Alexander A. Sherstov
c b Licensed under a Creative Commons Attribution License (CC-BY)                 DOI: 10.4086/toc.2013.v009a020
                                          A LEXANDER A. S HERSTOV

1/3 is called the approximate degree of f , denoted g  deg( f ). The constant 1/3 is chosen for aesthetic
reasons and can be replaced by any other in (0, 1/2) without affecting the theory in any way.
    The formal study of the approximate degree began in 1969 with the seminal work of Minsky and
Papert [23], who famously proved that the parity function in n variables cannot be approximated by
a polynomial of degree less than n. Since then, the approximate degree has been used to solve a vast
array of problems in complexity theory and algorithm design. The earliest use of the approximate degree
was to prove circuit lower bounds and oracle separations of complexity classes [27, 41, 5, 20, 21, 35].
Over the past decade, the approximate degree has been used many times to prove tight lower bounds
on quantum query complexity, e. g., [6, 9, 2, 18]. The approximate degree has enabled remarkable
progress [12, 28, 11, 36, 29, 32] in communication complexity, with complete resolutions of difficult
open problems. The results listed up to this point are of negative character, i. e., they are lower bounds in
relevant computational models. More recently, the approximate degree has found important algorithmic
applications. In computational learning theory, the approximate degree has been used to obtain the
fastest known algorithms for PAC-learning DNF formulas [42, 19] and read-once formulas [4] and the
fastest known algorithm for agnostically learning disjunctions [17]. Another well-known use of the
approximate degree is an algorithm for approximating the inclusion-exclusion formula based on its initial
terms [22, 16, 33, 43].
    These applications motivate the study of the approximate degree as a complexity measure in its own
right. As one would expect, methods of approximation theory have been instrumental in determining the
approximate degree for specific Boolean functions of interest [8, 25, 40, 2, 3, 33, 39]. In addition, quantum
query algorithms have been used to prove upper bounds on the approximate degree [15, 43, 4, 30], and
duality-based methods have yielded lower bounds [26, 34, 38]. Nevertheless, our understanding of this
complexity measure remains fragmented, with few general results available [25, 39].
    The limitations of known techniques are nicely illustrated by the so-called AND-OR tree,
                                                         n _
                                                         ^ n
                                               f (x) =             xi j .
                                                         i=1 j=1

Despite its seeming simplicity, it has been a frustrating function to analyze. Its approximate degree has
been studied for the past 19 years [25, 40, 15, 3, 34] and has been recently re-posed as an open problem
by Aaronson [1]. Table 1 gives a quantitative summary of this line of research. The best lower and upper
bounds prior to this paper were Ω(n0.75 ) and O(n), respectively. Our contribution is to close this gap by
improving the lower bound to Ω(n). We obtain the following more general result.

                                                     Vm Wn
Theorem 1.1 (Main result). The function f (x) =           i=1      j=1 xi j has approximate degree
                                                          √
                                             deg( f ) = Ω( mn) .
                                             g

      This lower bound is tight for all m and n, by the results of Høyer et al. [15].

1.1     Proof overview
The problem of approximating a given function f pointwise to within error ε by polynomials of degree
at most d can be viewed as a search for a point in the intersection of two convex sets, namely, the

                       T HEORY OF C OMPUTING, Volume 9 (20), 2013, pp. 653–663                          654
                                      A PPROXIMATING THE AND-OR T REE

                       Bound                      Reference
                       O(n)                       Høyer, Mosca, and de Wolf [15]
                         √
                       Ω( n)                      Nisan and Szegedy [25]
                         √
                       Ω( n log n)                Shi [40]
                       Ω(n0.66... )               Ambainis [3]
                       Ω(n0.75 )                  Sherstov [34]
                       Ω(n)                       This paper


                           Table 1: Approximate degree of the AND-OR tree.


ε-neighborhood of f and the set of polynomials of degree at most d. As a result, the nonexistence of
an approximating polynomial for f is equivalent to the existence of a so-called dual polynomial for f ,
whose defining properties are orthogonality to degree-d polynomials and large inner product with f .
Geometrically, the dual polynomial is a separating hyperplane for the two convex sets in question.
    Our proof is quite short (barely longer than a page). We view f (x) = m
                                                                         V Wn
                                                                          i=1 j=1 xi j as the component-
wise composition of the functions ANDm and ORn . We use the dual polynomial for ORn to prove the
existence of an operator L with the following properties:

   (i) L linearly maps functions {0, 1}m×n → [−1, 1] to functions {0, 1}m → [−1, 1];
                                                                                      √
  (ii) L decreases the degree of the function to which it is applied by a factor of Ω( n);

 (iii) L f ≈ ANDm pointwise.

The existence of L directly implies our main result. Indeed, for any polynomial p that approximates
                                               √
 f pointwise, the polynomial Lp has degree Ω( n) times smaller and approximates ANDm pointwise;
                                                                      √
since the latter approximation task is known [25] to require degree Ω( m), the claimed lower bound of
    √
Ω( mn) on the degree of p follows.
    What makes the construction of L possible is the following very special property of any dual
polynomial for ORn : it maintains the same sign on OR−1  n (0) and has almost half of its `1 norm there.
We call such dual polynomials one-sided. This property was proved several years ago by Gavinsky and
the author in [14], where it was used to obtain lower bounds for nondeterministic and Merlin-Arthur
communication protocols.

1.2   Independent work by Bun and Thaler
                                                                            √
In an upcoming paper, Bun and Thaler [13] independently prove an Ω( mn) lower bound on the
                              Vm Wn
approximate degree of f (x) = i=1 j=1 xi j . The proof in [13] and ours are both based on the fact that
                                                                                                   √
ORn has a one-sided dual polynomial. The two papers differ in how they use this fact to prove an Ω( mn)
lower bound on the approximate degree. The treatment in this paper is a combination of the dual view

                    T HEORY OF C OMPUTING, Volume 9 (20), 2013, pp. 653–663                         655
                                         A LEXANDER A. S HERSTOV

(one-sided dual polynomial for ORn ) and the primal view (construction of an approximating polynomial
for ANDm ). The treatment in [13] is a refinement of [34] and uses exclusively the dual view (construction
of a dual polynomial for f using dual polynomials for ANDm and ORn ). In our opinion, the proof in
this paper has the advantage of being shorter and simpler. On the other hand, the approach in [13] has
the advantage of giving an explicit dual polynomial for f , which is of interest because explicit dual
polynomials have found several uses in communication complexity [32].


2     Preliminaries
For a function f : X → R on a finite set X, we let k f k∞ = maxx∈X | f (x)|. The total degree of a multivariate
real polynomial p : Rn → R is denoted deg p. We use the terms degree and total degree interchangeably
in this paper. For a function f : X → R on a finite set X ⊂ Rn , the ε-approximate degree degε ( f ) of f is
defined as the least degree of a real polynomial p with k f − pk∞ ≤ ε. Throughout this paper, we will
work with the ε-approximate degree for a small constant ε > 0. For Boolean functions f : X → {0, 1},
the choice of constant 0 < ε < 1/2 affects the quantity degε ( f ) by at most a constant factor:

                                   c deg1/3 ( f ) ≤ degε ( f ) ≤ C deg1/3 ( f ) ,                        (2.1)

where c = c(ε) and C = C(ε) are positive constants. By convention, one studies ε = 1/3 as the canonical
                                            deg( f ) = deg1/3 ( f ). A dual characterization [36, 38] of the
case and reserves for it the special symbol g
approximate degree is as follows.

Fact 2.1. Let f : X → R be given, for a finite set X ⊂ Rn . Then degε ( f ) ≥ d if and only if there exists a
function ψ : X → R such that

                                               ∑ |ψ(x)| = 1 ,
                                               x∈X

                                               ∑ ψ(x) f (x) > ε ,
                                               x∈X

and

                                               ∑ ψ(x)p(x) = 0
                                               x∈X

for every polynomial p of degree less than d.

   We adopt the usual definitions of the Boolean functions ANDn , ORn : {0, 1}n → {0, 1}. Their approx-
imate degree was determined by Nisan and Szegedy [25].

Theorem 2.2 (Nisan and Szegedy). The functions ANDn and ORn obey
                                                              √
                           deg1/3 (ANDn ) = deg1/3 (ORn ) = Θ( n) .

    By combining the above two theorems, Gavinsky and the author [14, Thm. 5.1] obtained the following
result, which plays a key role in this paper.

                     T HEORY OF C OMPUTING, Volume 9 (20), 2013, pp. 653–663                               656
                                      A PPROXIMATING THE AND-OR T REE

Theorem 2.3 (Gavinsky and Sherstov). Fix any constant 0 < ε < 1. Then there exists a constant
δ = δ (ε) > 0 and a real function ψ : {0, 1}n → R such that

                                                     ∑ |ψ(x)| = 1 ,                                (2.2)
                                                x∈{0,1}n
                                                                        1−ε
                                               ψ(0, 0, . . . , 0) < −       ,                      (2.3)
                                                                         2

and

                                                     ∑ ψ(x)p(x) = 0                                (2.4)
                                                x∈{0,1}n
                                            √
for every polynomial p of degree less than δ n.

      For the sake of completeness, we include the proof.
                                                                                              √
Proof of Theorem 2.3 (adapted from [14]). Recall from Theorem 2.2 that deg1/3 (ORn ) = Ω( n). There-
                                         √
fore, (2.1) shows that deg 1−ε (ORn ) ≥ δ n for a sufficiently small constant δ = δ (ε) > 0. Now the dual
                            2
characterization of the approximate degree (Fact 2.1) provides a function ψ : {0, 1}n → R that obeys
(2.2), (2.4), and
                                                                          1−ε
                                               ∑ ψ(x)ORn (x) >             2
                                                                              .                    (2.5)
                                             x∈{0,1}n

It remains to verify (2.3):

                   ψ(0, 0, . . . , 0) =     ∑ ψ(x)(1 − ORn (x))
                                          x∈{0,1}n

                                    =−         ∑ ψ(x)ORn (x)                      by (2.4)
                                            x∈{0,1}n
                                            1−ε
                                    <−                                            by (2.5).
                                             2
    For probability distributions µ and λ on finite sets X and Y , respectively, we let µ × λ denote
the probability distribution on X ×Y given by (µ × λ )(x, y) = µ(x)λ (y). The support of a probability
distribution µ is defined to be supp µ = {x : µ(x) > 0}.


3      Main Result
We are now in a position to prove our main result.
                                                        Vm Wn
Theorem 3.1. The Boolean function f (x) =                  i=1   j=1 xi j obeys
                                                                  √
                                                 deg1/3 ( f ) = Ω( mn) .                           (3.1)


                      T HEORY OF C OMPUTING, Volume 9 (20), 2013, pp. 653–663                        657
                                            A LEXANDER A. S HERSTOV

Proof. Let ε be an absolute constant to be named later, 0 < ε < 1. Then by Theorem 2.3, there exists a
constant δ = δ (ε) > 0 and a function ψ : {0, 1}n → R that obeys (2.2)–(2.4). Let µ be the probability
distribution on {0, 1}n given by µ(x) = |ψ(x)|. Let µ0 and µ1 be the probability distributions induced
by µ on the sets {x : ψ(x) < 0} and {x : ψ(x) > 0}, respectively. Since ∑x∈{0,1}n ψ(x) = 0, the sets
{x : ψ(x) < 0} and {x : ψ(x) > 0} are weighted equally by µ. As a consequence,

                                                       1    1
                                                    µ = µ1 + µ0 ,                                      (3.2)
                                                       2    2
                                                       1    1
                                                    ψ = µ1 − µ0 .                                      (3.3)
                                                       2    2
   Consider the linear operator L that maps functions φ : ({0, 1}n )m → R to functions Lφ : {0, 1}m → R
according to

                                    (Lφ )(z) =        E ···           E       φ (x1 , . . . , xm ) .
                                                    x1 ∼µz1         xm ∼µzm

Fix a real polynomial p with

                                                      k f − pk∞ ≤ ε .                                  (3.4)

Claim 3.2.    kANDm − L f k∞ < ε.
                       √
Claim 3.3.    deg p ≥ δ n deg Lp.
   Before settling the claims, we finish the proof of the theorem. The linearity of L yields

                        kANDm − Lpk∞ ≤ kANDm − L f k∞ + kL( f − p)k∞ < 2ε ,
                                       |   {z       } |       {z   }
                                                              <ε                             ≤ε

where we have used (3.4) and Claim 3.2 in bounding the marked quantities. For ε = 1/6, we arrive at
                                                √
kANDm − Lpk∞ ≤ 1/3 and therefore deg Lp = Ω( m) by Theorem 2.2. Now Claim 3.3 implies that
          √
deg p = Ω( mn).

Proof of Claim 3.2. By (2.3), we have ψ(x) > 0 only when ORn (x) = 1. Hence supp µ1 ⊆ OR−1
                                                                                        n (1) and

                                                                                   m
                            (L f )(1, 1, . . . , 1) =         E        [ f ] = ∏ E [ORn ] = 1 .
                                                        µ1 ×···×µ1                      µ1
                                                                                  i=1

   It remains to prove that |(L f )(z)| < ε for every z 6= (1, 1, . . . , 1). We have
                                                         m                        m
                    (L f )(z) =        E         [ f ] = ∏ E [ORn ] = ∏(1 − µzi (0, 0, . . . , 0)) ,
                                  µz1 ×···×µzm                µzi
                                                        i=1                      i=1

whence

                                       0 ≤ (L f )(z) ≤ 1 − µ0 (0, 0, . . . , 0) .                      (3.5)


                     T HEORY OF C OMPUTING, Volume 9 (20), 2013, pp. 653–663                            658
                                    A PPROXIMATING THE AND-OR T REE

We know from (2.3) that ψ(0, 0, . . . , 0) < −(1 − ε)/2, which means in particular that (0, 0, . . . , 0) ∈
supp µ0 . Therefore

                      µ0 (0, 0, . . . , 0) = 2µ(0, 0, . . . , 0) = 2|ψ(0, 0, . . . , 0)| > 1 − ε ,

where the first step uses (3.2). By (3.5), we conclude that 0 ≤ (L f )(z) < ε.

Proof of Claim 3.3. By the linearity of L, it suffices to consider factored polynomials p of the form
p(x) = ∏mi=1 pi (xi,1 , xi,2 , . . . , xi,n ). In this case we have the convenient formula
                                                             m
                                              (Lp)(z) = ∏ E [pi ] .
                                                                  µzi
                                                            i=1
                                                        √
By (2.4) and (3.3), polynomials pi of degree less than δ n obey Eµ0 [pi ] = Eµ1 [pi ] and therefore do not
contribute to the degree of Lp. As a result,
                                                           √     deg p
                                  deg Lp ≤ |{i : deg pi ≥ δ n}| ≤ √ .
                                                                 δ n
    Using the pattern matrix method [36], one can immediately translate the main result of this paper into
lower bounds on communication complexity. For example, it follows that the two-party communication
                                                                                      √
problem f (x, y) = m
                   V Wn
                     i=1 j=1 (xi j ∧ yi j ) has bounded-error quantum complexity Ω( mn), regardless of
prior entanglement. We refer the interested reader to [36, 38, 37] for further details and applications.


Acknowledgments
The author is thankful to Mark Bun, Justin Thaler, and the anonymous reviewers for their valuable
feedback on an earlier version of this manuscript.


References
 [1] S COTT A ARONSON: The polynomial method in quantum and classical computing. In Proc. 49th
     FOCS, p. 3. IEEE Comp. Soc. Press, 2008. [doi:10.1109/FOCS.2008.91] 653, 654
 [2] S COTT A ARONSON AND YAOYUN S HI: Quantum lower bounds for the collision and the ele-
     ment distinctness problems. J. ACM, 51(4):595–605, 2004. Preliminary version in STOC’02.
     [doi:10.1145/1008731.1008735] 654
 [3] A NDRIS A MBAINIS: Polynomial degree and lower bounds in quantum complexity: Colli-
     sion and element distinctness with small range. Theory of Computing, 1(3):37–46, 2005.
     [doi:10.4086/toc.2005.v001a003] 654, 655
 [4] A NDRIS A MBAINIS , A NDREW M. C HILDS , B EN W. R EICHARDT, ROBERT Š PALEK , AND
     S HENGYU Z HANG: Any AND-OR formula of size N can be evaluated in time N 1/2+o(1) on a
     quantum computer. SIAM J. Comput., 39(6):2513–2530, 2010. Preliminary version in FOCS’07.
     [doi:10.1137/080712167] 654

                     T HEORY OF C OMPUTING, Volume 9 (20), 2013, pp. 653–663                           659
                                     A LEXANDER A. S HERSTOV

 [5] JAMES A SPNES , R ICHARD B EIGEL , M ERRICK L. F URST, AND S TEVEN RUDICH: The expressive
     power of voting polynomials. Combinatorica, 14(2):135–148, 1994. Preliminary version in
     STOC’91. [doi:10.1007/BF01215346] 653, 654
 [6] ROBERT B EALS , H ARRY B UHRMAN , R ICHARD C LEVE , M ICHELE M OSCA , AND RONALD DE
     W OLF: Quantum lower bounds by polynomials. J. ACM, 48(4):778–797, 2001. Preliminary version
     in FOCS’98. [doi:10.1145/502090.502097] 654
 [7] R ICHARD B EIGEL: The polynomial method in circuit complexity. In Proc. 8th IEEE Conf.
     on Structure in Complexity Theory (SCT’93), pp. 82–95. IEEE Comp. Soc. Press, 1993.
     [doi:10.1109/SCT.1993.336538] 653
 [8] R ICHARD B EIGEL: Perceptrons, PP, and the polynomial hierarchy. Comput. Complexity, 4(4):339–
     349, 1994. Preliminary version in SCT’92. [doi:10.1007/BF01263422] 654
 [9] H ARRY B UHRMAN , R ICHARD C LEVE , RONALD DE W OLF, AND C HRISTOF Z ALKA: Bounds for
     small-error and zero-error quantum algorithms. In Proc. 40th FOCS, pp. 358–368. IEEE Comp.
     Soc. Press, 1999. [doi:10.1109/SFFCS.1999.814607] 654
[10] H ARRY B UHRMAN AND RONALD DE W OLF: Complexity measures and decision tree complexity:
     A survey. Theoret. Comput. Sci., 288(1):21–43, 2002. [doi:10.1016/S0304-3975(01)00144-X] 653
[11] H ARRY B UHRMAN , N IKOLAI K. V ERESHCHAGIN , AND RONALD DE W OLF: On computation and
     communication with small bias. In Proc. 22nd IEEE Conf. on Computational Complexity (CCC’07),
     pp. 24–32. IEEE Comp. Soc. Press, 2007. [doi:10.1109/CCC.2007.18] 654
[12] H ARRY B UHRMAN AND RONALD DE W OLF: Communication complexity lower bounds by
     polynomials. In Proc. 16th IEEE Conf. on Computational Complexity (CCC’01), pp. 120–130.
     IEEE Comp. Soc. Press, 2001. [doi:10.1109/CCC.2001.933879] 654
[13] M ARK B UN AND J USTIN T HALER: Dual lower bounds for approximate degree and Markov-
     Bernstein inequalities. In Proc. 40th Internat. Colloq. on Automata, Languages and Programming
     (ICALP’13). Springer, 2013. To appear. Preprint available at arXiv. 655, 656
[14] D MITRY G AVINSKY AND A LEXANDER A. S HERSTOV: A separation of NP and coNP
     in multiparty communication complexity.       Theory of Computing, 6(1):227–245, 2010.
     [doi:10.4086/toc.2010.v006a010] 655, 656, 657
[15] P ETER H ØYER , M ICHELE M OSCA , AND RONALD DE W OLF: Quantum search on bounded-error
     inputs. In Proc. 30th Internat. Colloq. on Automata, Languages and Programming (ICALP’03), pp.
     291–299. Springer, 2003. [doi:10.1007/3-540-45061-0_25] 654, 655
[16] J EFF K AHN , NATHAN L INIAL , AND A LEX S AMORODNITSKY: Inclusion-exclusion: Exact and
     approximate. Combinatorica, 16(4):465–477, 1996. [doi:10.1007/BF01271266] 654
[17] A DAM TAUMAN K ALAI , A DAM R. K LIVANS , Y ISHAY M ANSOUR , AND ROCCO A. S ERVEDIO:
     Agnostically learning halfspaces. SIAM J. Comput., 37(6):1777–1805, 2008. Preliminary version in
     FOCS’05. [doi:10.1137/060649057] 654

                   T HEORY OF C OMPUTING, Volume 9 (20), 2013, pp. 653–663                       660
                                A PPROXIMATING THE AND-OR T REE

[18] H ARTMUT K LAUCK , ROBERT Š PALEK , AND RONALD DE W OLF: Quantum and classical strong
     direct product theorems and optimal time-space tradeoffs. SIAM J. Comput., 36(5):1472–1493,
     2007. Preliminary version in FOCS’04. [doi:10.1137/05063235X] 654
                                                                              1/3
[19] A DAM R. K LIVANS AND ROCCO A. S ERVEDIO: Learning DNF in time 2Õ(n ) . J. Comput. System
     Sci., 68(2):303–318, 2004. Preliminary version in STOC’01. [doi:10.1016/j.jcss.2003.07.007] 654

[20] M ATTHIAS K RAUSE AND PAVEL P UDLÁK: On the computational power of depth-2 circuits with
     threshold and modulo gates. Theoret. Comput. Sci., 174(1-2):137–156, 1997. Preliminary version
     in STOC’94. [doi:10.1016/S0304-3975(96)00019-9] 654

[21] M ATTHIAS K RAUSE AND PAVEL P UDLÁK: Computing Boolean functions by polynomials and
     threshold circuits. Comput. Complexity, 7(4):346–370, 1998. Preliminary version in FOCS’95.
     [doi:10.1007/s000370050015] 654

[22] NATHAN L INIAL AND N OAM N ISAN: Approximate inclusion-exclusion. Combinatorica,
     10(4):349–365, 1990. Preliminary version in STOC’90. [doi:10.1007/BF02128670] 654

[23] M ARVIN L. M INSKY AND S EYMOUR A. PAPERT: Perceptrons: An Introduction to Computational
     Geometry. MIT Press, Cambridge, Mass., 1969. 653, 654

[24] S ABURO M UROGA: Threshold Logic and Its Applications. John Wiley & Sons, New York, 1971.
     653

[25] N OAM N ISAN AND M ARIO S ZEGEDY: On the degree of Boolean functions as real poly-
     nomials. Comput. Complexity, 4(4):301–313, 1994. Preliminary version in STOC’92.
     [doi:10.1007/BF01263419] 653, 654, 655, 656

[26] RYAN O’D ONNELL AND ROCCO A. S ERVEDIO: New degree bounds for polynomial threshold func-
     tions. Combinatorica, 30(3):327–358, 2010. Preliminary version in STOC’03. [doi:10.1007/s00493-
     010-2173-3] 654

[27] R AMAMOHAN PATURI AND M ICHAEL E. S AKS: Approximating threshold circuits by ratio-
     nal functions. Inform. and Comput., 112(2):257–272, 1994. Preliminary version in FOCS’90.
     [doi:10.1006/inco.1994.1059] 654

[28] A LEXANDER A. R AZBOROV: Quantum communication complexity of symmetric predicates.
     Izvestiya: Mathematics, 67(1):145–159, 2003. [doi:10.1070/IM2003v067n01ABEH000422] 654

[29] A LEXANDER A. R AZBOROV AND A LEXANDER A. S HERSTOV: The sign-rank of AC0 . SIAM J.
     Comput., 39(5):1833–1855, 2010. Preliminary version in FOCS’08. [doi:10.1137/080744037] 654

[30] B EN W. R EICHARDT: Reflections for quantum query algorithms. In Proc. 22nd Ann. ACM-SIAM
     Symp. on Discrete Algorithms (SODA’11), pp. 560–569. ACM Press, 2011. SIAM. [ACM:2133080]
     654

                   T HEORY OF C OMPUTING, Volume 9 (20), 2013, pp. 653–663                      661
                                     A LEXANDER A. S HERSTOV

[31] M ICHAEL E. S AKS: Slicing the hypercube. In K EITH WALKER, editor, Surveys in Combinatorics,
     1993, volume 187 of London Mathematical Society Lecture Note Series, pp. 211–256. Cambridge
     Univ. Press, 1993. [doi:10.1017/CBO9780511662089.009] 653

[32] A LEXANDER A. S HERSTOV: Communication lower bounds using dual polynomials. Bulletin of
     the EATCS, 95:59–93, 2008. EATCS. 653, 654, 656

[33] A LEXANDER A. S HERSTOV: Approximate inclusion-exclusion for arbitrary symmetric functions.
     Comput. Complexity, 18(2):219–247, 2009. Preliminary version in CCC’08. [doi:10.1007/s00037-
     009-0274-4] 654

[34] A LEXANDER A. S HERSTOV: The intersection of two halfspaces has high threshold degree. In Proc.
     50th FOCS, pp. 343–362. IEEE Comp. Soc. Press, 2009. [doi:10.1109/FOCS.2009.18] 654, 655,
     656

[35] A LEXANDER A. S HERSTOV: Separating AC0 from depth-2 majority circuits. SIAM J. Comput.,
     38(6):2113–2129, 2009. Preliminary version in STOC’07. [doi:10.1137/08071421X] 654

[36] A LEXANDER A. S HERSTOV: The pattern matrix method. SIAM J. Comput., 40(6):1969–2000,
     2011. Preliminary version in FOCS’08. [doi:10.1137/080733644] 654, 656, 659

[37] A LEXANDER A. S HERSTOV: The multiparty communication complexity of set disjointness. In
     Proc. 44th STOC, pp. 525–548. ACM Press, 2012. [doi:10.1145/2213977.2214026] 659

[38] A LEXANDER A. S HERSTOV: Strong direct product theorems for quantum communication and
     query complexity. SIAM J. Comput., 41(5):1122–1165, 2012. Preliminary version in STOC’11.
     [doi:10.1137/110842661] 654, 656, 659

[39] A LEXANDER A. S HERSTOV: Making polynomials robust to noise. Theory of Computing, 9(18):593–
     615, 2013. Preliminary version in STOC’12. [doi:10.4086/toc.2013.v009a018] 654

[40] YAOYUN S HI: Approximating linear restrictions of Boolean functions. Manuscript. Available at
     author’s home page, 2002. 654, 655

[41] K AI -Y EUNG S IU , V WANI P. ROYCHOWDHURY, AND T HOMAS K AILATH: Rational approximation
     techniques for analysis of neural networks. IEEE Trans. Inform. Theory, 40(2):455–466, 1994.
     Preliminary version in NIPS’92. [doi:10.1109/18.312168] 654

[42] J UN TARUI AND TATSUIE T SUKIJI: Learning DNF by approximating inclusion-exclusion formulae.
     In Proc. 14th IEEE Conf. on Computational Complexity (CCC’99), pp. 215–221. IEEE Comp. Soc.
     Press, 1999. [doi:10.1109/CCC.1999.766279] 654

[43] RONALD DE W OLF: A note on quantum algorithms and the minimal degree of ε-error polynomials
     for symmetric functions. Quantum Inf. Comput., 8(10):943–950, 2008. QIC. [ACM:2016989] 654




                   T HEORY OF C OMPUTING, Volume 9 (20), 2013, pp. 653–663                      662
                              A PPROXIMATING THE AND-OR T REE

AUTHOR

    Alexander A. Sherstov
    Assistant Professor
    University of California, Los Angeles
    sherstov cs ucla edu
    http://www.cs.ucla.edu/~sherstov


ABOUT THE AUTHOR

    A LEXANDER A. S HERSTOV completed his Ph. D. in 2009 at the University of Texas at
       Austin. After a postdoc at Microsoft Research, Alexander joined UCLA as an assistant
       professor of computer science. His research interests include computational complexity
       theory, computational learning theory, and quantum computing.




                 T HEORY OF C OMPUTING, Volume 9 (20), 2013, pp. 653–663                        663