The Influence Lower Bound Via Query Elimination

Authors Rahul Jain, Shengyu Zhang,
Plaintext
                              T HEORY OF C OMPUTING, Volume 7 (2011), pp. 147–153
                                           www.theoryofcomputing.org

                                                            NOTE



                          The Influence Lower Bound
                            Via Query Elimination
                                      Rahul Jain∗                Shengyu Zhang†
                                     Received: March 4, 2011; published: July 19, 2011.



         Abstract: We give a simple proof, via query elimination, of a result due to O’Donnell,
         Saks, Schramm, and Servedio, which shows a lower bound on the zero-error expected query
         complexity of a function f in terms of the maximum influence of any variable of f . Our
         lower bound also applies to the two-sided error expected query complexity of f , and it
         allows an immediate extension which can be used to prove stronger lower bounds for some
         functions.

ACM Classification: F.1.3
AMS Classification: 68Q17
Key words and phrases: randomized query complexity, influence of variables

1      Introduction
Query complexity measures the hardness of computing a function f by the minimum number of input
variables one needs to read before knowing the function’s value. For ε ≥ 0 and a distribution λ on the
inputs, a k-expected query randomized algorithm has λ -distributional error ε, if its expected queries (over
the inputs drawn from λ and its random coins) is at most k, and its expected error (over the inputs drawn
from λ and its random coins) is at most ε. The ε-error λ -distributional expected query complexity of f ,
denoted Dλε ( f ), is the minimum number k such that there exists a k-expected query randomized algorithm
which has λ -distributional error ε.
     ∗ Supported by the internal grants of The Centre of Quantum Technologies.
     † Partially supported by Research Grants Council of Hong Kong S.A.R. (Project numbers: CUHK419309 and CUHK418710.)



    2011 Rahul Jain and Shengyu Zhang
    Licensed under a Creative Commons Attribution License                                  DOI: 10.4086/toc.2011.v007a010
                                      R AHUL JAIN AND S HENGYU Z HANG

    The influence of a variable is another important quantity which measures the importance of the
variable to the function’s value (on average over other variables). Both query complexity and influence
are well-studied subjects; see [1] for a survey of the former (with many other complexity measures)
and [11, 4] for surveys of the latter (and Fourier analysis on Boolean functions).

Definition 1.1 (Influence). Let f : Xn → Z be a function, and Xi ’s and Yi ’s (for i = 1, . . . , n) be random
variables independently and identically distributed according to µ on X. Let X = X1 · · · Xn , and for each
i ∈ [n], let X i represent the random variable X1 . . . Xi−1Yi Xi+1 . . . Xn . The influence of variable i on f with
respect to µ is defined as
                                       infi ( f , µ) = Pr[ f (X) 6= f (X i )] .
The maximum influence of f with respect to µ is defined as

                                          infmax ( f , µ) = max infi ( f , µ) .
                                                                    i

    For Boolean functions, the influence as defined above is half of that in [12].
    For any randomized algorithm P and any fixed input distribution (which is clear from the context), let
ε(P) denote the error probability (over random coins and inputs) of P, and δi (P) denote the probability
(over random coins and inputs) that P queries input variable i. In [12], O’Donnell, Saks, Schramm and
Servedio proved the following:

Theorem 1.2 ([12]). Let f : {−1, +1}n → {−1, +1}. Let µ be a distribution on {−1, +1}, and X be
drawn from µ ⊗n . Then for any zero-error randomized query algorithm P (querying X),
                                              n
                                                                            Var[ f ]
                                             ∑ δi (P)infi ( f , µ) ≥          2
                                                                                     ,
                                             i=1

where Var[ f ] = E[ f 2 ] − E[ f ]2 is the variance of f (X). In particular, let P be a randomized algorithm
                      µ ⊗n
with no error and D0 ( f ) expected queries. Then,
                                                        n
                                      µ ⊗n                                    Var[ f ]
                                    D0 ( f ) = ∑ δi (P) ≥                                   .
                                                       i=1              2 · infmax ( f , µ)

    Recently Lee [9] gave another proof of this fact.
    Together with another bound for monotone functions [13],

                                   µ ⊗n
                                                                        2           1
                                 D0 p ( f ) ≥          ∑ infi ( f , µ p )   log2            ,
                                                        i                          p(1 − p)

where µ p is the distribution on {−1, +1} with −1 picked with probability p, Theorem 1.2 gives a lower
bound of Ω(n2/3 ) for the randomized query complexity of all nontrivial monotone Boolean functions
invariant under a transitive group of permutations (on the variables). For such functions we let p be the
“critical threshold,” namely the probability such that the function takes value 1 with probability exactly
half. Observe that all the variables have the same influence; thus, depending on whether the influence is
large or small, one of the two bounds can be applied to yield the Ω(n2/3 ) lower bound. This in particular

                          T HEORY OF C OMPUTING, Volume 7 (2011), pp. 147–153                                   148
                        T HE I NFLUENCE L OWER B OUND V IA Q UERY E LIMINATION

reproduces the Ω(|V |4/3 ) lower bound for all monotone graph properties in [5], which is O(log1/3 (|V |))
shy of the current record [2].
    In this paper we give a new proof of Theorem 1.2, arguably shorter and simpler than both previous
ones [12, 9]. In fact, we prove a stronger statement that applies to the two-sided error case.

Theorem 1.3. Let f : Xn → Z be a function, µ be a distribution on X, ε ≥ 0, and X be drawn from µ ⊗n .
Then for any randomized query algorithm P (querying X),
                                    n
                                   ∑ δi (P)infi ( f , µ) ≥ 1 − fmax − ε(P) ,                              (1)
                                   i=1

where fmax = maxz∈Z Pr[ f (X) = z]. In particular, let P be an algorithm with expected error ε and
                  µ ⊗n
expected queries Dε ( f ). Then,
                                                 n
                                        µ ⊗n                1 − fmax − ε
                                   Dε ( f ) = ∑ δi (P) ≥                    .                             (2)
                                                i=1         infmax ( f , µ)


    It is easily seen that 1 − fmax ≥ Var[ f ]/4 for Boolean functions f , therefore the above theorem implies
Theorem 1.2 (up to a factor of 2) as a special case.
    We present the proof of Theorem 1.3 in the next section. The original proof in [12] used a hybrid
argument on decision trees, and Lee in [9] gave a martingale-based proof. Our approach proceeds by
query elimination: we can save one query without increasing the error by more than maxi infi ( f , p) and
eventually eliminate all queries to obtain a zero-query algorithm, which must have a large error probability
on a hard distribution. This bounds from below the number of queries of the original algorithm. The
analysis for the increase in error due to eliminating one query is quite simple and follows from the union
bound (applied just once) and the observation that X i is identically distributed to X.
    We can improve the lower bound by considering a function g, which is close to f but could potentially
have smaller infmax , a method often referred to as smoothing. As is the case with the rectangle bound and
the discrepancy bound in communication complexity and query complexity, where the smoothed versions
can prove strong lower bounds [7, 14, 15, 10, 8, 6, 3], this smoothed influence lower bound also gives
stronger bounds than Equation (2) for some functions.
    Let g : Xn → Z be a function such that Pr[ f (X) 6= g(X)] ≤ δ , where X is drawn from µ ⊗n as above
and δ ≥ 0. It is easily noted that an algorithm for f with average error under µ ⊗n at most ε also works as
                                                                                µ ⊗n        µ ⊗n
an algorithm for g with average error under µ ⊗n at most ε + δ . Therefore Dε ( f ) ≥ Dε+δ (g) and we
have the following corollary.

Corollary 1.4. Let f : Xn → Z be a function, µ be a distribution on X and ε, δ ≥ 0. Let X be drawn
from µ ⊗n . Let g : Xn → Z be a function such that Pr[ f (X) 6= g(X)] ≤ δ . Then

                          µ ⊗n           µ ⊗n    1 − maxz∈Z Pr[g(X) = z] − ε − δ
                        Dε ( f ) ≥ Dε+δ (g) ≥                                    .
                                                          infmax (g, µ)

                        T HEORY OF C OMPUTING, Volume 7 (2011), pp. 147–153                               149
                                   R AHUL JAIN AND S HENGYU Z HANG

    Note that there are functions f with large infmax which are close to some other function g with small
infmax . For example, Tribes is the OR of s ≈ n/ log2 n AND gates, each of t ≈ log2 n−log2 log2 n variables.
The parameters s,t are set to make approximately half the inputs take the value 1. It is well known that
for this function, all influences infi = Θ(log n/n), where the distribution is uniform on all inputs. Let g
be Tribes, and obtain f from g by picking a subset S = {0, 1} × S0 of inputs, where S0 ⊆ {0, 1}n−1 has
density δ , and defining
                                                 (
                                                   x1    if x ∈ S,
                                         f (x) =
                                                   g(x) otherwise,

where x1 denotes the first bit of x. Then the first variable has influence at least Ω(δ ), so applying the
bound of Equation (2) yields only a constant lower bound for the distributional query complexity of f .
As g is δ -close to f and infmax (g) = Θ(log n/n), the above corollary yields the stronger lower bound of
Θ(n/ log n).

Remark 1.5.

    1. Our proof does not need to assume that the distributions of the different variables are the same.
       The proof goes through and the bound applies analogously as long as these distributions are
       independent.

    2. The paper [12] also extends Theorem 1.2 to the general metric case where the influence is defined
       as
                                        infdi ( f , µ) = E[d( f (X), f (X i ))] .

       Our proof also extends to this general case. To see this, it suffices to note that all we use is the
       triangle inequality, which holds for any metric space.



2    Proof of main result
Proof of Theorem 1.3. Equation (2) is an easy corollary of Equation (1). For Equation (1), suppose that P
makes at most k queries for any input and any coins. We will prove the statement by induction on k. The
base case of k = 0 is trivial since an algorithm that does not make any query succeeds with probability at
most fmax . For the general k > 0, we will first show the statement when P is deterministic (assuming the
induction hypothesis on randomized query algorithms). Suppose Xi is the first query of P; without loss of
generality we can assume that P does not query Xi afterwards. We will show a randomized algorithm
P0 making at most k − 1 queries with ε(P0 ) ≤ ε(P) + infi ( f , µ). In P0 we do not make this query, but
assume the answer to this query to be Yi , where Yi is distributed according to µ and is independent of X.
From here on P0 proceeds identically to P.
    By construction the maximum number of queries made by P0 (over coins and inputs) is at most
k − 1. Let ans(P, X) represent the answer of algorithm P on input X. Recall the definition of X i from

                        T HEORY OF C OMPUTING, Volume 7 (2011), pp. 147–153                            150
                              T HE I NFLUENCE L OWER B OUND V IA Q UERY E LIMINATION

Definition 1.1. Since ans(P, X i ) 6= f (X) implies either ans(P, X i ) 6= f (X i ) or f (X i ) 6= f (X), we have,

    ε(P0 ) = Pr[ans(P, X i ) 6= f (X)]
             ≤ Pr[ans(P, X i ) 6= f (X i )] + Pr[ f (X i ) 6= f (X)]       (from union bound)
                                                               i
             = Pr[ans(P, X) 6= f (X)] + Pr[ f (X ) 6= f (X)]               (since X is identically distributed to X i )
             = ε(P) + infi ( f , µ) .                                                                                     (3)

Therefore (below we abbreviate infi ( f , µ) = infi ),

       ∑ δi (P)infi = 1 · infi + ∑ δi (P)
              0        0                           0
        i0                               i0 6=i

                           = infi + ∑ δi0 (P0 )                         (since X is identically distributed to X i )
                                    i0 6=i
                                        n
                           = infi + ∑ δi0 (P0 )                         (since P0 does not query Xi )
                                    i0 =1
                           ≥ infi + 1 − fmax − ε(P0 )                   (by the induction hypothesis)
                           ≥ infi + 1 − fmax − ε(P) − infi              (from Equation (3))
                           = 1 − fmax − ε(P) .

    Now let us show Equation (1) when P is a randomized algorithm. We can think of P as invoking
deterministic algorithm P j with probability p j , where each P j makes at most k queries on any input.
Then (assuming Equation (1) for all deterministic algorithms making at most k queries)

                              ∑ δi (P)infi = ∑ ∑ p j δi (P j )infi = ∑ p j ∑ δi (P j )infi
                               i                       i   j               j    i
                                                                          j
                                                  ≥ 1 − fmax − ∑ p j · ε(P ) = 1 − fmax − ε(P) .
                                                                   j




3    Acknowledgment
We thank Ronald de Wolf for detailed and helpful comments on an earlier draft of the paper. The original
paper only proves Equation (2). Ryan O’Donnell and an anonymous referee pointed out that the same
argument can be easily extended to show Equation (1), yielding the current presentation.


References
 [1] H ARRY B UHRMAN AND RONALD DE W OLF: Complexity measures and decision tree complexity:
     A survey. Theoret. Comput. Sci., 288(1):21–43, 2002. [doi:10.1016/S0304-3975(01)00144-X] 148

                              T HEORY OF C OMPUTING, Volume 7 (2011), pp. 147–153                                         151
                                 R AHUL JAIN AND S HENGYU Z HANG

 [2] A MIT C HAKRABARTI AND S UBHASH K HOT: Improved lower bounds on the randomized complex-
     ity of graph properties. In Proc. 28th Intern. Colloq. Autom. Lang. Program. (ICALP), number 2076
     in Lecture Notes in Comput. Sci., pp. 285–296. Springer, 2001. [doi:10.1007/3-540-48224-5 24]
     149
 [3] A MIT C HAKRABARTI AND O DED R EGEV: An optimal lower bound on the communication
     complexity of Gap-Hamming-Distance. In Proc. 43th STOC, pp. 51–60. ACM Press, 2011.
     [doi:10.1145/1993636.1993644] 149
 [4] RONALD DE W OLF: A Brief Introduction to Fourier Analysis on the Boolean Cube. Number 1 in
     Graduate Surveys. Theory of Computing Library, 2008. [doi:10.4086/toc.gs.2008.001] 148

 [5] P ÉTER H AJNAL: An Ω(n4/3 ) lower bound on the randomized complexity of graph properties.
     Combinatorica, 11(2):131–143, 1991. [doi:10.1007/BF01206357] 149
 [6] R AHUL JAIN AND H ARTMUT K LAUCK: The partition bound for classical communication com-
     plexity and query complexity. In Proc. 25th IEEE Conf. Comput. Complexity (CCC), pp. 247–258.
     IEEE Comp. Soc. Press, 2010. [doi:10.1109/CCC.2010.31] 149
 [7] H ARTMUT K LAUCK: Lower bounds for quantum communication complexity. SIAM J. Comput.,
     1:20–46, 2007. [doi:10.1137/S0097539702405620] 149
 [8] H ARTMUT K LAUCK: A strong direct product theorem for Disjointness. In Proc. 42nd STOC, pp.
     77–86. ACM Press, 2010. [doi:10.1145/1806689.1806702] 149
 [9] H OMIN L EE: Decision trees and influence: An inductive proof of the OSSS inequality. Theory of
     Computing, 6(1):81–84, 2010. [doi:10.4086/toc.2010.v006a004] 148, 149
[10] T ROY L EE AND S HENGYU Z HANG: Composition theorems in communication complexity. In Proc.
     37th Intern. Colloq. Autom. Lang. Program. (ICALP), number 6198 in Lecture Notes in Comput.
     Sci., pp. 475–489. Springer, 2010. [doi:10.1007/978-3-642-14165-2 41] 149
[11] RYAN O’D ONNELL: Some topics in analysis of Boolean functions. In Proc. 40th STOC, pp.
     569–578. ACM Press, 2008. [doi:10.1145/1374376.1374458] 148
[12] RYAN O’D ONNELL , M ICHAEL E. S AKS , O DED S CHRAMM , AND ROCCO A. S ERVEDIO: Every
     decision tree has an influential variable. In Proc. 46th FOCS, pp. 31–39. IEEE Comp. Soc. Press,
     2005. [doi:10.1109/SFCS.2005.34] 148, 149, 150
[13] RYAN O’D ONNELL AND ROCCO S ERVEDIO: Learning monotone decision trees in polynomial
     time. SIAM J. Comput., 37(3):827–844, 2007. [doi:10.1137/060669309] 148
[14] A LEXANDER S HERSTOV: The pattern matrix method for lower bounds on quantum communication.
     In Proc. 40th STOC, pp. 85–94. ACM Press, 2008. [doi:10.1145/1374376.1374392] 149
[15] YAOYUN S HI AND Y UFAN Z HU: The quantum communication complexity of block-composed
     functions. Quantum Inf. Comput., 9(5&6):444–460, 2009. http://www.rintonpress.com/
     journals/qic/, http://arxiv.org/pdf/0710.0095v4/. 149


                      T HEORY OF C OMPUTING, Volume 7 (2011), pp. 147–153                         152
                    T HE I NFLUENCE L OWER B OUND V IA Q UERY E LIMINATION

AUTHORS

    Rahul Jain
    Assistant professor
    National University of Singapore
    rahul comp nus edu sg
    http://comp.nus.edu.sg/~rahul


    Shengyu Zhang
    Assistant professor
    The Chinese University of Hong Kong
    syzhang cse cuhk edu hk
    http://www.cse.cuhk.edu.hk/~syzhang


ABOUT THE AUTHORS

    R AHUL JAIN obtained his Ph. D. in computer science from the Tata Institute of Fundamental
       Research, Mumbai, India in 2003. He was a postdoctoral fellow for two years at the
       University of California, Berkeley (2004-2006) and for two years at the Institute for
       Quantum Computing (IQC), University of Waterloo, Canada (2006-2008). In 2008, he
       joined NUS as an Assistant Professor in the Computer Science Department with cross
       appointment with CQT. His research interests are in the areas of information theory,
       quantum computation, cryptography, communication complexity, and computational
       complexity theory.


    S HENGYU Z HANG received his B. S. in Mathematics at Fudan University in 1999, his
       M. S. in Computer Science at Tsinghua University in 2002, and his Ph. D. in Computer
       Science at Princeton University in 2006. After working in NEC Laboratories America
       for a summer, and in California Institute of Technology for two years as a postdoctoral
       researcher, he joined The Chinese University of Hong Kong as an assistant professor.
       His main research interests are complexity theories in various randomized and quantum
       models.




                    T HEORY OF C OMPUTING, Volume 7 (2011), pp. 147–153                          153