The One-Way Communication Complexity of Hamming Distance

Authors T. S. Jayram, Ravi Kumar, D. Sivakumar,
Plaintext
                                 T HEORY OF C OMPUTING, Volume 4 (2008), pp. 129–135
                                             http://theoryofcomputing.org




    The One-Way Communication Complexity
            of Hamming Distance
                          T. S. Jayram                           Ravi Kumar∗            D. Sivakumar∗
                Received: January 23, 2008; revised: September 12, 2008; published: October 11, 2008.




        Abstract: Consider the following version of the Hamming distance problem for ±1 vec-
                                                                                  √                 √
        tors of length n: the promise is that the distance is either at least 2n + n or at most 2n − n,
        and the goal is to find out which of these two cases occurs. Woodruff (Proc. ACM-SIAM
        Symposium on Discrete Algorithms, 2004) gave a linear lower bound for the randomized
        one-way communication complexity of this problem. In this note we give a simple proof
        of this result. Our proof uses a simple reduction from the indexing problem and avoids the
        VC-dimension arguments used in the previous paper. As shown by Woodruff (loc. cit.),
        this implies an Ω(1/ε 2 )-space lower bound for approximating frequency moments within
        a factor 1 + ε in the data stream model.

ACM Classification: F.2.2
AMS Classification: 68Q25
Key words and phrases: One-way communication complexity, indexing, Hamming distance

1     Introduction
The Hamming distance H(x, y) between two vectors x and y is defined to be the number of positions i
such that xi 6= yi . Let GapHD denote the following promise problem: the input consists of ±1 vectors
                                                                          √                   √
x and y of length n, together with the promise that either H(x, y) ≤ n2 − n or H(x, y) ≥ n2 + n, and
the goal is to find out which of these two cases occurs. In the one-way communication model [6], Alice
gets x, Bob gets y and Alice sends a single message to Bob using which Bob outputs the desired answer.
    ∗ Part of the work done while the author was at the IBM Almaden Research Center.



Authors retain copyright to their work and grant Theory of Computing unlimited rights
to publish the work electronically and in hard copy. Use of the work is permitted as
long as the author(s) and the journal are properly acknowledged. For the detailed
copyright statement, see http://theoryofcomputing.org/copyright.html.

c 2008 T. S. Jayram, Ravi Kumar, and D. Sivakumar                                          DOI: 10.4086/toc.2008.v004a006
                              T. S. JAYRAM , R AVI K UMAR , AND D. S IVAKUMAR

We will also allow the protocols to be randomized in which case both Alice and Bob have access to a
public random string, and the correct answer must be output with probability at least 2/3. The cost of
such a protocol is the maximum number of bits communicated by Alice over all inputs. The randomized
one-way communication complexity of GapHD is the cost of the minimum-cost one-way protocol for
GapHD.
    Woodruff [11] showed a tight Ω(n) lower bound for GapHD and used it to obtain an Ω(1/ε 2 )-space
lower bound for approximating frequency moments within a factor 1 + ε in the data stream model.
In this note we show a simpler proof of the linear lower bound for GapHD; our proof uses an easy
reduction from the indexing problem and avoids the VC-dimension arguments in [11]. We will present
two different reductions: the first reduction uses Rademacher sums and the second reduction treats the
indexing problem from a geometric viewpoint.
    Subsequent to discussing our proof with Woodruff, he obtained [12] two alternative proofs of the
Ω(n) lower bound for GapHD. The first proof [12, Section 4.3] is similar in spirit to ours, and presents
a reduction from the indexing problem. The second proof [12, Section 4.4] establishes the Ω(n) lower
bound for GapHD under the uniform product distribution of inputs, using combinatorial and information-
theoretic arguments. For the more general version of the gap Hamming problem where the goal is to
distinguish between H(x, y) ≤ d and H(x, y) ≥ d(1 + ε), an algorithm of Kushilevitz, Ostrovsky, and
Rabani [7] yields an O(1/ε 2 ) upper bound (independent of d). Finally, the multi-round randomized
communication complexity of GapHD is still open; we conjecture an Ω(n) lower bound.


2    Main result
In this note we give a simple proof of the following result.
Theorem 1 (Woodruff). The randomized one-way communication complexity of GapHD is linear in the
length of the input.
Proof. We begin by recalling the indexing problem: Alice gets a set T ⊆ [n], Bob gets an element
i ∈ [n], and the goal is to compute whether i ∈ T . We know that this has an Ω(n) lower bound in the
one-way communication model (e. g., see [5] or [2] for a sharp bound in terms of the error probability;
for completeness, we include a self-contained elementary proof of this result in Section 3).
     Let Alice’s input be T ⊆ [n] and Bob’s input be i ∈ [n]. Transform T to a vector u ∈ {−1, +1}n by
setting u j = −1 if j ∈ T and u j = +1 if j 6∈ T . Let e j denote the standard 0-1 basis vector corresponding
to Bob’s input.
     Alice and Bob will use public randomness to realize an instance x, y ∈ {−1, +1}N of GapHD, for
some N to be specified later, as follows. Pick N i.i.d. vectors r1 , r2 , . . . , rN in Rn where the distribution
µ of each rk will be specified later. Define xk , sgn(hu, rk i) and yk , sgn(hei , rk i) for all k. Since
rk ∈ {−1, +1}n , we have sgn(hei , rk i) = rik . Also note that
                 H(x, y) = |{k : sgn(hu, rk i) 6= sgn(hei , rk i)}| = |{k : sgn(hu, rk i) 6= rik }| .
    We will show that if r ∼ µ,
                                                         ( 1
                                                          ≥ 2 + √cn     if ui = −1,
                                Pr[sgn(hu, ri) 6= ri ]                                                     (2.1)
                                                          ≤ 12 − √cn    if ui = +1,

                         T HEORY OF C OMPUTING, Volume 4 (2008), pp. 129–135                                 130
                    T HE O NE -WAY C OMMUNICATION C OMPLEXITY OF H AMMING D ISTANCE

for some positive constant c > 0.
    We will use the following version of Chernoff’s bound (e. g., see [8]):
Chernoff’s bound. Let X1 , X2 , . . . , XN be N i.i.d. 0-1 random variables and X = ∑Nk=1 Xk . Then
                                                      2                                              2
                  Pr[X − E[X] > ε] ≤ e−2ε /N and Pr[X − E[X] < −ε] ≤ e−2ε /N .
                              √
                 2 and ε = 1/ N. By Chernoff’s bound, with probability at least 2/3, we have that either
   Set N = 4n/c
              √                                √
H(x, y) ≥ N2 + N if ui = −1, or H(x, y) ≤ N2 − N if ui = +1. Therefore, given a protocol for GapHD,
we have a protocol for the indexing problem. Since the indexing problem has a linear lower bound and
N = O(n), this proves the linear lower bound for GapHD.
   We now establish (2.1) by giving two different proofs.

Rademacher sums: Assume that n is odd. Let µ be the uniform distribution over the vectors in
{−1, +1}n and let r ∼ µ. Note that since both u and r are ±1 vectors and n is odd, hu, ri =            6 0. Write
hu, ri = ui ri + w, where w , ∑ j6=i u j r j . Note that w is independent of ri . Fix a value for w; there are two
cases to consider:
    • If w 6= 0, then |w| ≥ 2 since w is a sum of an even number of ±1 values. Therefore, sgn(hu, ri) =
      sgn(w), implying that
                                                                                                     1
                                 Pr[sgn(hu, ri) 6= ri | w 6= 0] = Pr[sgn(w) 6= ri | w 6= 0] =          .                   (2.2)
                                                                                                     2
    • If w = 0, then sgn(hu, ri) = ui ri . Using the independence of w and ri , we obtain
                                      Pr[sgn(hu, ri) 6= ri | w = 0] = Pr[ui ri 6= ri | w = 0]
                                                                          = Pr[ui ri 6= ri ]
                                                                            (                                              (2.3)
                                                                              1 if ui = −1,
                                                                          =
                                                                              0 if ui = +1.

Now w is the sum of n − 1 i.i.d. random variables each of which is distributed uniformly in {−1, +1}.
                                                             √
Since n is odd, assuming it is large enough, Pr[w = 0] ≥ c0 / n, for some constant c0 > 0 (by Stirling’s
formula). Combining this with (2.2) and (2.3), and letting c = c0 /2, we obtain (2.1).

Geometry: The key idea is to view u and ei as vectors in Euclidean space and apply the inner product
protocol given in [5]. This protocol uses the technique of [4], which arose in the context of rounding
the solution of a semi-definite program. For the sake of completeness, we sketch this argument. Define
µ such that r ∼ µ is a uniformly chosen n-dimensional unit vector. Let r0 be the projection of r in the
plane spanned by u and ei . Then by rotational symmetry, the direction of r0 is uniform in this plane. If û
denotes the unit vector in the direction of u, then it follows1 that
                                                                                
                                                arccos(hû, ei i)  1          ui
                       Pr[sgn(hu, ri) 6= ri ] =                   = · arccos √     .                 (2.4)
                                                       π           π           n
   1 Note that the events hu, ri = 0 and he , ri = 0 have probability measure zero; if either happens, Alice and Bob will abandon
                                           i
the protocol; this adds a negligible quantity to the error probability.


                              T HEORY OF C OMPUTING, Volume 4 (2008), pp. 129–135                                           131
                                  T. S. JAYRAM , R AVI K UMAR , AND D. S IVAKUMAR

Now, for any z ∈ [−1, 1], arccos(z) = π2 − arcsin(z). Since sin(z) ≤ z for z ≥ 0, we have arcsin(z) ≥ z for
0 ≤ z ≤ 1. Substituting in (2.4), we conclude
                                                          ( 1       1
                                                           ≥ 2 + π√   n
                                                                        if ui = −1,
                         Pr[sgn(hu, ri) 6= sgn(hei , ri)]
                                                           ≤ 21 − π √
                                                                    1
                                                                      n
                                                                        if ui = +1,

as required.

Remark. The geometric approach shown above uses an infinite amount of randomness which is not
part of the standard model. However, the important point is that the space of inputs and messages are
finite, therefore, the lower bounds for indexing and consequently for the GapHD will continue to hold.
Alternatively, one can also prove the above bounds using finite amount of randomness by considering
finite-precision versions of the random vectors (as was done in [5]; see also [9]).


3     Indexing lower bound
In this section, we present an elementary and self-contained proof of an Ω(n) lower bound on the ran-
domized one-way communication complexity of the indexing problem. A more general result, involving
the VC-dimension and shatter coefficients, appears in [5, 2]. The proof is intended to be elementary to
present, and therefore, the constants are not chosen optimally. The proof borrows ideas from [3]; the use
of error-correcting codes for data stream lower bounds also appears in [1].
     The indexing problem may also be thought of as a problem on strings: Alice is given a string
x ∈ {0, 1}n and Bob is given an index i ∈ [n], and the goal is for Bob to learn xi , after receiving a single
message from Alice. Alice and Bob are both allowed to be randomized. From the easy direction of
Yao’s min-max principle, it follows that if there is a randomized communication protocol that solves
every instance of indexing with probability of error at most δ , then for any distribution µ on the inputs
for indexing, there is a deterministic protocol that works correctly with probability at least 1 − δ when
the inputs are chosen according to µ. Furthermore, if the randomized protocol is one-way, then so
is the resulting deterministic protocol. The rest of this section presents an ensemble of distributions
µ = µ(δ ) = {µn } (for suitably large δ > 0) such that for some absolute constant ε > 0 and for sufficiently
large n, in any deterministic communication protocol that works correctly on all but δ fraction of inputs
from µn , Alice must communicate at least εn bits to Bob.
     An (n, d)-error correcting code is a collection C of strings in {0, 1}n such that for any distinct pair
of strings x, y ∈ C, H(x, y) ≥ d, where H denotes the Hamming distance. It is well-known2 that there are
constants α > 0, ∆ > 0 such that for all sufficiently large n, there is an (n, ∆n)-error correcting code Cn
of size 2αn . The distribution µn we will use will pick a string x for Alice uniformly from the collection
Cn , and an index i ∈ [n] for Bob uniformly at random. Let 0 < δ ≤ ∆/4.
     Let Π denote a deterministic one-way communication protocol that succeeds with probability at
least 1 − δ when the inputs are drawn according to µn , and in which the number of bits sent by Alice
   2 For example, Justesen codes (see [10]) are a simple and explicit construction. It can be shown via the probabilistic method,

using Chernoff bounds, that there exists α such that for any ∆ < 1/2 and sufficiently large n, there is an (n, ∆n)-error correcting
code of size 2αn . For our proof, the explicitness of the code is not necessary.


                             T HEORY OF C OMPUTING, Volume 4 (2008), pp. 129–135                                              132
                 T HE O NE -WAY C OMMUNICATION C OMPLEXITY OF H AMMING D ISTANCE

is at most c. Since Π is a one-way protocol, there is a pair of functions A : {0, 1}n −→ {0, 1}c and
B : {0, 1}c × [n] −→ {0, 1} that describe the actions of Alice and Bob in the protocol Π. First note that
if x 6= y, since H(x, y) ≥ 4δ n, under the uniform distribution of i ∈ [n], we have xi 6= yi with probability
at least 4δ . Now suppose x, y ∈ Cn , x 6= y, and A(x) = A(y) = z. Then, for at least 4δ n distinct values
of i, B(z, i) agrees with precisely one of xi and yi ; equivalently, for at least 2δ n distinct values of i,
either B(z, i) 6= xi or B(z, i) 6= yi . Since Π(x, i) = B(z, i) = Π(y, i), we have Pri [Π(x, i) 6= xi ] ≥ 2δ or
Pri [Π(y, i) 6= yi ] ≥ 2δ . Since this is true for any pair of strings in Cn that agree under the map A(·), it
follows that for any z ∈ {0, 1}c , for all but one x satisfying A(x) = z, we have Pri [Π(x, i) 6= xi ] ≥ 2δ . If
the total error of the protocol under the distribution µn is no more than δ , we have:
                                              2αn − 2c
                                                       2δ < δ ,
                                                2αn
whence it follows that c > αn − 1.

Acknowledgments
We thank Laci Babai and the anonymous reviewers for many useful comments.


References
 [1] * N. A LON , Y. M ATIAS , AND M. S ZEGEDY: The space complexity of approximating the fre-
     quency moments. J. Comput. System Sci., 58(1):137–147, 1999. [JCSS:10.1006/jcss.1997.1545].
     3

 [2] * Z. BAR -YOSSEF , T.S. JAYRAM , R. K UMAR , AND D. S IVAKUMAR: Information theory meth-
     ods in communication complexity. In Proc. 17th Annual IEEE Conf. Computational Complexity,
     pp. 93–102. IEEE, 2002. [CCC:10.1109/CCC.2002.1004344]. 2, 3

 [3] * Z. BAR -YOSSEF, R. K UMAR , AND D. S IVAKUMAR: Reductions in streaming algorithms, with
     an application to counting triangles in graphs. In Proc. 13th ACM-SIAM Symp. Discrete Algorithms
     (SODA), pp. 623–632. SIAM, 2002. [SODA:545381.545464]. 3

 [4] * M. X. G OEMANS AND D. P. W ILLIAMSON: Improved approximation algorithms for maximum
     cut and satisfiability problems using semidefinite programming. J. ACM, 42(6):1115–1145, 1995.
     [JACM:10.1145/227683.227684]. 2

 [5] * I. K REMER , N. N ISAN , AND D. RON: On randomized one-round communication complexity.
     Comput. Complexity, 8(1):21–49, 1999. [Springer:w5pccfda9jbhpgdj]. 2, 2, 2, 3

 [6] * E. K USHILEVITZ AND N. N ISAN: Communication Complexity. Cambridge University Press,
     1997. 1

 [7] * E. K USHILEVITZ , R. O STROVSKY, AND Y. R ABANI: Efficient search for approximate
     nearest neighbor in high dimensional spaces. SIAM J. Comput., 30(2):457–474, 2000.
     [SICOMP:10.1137/S0097539798347177]. 1

                         T HEORY OF C OMPUTING, Volume 4 (2008), pp. 129–135                                133
                          T. S. JAYRAM , R AVI K UMAR , AND D. S IVAKUMAR

 [8] * C. M C D IARMID: Probabilistic Methods for Algorithmic Discrete Mathematics, volume 16 of
     Algorithms and Combinatorics. Springer-Verlag, Berlin, 1998. 2

 [9] * I. N EWMAN: Private vs. common random bits in communication complexity. Inform. Process.
     Lett., 39(2):67–71, 1991. [IPL:10.1016/0020-0190(91)90157-D]. 2

[10] * J. H. VAN L INT: Introduction to Coding Theory. Springer, 1999. 2

[11] * D. W OODRUFF: Optimal space lower bounds for all frequency moments.  In Proc.
     15th Annual ACM-SIAM Symp. Discrete Algorithms (SODA), pp. 167–175. SIAM, 2004.
     [SODA:982792.982817]. 1

[12] * D. W OODRUFF: Efficient and Private Distance Approximation in the Communication and
     Streaming Models. PhD thesis, MIT, 2007. 1


AUTHORS

      T. S. Jayram [About the author]
      IBM Almaden Research Center
      650 Harry Rd
      San Jose, CA 95120
      jayram almaden ibm com
      http://www.almaden.ibm.com/cs/people/jayram


      Ravi Kumar [About the author]
      Yahoo! Research
      701 First Ave
      Sunnyvale, CA 94089
      ravikumar yahoo-inc com
      http://research.yahoo.com/Ravi Kumar


      D. Sivakumar [About the author]
      Google Inc.
      1600 Amphitheatre Parkway
      Mountain View, CA 94043
      siva google com
      http://globofthoughts.blogspot.com




                      T HEORY OF C OMPUTING, Volume 4 (2008), pp. 129–135                   134
            T HE O NE -WAY C OMMUNICATION C OMPLEXITY OF H AMMING D ISTANCE

ABOUT THE AUTHORS

   T. S. JAYRAM (aka JAYRAM T HATHACHAR) has been at IBM Research since 1998. He
      obtained his Bachelors degree in Computer Science from IIT Madras, Chennai and his
      Masters and Ph. D. in Computer Science and Engineering from the University of Wash-
      ington, Seattle, under the supervision of Paul Beame. He now manages the Algorithms
      and Computation group at IBM Almaden. His primary research interests are in massive
      data sets and computational complexity.


   R AVI K UMAR has been at Yahoo! Research since July 2005. Prior to this, he was a research
      staff member at the IBM Almaden Research Center in the Computer Science Principles
      and Methodologies group. He obtained his Ph. D. in Computer Science from Cornell
      University under Ronitt Rubinfeld in December 1997. His primary interests are web
      algorithms, algorithms for large data sets, and theory of computation.


   D. S IVAKUMAR (S IVA) is interested in theoretical computer science, organizing the world’s
      information, and everything in between. He has been at Google Research since 2005,
      before which he was on the Research Staff at the IBM Almaden Research Center, San
      Jose, CA. Before finding his home in the pleasant environs of northern California, he
      spent a few years in the steam chamber of Houston, TX, as a member of the faculty of
      Computer Science at the University of Houston, and a few years in the freezer that is
      Buffalo, NY, where he received his Ph. D. in Computer Science from the State Univer-
      sity of New York under the wise tutelage of Ken Regan, Jin-Yi Cai, and Alan Selman.




                    T HEORY OF C OMPUTING, Volume 4 (2008), pp. 129–135                          135