On The Hardness of Approximate and Exact (Bichromatic) Maximum Inner Product

Authors Lijie Chen,
Plaintext
                           T HEORY OF C OMPUTING, Volume 16 (4), 2020, pp. 1–50
                                        www.theoryofcomputing.org

                                        S PECIAL ISSUE : CCC 2018



 On The Hardness of Approximate and Exact
   (Bichromatic) Maximum Inner Product
                                                      Lijie Chen∗
                   Received July 1, 2018; Revised May 28, 2019; Published September 23, 2020




       Abstract. In this paper we study the (Bichromatic) Maximum Inner Product Problem
       (Max-IP), in which we are given sets A and B of vectors, and the goal is to find a ∈ A
       and b ∈ B maximizing inner product a · b. Max-IP is a basic question and serves as the
       base problem in the recent breakthrough of [Abboud et al., FOCS 2017] on hardness of
       approximation for polynomial-time problems. It is also used (implicitly) in the argument for
       hardness of exact `2 -Furthest Pair (and other important problems in computational geometry)
       in poly-loglog dimensions in [Williams, SODA 2018]. We have three main results regarding
       this problem.

           • Characterization of Multiplicative Approximation. First, we study the best multi-
             plicative approximation ratio for Boolean Max-IP in subquadratic time. We show that,
             for Max-IP with two sets each consisting of n vectors from {0, 1}d , there is an n2−Ω(1) -
             time multiplicative t-approximation algorithm when t = (d/ log n)Ω(1) . We also show
             this is conditionally optimal, as a (d/ log n)o(1) -approximation algorithm would refute
             SETH. Similar characterization is also achieved for additive approximation for Max-IP.
   ∗ Supported by an Akamai Fellowship.



ACM Classification: F.1.3
AMS Classification: 68Q17
Key words and phrases: Maximum Inner Product, SETH, hardness of approximation in P, fine-grained
complexity, Hopcroft’s problem, Chinese Remainder Theorem


© 2020 Lijie Chen
c b Licensed under a Creative Commons Attribution License (CC-BY)                  DOI: 10.4086/toc.2019.v016a004
                                                         L IJIE C HEN

                      ∗
            • 2O(log n) -dimensional Hardness for Exact Max-IP Over The Integers. Second, we
              revisit the hardness of solving Max-IP exactly for vectors with integer entries. We show
                                                                                                    ∗
              that, under SETH, for Max-IP with sets of n vectors from Zd for some d = 2O(log n) ,
              every exact algorithm requires n2−o(1) time. With the reduction from [Williams, SODA
              2018], it follows that `2 -Furthest Pair and Bichromatic `2 -Closest Pair in dimension
                    ∗
              2O(log n) require n2−o(1) time.
            • Connection with NP · UPP Communication Protocols. Last, we establish a con-
              nection between conditional lower bounds for exact Max-IP with integer entries and
              NP · UPP communication protocols for Set-Disjointness, parallel to the connection
              between conditional lower bounds for approximate Max-IP and MA communication
              protocols for Set-Disjointness.

             The lower bound in our first result is a direct corollary of the new MA protocol for
        Set-Disjointness introduced in [Rubinstein, STOC 2018], and our algorithms utilize the
        polynomial method and simple random sampling. Our second result follows from a new
        dimensionality self reduction from the Orthogonal Vectors problem for n vectors from {0, 1}d
                                               ∗
        to n vectors from Z` where ` = 2O(log d) , dramatically improving the previous reduction in
        [Williams, SODA 2018]. The key technical ingredient is a recursive application of Chinese
        Remainder Theorem.
             As a by-product we obtain an MA communication protocol for Set-Disjointness with
                        √                                           √
        complexity O( n log n log log n), slightly improving the O( n log n) bound [Aaronson and
                                                           √
        Wigderson, TOCT 2009], and approaching the Ω( n) lower bound [Klauck, CCC 2003].
                                                                          √
             Moreover, we show that (under SETH) one can apply the O( n) BQP communication
        protocol for Set-Disjointness to prove near-optimal hardness for approximation to Max-IP
        with vectors in {−1, 1}d . This answers a question from [Abboud et al., FOCS 2017] in the
        affirmative.


1     Introduction
Maximum Inner Product Search is a fundamental similarity search problem in which you want to maintain
a collection of vectors S, and answer queries of the form that given a new vector q, find the vector in S
which is the most correlated to q (or an approximation to it). This problem is closely related to another
fundamental problem called nearest neighbor search, in which one needs to maintain a collection of
points, and find the nearest neighbor for the query points (or an approximation to it).
    In this paper we consider a natural offline version of Maximum Inner Product Search, in which you
are only required to compute the maximum correlation between S and all queries, and queries are given
in advance.1

Definition 1.1 (Bichromatic Maximum Inner Product (Max-IP)). For n, d ∈ N, the Max-IPn,d problem is
    1 Clearly, this offline version is weaker than the online similarity search counterpart, so lower bounds for it automatically

imply lower bounds for its online version.


                             T HEORY OF C OMPUTING, Volume 16 (4), 2020, pp. 1–50                                              2
      O N T HE H ARDNESS OF A PPROXIMATE AND E XACT (B ICHROMATIC ) M AXIMUM I NNER P RODUCT

defined as: given two sets A, B each consisting of n vectors from {0, 1}d compute

                                             OPT(A, B) := max a · b.
                                                               a∈A,b∈B

   We use Z-Max-IPn,d (R-Max-IPn,d ) to denote the same problem, but with A, B being sets of vectors
from Zd (Rd ).

1.1     Motivation and background
1.1.1    Hardness of approximate Max-IP
A natural brute-force algorithm solves Max-IP in O(n2 · d) time. Assuming SETH2 , there is no n2−Ω(1) -
time algorithm for Max-IPn,d when d = ω(log n) [73].
    Despite being one of the most central problems in similarity search and having numerous applica-
tions [48, 43, 15, 64, 65, 68, 17, 16, 18, 60, 69, 71, 14, 50, 11, 70, 34, 33], until recently it was unclear
whether there could be a near-linear time, 1.1-approximation algorithm, before the recent breakthrough
by Abboud, Rubinstein and Williams [5].3
    In [5], a framework for proving inapproximability results for problems in P is established (the
distributed PCP framework), from which it follows:

Theorem 1.2 (Abbaud, Rubinstein, Williams 2017). Assuming SETH, no n2−Ω(1) -time algorithm for
                                                 1−o(1)
Max-IPn,no(1) can achieve multiplicative 2(log n)       -approximation.

     Theorem 1.2 is an exciting development for hardness of approximation in P, implying other important
inapproximability results for a host of problems including Bichromatic LCS Closest Pair Over Permuta-
tions, Approximate Regular Expression Matching, and Diameter in Product Metrics [5]. However, we
still do not have a complete understanding of the approximation hardness of Max-IP yet. For instance,
consider the following two concrete questions:

Question 1. Is there a multiplicative (log n)-approximation n2−Ω(1) -time algorithm for Max-IPn,log2 n ?
What about a multiplicative 2-approximation algorithm for Max-IPn,log2 n ?

Question 2. Is there an additive (d/ log n)-approximation n2−Ω(1) -time algorithm for Max-IPn,d ?

    We note that the lower bound from [5] cannot answer Question 1. Tracing the details of their proofs,
one can see that it only shows approximation hardness for dimension d = logω(1) n. Question 2 concerning
additive approximation is not addressed at all by [5]. Given the importance of Max-IP, it is interesting to
ask:
         For what ratios r do n2−Ω(1) -time r-approximation algorithms exist for Max-IP?
   Does the best-possible approximation ratio (in n2−Ω(1) time) relate to the dimensionality, in some
way?
   2 SETH (Strong Exponential Time Hypothesis) states that for every ε > 0 there is a k such that k-SAT cannot be solved in

O((2 − ε)n ) time [47].
   3 See [5] for a thorough discussion on the state of affairs on hardness of approximation in P before their work.



                           T HEORY OF C OMPUTING, Volume 16 (4), 2020, pp. 1–50                                          3
                                                        L IJIE C HEN

    In an important recent work, Rubinstein [67] improved the distributed PCP construction in a crucial
way, from which one can derive more refined lower bounds on approximate Max-IP. He then used the
refined lower bounds to establish the SETH-hardness of (1 + o(1))-approximate nearest neighbor search.
    Building on Rubinstein’s technique in this paper we provide full characterizations, determining
essentially optimal multiplicative approximations and additive approximations to Max-IP, under SETH.


1.1.2    Hardness of exact Z-Max-IP

Recall that from [73], there is no n2−Ω(1) -time algorithm for exact Boolean Max-IPn,ω(log n) . Since in
real-life applications of similarity search, one often deals with real-valued data instead of just Boolean
data, it is natural to ask about Z-Max-IP (which is certainly a special case of R-Max-IP): what is the
maximum d such that Z-Max-IPn,d can be solved exactly in n2−Ω(1) time?
    Besides being interesting in its own right, Z-Max-IP also has reductions to `2 -Furthest Pair and to
Bichromatic `2 -Closest Pair. Hence, lower bounds for Z-Max-IP imply lower bounds for these two
famous problems in computational geometry (see [75] for a discussion on this topic).
    Prior to our work, it was implicitly shown in [75] that:

Theorem 1.3 ([75]). Assuming SETH, there is no n2−Ω(1) -time algorithm for Z-Max-IPn,ω((log log n)2 )
with vectors of O(log n)-bit entries.

     However, the best known algorithm for Z-Max-IP runs in n2−Θ(1/d) time [58, 10, 78]4 , hence there is
still a gap between the lower bound and the best known upper bounds. To confirm these algorithms are in
fact optimal, we would like to prove a lower bound with ω(1) dimensions.
     In this paper, we significantly strengthen the previous lower bound from ω((log log n)2 ) dimensions
           ∗
to 2 O(log   n) dimensions. (2O(log∗ n) is an extremely slowly growing function. See the Preliminaries for its

formal definition.)


1.1.3    Fine-grained complexity and communication complexity
                                                                                   e √n) MA commu-
One intriguing aspect of the distributed PCP framework is that it makes use of the O(
nication protocol for Set-Disjointness [1]. Several follow-up works [51, 67] explored this connection
further, and settled the hardness of approximation to several fundamental problems (under SETH).
    Given the success of the interplay between these two seemingly unrelated fields, it is natural to seek
                                                                      √
more results from it. In particular, it is asked in [5] whether the O( n) BQP communication protocol for
Set-Disjointness can be utilized.
    In this paper, we answer the question affirmatively by showing that BQP communication protocol
implies hardness for approximation to {−1, 1}-Max-IP.5 Moreover, we also establish a connection
between Z-Max-IP lower bounds and NP · UPP communication protocols for Set-Disjointness, which
suggests a new perspective on our results on Z-Max-IP.
  4 [10, 78] are for ` -Furthest Pair or Bichromatic ` -Closest Pair. They also work for Z-Max-IP as there are reductions from
                      2                               2
Z-Max-IP to these two problems, see [75] or Lemma 4.6 and Lemma 4.7.
  5 That is, Max-IP with sets A and B each consisting of n vectors from {−1, 1}d .



                            T HEORY OF C OMPUTING, Volume 16 (4), 2020, pp. 1–50                                            4
      O N T HE H ARDNESS OF A PPROXIMATE AND E XACT (B ICHROMATIC ) M AXIMUM I NNER P RODUCT

1.2     Hypothesis assumed in this paper
We use OVn,d to denote the Orthogonal Vectors problem: given two sets of vectors A, B each consisting
of n vectors from {0, 1}d , determine whether there is an a ∈ A and b ∈ B such that a · b = 0.6 Similarly,
we use Z-OVn,d to denote the same problem except for that A and B consist of vectors from Zd (which is
also called Hopcroft’s problem).
    All our results are based on the following widely used conjecture about OV:

Conjecture 1.4 (Orthogonal Vectors Conjecture (OVC) [73, 8]). For every ε > 0, there exists a c ≥ 1
such that OVn,d requires n2−ε time when d = c log n.

    OVC is a plausible conjecture as it is implied by the popular Strong Exponential Time Hypothesis [47,
29] on the time complexity of solving k-SAT [73, 76].


1.3     Our results on {0, 1}-Max-IP
The first main result of our paper characterizes when there is a truly subquadratic-time (n2−Ω(1) -time, for
some universal constant hidden in the big-Ω) multiplicative t-approximation algorithm for Max-IP, and
characterizes the best-possible additive approximations as well. Let P be an optimization problem (can
be either minimization or maximization). We begin with formal definitions of these two standard types of
approximation:

      • We say A is a multiplicative t-approximation algorithm for P, if for all instances I, A outputs
        a value OPT(I)
                ]                 ] ∈ [OPT(I), OPT(I) · t], where OPT(I) is the answer to P on
                       such that OPT(I)
        instance I.

      • We say A is an additive t-approximation algorithm for P, if for all instances I, A outputs a value
        ] such that |OPT(I)
        OPT(I)            ] − OPT(I)| ≤ t.

      • To avoid ambiguity, we call an algorithm computing OPT(I) exactly an exact algorithm for P.



1.3.1    Characterizations of hardness of multiplicative approximations to Max-IP

In the multiplicative case, our characterization (formally stated below) basically says that there is a
multiplicative t-approximation n2−Ω(1) -time algorithm for Max-IPn,d if and only if t = (d/ log n)Ω(1) .
Note that in the following theorem we require d = ω(log n), since in the case of d = O(log n), there are
n2−ε -time algorithms for exact Max-IPn,d [14, 13].

Theorem 1.5. Letting ω(log n) < d < no(1) and t ≥ 2,7 the following holds.
  6 Here we use the bichromatic version of OV instead of the monochromatic one for convenience, as they are equivalent.
  7 Note that t and d are both functions of n, we assume they are computable in no(1) time throughout this paper for simplicity.



                           T HEORY OF C OMPUTING, Volume 16 (4), 2020, pp. 1–50                                               5
                                                         L IJIE C HEN

   1. There is an n2−Ω(1) -time multiplicative t-approximation algorithm for Max-IPn,d if

                                                         t = (d/ log n)Ω(1) ,

       and under SETH (or OVC), there is no n2−Ω(1) -time multiplicative t-approximation algorithm for
       Max-IPn,d if
                                            t = (d/ log n)o(1) .

   2. Moreover, let                                                             
                                                                     logt
                                                  ε = min                      ,1 .
                                                                 log(d/ log n)
       There are multiplicative t-approximation deterministic algorithms for Max-IPn,d running in time
                                                                 
                                          O n2−Ω(ε) · polylog(n) .

Remark 1.6. We remark that our algorithm still gets a non-trivial speed-up over the brute force algorithm
as long as ε = ω(log log n/ log n).

Comparison with [11]. We remark that in [11], subquadratic-time algorithms for multiplicative t-
approximation Max-IP is given when t = d Ω(1) .8 Our algorithms achieve a slightly better approximation
ratio t = (d/ log n)Ω(1) which matches the conditional lower bound. Moreover, our algorithms work for
the following more general case which is not handled by [11].
     The algorithms in Theorem 1.5 indeed work for the case where the sets consist of non-negative reals
(i. e., R+ -Max-IP).
Corollary 1.7. Assuming ω(log n) < d < no(1) and letting
                                                            
                                                 logt
                                  ε = min                  ,1 ,
                                             log(d/ log n)

there is a multiplicative t-approximation deterministic algorithm for R+ -Max-IPn,d running in time
                                                               
                                        O n2−Ω(ε) · polylog(n) .

    The lower bound of Theorem 1.5 is a direct corollary of the new improved MA protocols for Set-
Disjointness from [67], which is based on Algebraic Geometry codes. Together with the framework
of [5], that MA-protocol implies a reduction from OV to approximate Max-IP.
    Our upper bounds are applications of the polynomial method [74, 6]: defining appropriate sparse
polynomials for approximating Max-IP on small groups of vectors, and using fast matrix multiplication
to speed up the evaluation of these polynomials on many pairs of points.
    Via the known reduction from Max-IP to LCS-Pair in [5], we also obtain a more refined lower bound
for approximating the LCS Closest Pair problem (defined below).
   8 They in fact consider the data structure version of a more fine-grained version of this problem, where there is an additional

parameter s such that you want to decide whether OPT(A, B) ≤ t · s or ≥ s.


                            T HEORY OF C OMPUTING, Volume 16 (4), 2020, pp. 1–50                                                6
    O N T HE H ARDNESS OF A PPROXIMATE AND E XACT (B ICHROMATIC ) M AXIMUM I NNER P RODUCT

Definition 1.8 (LCS Closest Pair). The LCS-Closest-Pairn,d problem is: given two sets A, B each consist-
ing of n strings from Σd (Σ is a finite alphabet), determine
                                                  max LCS(a, b),
                                                a∈A,b∈B

where LCS(a, b) is the length of the longest common subsequence of strings a and b.
Corollary 1.9 (Improved Inapproximability for LCS-Closest-Pair). Assuming SETH (or OVC), for
every t ≥ 2, computing a multiplicative t-approximation to LCS-Closest-Pairn,d requires n2−o(1) time, if
d = t ω(1) · log5 n.

1.3.2    Characterizations of hardness of additive approximations to Max-IP
Our characterization for additive approximations to Max-IP says that there is an additive t-approximation
n2−Ω(1) -time algorithm for Max-IPn,d if and only if t = Ω(d).
Theorem 1.10. Letting ω(log n) < d < no(1) and 0 ≤ t ≤ d, the following holds:
   1. There is an n2−Ω(1) -time additive t-approximation algorithm for Max-IPn,d if
                                                          t = Ω(d),
        and under SETH (or OVC), there is no n2−Ω(1) -time additive t-approximation algorithm for
        Max-IPn,d if
                                              t = o(d).
   2. Moreover, letting ε = t/d, there is an
                                                         1/3     −1
                                                                     
                                                  O n2−Ω(ε / log ε )

        time, additive t-approximation randomized algorithm for Max-IPn,d when ε = ω(log6 log n/ log3 n).
   The lower bound above is already established in [67], while the upper bound works by reducing the
problem to the d = O(log n) case via random-sampling coordinates, and solving the reduced problem via
known methods [14, 13].
Remark 1.11. We want to remark here that the lower bounds for approximate Max-IP are direct
corollaries of the new MA protocols for Set-Disjointness in [67]. Our main contribution is providing the
complementary upper bounds to show that these lower bounds are indeed tight assuming SETH.

All-Pair-Max-IP. Finally, we remark that our algorithms (with slight adaptations) also work for the
following stronger problem9 : All-Pair-Max-IPn,d , in which we are given two sets A and B each consisting
of n vectors from {0, 1}d , and for each x ∈ A we must compute
                                             OPT(x, B) := max x · y.
                                                               y∈B

We say an algorithm is a multiplicative t-approximation (additive t-approximation) algorithm for
All-Pair-Max-IP, if for all instances of OPT(x, B), it computes corresponding approximate answers.
  9 Since All-Pair-Max-IP is stronger than Max-IP, lower bounds for Max-IP automatically apply to All-Pair-Max-IP.



                         T HEORY OF C OMPUTING, Volume 16 (4), 2020, pp. 1–50                                        7
                                                       L IJIE C HEN

Corollary 1.12. Suppose ω(log n) < d < no(1) , and let
                                                    
                                        logt                    min(t, d)
                        εM := min                 , 1 and εA :=           .
                                    log(d/ log n)                  d
                                                                                                                1/3    −1
There is an n2−Ω(εM ) polylog(n)-time multiplicative t-approximation algorithm and an n2−Ω(εA / log εA ) -
time additive t-approximation algorithm for All-Pair-Max-IPn,d , when εA = ω(log6 log n/ log3 n).

1.4 BQP communication protocols and approximate {−1, 1}-Max-IP
                        √
Making use of the O( n)-degree approximate polynomial for OR [27, 77], we also give a completely
different proof for the hardness of multiplicative approximations to {−1, 1}-Max-IP. Note that the answer
to {−1, 1}-Max-IP can be negative, so we consider the variant of maximizing unsigned inner product
here; that is, we want to approximate maxa∈A,b∈B |a · b| (this variant is also studied in [11]). Lower bound
from that approach is inferior to Theorem 1.5: in particular, it cannot achieve a characterization.10
                                                                   √
    It is asked in [5] that whether we can make use of the O( n) BQP communication protocol for
Set-Disjointness [28] to prove conditional lower bounds. Indeed, that quantum communication protocol
                    √
is based on the O( n)-time quantum query algorithm for OR (Grover’s algorithm [42]), which induces
the needed approximate polynomial for OR. Hence, the following theorem in some sense answers their
question in the affirmative.

Theorem 1.13 (Informal). Assuming SETH (or OVC), no n2−Ω(1) -time algorithm for

                                                 {−1, 1}-Max-IPn,no(1)

can achieve multiplicative no(1) -approximation.

      The full statement can be found in Theorem 7.1 and Theorem 7.2.

1.5     Our results on Z-Max-IP
                                                                      ∗
1.5.1    Hardness of exact Z-Max-IP in dimension 2O(log n)
Now we turn to discussing our results on Z-Max-IP. We show that Z-Max-IP is hard to solve in n2−Ω(1)
                      ∗
time, even with 2O(log n) -dimensional vectors.

Theorem 1.14. Assuming SETH (or OVC), there is a constant c such that any exact algorithm for
                                 ∗
Z-Max-IPn,d in dimension d = clog n requires n2−o(1) time, with vectors of O(log n)-bit entries.

    As direct corollaries of the above theorem, using reductions implicit in [75], we also conclude
hardness for `2 -Furthest Pair and Bichromatic `2 -Closest Pair under SETH (or OVC) in dimension
      ∗
2O(log n) .
  10 We also remark that in particular, the hardness of approximate {−1, 1}-Max-IP follows from the hardness of approximate

{0, 1}-Max-IP. We include the proof via BQP communication protocols here as it is an interesting alternative proof.


                           T HEORY OF C OMPUTING, Volume 16 (4), 2020, pp. 1–50                                             8
     O N T HE H ARDNESS OF A PPROXIMATE AND E XACT (B ICHROMATIC ) M AXIMUM I NNER P RODUCT

                                                                       ∗
Theorem 1.15 (Hardness of `2 -Furthest Pair in Dimension clog n ). Assuming SETH (or OVC), there is a
                                                        ∗
constant c such that `2 -Furthest Pair in dimension clog n requires n2−o(1) time, with vectors of O(log n)-bit
entries.
                                                                                       ∗
Theorem 1.16 (Hardness of Bichromatic `2 -Closest Pair in Dimension clog n ). Assuming SETH (or
                                                                                    ∗
OVC), there is a constant c such that Bichromatic `2 -Closest Pair in dimension clog n requires n2−o(1)
time, with vectors of O(log n)-bit entries.

    The above lower bounds on `2 -Furthest Pair and Bichromatic `2 -Closest Pair are in sharp contrast
with the case of `2 -Closest Pair, which can be solved in 2O(d) · n logO(1) n time [23, 53, 38].

1.5.2   Improved dimensionality reduction for OV and Hopcroft’s problem
Our hardness of Z-Max-IP is established by a reduction from Hopcroft’s problem, whose hardness is in
turn derived from the following significantly improved dimensionality reduction for OV.

Lemma 1.17 (Improved Dimensionality Reduction for OV). Let 1 ≤ ` ≤ d. There is an
                                      log∗ d
                                                       
                            O n · `O(6 ·(d/`)) · poly(d) -time

                                log∗ d ·(d/`))
reduction from OVn,d to `O(6                     instances of Z-OVn,`+1 , with vectors of entries with bit-length
                    ∗
O(d/` · log ` · 6log d ).

Comparison with [75]. Comparing to the old construction in [75], our reduction here is more efficient
when ` is much smaller than d (which is the case we care about). That is, in [75], OVn,d can be reduced
                                                      log∗ d
to d d/` instances of Z-OVn,`+1 , while we get {`6 }d/` instances in our improved one. So, for example,
               ∗                                log∗ d
when ` = 7log d , the old reduction yields d d/7        = nω(1) instances (recall that d = c log n for an arbitrary
                                                                                         ∗
constant c), while our improved one yields only no(1) instances, each with 2O(log n) dimensions.
     From Lemma 1.17, the following theorem follows in the same way as in [75].
                                                                           ∗
Theorem 1.18 (Hardness of Hopcroft’s Problem in Dimension clog n ). Assuming SETH (or OVC), there
is a constant c such that Z-OVn,clog∗ n with vectors of O(log n)-bit entries requires n2−o(1) time.

1.5.3   Connection between Z-Max-IP lower bounds and NP · UPP communication protocols
We also show a new connection between Z-Max-IP and a special type of communication protocol. Let us
first recall the Set-Disjointness problem.

Definition 1.19 (Set-Disjointness). Let n ∈ N. In Set-Disjointness (DISJn ), Alice holds a vector X ∈
{0, 1}n , Bob holds a vector Y ∈ {0, 1}n , and they want to determine whether X ·Y = 0.

    In [5], the hardness of approximate Max-IP is established via a connection to MA communication
protocols (in particular, an MA communication protocol with small communication complexity for
Set-Disjointness). Our lower bound for (exact) Z-Max-IP can also be connected to similar NP · UPP
protocols (note that MA = NP · promiseBPP).

                         T HEORY OF C OMPUTING, Volume 16 (4), 2020, pp. 1–50                                    9
                                                         L IJIE C HEN

    Formally, we define NP · UPP protocols as follows.11
Definition 1.20. For a problem Π with two n-bit strings x, y as inputs (Alice holds x and Bob holds y),
we say a communication protocol is an (m, `)-efficient NP · UPP communication protocol if the following
holds:
    • There are three parties Alice, Bob and Merlin in the protocol.
    • Merlin sends Alice and Bob an advice string z of length m, which is a function of x and y.
    • Given y and z, Bob sends Alice ` bits, and Alice decides to accept or not.12 They have an unlimited
      supply of private random coins (not public, which is important) during their conversation. The
      following conditions hold:

           – If Π(x, y) = 1, then there is an advice z from Merlin such that Alice accepts with probability
             ≥ 1/2.
           – Otherwise, for all possible advice strings from Merlin, Alice accepts with probability < 1/2.

    Moreover, we say the protocol is (m, `)-computational-efficient, if in addition the probability distri-
butions of both Alice’s and Bob’s behavior can be computed in poly(n) time given their input and the
advice.
    Our new reduction from OV to Max-IP actually implies an efficient NP · UPP protocol for Set-
Disjointness.
Theorem 1.21. For every 1 ≤ α ≤ n, there is an
                                ∗
                                                   
                         α · 6log n · (n/2α ), O(α) -computational-efficient
NP · UPP communication protocol for DISJn .
    For example, when α = 3 log∗ n, Theorem 1.21 implies there is an O(o(n), O(log∗ n))-computational-
efficient NP · UPP communication protocol for DISJn . Moreover, we show that if the protocol of
                                                                 ∗
Theorem 1.21 can be improved a little bit (like removing the 6log n term), we would obtain the desired
hardness for Z-Max-IP in dimension ω(1).
Theorem 1.22. Assuming SETH (or OVC), if there is an increasing and unbounded function f such that
for every 1 ≤ α ≤ n, there is an
                                         (n/ f (α), α) -computational-efficient
NP · UPP communication protocol for DISJn , then Z-Max-IPn,ω(1) requires n2−o(1) time with vectors of
polylog(n)-bit entries. The same holds for `2 -Furthest Pair and Bichromatic `2 -Closest Pair.
  11 Here are some comments on the name NP · UPP. Roughly speaking, a UPP protocol Π is a private-coin randomized

communication protocol in which Alice and Bob accept with probability > 1/2 if and only if the answer is yes. For a
communication protocol class D, NP · D denotes a new class of protocol resembling a Merlin-Arthur game: A prover (who
knows the inputs of Alice and Bob) sends a proof to both Alice and Bob first; then Alice and Bob run a prescribed D protocol
on their inputs and the proof to decide whether they accept the proof or not. The protocol needs to satisfy the soundness and
completeness conditions similar to an original MA protocol.
  12 Our simplification here is justified by the fact that one-way UPP communication protocols are equivalent to their seemingly

more powerful two-way analogs [63].


                           T HEORY OF C OMPUTING, Volume 16 (4), 2020, pp. 1–50                                              10
      O N T HE H ARDNESS OF A PPROXIMATE AND E XACT (B ICHROMATIC ) M AXIMUM I NNER P RODUCT

1.6    Improved MA protocols for Set-Disjointness
Finally, we also obtain a new MA protocol for Set-Disjointness, which improves on the previous
   √                                                √
O( n log n) protocol in [1], and is closer to the Ω( n) lower bound by [54]. Like the protocol in [1], our
new protocol also works for the following slightly harder problem Inner Product.

Definition 1.23 (Inner Product). Let n ∈ N. In Inner Product (IPn ), Alice holds a vector X ∈ {0, 1}n , Bob
holds a vector Y ∈ {0, 1}n , and they want to compute X ·Y .

Theorem 1.24. There is an MA protocol for DISJn and IPn with communication complexity
                                       p                   
                                    O      n log n log log n .

                                                                                                  √
   In [67], the author asked whether the MA communication complexity of DISJn (IPn ) is Θ( n) or
  √
Θ( n log n). Our result makes progress on that question by showing that the true complexity lies between
  √           √
Θ( n) and Θ( n log n log log n).


1.7    Intuition for dimensionality self-reduction for OV
             ∗
The 2O(log n) factor in Lemma 1.17 is not common in theoretical computer science,13 and our new
reduction for OV is considerably more complicated than the polynomial-based construction from [75].
Hence, it is worth discussing the intuition behind Lemma 1.17, and the reason why we get a factor of
      ∗
2O(log n) .
A direct approach based on the Chinese Remainder Theorem. We first discuss a direct reduction
based on the Chinese Remainder Theorem (CRT) (see Theorem 2.5 for a formal definition). CRT says
that given a collection of distinct primes q1 , . . . , qb , and a collection of integers r1 , . . . , rb , there exists a
unique integer t = CRR({r j }; {q j }) such that t ≡ r j (mod q j ) for each j ∈ [b] and 0 ≤ t < ∏bj=1 q j (CRR
stands for Chinese Remainder Representation).
     Now, let b, ` ∈ N. Suppose we would like to have a dimensionality reduction ϕ from {0, 1}b·` to
Z` . We can partition an input x ∈ {0, 1}b·` into ` blocks, each of length b, and represent each block via
CRT: that is, for a block z ∈ {0, 1}b , we map it into a single integer ϕblock (z) := CRR({z j }; {q j }), and
the concatenations of ϕblock over all blocks of x is ϕ(x) ∈ Z` .
     The key idea here is that, for z, z0 ∈ {0, 1}b , ϕblock (z) · ϕblock (z0 ) (mod q j ) is simply z j · z0j . That is,
the multiplication between two integers ϕblock (z) · ϕblock (z0 ) simulates the coordinate-wise multiplication
between two vectors z and z0 !
     Therefore, if we make all primes q j larger than `, we can in fact determine x · y from ϕ(x) · ϕ(y), by
looking at ϕ(x) · ϕ(y) (mod q j ) for each j. That is,

                            x · y = 0 ⇔ ϕ(x) · ϕ(y) ≡ 0 (mod q j ) for every j ∈ [b].
  13 Other examples include an O 2O(log∗ n) n4/3 -time algorithm for Z-OV              O(log∗ n) n log n -time algorithms (Fürer’s
                                                                                                      
                                                                          n,3 [59], O 2
                                                                                                          ∗
algorithm with its modifications) for Fast Integer Multiplication [39, 36, 44] and an old O(nd/2 2O(log n) )-time algorithm for
Klee’s measure problem [30].


                            T HEORY OF C OMPUTING, Volume 16 (4), 2020, pp. 1–50                                               11
                                                      L IJIE C HEN

                                                               2
   Hence, let V be the set of all integers0 ≤ v ≤ ` · ∏bj=1 q j that v ≡ 0 (mod q j ) for every j ∈ [b]. We
have
                                        x · y = 0 ⇔ ϕ(x) · ϕ(y) ∈ V.
     The reduction is completed by constructing a Z-OV instance for each v ∈ V : we append corresponding
values to make ϕA (x) = [ϕ(x), −1] and ϕB (y) = [ϕ(y), v] (this step is from [75]).
     Note that a nice property for ϕ is that each ϕ(x)i only depends on the i-th block of x, and the mapping
is the same on each block (ϕblock ); we call this the block mapping property.
Analysis of the direct reduction. To continue building intuition, let us analyze the above reduction. The
size of V is the number of Z-OVn,`+1 instances we create, and |V | ≥ ∏bj=1 q j . These primes q j have to
be all distinct, and it follows that ∏bj=1 q j is bΘ(b) . Since we want to create at most no(1) instances (or
nε for arbitrarily small ε), we need to set b ≤ log n/ log log n. Moreover, to base our hardness on OVC
which deals with c log n-dimensional vectors, we need to set b · ` = d = c · log n for an arbitrary constant
c. Therefore, we must have ` ≥ log log n, and the above reduction only obtains the same hardness result
as [75].
Key observation: “Most space modulo q j ” is actually wasted. To improve the above reduction, we
need to make |V | smaller. Our key observation about ϕ is that, for the primes q j , they are mostly larger
than b  `, but ϕ(x) · ϕ(y) ∈ {0, 1, . . . , `} (mod q j ) for all these q j . Hence, “most space modulo q j ” is
actually wasted.
Make more “efficient” use of the “space”: recursive reduction. Based on the previous observation,
we want to use the “space modulo q j ” more efficiently. It is natural to consider a recursive reduction.
We will require all our primes q j to be larger than b. Let bm be a very small integer compared to b, and
ψ : {0, 1}bm ·` → Z` with a set Vψ and a block mapping ψblock be a similar reduction on much smaller
inputs: for x, y ∈ {0, 1}bm ·` , x · y = 0 ⇔ ψ(x) · ψ(y) ∈ Vψ . We also require here that ψ(x) · ψ(y) ≤ b for
every x and y.
    For an input x ∈ {0, 1}b·` and a block z ∈ {0, 1}b of x, our key idea is to partition z again into b/bm
“micro” blocks each of size bm . And for a block z in x, let z1 , . . . , zb/bm be its b/bm micro blocks. We map
                                                     b/b         b/b
z into an integer ϕblock (z) := CRR({ψblock (z j )} j=1m ; {q j } j=1m ).
    Now, given two blocks z, w ∈ {0, 1}b , we can see that

                          ϕblock (z) · ϕblock (w) ≡ ψblock (z j ) · ψblock (w j ) (mod q j ).

       For two inputs x, y ∈ {0, 1}b·` , for (i, j) ∈ [`] × [b/bm ], we use xi, j ∈ {0, 1}bm (yi, j ) to denote the j-th
micro block in the i-th block of x (y, respectively). We also define x[ j] ∈ {0, 1}bm ·` as the concatenation of
x1, j , x2, j , . . . , x`, j (y[ j] is defined similarly). Then we have
                                            `
                           ϕ(x) · ϕ(y) ≡ ∑ ψblock (xi, j ) · ψblock (yi, j ) (mod q j )
                                           i=1
                                        = ψ(x[ j] ) · ψ(y[ j] ).                          (ψ(x[ j] ) · ψ(y[ j] ) ≤ b < q j )

   Hence, for every j ∈ [b/bm ], we can determine whether x[ j] · y[ j] = 0 from whether ϕ(x) · ϕ(y)
(mod q j ) ∈ Vϕ , and therefore also determine whether x · y = 0 from ϕ(x) · ϕ(y).

                          T HEORY OF C OMPUTING, Volume 16 (4), 2020, pp. 1–50                                           12
      O N T HE H ARDNESS OF A PPROXIMATE AND E XACT (B ICHROMATIC ) M AXIMUM I NNER P RODUCT

    We can now observe that |V | ≤ bΘ(b/bm ) , smaller than before; thus we get an improvement, depending
on how large can bm be. Clearly, the reduction ψ can also be constructed from even smaller reductions,
and after recursing Θ(log∗ n) times, we can switch to the direct construction discussed before. By a
straightforward (but tedious) calculation, we can derive Lemma 1.17.
                                        ∗
High-level explanation on the 2O(log n) factor. Ideally, we want to have a reduction from OV to Z-OV
with only `O(b) instances, in other words, we want |V | = `O(b) . The reason we need to pay an extra
      ∗
2O(log n) factor in the exponent is as follows:
                                        b/b
    In our reduction, |V | is at least ∏ j=1m q j , which is also the bound on each coordinate of the reduction:
                                                          b/b                                       b/b
ϕ(x)i equals to a CRR encoding of a vector with {q j } j=1m , whose value can be as large as ∏ j=1m q j − 1.
That is, all we want is to control the upper bound on the coordinates of the reduction.
    Suppose we are constructing an “outer” reduction ϕ : {0, 1}b·` → Z` from the “micro” reduction
ψ : {0, 1}bm ·` → Z` with coordinate upper bound Lψ (ψ(x)i ≤ Lψ ), and let Lψ = `κ·bm . (That is, κ is the
extra factor comparing to the ideal case.) Recall that we have to ensure q j > ψ(x) · ψ(y) to make our
construction work, and therefore we have to set q j larger than Lψ2 .
                                                                 b/b
    Then the coordinate upper bound for ϕ becomes Lϕ = ∏ j=1m q j ≥ (Lψ )2·b/bm = `2κ·b . Therefore, we
can see that after one recursion, the “extra factor” κ at least doubles. Since our recursion proceeds in
                                                 ∗
Θ(log∗ n) rounds, we have to pay an extra 2O(log n) factor on the exponent.


1.8    Related work

SETH-based conditional lower bound. SETH is one of the most fruitful conjectures in the Fine-
Grained Complexity. There are numerous conditional lower bounds based on it for problems in P
among different areas, including: dynamic data structures [61, 7, 45, 55, 3, 46, 41], computational
geometry [24, 37, 75, 67], pattern matching [8, 22, 21, 25, 26], graph algorithms [66, 40, 9, 56]. See [72]
for a recent survey on SETH-based lower bounds (and more).

Communication Complexity and conditional hardness. The connection between communication
protocols (in various model) for Set-Disjointness and SETH dates back at least to [62], in which it is
shown that a sublinear computational efficient protocol for 3-party Number-On-Forehead Set-Disjointness
problem would refute SETH. And it is worth mentioning that Abboud and Rubinstein’s result [4] builds on
the O(log
    e     n) IP communication protocol for Set-Disjointness in [1]. Making use of the IP communication
protocol for low-space computation, [31] establish an equivalence class for LCS-Closest-Pair.
    In [32], Σ2 communication protocols are utilized to show the subquadratic-time equivalence between
OVn,O(log n) , Max-IPn,O(log n) , Approximate Bichromatic Closest Pair and several other problems.

Distributed PCP. Using Algebraic Geometry codes (AG codes), [67] obtains a better MA protocol,
which in turn improves the efficiency of the previous distributed PCP construction of [5]. He then shows
the n2−o(1) -time hardness for 1 + o(1)-approximation to Bichromatic Closest Pair and o(d)-additive
approximation to Max-IPn,d with this new technique.
    [51] use the Distributed PCP framework to derive inapproximability results for k-Dominating Set
under various assumptions. In particular, building on the techniques of [67], it is shown that under SETH,

                        T HEORY OF C OMPUTING, Volume 16 (4), 2020, pp. 1–50                                 13
                                                L IJIE C HEN

k-Dominating Set has no (log n)1/ poly(k,e(ε)) approximation in nk−ε time.14
   [52] also utilize AG codes and polynomial method to show hardness results for Exact and Approximate
Monochromatic Closest Pair and Approximate Monochromatic Maximum Inner Product.
                                                                                                 √         
                                                                                                Ω loglog n
                                                                                                      log n
Hardness of approximation in P. Making use of Chebyshev embeddings, [11] prove a 2
inapproximability lower bound on {−1, 1}-Max-IP. [2] take an approach different from Distributed
PCP, and shows that under certain complexity assumptions, LCS does not have a deterministic 1 + o(1)-
approximation in n2−ε time. They also establish a connection with circuit lower bounds and show that the
existence of such a deterministic algorithm implies ENP does not have non-uniform linear-size Valiant
Series Parallel circuits. In [4], it is improved to that any constant factor approximation deterministic
algorithm for LCS in n2−ε time implies that ENP does not have non-uniform linear-size NC1 circuits.
See [5] for more related results in hardness of approximation in P.

Organization of the paper
In Section 2, we introduce the needed preliminaries for this paper. In Section 3, we prove our characteri-
                                                                                         ∗
zations for approximate Max-IP and other related results. In Section 4, we prove 2O(log n) dimensional
hardness for Z-Max-IP and other related problems. In Section 5, we establish the connection between
NP · UPP communication protocols and SETH-based lower bounds for exact Z-Max-IP. In Section 6, we
               √
present the O n log n log log n MA protocol for Set-Disjointness.


2      Preliminaries
We begin by introducing some notation. For an integer d, we use [d] to denote the set of integers from 1
to d. For a vector u, we use ui to denote the i-th element of u.
    We use log(x) to denote the logarithm of x with respect to base 2 and ln(x) to denote the natural
logarithm of x.
    In our arguments, we use the iterated logarithm function log∗ (n), which is defined recursively as
follows:                                      (
                                                0                n ≤ 1;
                                  log∗ (n) :=      ∗
                                                log (log n) + 1  n > 1.

2.1     Fast rectangular matrix multiplication
Similarly to previous algorithms using the polynomial method, our algorithms make use of the algorithms
for fast rectangular matrix multiplication.

Theorem 2.1 ([57]). There is an N 2+o(1) -time algorithm for multiplying two matrices A and B with size
N × N α and N α × N, where α > 0.31389.

Theorem 2.2 ([35]). There is an N 2 · polylog(N)-time algorithm for multiplying two matrices A and B
with size N × N α and N α × N, where α > 0.172.
    14 where e : R+ → N is some function



                           T HEORY OF C OMPUTING, Volume 16 (4), 2020, pp. 1–50                          14
      O N T HE H ARDNESS OF A PPROXIMATE AND E XACT (B ICHROMATIC ) M AXIMUM I NNER P RODUCT

2.2     Number theory
Here we recall some facts from number theory. In our reduction from OV to Z-OV, we will apply the
famous prime number theorem, which supplies a good estimate of the number of primes smaller than a
certain number. See e.g. [19] for a reference on this.

Theorem 2.3 (Prime Number Theorem). Let π(n) be the number of primes ≤ n. We have

                                                         π(n)
                                                    lim         = 1.
                                                    n→∞ n/ ln n

      From a simple calculation, we have the following lemma.

Lemma 2.4. There are 10n distinct primes in [n + 1, n2 ] for all sufficiently large n.

Proof. For a sufficiently large n, from the prime number theorem, the number of primes in [n + 1, n2 ] is
equal to
                                π(n2 ) − π(n) ∼ n2 /2 ln n − n/ ln n  10n.

      Next we recall the Chinese remainder theorem, and Chinese remainder representation.

Theorem 2.5. Given d pairwise co-prime integers q1 , q2 , . . . , qd , and d integers r1 , r2 , . . . , rd , there is
exactly one integer 0 ≤ t < ∏di=1 qi such that

                                        t ≡ ri   (mod q j ) for every i ∈ [d].

We call this t the Chinese remainder representation (or the CRR encoding) of the ri (with respect to these
qi ). We also denote
                                          t = CRR({ri }; {qi })
for convenience. We sometimes omit the sequence {qi } for simplicity, when it is clear from the context.
    Moreover, t can be computed in polynomial time with respect to the total bits of all the given integers.

2.3     Communication complexity
In our paper we will make use of a certain kind of MA protocol, we call them (m, r, `, s)-efficient
protocols.15

Definition 2.6. We say an MA Protocol is (m, r, `, s)-efficient for a communication problem, if in the
protocol:

      • There are three parties Alice, Bob and Merlin in the protocol, Alice holds input x and Bob holds
        input y.

      • Merlin sends an advice string z of length m to Alice, which is a function of x and y.
  15 Our notation here is adopted from [51]. They also defined similar k-party communication protocols, while we only discuss

2-party protocols in this paper.


                            T HEORY OF C OMPUTING, Volume 16 (4), 2020, pp. 1–50                                          15
                                                            L IJIE C HEN

      • Alice and Bob jointly toss r coins to obtain a random string w of length r.

      • Given y and w, Bob sends Alice a message of length `.

      • After that, Alice decides whether to accept or not.

           – When the answer is yes, Merlin has exactly one advice such that Alice always accept.
           – When the answer is no, or Merlin sends the wrong advice, Alice accepts with probability at
             most s.

2.4     Derandomization
We make use of expander graphs to reduce the amount of random coins needed in one of our communica-
tion protocols. We abstract the following result for our use here.
Theorem 2.7 (see, e. g., Theorem 21.12 and Theorem 21.19 in [20]). Let m be an integer. There is a
universal constant c1 such that for every ε < 1/2, there is a poly(log m, log ε −1 )-time computable function
                         −1             −1
F : {0, 1}log m+c1 ·log ε → [m]c1 ·log ε , such that for every set B ⊆ [m] of size at least m/2,

                                       Pr                     / B for every a ∈ F(w)] ≤ ε,
                                                           [a ∈
                                                      −1
                             w∈{0,1}log m+c1 ·log ε

here a ∈ F(w) means a is one of the elements in the sequence F(w).


3      Hardness of Approximate Max-IP
In this section we prove our characterizations of approximate Max-IP.

3.1     The multiplicative case
We begin with the proof of Theorem 1.5. In Lemma 3.2, we construct the desired approximation algorithm
and in Corollary 3.4 we prove the lower bound.
    First we need the following simple lemma, which says that the k-th root of the sum of the k-th powers
of non-negative reals gives a good approximation to their maximum.
Lemma 3.1. Let S be a set of non-negative real numbers, k be an integer, and xmax := maxx∈S x. We have
                                          !1/k
                                                 h                   i
                                        k                        1/k
                                   ∑  x        ∈  x    , x
                                                    max max · |S|      .
                                          x∈S

Proof. Since                                               !
                                                                h                  i
                                                ∑ xk              k
                                                               ∈ xmax          k
                                                                      , |S| · xmax   ,
                                              x∈S

the lemma follows directly by taking the k-th root of both sides.


                         T HEORY OF C OMPUTING, Volume 16 (4), 2020, pp. 1–50                             16
     O N T HE H ARDNESS OF A PPROXIMATE AND E XACT (B ICHROMATIC ) M AXIMUM I NNER P RODUCT

Lemma 3.2. Assuming ω(log n) < d < no(1) and letting
                                                         
                                               logt
                               ε = min                  ,1 ,
                                          log(d/ log n)

there are multiplicative t-approximation deterministic algorithms for R+ -Max-IPn,d running in time16
                                   2+o(1)−0.31· −1 1 0.31
                                                                          
                              O n              ε + 2
                                                            = O n2+o(1)−Ω(ε)

or time
                            2−0.17· −1 1 0.17
                                                                                
                         O n       ε + 2
                                              · polylog(n) = O n2−Ω(ε) · polylog(n) .

Proof. Let d = c · log n. From the assumption, we have c = ω(1), and ε = min (log(t)/log(c), 1). When
logt > log c, we simply use a multiplicative c-approximation algorithm instead, hence in the following
we assume logt ≤ log c. We begin with the first algorithm here.
Construction and analysis of the Power of Sum Polynomial Pr (z). Let r be a parameter to be specified
later and z be a vector from (R+ )d . We define a polynomial Pr (z) as
                                                          !r
                                                                 d
                                                   Pr (z) :=     ∑ zi      .
                                                                 i=1

     Let E := {(e1 , e2 , . . . , ed ) | ∑di=1 ei = r, the ei are non-negative integers}.
     We have                                                                  
                                                       r+d −1          r+d −1
                                             |E| =                 =               .
                                                         d −1              r
     For each e ∈ E, we define ze := ∏di=1 zei i . Now, by expanding out the polynomial, we can write Pr (z)
as
                                                    Pr (z) = ∑ ce · ze ,
                                                               e∈E

where the ce are the corresponding coefficients. Then consider Pr (x, y) := Pr (x1 · y1 , x2 · y2 , . . . , xd · yd ),
plugging in zi := xi · yi , it can be written as

                                                Pr (x, y) := ∑ ce · xe · ye ,
                                                               e∈E

where xe and ye are defined similarly as ze .
Construction and analysis of the Batch Evaluation Polynomial Pr (X,Y ). Now, let X and Y be two
sets each consisting of b = t r/2 vectors from {0, 1}d . We define17

                                    Pr (X,Y ) :=     ∑ Pr (x, y) = ∑ (x · y)r .
                                                   x∈X,y∈Y              x∈X,y∈Y

  16 In the following we assume a real RAM model of computation for simplicity.
  17 We remark that similar polynomials are also used in [12] to give a simple algorithm for solving the light bulb problem.



                           T HEORY OF C OMPUTING, Volume 16 (4), 2020, pp. 1–50                                            17
                                                           L IJIE C HEN

      By Lemma 3.1, we have

                                   Pr (X,Y )1/r ∈ [OPT(X,Y ), OPT(X,Y ) · t] ,

recall that OPT(X,Y ) := maxx∈X,y∈Y x · y.
Embedding into Rectangular Matrix Multiplication. Now, for x, y ∈ {0, 1}d , we define the mappings
φx (x) and φy (y) as,
                           φx (x) := (ce1 · xe1 , ce2 · xe2 , . . . , cem · xem )
and
                                              φy (y) := (ye1 , ye2 , . . . , yem ) ,
where m = |E| and e1 , e2 , . . . , em is an enumeration of all vectors in E.
   From the definition, it follows that

                                                  φx (x) · φy (y) = Pr (x, y)

for every x, y ∈ {0, 1}d .
    Then for each X and Y , we map them into m-dimensional vectors φX (X) and φY (Y ) simply by a
summation:
                           φX (X) := ∑ φx (x) and φY (Y ) := ∑ φy (y).
                                              x∈X                                         y∈Y

We can see
                     φX (X) · φY (Y ) = ∑ φx (x) · ∑ φy (y) = ∑ ∑ Pr (x, y) = Pr (X,Y ).
                                         x∈X                y∈Y                 x∈X y∈Y

     Given two sets A, B each consisting of n vectors from {0, 1}d , we split A into n/b sets A1 , A2 , . . . , An/b
of size b, and split B in the same way as well. Then we construct a matrix MA (MB ) of size n/b × m,
such that the i-th row of MA (MB ) is the vector φX (Ai )(φY (Bi )). After that, the evaluation of Pr (Ai , B j ) for
all integers i, j ∈ [n/b] can be reduced to computing the matrix product MA · MBT . After knowing all the
Pr (Ai , B j ), we simply compute their maximum, whose r-th root gives us a t-approximate answer of the
original problem.
Analysis of the running time. Finally, we are going to specify the parameter r and analyze the time
complexity. In order to utilize the fast matrix multiplication algorithm from Theorem 2.1, we need to have

                                                      m ≤ (n/b)0.313 ,

then our running time is simply (n/b)2+o(1) = n2+o(1) /b2 .
    We are going to set r = k · log n/ log c, and our choice of k will satisfy k = Θ(1) and r ≤ d. We have
                                            r                   r                             k·log n/ log c
                               e · (r + d)                e · 2d                  2c log n · e
                      m≤                          ≤                     ≤                                            ,
                                     r                       r                  k · log n/ log c
and therefore                                                          .
                                                       2c log c
                                 log m ≤ k · log n log          + log e    log c.
                                                           k

                         T HEORY OF C OMPUTING, Volume 16 (4), 2020, pp. 1–50                                            18
     O N T HE H ARDNESS OF A PPROXIMATE AND E XACT (B ICHROMATIC ) M AXIMUM I NNER P RODUCT

Since c = ω(1) and k = Θ(1), we have

                              log m ≤ (1 + o(1)) · k log n = k log n + o(log n).

Plugging in, we have

                         m ≤ (n/b)0.313
                    ⇐= log m ≤ 0.313 · (log n − log b)
                    ⇐= k log n ≤ 0.31 · (log n − log b)
                    ⇐= 0.31 · (r/2) · logt + k log n ≤ 0.31 log n                                    (b = t r/2 )
                       log n              0.31
                    ⇐=       · k · logt ·      + k log n ≤ 0.31 log n                     (r = k · log n/ log c)
                       log c               2
                                              
                                  logt 0.31
                    ⇐= k · 1 +          ·        ≤ 0.31
                                  log c 2
                                  0.31             0.31
                    ⇐= k =                   =             .
                                  logt 0.31
                            1 + log c · 2       1 + 0.31
                                                     2 ·ε

    Note since ε ∈ [0, 1], k is indeed Θ(1).
    Finally, with our choice of k specified, our running time is n2+o(1) /b2 = n2+o(1) /t r .
    By a simple calculation,

                                    logt r = r · logt
                                          = k · log n/ log c · logt
                                                    (                    )
                                                      logt        0.31
                                          = log n ·          ·
                                                      log c 1 + 0.312 ·ε
                                                     0.31ε
                                          = log n ·
                                                   1 + 0.31
                                                        2 ·ε
                                                      0.31
                                          = log n · −1 0.31 .
                                                   ε + 2

    Hence, our running time is
                                          n2+o(1)     2+o(1)− −10.310.31
                                                             ε + 2
                                                  = n
                                            tr
as stated.
The second algorithm. The second algorithm follows exactly the same except for that we apply
Theorem 2.2 instead, hence the constant 0.31 is replaced by 0.17.

    The lower bound follows directly from the new MA protocol for Set-Disjointness in [67]. We present
an explicit proof here for completeness.
    Before proving the lower bound, we need the following reduction from OV to approximate Max-IP.

                        T HEORY OF C OMPUTING, Volume 16 (4), 2020, pp. 1–50                                  19
                                                      L IJIE C HEN

Lemma 3.3 (Implicit in Theorem 4.1 of [67]). There is a universal constant c1 such that, for every integer
c, realε ∈ (0, 1] and τ ≥ 2, OVn,c log n can be reduced to nε Max-IPn,d instances (Ai , Bi ) for i ∈ [nε ], such
that:

      • d = τ poly(c/ε) · log n.

      • Letting T = c log n · τ c1 , if there is an a ∈ A and b ∈ B such that a · b = 0, then there exists an i such
        that OPT(Ai , Bi ) ≥ T .

      • Otherwise, for every i ∈ [nε ] we must have OPT(Ai , Bi ) ≤ T /τ.

    The reduction above follows directly from the new MA communication protocols in [67] together
with the use of expander graphs to reduce the amount of random coins. A proof for the lemma is deferred
to Section 3.5.
    Now we are ready to show the lower bound on approximate Max-IP.

Corollary 3.4. Assuming SETH (or OVC) and letting d = ω(log n) and t ≥ 2, no n2−Ω(1) -time algorithm
for Max-IPn,d can achieve multiplicative t-approximation if

                                                  t = (d/ log n)o(1) .

Proof. Let c = d/ log n. Note that t = co(1) (recall that t and d are two functions of n).
                                                              0
    Suppose for contradiction that there is an n2−ε -time multiplicative t(n)-approximation algorithm A
for Max-IP(n, d) for some ε 0 > 0.
    Let ε = ε 0 /2. Now, for every constant c2 , we apply the reduction in Lemma 3.3 with τ = t to reduce an
OVn,c2 log n instance to nε Max-IPn,t poly(c2 /ε) ·log n instances. Since t poly(c2 /ε) = t O(1) and t = co(1) , it follows
that for sufficiently large n, t O(1) · log n = co(1) · log n = o(d). It in turn implies that for sufficiently large
n, nε calls to A are enough to solve the OVn,c2 log n instance.
                                                           0
    Therefore, we can solve OVn,c2 log n in n2−ε · nε = n2−ε time for all constants c2 . Contradiction to
OVC.

      Finally, the correctness of Theorem 1.5 follows directly from Lemma 3.2 and Corollary 3.4.

3.2     The additive case
In this subsection we prove Theorem 1.10. We proceed similarly as in the multiplicative case by
establishing the algorithm first.
    The algorithm is actually very easy, we simply apply the following algorithm from [13].

Lemma 3.5 (Implicit in Theorem 5.1 in [13]). Assuming ε = ω(log6 log(d log n)/ log3 n), there is an
                                                                
                                                          d
                                      2−Ω ε 1/3 / log( ε log   )
                                     n                       n    -time

additive ε · d-approximation randomized algorithm for Max-IPn,d .


                           T HEORY OF C OMPUTING, Volume 16 (4), 2020, pp. 1–50                                         20
    O N T HE H ARDNESS OF A PPROXIMATE AND E XACT (B ICHROMATIC ) M AXIMUM I NNER P RODUCT

Lemma 3.6. Let ε = min(t, d)/d. There is an
                                            1/3     −1
                                                        
                                     O n2−Ω(ε / log ε )

time, additive t-approximation randomized algorithm for Max-IPn,d when ε = ω(log6 log n/ log3 n).
Proof. When t > d the problem becomes trivial, so we can assume t ≤ d, and now t = ε · d.
    Let ε1 = ε/2 and c1 be a constant to be specified later. Given a Max-IPn,d instance with two sets A
and B each consisting of n vectors from {0, 1}d , we create another Max-IPn,d1 instance with sets A,
                                                                                                  e Be and
           −2
d1 = c1 · ε1 · log n as follows:

   • Pick d1 uniform random indices i1 , i2 , i3 , . . . , id1 ∈ [d], each ik for k ∈ [d1 ] is an independent uniform
     random number in [d].

   • Then we construct Ae from A by reducing each a ∈ A into ã = (ai , ai , . . . , ai ) ∈ {0, 1}d1 and Be
                                                                     1    2            d1
     from B in the same way.

    Note for each a ∈ A and b ∈ B, by a Chernoff bound, we have
                                                  
                                 ã · b̃ a · b                2
                            Pr          −      ≥ ε1 < 2e−2d1 ε1 = 2n−2·c1 .
                                  d1      d

By setting c1 = 2, the above probability is smaller than 1/n3 .
   Hence, by a simple union bound, with probability at least 1 − 1/n, we have

                                                ae · e
                                                     b a·b
                                                       −   ≤ ε1
                                                 d1      d

for every a ∈ A and b ∈ B. Hence, it means that this reduction only changes the “relative inner
product”(a · b/d or ae · e
                         b/d1 ) of each pair by at most ε1 . Hence, the maximum of the “relative inner
product” also changes by at most ε1 , and we have |OPT(A, B)/d − OPT(A,      e 1 | ≤ ε1 .
                                                                          e B)/d
    Then we apply the algorithm in Lemma 3.5 on the instance with sets A and Be with error ε = ε1 to
                                                                            e
                   e and our final answer is simply (O/d
obtain an estimate O,                                   e 1 ) · d.
    From the guarantee from Lemma 3.5, we have |OPT(A,         e 1 − O/d
                                                             e B)/d    e 1 | ≤ ε1 , and therefore

                                      |OPT(A, B)/d − O/d
                                                     e 1 | ≤ 2ε1 = ε,

from which the correctness of our algorithm follows directly.
   For the running time, note that the reduction part runs in linear time O(n · d), and the rest takes
                                                      d1
                                                           
                                  2−Ω ε 1/3 / log( ε log n)
                                                                     1/3     −1
                                n                   1        = n2−Ω(ε / log ε )

time.

   The lower bound is already established in [67], we show it follows from Lemma 3.3 here for
completeness.

                        T HEORY OF C OMPUTING, Volume 16 (4), 2020, pp. 1–50                                      21
                                                  L IJIE C HEN

Lemma 3.7 (Theorem 4.1 of [67]). Assuming SETH (or OVC), and letting d = ω(log n) and t > 0, there
is no n2−Ω(1) -time additive t-approximation randomized algorithm for Max-IPn,d if
                                                   t = o(d).
                                                                                                           0
Proof. Recall that t and d are all functions of n. Suppose for contradiction that there is an n2−ε -time
additive t(n)-approximation algorithm A for Max-IP(n, d) for some ε 0 > 0.
     Let ε = ε 0 /2. Now, for every constant c2 , we apply the reduction in Lemma 3.3 with τ = 2 to
reduce an OVn,c2 log n instance to nε Max-IPn,d1 instances, where d1 = 2poly(c2 /ε) · log n = O(1) · log n.
In addition, from Lemma 3.3, to solve the OVn,c2 log n instance, we only need to compute additive
T /6 = Ω(log n) = Ω(d1 )-approximations to these Max-IP instances obtained via the reduction.
     This can be solved, via nε calls to A as follows: for each Max-IPn,d1 instance I we get, we duplicate
each coordinate d/d1 times (note that d = ω(log n)  d1 = O(log n), and for simplicity we assume
d1 | d), to obtain a Max-IPn,d instance Inew , such that OPT(Inew ) = d/d1 · OPT(I). Then A can be used
to estimate OPT(Inew ) within an additive error t = o(d). Scaling its estimate by d1 /d, it can also be used
to estimate OPT(I) within an additive error o(d1 ) = o(log n) ≤ T /6 for sufficiently large n.
                                                   0
     Therefore, we can solve OVn,c2 log n in n2−ε · nε = n2−ε time for all constants c2 . Contradiction to
OVC.

      Finally, the correctness of Theorem 1.10 follows directly from Lemma 3.6 and Lemma 3.7.

3.3     Adaptation to All-Pair-Max-IP
Now we sketch the adaptation of our algorithms to work for the All-Pair-Max-IP problem.

Reminder of Corollary 1.12 Suppose ω(log n) < d < no(1) , and let
                                                   
                                       logt                     min(t, d)
                       εM := min                 , 1 and εA :=            .
                                   log(d/ log n)                   d
                                                                                                     1/3       −1
There is an n2−Ω(εM ) polylog(n)-time multiplicative t-approximation algorithm and an n2−Ω(εA / log εA ) -
time additive t-approximation algorithm for All-Pair-Max-IPn,d , when εA = ω(log6 log n/ log3 n).

Proof sketch. Note that the algorithm in Lemma 3.5 from [13] actually works for the All-Pair-Max-IPn,d .
Hence, we can simply apply that algorithm after the coordinate sampling phase, and obtain an additive
t-approximation algorithm for All-Pair-Max-IPn,d .
     For multiplicative t-approximation algorithm, suppose we are given with two sets, A and B, of n
vectors each, from {0, 1}d . Instead of partitioning each of them into n/b subsets (the notation used here
is the same as in the proof of Lemma 3.2), we only partition B into n/b subsets, B1 , B2 , . . . , Bn/b , of size
b each, and calculate Pr (x, Bi ) := ∑y∈Bi Pr (x, y) for every x ∈ A and i ∈ [n/b] using similar reduction to
rectangular matrix multiplication as in Lemma 3.2. Note that here we are multiplying an n × m matrix and
an m × (n/b) matrix, and this can be reduced to b instances of multiplication of an (n/b) × m matrix and
an m × (n/b) matrix, and now our running time becomes n2 /b · polylog(n) instead of n2 /b2 · polylog(n).
     By a similar analysis, these can be done in n2−Ω(εM ) · polylog(n) time, and then we can compute the
multiplicative t-approximate answers for the given All-Pair-Max-IPn,d instance.


                        T HEORY OF C OMPUTING, Volume 16 (4), 2020, pp. 1–50                                   22
       O N T HE H ARDNESS OF A PPROXIMATE AND E XACT (B ICHROMATIC ) M AXIMUM I NNER P RODUCT

3.4     Improved hardness for LCS-Closest Pair problem
We finish this section with the proof of Corollary 1.9. First we abstract the reduction from Max-IP to
LCS-Closest-Pair in [5] here.

Lemma 3.8 (Implicit in Theorem I.10 in [5]). For every real t ≥ 2 and integer n, computing a mul-
tiplicative t-approximation to Max-IPn,d reduces to computing a multiplicative t/2-approximation to
LCS-Closest-Pairn,O(d 3 log2 n) in O(n poly(d, log n)) time.

      Now we are ready to prove Corollary 1.9 (restated below for convenience).

Reminder of Corollary 1.9 Assuming SETH (or OVC), for every t ≥ 2, computing a multiplicative
t-approximation to LCS-Closest-Pairn,d requires n2−o(1) time, if d = t ω(1) · log5 n.

Proof. From Corollary 3.4, assuming SETH (or OVC), for every t ≥ 2, we have that computing a multi-
plicative 2t-approximation to Max-IPn,d requires n2−o(1) time if d = t ω(1) · log n. Then from Lemma 3.8,
we immediately have that computing a multiplicative t-approximation to LCS-Closest-Pairn,d 3 ·log2 n =
LCS-Closest-Pairn,t ω(1) ·log5 n requires n2−o(1) time.

3.5     A proof of Lemma 3.3
Finally, we present a proof of Lemma 3.3, which is implicit in [67].
    We need the following efficient MA protocol for Set-Disjointness from [67], which is also used
in [51].18

Lemma 3.9 (Theorem 3.2 of [67]). For every α and m, there is an (m/α, log2 m + O(1), poly(α), 1/2)-
efficient MA protocol for DISJm .

    We want to reduce the error probability while keeping the number of total random coins relatively
low. To achieves this, we can use an expander graph (Theorem 2.7) to prove the following theorem.

Lemma 3.10. For every α, m and ε < 1/2, there is an (m/α, log2 m + O(log ε −1 ), poly(α) · log ε −1 , ε)-
efficient MA protocol for DISJm .
                                              −1               −1
Proof. Let c1 and F : {0, 1}log m+c1 ·log ε → [m]c1 ·log ε be the corresponding constant and function as in
Theorem 2.7, and Π be the (m/α, log2 m, poly(α), 1/2)-efficient MA protocol for DISJm in Lemma 3.9.
Set q = c1 · log ε −1 and our new protocol Πnew works as follows:

      • Merlin still sends the same advice to Alice as in Π.

      • Alice and Bob jointly toss r = log m + q coins to get a string w ∈ {0, 1}r . Then we let w1 , w2 , . . . , wq
        be the sequence corresponding to F(w). Each of them can be interpreted as log m bits.

      • Bob sends Alice q messages, the i-th message mi corresponds to Bob’s message in Π when the
        random bits is wi .
  18 The protocol in [51] also works for the k-party number-in-hand model.



                          T HEORY OF C OMPUTING, Volume 16 (4), 2020, pp. 1–50                                    23
                                                L IJIE C HEN

   • After that, Alice decides whether to accept or not as follows:

         – If for every i ∈ [q], Alice would accept Bob’s message mi with random bits wi in Π, then
           Alice accepts.
         – Otherwise, Alice rejects.

    It is easy to verify that the advice length, message length and number of random coins satisfy our
requirements.
    For the error probability, note that when these two sets are disjoint, the same advice in Π leads to
acceptance of Alice. Otherwise, suppose the advice from Merlin is either wrong or these two sets are
intersecting, then half of the random bits in {0, 1}log m leads to the rejection of Alice in Π. Hence, from
Theorem 2.7, with probability at least 1 − ε, at least one of the random bits wi would lead to the rejection
of Alice, which completes the proof.

    Finally, we prove Lemma 3.3 (restated below).

Reminder of Lemma 3.3 There is a universal constant c1 such that, for every integer c, realε ∈ (0, 1]
and τ ≥ 2, OVn,c log n can be reduced to nε Max-IPn,d instances (Ai , Bi ) for i ∈ [nε ], such that:

   • d = τ poly(c/ε) · log n.

   • Letting T = c log n · τ c1 , if there is a ∈ A and b ∈ B such that a · b = 0, then there exists an i such
     that OPT(Ai , Bi ) ≥ T .1

   • Otherwise, for every i we must have OPT(Ai , Bi ) ≤ T /τ.


Proof. The reduction follows exactly the same as in [5], we recap here for completeness.
    Set α = c/ε, m = c · log n and ε = 1/τ, and let Π be the (m/α, log2 m + O(log ε −1 ), poly(α) ·
log ε −1 , ε)-efficient MA protocol for Set-Disjointness as in Lemma 3.10.
    Now, we first enumerate all of 2m/α = 2ε·log n = nε possible advice strings, and create a Max-IP
instance for each of the advice strings.
    For a fixed advice ψ ∈ {0, 1}ε·log n , we create a Max-IP instance with sets Aψ and Bψ as follows. We
use a ◦ b to denote the concatenation of the strings a and b.
    Let r = log2 m + c1 · log ε −1 , where c1 is the constant hidden in the big O notation in Lemma 3.10,
and ` = poly(α) · log ε −1 . Let m1 , m2 , . . . , m2` be an enumeration of all strings in {0, 1}` .
                                                                                              `
   • For each a ∈ A, and for each string w ∈ {0, 1}r , we create a vector aw ∈ {0, 1}2 , such that awi
     indicates that given advice ψ and randomness w, whether Alice accepts message mi or not (1 for
     acceptance, 0 for rejection). Let the concatenation of all these aw be aψ . Then Aψ is the set of all
     these aψ for a ∈ A.
                                                                                          `
   • For each b ∈ B, and for each string w ∈ {0, 1}r , we create a vector bw ∈ {0, 1}2 , such that bwi = 1
     if Bob sends the message mi given advice ψ and randomness w, and = 0 otherwise. Let the
     concatenation of all these bw be bψ . Then Bψ is the set of all these bψ for b ∈ B.

                        T HEORY OF C OMPUTING, Volume 16 (4), 2020, pp. 1–50                               24
     O N T HE H ARDNESS OF A PPROXIMATE AND E XACT (B ICHROMATIC ) M AXIMUM I NNER P RODUCT

    We can see that for a ∈ A and b ∈ B, aψ · bψ is precisely the number of random coins leading Alice
to accept the message from Bob given advice ψ when Alice and Bob holds a and b correspondingly.
Therefore, let T = 2r = c log n · τ c1 . From the properties of the protocol Π, we can see that:

    • If there is an a ∈ A and b ∈ B such that a ·b = 0, then there is a ψ ∈ {0, 1}ε·log n such that aψ ·bψ ≥ T .

    • Otherwise, for every a ∈ A, b ∈ B and advice ψ{0, 1}ε·log n , aψ · bψ ≤ T /τ.

And this completes the proof.


4    Hardness of Exact Z-Max-IP, Hopcroft’s problem and more
In this section we show hardness of Hopcroft’s problem, exact Z-Max-IP, `2 -Furthest Pair and Bichromatic
`2 -Closest Pair. Essentially our results follow from the framework of [75], in which it is shown that
hardness of Hopcroft’s problem implies hardness of other three problems, and is implied by dimensionality
reduction for OV.

                                                                                   `2 -furthestn,2O(log∗ n)




    OVn,c log n              Z-OVn,2O(log∗ n)          Z-Max-IPn,2O(log∗ n)




                                                                              Bichrom.-`2 -closestn,2O(log∗ n)

                           Figure 1: A diagram for all reductions in this section.


The organization of this section

In Section 4.1, we prove the improved dimensionality reduction for OV. In Section 4.2, we establish
                                                          ∗
the hardness of Hopcroft’s problem in dimension 2O(log n) with the improved reduction. In Section 4.3,
we show Hopcroft’s problem can be reduced to Z-Max-IP and thus establish the hardness for the later
one. In Section 4.4, we show Z-Max-IP can be reduced to `2 -Furthest Pair and Bichromatic `2 -Closest
Pair, therefore the hardness for the last two problems follow. In Section 4.5, we show that to construct
better dimensionality reduction for OV, it suffices to show the existence of such reductions, instead of
constructing them algorithmically. See Figure 1 for a diagram of all reductions covered in this section.
    The reductions in Section 4.2, Section 4.3 and Section 4.4 are all from [75] (either explicitly or
implicitly), we make them explicit here for our ease of exposition and for making the paper self-contained.

                        T HEORY OF C OMPUTING, Volume 16 (4), 2020, pp. 1–50                                     25
                                                     L IJIE C HEN

4.1     Improved dimensionality reduction for OV
We begin with the improved dimensionality reduction for OV. The following theorem is one of the
technical cores of this paper, which makes use of the CRR encoding (see Theorem 2.5) recursively.

Theorem 4.1. Let b, ` be two sufficiently large integers. There is a reduction ψb,` : {0, 1}b·` → Z` and a
set Vb,` ⊆ Z, such that for every x, y ∈ {0, 1}b·` ,

                                       x · y = 0 ⇔ ψb,` (x) · ψb,` (y) ∈ Vb,`

and
                                                                       log∗ (b) ·b
                                             0 ≤ ψb,` (x)i < `6
for every x ∈ {0, 1}b·` and i ∈ [`]. Moreover, the computation of ψb,` (x) takes poly(b · `) time, and the set
Vb,` can be constructed in
                                                   log∗ (b) ·b)
                                                                 
                                            O `O(6                · poly(b · `)

time.

Remark 4.2. We didn’t make much effort to minimize the base 6 above to keep the calculation clean, it
can be replaced by any constant > 2 with a tighter calculation.

Proof. We are going to construct our reduction in a recursive way. ` will be the same throughout the
proof, hence in the following we use ψb (Vb ) instead of ψb,` (Vb,` ) for simplicity.

Direct CRR for small b: When b < `, we use a direct Chinese remainder representation of numbers.
We pick b distinct primes q1 , q2 , . . . , qb in [` + 1, `2 ] (they are guaranteed to exist by Lemma 2.4), and use
them for our CRR encoding.
   For x ∈ {0, 1}b·` , we partition it into ` equal-sized subvectors, and use xi to denote the i-th subvector.
That is, xi is the subvector of x from the ((i − 1) · b + 1)-th bit to the (i · b)-th bit.
   Then we define ψb (x) as
                                                                                   n o 
                                      b                     b                         b
                                             1                    2
                   ψb (x) := CRR x j j=1 , CRR x j j=1 , . . . , CRR                    x`j        .
                                                                                                  j=1


That is, the i-th coordinate of ψb (x) is the CRR encoding of the i-th subvector xi with respect to the
primes q j .
    Now, for x, y ∈ {0, 1}b·` , note that for j ∈ [b],

                             ψb (x) · ψb (y) (mod q j )
                              `                         
                                             b             b
                            ≡ ∑ CRR xij j=1 · CRR yij j=1                            (mod q j )
                               i=1
                                `
                            ≡ ∑ xij · yij   (mod q j ).
                               i=1


                         T HEORY OF C OMPUTING, Volume 16 (4), 2020, pp. 1–50                                  26
     O N T HE H ARDNESS OF A PPROXIMATE AND E XACT (B ICHROMATIC ) M AXIMUM I NNER P RODUCT

    Since the sum ∑`i=1 xij · yij is in [0, `], and q j > `, we can see
                                  `
                                 ∑ xij · yij = 0 ⇔ ψb (x) · ψb (y) ≡ 0 (mod q j ).
                                 i=1

    Therefore, x · y = ∑bj=1 ∑`i=1 xij · yij = 0 is equivalent to that

                                           ψb (x) · ψb (y) ≡ 0            (mod q j )

for every j ∈ [b].
                                                        log∗ (b) ·b
    Finally, we have 0 ≤ ψb (x)i < ∏bj=1 q j < `2·b ≤ `6            . Therefore
                                                                     log∗ (b) ·2b+1
                                           ψb (x) · ψb (y) < `6                       ,
                                                               log∗ (b)
                                                              ·2b+1 ] that is 0 modulo each q , and it is easy to
and we can set Vb to be the set of all integers in [0, `6                                    j
see that
                                          x · y ⇔ ψb (x) · ψb (y) ∈ Vb
for every x, y ∈ {0, 1}b·` .

Recursive construction for larger b: When b ≥ `, suppose the theorem holds for all integers b0 < b.
Let bm be the number such that (we ignore the rounding issue here and pretend that bm is an integer for
simplicity),
                                             log∗ (bm ) ·b
                                           `6             m
                                                            = b.
    Then we pick b/bm primes p1 , p2 , . . . , pb/bm in [(b2 `), (b2 `)2 ], and use them as our reference primes
in the CRR encodings.
    For x ∈ {0, 1}b·` , as before, we partition x into ` equal-sized subvectors x1 , x2 , . . . , x` , where xi consists
of the ((i − 1) · b + 1)-th bit of x to the (i · b)-th bit of x. Then we partition each xi again into b/bm micro
groups, each of size bm . We use xi, j to denote the j-th micro group of xi after the partition.
    Now, we use x[ j] to denote the concatenation of the vectors x1, j , x2, j , . . . , x`, j . That is, x[ j] is the
concatenation of the j-th micro group in each of the ` subvectors. Note that x[ j] ∈ {0, 1}bm ·` , and can be
seen as a smaller instance, on which we can apply ψbm .
    Our recursive construction then goes in two steps. In the first step, we make use of ψbm , and transform
each bm -size micro group into a single number in [0, b). This step transforms x from a vector in {0, 1}b·`
into a vector S(x) in Z(b/bm )·` . And in the second step, we use a similar CRR encoding as in the base case
to encode S(x), to get our final reduced vector in Z` .
    S(x) is simply
                                       
                              S(x) := ψbm (x[1] )1 , ψbm (x[2] )1 , . . . , ψbm (x[b/bm ] )1 ,
                                         ψbm (x[1] )2 , ψbm (x[2] )2 , . . . , ψbm (x[b/bm ] )2 ,
                                          ...,...,...
                                                                                               
                                         ψbm (x[1] )` , ψbm (x[2] )` , . . . , ψbm (x[b/bm ] )` .


                          T HEORY OF C OMPUTING, Volume 16 (4), 2020, pp. 1–50                                       27
                                                          L IJIE C HEN

   That is, we apply ψbm to all the x[ j] , and shrink all the corresponding micro groups in x into integers.
Again, we partition S = S(x) into ` equal size groups S1 , S2 , . . . , S` .
   Then we define ψb (x) as
                                                                 n o      
                                                                 b/bm
                                1 b/bm        2 b/bm                 `
                ψb (x) := CRR S j j=1 , CRR S j j=1 , . . . , CRR   Sj         .
                                                                                           j=1


    In other words, the i-th coordinate of ψb (x) is the CRR representation of the number sequence Si ,
                                 b/b
with respect to our primes {q j } j=1m .
    Now, note that for x, y ∈ {0, 1}b·` , x · y = 0 is equivalent to x[ j] · y[ j] = 0 for every j ∈ [b/bm ], which is
further equivalent to
                                             ψbm (x[ j] ) · ψbm (y[ j] ) ∈ Vbm

for every j ∈ [b/bm ], by our assumption on ψbm .
    Since 0 ≤ ψbm (x[ j] )i , ψbm (y[ j] )i < b for every x, y ∈ {0, 1}b·` , i ∈ [`] and j ∈ [b/bm ], we also have
ψbm (x[ j] ) · ψbm (y[ j] ) < b2 · `, therefore we can assume that Vbm ⊆ [0, b2 `).
    For every x, y ∈ {0, 1}b·` and j ∈ [b/bm ], we have

                         ψb (x) · ψb (y)
                          `                              
                                         b/b             b/b
                        ≡ ∑ CRR S(x)ij j=1m · CRR S(y)ij j=1m                     (mod p j )
                           i=1
                            `
                        ≡ ∑ S(x)ij · S(y)ij      (mod p j )
                           i=1
                            `
                        ≡ ∑ ψbm (x[ j] )i · ψbm (y[ j] )i     (mod p j )
                           i=1
                        ≡ψbm (x[ j] ) · ψbm (y[ j] ) (mod p j ).

   Since p j ≥ b2 · `, we can determine ψbm (x[ j] ) · ψbm (y[ j] ) from ψb (x) · ψb (y) by taking modulo p j .
Therefore x · y = 0 is equivalent to

                             (ψb (x) · ψb (y) mod p j ) ∈ Vbm for every j ∈ [b/bm ].

    Finally, recall that we have
                                                      log∗ (bm ) ·b
                                                     `6            m
                                                                       = b.

    Taking logarithm of both sides, we have
                                                 ∗
                                             6log (bm ) · bm · log ` = log b.

                         T HEORY OF C OMPUTING, Volume 16 (4), 2020, pp. 1–50                                      28
     O N T HE H ARDNESS OF A PPROXIMATE AND E XACT (B ICHROMATIC ) M AXIMUM I NNER P RODUCT

    Then we can upper bound ψb (x)i by
                           b/bm
                ψb (x)i < ∏ p j
                           j=1

                        < (b2 `)2·(b/bm )                                                                           (b ≥ `.)
                        ≤ 26·b/bm ·log b
                                           log∗ (bm ) ·b ·log `
                        ≤ 26·b/bm ·6                    m

                               log∗ (bm ) ·b
                        ≤ `6·6
                             log∗ (b) ·b
                        ≤ `6                                      (bm ≤ log b, log∗ (bm ) + 1 ≤ log∗ (log b) + 1 = log∗ (b).)
                                                                                log∗ (b) ·2b+1
    Therefore, we can set Vb as the set of all integers t in [0, `6                              ) such that

                                      (t mod p j ) ∈ Vbm for every j ∈ [b/bm ].

And it is easy to see this Vb satisfies our requirement.
    Finally, it is easy to see that the straightforward way of constructing ψb (x) takes O(poly(b · `)) time,
and we can construct Vb by enumerating all possible values of ψb (x) · ψb (y) and checking each of them in
                                                     log∗ (b) ·b)
O(poly(b · `)) time. Since there are at most `O(6                 such values, Vb can be constructed in
                                                  log∗ (b) ·b)
                                                                           
                                           O `O(6              · poly(b · `)

time, which completes the proof.

    Now we prove Lemma 1.17, we recap its statement here for convenience.

Reminder of Lemma 1.17 Let 1 ≤ ` ≤ d. There is an
                                       log∗ d
                                                        
                             O n · `O(6 ·(d/`)) · poly(d) -time

                                  log∗ d ·(d/`))
reduction from OVn,d to `O(6                         instances of Z-OVn,`+1 , with vectors of entries with bit-length
                    ∗
O(d/` · log ` · 6log d ).

Proof. The proof is exactly the same as the proof for Lemma 1.1 in [75] with different parameters. We
recap it here for convenience.
    Given two sets A0 and B0 each consisting of n vectors from {0, 1}d , we apply ψd/`,` to each of the
vectors in A0 (B0 ) to obtain a set A (B) of vectors from Z` . From Theorem 4.1, there is a (u, v) ∈ A0 × B0
such that u · v = 0 if and only if there is a (u, v) ∈ A × B such that u · v ∈ Vd/`,` .
    Now, for each element t ∈ Vd/`,` , we are going to construct two sets At and Bt of vectors from Z`+1
such that there is a (u, v) ∈ A × B with u · v = t if and only if there is a (u, v) ∈ At × Bt with u · v = 0. We
construct a set At as the collection of all vectors uA = [u, 1] for u ∈ A, and a set Bt as the collection of all
vectors vB = [v, −t] for v ∈ B. It is easy to verify this reduction has the properties we want.

                        T HEORY OF C OMPUTING, Volume 16 (4), 2020, pp. 1–50                                              29
                                                L IJIE C HEN

                                     log∗ d
    Note that there are at most `O(6 ·(d/`)) numbers in Vd/`,` , so we have such a number of Z-OVn,`+1
instances. And from Theorem 4.1, the reduction takes
                                               log∗ d
                                                                  
                                    O n · `O(6 ·(d/`)) · poly(d)

time.
   Finally, the bit-length of reduced vectors is bounded by
                                       log∗ d
                                                                  ∗
                                                                       
                              log `O(6 ·(d/`)) = O d/` · log ` · 6log d ,

which completes the proof.

A transformation from nonuniform construction to uniform construction. The proof for Theo-
rem 4.1 works recursively. In one recursive step, we reduce the construction of ψb,` to the construction of
ψbm ,` , where bm ≤ log b. Applying this reduction log∗ n times, we get a sufficiently small instance that
we can switch to a direct CRR construction.
    An interesting observation here is that after applying the reduction only three times, the block length
parameter becomes b0 ≤ log log log b. Such a b0 is so small that we can actually use brute force to find the
“optimal” construction ψb0 ,` in bo(1) time instead of recursing deeper. Hence, to find a construction better
than Theorem 4.1, we only need to prove the existence of such a construction. See Section 4.5 for details.

4.2   Improved hardness for Hopcroft’s problem
In this subsection we are going to prove Theorem 1.18 using our new dimensionality reduction from
Lemma 1.17, we recap its statement here for completeness.
                                                                        ∗
Reminder of Theorem 1.18 [Hardness of Hopcroft’s problem in clog n dimension] Assuming SETH (or
OVC), there is a constant c such that Z-OVn,clog∗ n with vectors of O(log n)-bit entries requires n2−o(1)
time.

Proof. The proof here follows roughly the same as the proof for Theorem 1.1 in [75].
    Let c be an arbitrary constant and d := c · log n. We show that an algorithm A solving Z-OVn,`+1
               ∗
where ` = 7log n in O(n2−δ ) time for some δ > 0 can be used to construct an O(n2−δ +o(1) )-time algorithm
for OVn,d , and therefore contradicts the OVC.
    We simply invoke Lemma 1.17, note that we have
                                  log∗ d
                            n              o            ∗               
                        log `O(6 ·(d/`)) = log ` · O 6log d · (d/`)
                                                             ∗                  ∗
                                                                                    
                                             = O log∗ n · 6log n · c · log n/7log n
                                                                   ∗
                                                                                 
                                             = O log∗ n · (6/7)log n · c · log n
                                              = o(log n).


                       T HEORY OF C OMPUTING, Volume 16 (4), 2020, pp. 1–50                               30
      O N T HE H ARDNESS OF A PPROXIMATE AND E XACT (B ICHROMATIC ) M AXIMUM I NNER P RODUCT

                                          log d ∗
Therefore, the reduction takes O(n · `O(6 ·(d/`)) · poly(d)) = n1+o(1) time. It reduces an OVn,d instance
to no(1) instances of Z-OVn,`+1 , whose vectors have bit length o(log n) as calculated above. We simply
solve all these no(1) instances using A, and this gives us an O(n2−δ +o(1) )-time algorithm for OVn,d , which
completes the proof.

4.3     Hardness for Z-Max-IP
Now we move to hardness of exact Z-Max-IP.

Theorem 4.3 (Implicit in Theorem 1.2 [75]). There is an O(poly(d) · n)-time algorithm which reduces a
Z-OVn,d instance into a Z-Max-IPn,d 2 instance.

Proof. We remark here that this reduction is implicitly used in the proof of Theorem 1.2 in [75], we
abstract it here only for our exposition.
    Given a Z-OVn,d instance with sets A, B. Consider the following polynomial P(x, y), where x, y ∈ Zd .

                 P(x, y) = −(x · y)2 =      ∑ −(xi · yi ) · (x j · y j ) = ∑ −(xi · x j ) · (yi · y j ).
                                          i, j∈[d]                          i, j∈[d]


   It is easy to see that whether there is an (x, y) ∈ A × B such that x · y = 0 is equivalent to whether the
maximum value of P(x, y) is 0.
                                                                                              2
   Now, for each x ∈ A and y ∈ B, we identify [d 2 ] with [d] × [d] and construct xe, ye ∈ Zd such that

                                    xe(i, j) = xi · x j   and ye(i, j) = −yi · y j .

Then we have xe· ye = P(x, y). Hence, let Ae be the set of all these vectors xe, and Be be the set of all these
vectors ye. Whether there is an (x, y) ∈ A × B such that x · y = 0 is equivalent to whether OPT(A,      e = 0,
                                                                                                     e B)
and our reduction is completed.

      Now, Theorem 1.14 (restated below) is just a simple corollary of Theorem 4.3 and Theorem 1.18.

Reminder of Theorem 1.14 Assuming SETH (or OVC), there is a constant c such that every exact
                                      ∗
algorithm for Z-Max-IPn,d for d = clog n requires n2−o(1) time, with vectors of O(log n)-bit entries.


4.3.1    A dimensionality reduction for Max-IP
The reduction ψb,` from Theorem 4.1 actually does more: For x, y ∈ {0, 1}b·` , from ψb,` (x) · ψb,` (y) we
can in fact determine the inner product x · y itself, not only whether x · y = 0. Formally, we have the
following corollary.

Corollary 4.4. Let b, ` be two sufficiently large integers. There is a reduction ψb,` : {0, 1}b·` → Z` and
                0 ,V 1 , . . . ,V b·` ⊆ Z, such that for every x, y ∈ {0, 1}b·` ,
b · ` + 1 sets Vb,` b,`          b,`

                                                           k
                        x · y = k ⇔ ψb,` (x) · ψb,` (y) ∈ Vb,`       for every 0 ≤ k ≤ b · `,

                        T HEORY OF C OMPUTING, Volume 16 (4), 2020, pp. 1–50                                31
                                                       L IJIE C HEN

and
                                                                   log∗ (b) ·b
                                             0 ≤ ψb,` (x)i < `6
for for every x ∈ {0, 1}b·` and i ∈ [`]. Moreover, the computation of ψb,` (x) takes poly(b · `) time, and the
                                          ∗
       k can be constructed in O(`O(6log (b) ·b) · poly(b · `)) time.
sets Vb,`

    Combining Corollary 4.4 with Theorem 4.3, we can in fact derive a similar dimensionality self-
reduction from Max-IP to Z-Max-IP.

Corollary 4.5. Let 1 ≤ ` ≤ d. There is an
                                          log∗ d
                                                           
                                O n · `O(6 ·(d/`)) · poly(d) -time

                                      log∗ d ·(d/`))
reduction from Max-IPn,d to d · `O(6                   instances of Z-Max-IPn,(`+1)2 , with vectors of entries with
                               ∗ 
bit-length O d/` · log ` · 6log d .

Proof sketch. Let b = d/` (assume ` divides d here for simplicity), A and B be the sets in the given
Max-IPn,d instance. We proceed similarly as the case for OV.
     For each k from 0 to d, we construct the set Vb,`   k as specified in Corollary 4.4. Then there is an
                                                                                                              k .
(x, y) ∈ A × B such that x · y = k if and only if there is an (x, y) ∈ A × B such that ψb,` (x) · ψb,` (y) ∈ Vb,`
                                                                                                  log∗ (b)
                                                                                                   ·b) instances
Using exactly the same reduction as in Lemma 1.17, we can in turn reduce this into `O(6
of Z-OVn,`+1 .
                                                               log∗ (b) ·b)
    Applying Theorem 4.3, by solving all these (d + 1) · `O(6               Z-Max-IPn,(`+1)2 instances, we can
determine whether there is an (x, y) ∈ A × B such that x · y = k for every k, from which we can compute
the answer to the Max-IPn,d instance.

4.4   Hardness for `2 -Furthest Pair and Bichromatic `2 -Closest Pair
Now we turn to the proof of hardness of `2 -Furthest Pair and Bichromatic `2 -Closest Pair. The two
reductions below are slight adaptations of the reductions in the proofs of Theorem 1.2 and Corollary 2.1
in [75].

Lemma 4.6. Assuming d = no(1) , there is an O(poly(d) · n)-time algorithm which reduces a Z-Max-IPn,d
instance into an instance of `2 -Furthest Pair on 2n points in Rd+2 . Moreover, if the Z-Max-IP instance
consists of vectors of O(log n)-bit entries, so does the `2 -Furthest Pair instance.

Proof. Let A, B be the sets in the Z-Max-IPn,d instance, and k be the smallest integer such that all vectors
from A and B consist of (k · log n)-bit entries.
   Let W be nC·k where C is a large enough constant. Given x ∈ A and y ∈ B, we construct point
                            q                                     q            
                      xe = x, W − kxk2 , 0        and ye = −y, 0, W − kyk2 .

That is, we append two corresponding values into the end of vectors x and −y.

                        T HEORY OF C OMPUTING, Volume 16 (4), 2020, pp. 1–50                                    32
     O N T HE H ARDNESS OF A PPROXIMATE AND E XACT (B ICHROMATIC ) M AXIMUM I NNER P RODUCT

    Now, we can see that for x1 , x2 ∈ A, the squared distance between their reduced points is

                               kxe1 − xe2 k2 ≤ kx1 − x2 k2 +W ≤ 4 · d · n2k +W.

Similarly we have
                                         kye1 − ye2 k2 ≤ 4 · d · n2k +W

for y1 , y2 ∈ B.
    Next, for x ∈ A and y ∈ B, we have

           x − yek2 = ke
          ke           xk2 + ke
                              yk2 − 2 · xe· ye = 2 ·W + 2 · (x · y) ≥ 2 ·W − d · n2k > 4 · d · n2k +W,

the last inequality holds when we set C to be 5.
      Putting everything together, we can see the `2 -farthest pair among all points xe and ye must be a pair
 x, ye) with x ∈ A and y ∈ B. And maximizing ke
(e                                              x − yek is equivalent to maximizing x · y, which proves the
correctness of our reduction. Furthermore, when k is a constant, the reduced instance only needs vectors
with O(k) · log n = O(log n)-bit entries.

Lemma 4.7. Assuming d = no(1) , there is an O(poly(d) · n)-time algorithm which reduces a Z-Max-IPn,d
instance into an instance of Bichromatic `2 -Closest Pair on 2n points in Rd+2 . Moreover, if the Z-Max-IP
instance consists of vectors of O(log n)-bit entries, so does the Bichromatic `2 -Closest Pair instance.

Proof. Let A, B be the sets in the Z-Max-IPn,d instance, and k be the smallest integer such that all vectors
from A and B consist of (k · log n)-bit entries.
   Let W be nC·k where C is a large enough constant. Given x ∈ A and y ∈ B, we construct point
                             q                                  q         
                                       2
                        xe = x, W − kxk , 0                                2
                                                     and ye = y, 0, W − kyk .

That is, we append two corresponding values into the end of vectors x and y. And our reduced instance is
                                          e (consisting of all these xe where x ∈ A) and the set Be (consisting
to find the closest point between the set A
of all these ye where y ∈ B).
     Next, for x ∈ A and y ∈ B, we have

                            x − yek2 = ke
                           ke           xk2 + ke
                                               yk2 − 2 · xe· ye = 2 ·W − 2 · (x · y).

    Hence minimizing ke  x − yek where x ∈ A and y ∈ B is equivalent to maximize x · y, which proves the
correctness of our reduction. Furthermore, when k is a constant, the reduced instance only needs vectors
with O(k) · log n = O(log n)-bit entries.

   Now Theorem 1.15 and Theorem 1.16 are simple corollaries of Lemma 4.6, Lemma 4.7 and Theo-
rem 1.14.

                        T HEORY OF C OMPUTING, Volume 16 (4), 2020, pp. 1–50                                33
                                                     L IJIE C HEN

4.5     Nonuniform to uniform transformation for dimensionality reduction for OV
Finally, we discuss the transformation from nonuniform construction to uniform one for dimensionality
reduction for OV. In order to state our result formally, we need to introduce some definitions.

Definition 4.8 (Nonuniform Reduction). Let b, `, κ ∈ N. We say a function ϕ : {0, 1}b·` → Z` together
with a set V ⊆ Z is a (b, `, κ)-reduction, if the following holds:

      • For every x, y ∈ {0, 1}b·` ,
                                               x · y = 0 ⇔ ϕ(x) · ϕ(y) ∈ V.

      • For every x ∈ {0, 1}b·` and i ∈ [`],
                                                     0 ≤ ϕ(x)i < `κ·b .

    Similarly, let τ be an increasing function. We say a function family {ϕb,` }b,` together with a set
family {Vb,` }b,` is a τ-reduction family, if for every integer b and `, (ϕb,` ,Vb,` ) is a (b, `, τ(b))-reduction.
    Moreover, if for all integers b and ` ≤ log log log b, there is an algorithm A which computes      ϕb,` (x) in
                                          b·`
poly(b) time given b, ` and x ∈ {0, 1} , and constructs the set Vb,` in O `      O(τ(b)·b)  · poly(b) time given
b and `, then we call (ϕb,` ,Vb,` ) a uniform-τ-reduction family.

Remark 4.9. The reason we assume ` to be so small compared to b is that in our applications we
only care about very small `, and that greatly simplifies the notation. From Theorem 4.1, there is a
             ∗
uniform- 6log b -reduction family, and a better uniform-reduction family implies better hardness for
Z-OV and other related problems as well (Lemma 1.17, Theorem 4.3, Lemma 4.7 and Lemma 4.6).

      Now we are ready to state our nonuniform to uniform transformation result formally.

Theorem 4.10. Letting τ be an increasing function such that τ(n) = O(log log log n) and supposing there
is a τ-reduction family, then there is a uniform-O(τ)-reduction family.

Proof sketch. The construction in Theorem 4.1 is recursive, it constructs the reduction ψb,` from a much
smaller reduction ψbm ,` , where bm ≤ log b. In the original construction, it takes log∗ b recursions to make
the problem instance sufficiently small so that a direct construction can be used. Here we only apply the
reduction three times. First let us abstract the following lemma from the proof of Theorem 4.1.
Lemma 4.11 (Implicit in Theorem 4.1). Letting b, `, bm , κ ∈ N and supposing `κ·bm = b and there is a
(bm , `, κ)-reduction (ϕ,V 0 ), the following holds:

      • There is a (b, `, 6 · κ)-reduction (ψ,V ).

      • Given (ϕ,V 0 ), for every x ∈ {0, 1}b·` , ψ(x) can be computed in poly(b · `), and V can be constructed
        in O `O(κ·b) · poly(b · `) time.

      Now, let b, ` ∈ N. We are going to construct our reduction as follows.
      Let b1 be the number such that
                                                      2
                                               `τ(b)·6 ·b1 = b.

                          T HEORY OF C OMPUTING, Volume 16 (4), 2020, pp. 1–50                                 34
     O N T HE H ARDNESS OF A PPROXIMATE AND E XACT (B ICHROMATIC ) M AXIMUM I NNER P RODUCT

Similarly, we set b2 and b3 so that

                                   `τ(b)·6·b2 = b1   and   `τ(b)·b3 = b2 .

We can calculate from above that b3 ≤ log log log b.
   From the assumption that there is a τ-reduction, there is a (b3 , `, τ(b3 ))-reduction (ϕb3 ,` ,Vb3 ,` ),
which is also a (b3 , `, τ(b))-reduction, as τ is increasing. Note that we can assume ` ≤ log log log b and
τ(b) ≤ log log log b from assumption. Now we simply use a brute force algorithm to find (ϕb3 ,` ,Vb3 ,` ).
There are
                                                          b3 ·`
                                             `τ(b)·b3 ·`·2 = bo(1)

possible functions from {0, 1}b3 ·` → {0, . . . `τ(b3 )·b3 − 1}` . Given such a function ϕ, one can check in
poly(2b3 ·` ) = bo(1) time that whether one can construct a corresponding set V to obtain our (b3 , `, τ(b))-
reduction.
    Applying Lemma 4.11 three times, one obtain a (b, `, O(τ(b)))-reduction (ψ,V ). And since ϕb3 ,` can
be found in bo(1) time, together with Lemma 4.11, we obtain a uniform-τ-reduction family.

   Finally, we give a direct corollary of Theorem 4.10 that the existence of an O(1)-reduction family
implies hardness of Z-OV, Z-Max-IP, `2 -Furthest Pair and Bichromatic `2 -Closest Pair in dimension
ω(1).

Corollary 4.12. If there is an O(1)-reduction family, then for every ε > 0, there exists a c ≥ 1 such
that Z-OV, Z-Max-IP, `2 -Furthest Pair and Bichromatic `2 -Closest Pair in dimension c with O(log n)-bit
entries require n2−ε time.

Proof sketch. Note that since the hardness of Z-OV implies the harnesses of other three, we only need to
consider Z-OV here.
    From Theorem 4.10 and the assumption, there exists a uniform-O(1)-reduction. Proceeding similar
as in Lemma 1.17 with the uniform-O(1)-reduction, we obtain a better dimensionality self reduction from
OV to Z-OV. Then exactly the same argument as in Theorem 1.18 with different parameters gives us the
lower bound required.


5 NP · UPP communication protocol and exact hardness for Z-Max-IP
We note that the inapproximability results for (Boolean) Max-IP is established via a connection to the MA
communication complexity protocol of Set-Disjointness [5]. In the light of this, in this section we view
our reduction from OV to Z-Max-IP (Lemma 1.17 and Theorem 4.3) in the perspective of communication
complexity.
    We observe that in fact, our reduction can be understood as an NP · UPP communication protocol
for Set Disjointness. Moreover, we show that if we can get a slightly better NP · UPP communication
protocol for Set-Disjointness, then we would be able to prove Z-Max-IP is hard even for ω(1) dimensions
(and also `2 -Furthest Pair and Bichromatic `2 -Closest Pair).

                       T HEORY OF C OMPUTING, Volume 16 (4), 2020, pp. 1–50                               35
                                                    L IJIE C HEN

5.1 NP · UPP communication protocol for Set-Disjointness
First, we rephrase the results of Lemma 1.17 and Theorem 4.3 in a more convenient way for our use here.
                                                                                      log∗ d ·(d/`))
Lemma 5.1 (Rephrasing of Lemma 1.17 and Theorem 4.3). Let 1 ≤ ` ≤ d, and m = `O(6                      . There
exists a family of functions
                                                               2
                                 i
                               ψAlice    i
                                      , ψBob : {0, 1}d → R(`+1)
for i ∈ [m] such that:
                                               i
    • when x · y = 0, there is an i such that ψAlice        i (y) ≥ 0;
                                                     (x) · ψBob
                                   i
    • when x · y > 0, for every i ψAlice        i (y) < 0;
                                         (x) · ψBob
           i
    • all ψAlice          i (y) can be computed in poly(d) time.
                 (x) and ψBob

    We also need the standard connection between UPP communication protocols and sign-rank [63]
(see also Chapter 4.11 of [49]).
    Recall that for a function F : X × Y → {0, 1}, a UPP protocol for F is a private-coin randomized
communication protocol such that: Alice and Bob hold x ∈ X and y ∈ Y respectively; F(x, y) = 1 if and
only if Alice and Bob accepts with probability > 1/2. The cost of the protocol is the maximum bits
communicated over all pairs (x, y) ∈ X × Y and Alice’s and Bob’s corresponding private random coins.

Lemma 5.2 (Equivalence of sign-rank and UPP communication protocol (Theorem 3 of [63])). The
following holds.

    • There is a d-cost UPP protocol for F implies that for ` = d + 1, there are mappings ψ X : X → R2
                                                                                                             `


      and ψ Y : Y → R2 such that for every (x, y) ∈ X × Y:
                        `



         – if F(x, y) = 1, ψ X (x) · ψ Y (y) ≥ 0;
         – otherwise, ψ X (x) · ψ Y (y) < 0.

    • There are mappings ψ X : X → R2 and ψ Y : Y → R2 satisfying the above conditions implies that
                                           `                       `


      for d = ` + 1, there is a d-cost UPP protocol for F.

   From the above lemmas, we immediately get the needed communication protocol and prove Theo-
rem 1.21 (restated below for convenience).

Reminder of Theorem 1.21 For every 1 ≤ α ≤ n, there is an
                              ∗
                                                 
                       α · 6log n · (n/2α ), O(α) -computational-efficient

NP · UPP communication protocol for DISJn .
                                                                      i
Proof sketch. We set α = log ` here. Given the function families {ψAlice      i } from Lemma 5.1,
                                                                         }, {ψBob
Merlin just sends the index i ∈ [m] and the rest follows from Lemma 5.2.

                         T HEORY OF C OMPUTING, Volume 16 (4), 2020, pp. 1–50                              36
       O N T HE H ARDNESS OF A PPROXIMATE AND E XACT (B ICHROMATIC ) M AXIMUM I NNER P RODUCT

5.2     Slightly better protocols imply hardness in dimension ω(1)
Finally, we show that if we have a slightly better NP·UPP protocol for Set-Disjointness, then we can show
Z-Max-IP requires n2−o(1) time even for ω(1) dimensions (and so do `2 -Furthest Pair and Bichromatic
`2 -Closest Pair). We restate Theorem 1.22 here for convenience.

Reminder of Theorem 1.22 Assuming SETH (or OVC), if there is an increasing and unbounded function
f such that for every 1 ≤ α ≤ n, there is an
                                      (n/ f (α), α) -computational-efficient
NP · UPP communication protocol for DISJn , then Z-Max-IPn,ω(1) requires n2−o(1) time with vectors of
polylog(n)-bit entries. The same holds for `2 -Furthest Pair and Bichromatic `2 -Closest Pair.

Proof. Suppose otherwise, there is a constant ε1 > 0 and an algorithm A for Z-Max-IPn,d running in n2−ε1
time for all constants d. (Note from Lemma 4.6 and Lemma 4.7, we only need to consider Z-Max-IP
here.)
    Now, letting c be an arbitrary constant, we are going to construct an n2−Ω(1) -time algorithm for
OVn,c log n , contradicting OVC.
    Let ε = ε1 /2, and α be the first number such that c/ f (α) < ε. Note that α is also a constant. Consider
the (c log n/ f (α), α)-computational-efficient NP · UPP protocol Π for DISJc log n , and let A, B be the two
sets in the OVn,c log n instance. Our algorithm via reduction works as follows:

      • There are 2α possible messages in {0, 1}α , let m1 , m2 , . . . , m2α be an enumeration of them.
      • We first enumerate all possible advice strings from Merlin in Π. There are 2c log n/ f (α) ≤ 2ε·log n = nε
        such strings, let φ ∈ {0, 1}ε·log n be such an advice string.
                                                  α
           – For each x ∈ A, let ψAlice (x) ∈ R2 be the probabilities that Alice accepts each message from
             Bob. That is, ψAlice (x)i is the probability that Alice accepts the message mi , given its input x
             and the advice φ .
                                                            α
           – Similarly, for each y ∈ B, let ψBob (y) ∈ R2 be the probabilities that Bob sends each message.
             That is, ψBob (y)i is the probability that Bob sends the message mi , give its input y and the
             advice φ .
           – Then, for each x ∈ A and y ∈ B, ψAlice (x) · ψBob (y) is precisely the probability that Alice
             accepts at the end when Alice and Bob hold x and y respectively and the advice is φ . Now we
             let Aφ be the set of all the vectors ψAlice (x), and Bφ be the set of all the vectors ψBob (y).
      • If there is a φ such that OPT(Aφ , Bφ ) ≥ 1/2, then we output yes, and otherwise output no.

   From the definition of Π, it is straightforward to see that the above algorithm solves OVn,c·log n .
Moreover, notice that from the computational-efficient property of Π, the reduction itself works in
n1+ε · polylog(n) time, and all the vectors in Aφ and Bφ have at most polylog(n) bit precision, which
means OPT(Aφ , Bφ ) can be solved by a call to Z-Max-IPn,2α with vectors of polylog(n)-bit entries.
   Hence, the final running time for the above algorithm is bounded by nε · n2−ε1 = n2−ε (2α is still a
constant), which contradicts the OVC.

                          T HEORY OF C OMPUTING, Volume 16 (4), 2020, pp. 1–50                                 37
                                                   L IJIE C HEN

6    Improved MA protocols
In this section we prove Theorem 1.24 (restated below for convenience).

Reminder of Theorem 1.24 There is an MA protocol for DISJn and IPn with communication complexity
                                      p                  
                                    O    n log n log log n .


    To prove Theorem 1.24, we need the following intermediate problem.
Definition 6.1 (The Inner Product Modulo p Problem (IPnp )). Let p and n be two positive integers. In IPnp ,
Alice and Bob are given vectors X and Y in {0, 1}n respectively and they want to compute X ·Y (mod p).
    Note that IPn and IPnp are not Boolean functions, so we need to generalize the definition of an MA
protocol. In an MA protocol for IPn , Merlin sends the answer directly to Alice together with a proof to
convince Alice and Bob. The correctness condition becomes that for the right answer X ·Y , Merlin has a
proof such that Alice and Bob will accept with high probability (like 2/3). And the soundness condition
becomes that for the wrong answers, every proof from Merlin will be rejected with high probability.
    We are going to use the following MA protocol for IPnp , which is a slight adaptation of the protocol
from [67].
Lemma 6.2 (Implicit in Theorem 3.1 of [67]). For a sufficiently large prime q and integers T and n, there
is an                                                                    
                       O (n/T · log q) , log n + O(1), O (T · log q) , 1/2 -efficient
MA protocol for IPqn .
    Now we ready to prove Theorem 1.24.
Proof of Theorem 1.24. Since an IPn protocol trivially implies a DISJn protocol, we only need to consider
IPn in the following.
    Now, let x be the number such that xx = n. For convenience we are going to pretend that x is an
integer. It is easy to see that x = Θ(log n/ log log n). Then we pick 10x distinct primes p1 , p2 , . . . , p10x in
[x + 1, x2 ]. (We can assume that n is large enough to make x satisfy the requirement of Lemma 2.4.) Let
T be a parameter. We use Π pi to denote the O (n/T · log pi ) , log n + O(1), O (T · log pi ) , 1/2 -efficient
MA protocol for IPnpi .
    Our protocol for IPn works as follows:
    • Merlin sends Alice all the advice strings from the protocols Π p1 , Π p2 , . . . , Π p10x , together with a
      presumed inner product 0 ≤ z ≤ n.
    • Note that Π pi contains the presumed value of X · Y (mod pi ), Alice first checks whether z is
      consistent with all these Π pi , and rejects immediately if it does not.
    • Alice and Bob jointly toss O(log(10x)) coins, to pick a uniform random number i? ∈ [10x], and
      then they simulate Π pi? . That is, they pretend they are the Alice and Bob in the protocol Π pi? with
      the advice from Merlin in Π pi? (which Alice does have).


                         T HEORY OF C OMPUTING, Volume 16 (4), 2020, pp. 1–50                                   38
     O N T HE H ARDNESS OF A PPROXIMATE AND E XACT (B ICHROMATIC ) M AXIMUM I NNER P RODUCT

Correctness. Let X,Y ∈ {0, 1}n be the vectors of Alice and Bob. If X ·Y = z, then by the definition of
these protocols Π pi , Alice always accepts with the correct advice from Merlin. Otherwise, let d = X ·Y 6= z.
We are going to analyze the probability that we pick a “good” pi? such that pi? does not divide |d − z|.
Since pi > x for all the pi and xx > n ≥ |d − z|, |d − z| cannot be a multiplier of more than x primes
among the pi . Therefore, with probability at least 0.9, our pick of pi? is good. And in this case, from the
definition of the protocols Π pi , Alice and Bob would reject afterward with probability at least 1/2, as d
(mod pi? ) differs from z (mod pi? ). In summary, when X ·Y 6= z, Alice rejects with probability at least
0.9/2 = 0.45, which finishes the proof for the correctness.

Complexity. Now, note that the total advice length is
                         !                        !
                     10x                         10x
                                 = O n/T · log ∏ x2       = O n/T · log x20x = O (n/T · log n) .
                                                                            
           O n/T · ∑ log pi
                     i=1                         i=1

And the communication complexity between Alice and Bob is bounded by
                                  O T · log x2 = O (T · log log n) .
                                              

                   p
    Setting T = n log n/ log log n balances the above two quantities, and we obtain the needed MA-
protocol for IPn .


7    Hardness of Approximate {−1, 1}-Max-IP via approximate
     polynomial for OR
                               √
In this section we apply the O( n)-degree approximate polynomial for OR [27, 77] to show hardness of
approximate {−1, 1}-Max-IP. We first give a reduction from OV to approximate {−1, 1}-Max-IP.
Theorem 7.1. Letting ε ∈ (0, 1), an OVn,d instance with sets A, B reduces to a {−1, 1}-Max-IPn,d1
instance with sets A
                   e and B,
                         e such that:
                                3     √          
                       d               O  d log 1/ε
                                                      · ε −1 , in which the notation ≤m
                                                                                      n
                                                                                          denotes ∑m   n
                                                                                                       
    • d1 =         p              ·2                                                             i=0 i .
             ≤O       d log 1/ε

    • There is an integer T > ε −1 such that if there is an (a, b) ∈ A × B such that a · b = 0, then
      OPT(A,  e ≥ T.
           e B)

    • Otherwise, |OPT(A, e ≤ T · ε.
                      e B)|

    • Moreover, the reduction takes n · poly(d1 ) time.

     We remark here that the above reduction fails to achieve a characterization: setting ε = 1/2 and
                                                       e√
d = c log n for an arbitrary constant c, we have d1 = 2O( log n) , much larger than log n. Another interesting
difference between the above theorem and Lemma 3.3 (the reduction from OV to approximate Max-IP)
is that Lemma 3.3 reduces one OV instance to many Max-IP instances, while the above reduction only
reduces it to one {−1, 1}-Max-IP instance.

                        T HEORY OF C OMPUTING, Volume 16 (4), 2020, pp. 1–50                               39
                                                       L IJIE C HEN

Proof of Theorem 7.1.
Construction and analysis of the polynomial Pε (z). By [27, 77], there is a polynomial Pε : {0, 1}d → R
such that:
                               p         
    • Pε is of degree D = O      d log 1/ε .

    • For every z ∈ {0, 1}d , Pε (z) ∈ [0, 1].

    • Given z ∈ {0, 1}d , if OR(z) = 0, then Pε (z) ≥ 1 − ε, otherwise Pε (z) ≤ ε.

    • Pε can be constructed in time polynomial in its description size.

   Now, let us analyze Pε further. For a set S ⊆ [d], let χS : {0, 1}d → R be χS (z) := ∏i∈S (−1)zi . Then
we can write Pε as
                                     Pε :=     ∑ χS · hχS , Pε i,
                                                   S⊆[d],|S|≤D

where hχS , Pε i is the inner product of χS and Pε , defined as hχS , Pε i := Ex∈{0,1}d χS (x) · Pε (x).
   Let cS = hχS , Pε i. From the definition it is easy to see that cS ∈ [−1, 1].
Discretization of polynomial Pε . Note that Pε (z) has real coefficients, we need to turn it into another
polynomial with integer coefficients first.
               d
                 
    Let M := ≤D   . Consider the following polynomial Pbε :

                                        Pbε :=         ∑         bcS · 2M/εc · χS .
                                                 S⊆[d],|S|≤D


   We can see that |Pbε (z)/(2M/ε) − Pε (z)| ≤ ε for every z ∈ {0, 1}d , and we let ĉS := bcS · M · 2/εc for
convenience.
Simplification of the polynomial Pbε . Pbε (z) is expressed over the basis consisting of the χS . We need to
turn it into a polynomial over the standard basis.
    For each S ⊆ [d], consider χS , it can also be written as

                             χS (z) = ∏(−1)zi := ∏(1 − 2zi ) = ∑ (−2)|T | zT ,
                                       i∈S                 i∈S                T ⊆S


where zT := ∏i∈T zi . Plugging it into the expression of Pbε , we have
                                                                    !
                          Pbε (z) :=     ∑             ∑         ĉS · (−2)|T | zT .
                                       T ⊆[d],|T |≤D    S⊆[d],|S|≤D,T ⊆S

    Set                                                                 !
                                      c̃T :=            ∑             ĉS · (−2)|T | ,
                                                 S⊆[d],|S|≤D,T ⊆S


                         T HEORY OF C OMPUTING, Volume 16 (4), 2020, pp. 1–50                              40
     O N T HE H ARDNESS OF A PPROXIMATE AND E XACT (B ICHROMATIC ) M AXIMUM I NNER P RODUCT

the above simplifies to
                                                Pbε (z) :=       ∑           c̃T · zT .
                                                             T ⊆[d],|T |≤D


Properties of the polynomial Pbε . Let us summarize some properties of Pbε for now. First we need a
bound on |c̃T |. We can see |ĉS | ≤ M · 2/ε, and by a simple calculation we have

                                                   |c̃T | ≤ M 2 · 2D · 2/ε.

    Let B = M 2 · 2D · 2/ε for convenience. For x, y ∈ {0, 1}d , consider Pbε (x, y) := Pbε (x1 y1 , x2 y2 , . . . , xd yd )
(that is, plugging in zi = xi yi ), we have

                                          Pbε (x, y) :=         ∑         c̃T · xT · yT ,
                                                          T ⊆[d],|T |≤D

where xT := ∏i∈T xi and yT is defined similarly. Moreover, we have

    • If x · y = 0, then Pbε (x, y) ≥ (2M/ε) · (1 − 2ε).

    • If x · y 6= 0, then |Pbε (x, y)| ≤ (2M/ε) · 2ε.

The reduction. Now, let us construct the reduction, we begin with some notation. For two vectors a, b,
we use a ◦ b to denote their concatenation. For a vector a and a real τ, we use a · τ to denote the vector
resulting from multiplying each coordinate of a by τ. Let sgn(τ) be the sign function that outputs 1
when τ > 0, −1 when τ < 0, and 0 when τ = 0. For τ ∈ {−B, −B + 1, . . . , B}, we use eτ ∈ {−1, 0, 1}B
to denote the vector whose first |τ| elements are sgn(τ) and the rest are zeros. We also use 1 to denote the
all-1 vector with length B.
    Let T1 , T2 , . . . , TM be an enumeration of all subsets T ⊆ [d] such that |T | ≤ D. We define

                                ϕx (x) := ◦M                                 M
                                           i=1 (ec̃Ti · xTi ) and ϕy (y) := ◦i=1 (1 · yTi ).

And we have
                                           M                                   M
                       ϕx (x) · ϕy (y) = ∑ (ec̃Ti · 1) · (xTi · yTi ) = ∑ c̃Ti · xTi · yTi = Pbε (x, y).
                                          i=1                                 i=1

   To move from {−1, 0, 1} to {−1, 1}, we use the following carefully designed reductions ψx , ψy :
{−1, 0, 1} → {−1, 1}2 , such that

                      ψx (−1) = ψy (−1) = (−1, −1, −1, −1),                        ψx (0) = (−1, −1, 1, 1),

                          ψy (0) := (1, −1, 1, −1), and ψx (1) = ψy (1) = (1, 1, 1, 1).
    It is easy to check that for a, b ∈ {−1, 0, 1}, we have ψx (a) · ψy (b) = 4 · (a · b).
                                                                                                  ⊗(B·M)
    Hence, composing the above two reductions, we get our desired reductions φx = ψx                       ◦ ϕx and
        ⊗(B·M)                                 d                           4B·M
φy = ψy        ◦ ϕy such that for x, y ∈ {0, 1} , φx (x), φy (y) ∈ {−1, 1}      and φx (x) · φy (y) = 4 · Pbε (x, y).

                           T HEORY OF C OMPUTING, Volume 16 (4), 2020, pp. 1–50                                          41
                                                 L IJIE C HEN

   Finally, given an OVn,d instance with two sets A and B, we construct two sets A        e and B,
                                                                                                 e such that A
                                                                                                             e
consists of all the vectors φx (x) for x ∈ A, and B consists of all the vectors φy (y) for y ∈ B.
                                                  e
   Then we can see A   e and Be consist of n vectors from {−1, 1}d1 , where
                                                          3     √          
                                                 d               O  d log 1/ε
                                 3   D
               d1 = 4B · M = M · 2 · 8/ε =    p             ·2                · ε −1
                                           ≤O   d log 1/ε

as stated.
    It is not hard to see the above reduction takes n · poly(d1 ) time. Moreover, if there is an (x, y) ∈ A × B
such that x · y = 0, then OPT(A,   e ≥ (8M/ε) · (1 − 2ε), otherwise, OPT(A,
                                e B)                                            e ≤ (8M/ε) · 2ε. Setting ε
                                                                              e B)
above to be 1/3 times the ε in the statement finishes the proof.

    With Theorem 7.1, we are ready to prove our hardness results on {−1, 1}-Max-IP.

Theorem 7.2. Assume SETH (or OVC). Letting α : N → R be any function of n such that α(n) = no(1) ,
there is another function β satisfying β (n) = no(1) and an integer T > α (β and T depend on α), such
that there is no n2−Ω(1) -time algorithm for {−1, 1}-Max-IPn,β (n) distinguishing the following two cases:

    • OPT(A, B) ≥ T (A and B are the sets in the {−1, 1}-Max-IP instance).

    • |OPT(A, B)| ≤ T /α(n).

Proof. Letting α = no(1) and k = log α/ log n, we have k = o(1). Setting d = c log n where c is an arbitrary
constant and ε = α −1 in Theorem 7.1, we have that an OVc log n reduces to a certain α(n)-approximation
to a {−1, 1}-Max-IPn,d1 instance with sets A and B, where
                          3      √
                                                √ O(√ck log n)      √                        √
                c log n                           c
        d1 =     √            · 2O( ck log n) ≤ √                · 2O( ck log n) = nO(log(c/k)· ck) .
             ≤ O( ck log n)                       k
                 1/3
Now let β = nk and T be the integer specified by Theorem 7.1. Since k = o(1), β = no(1) . Suppose
otherwise there is an n2−Ω(1) -time algorithm for√ distinguishing whether OPT(A, B) ≥ T or |OPT(A, B)| ≤
T /α(n). Then for any constant c, O(log(c/k) ck) ≤ k1/3 for sufficiently large n, which means d1 ≤ β (n)
for a sufficiently large n, and there is an n2−Ω(1) -time algorithm for OVc log n by Theorem 7.1, contradiction
to OVC.


8    Future work
We end our paper by discussing a few interesting research directions.

    1. The most important future direction from this work is to further improve the dimensionality
                                                                 ∗
       reduction for OV. It is certainly weird to consider 2O(log n) to be the right answer for the limit
       of the dimensionality reduction. This term seems to follow from the limitation of our recursive
       number-theoretical construction, and not from the nature of the problem itself. We conjecture that
       there should be an ω(1) dimensional reduction with a more direct construction.

                        T HEORY OF C OMPUTING, Volume 16 (4), 2020, pp. 1–50                                42
    O N T HE H ARDNESS OF A PPROXIMATE AND E XACT (B ICHROMATIC ) M AXIMUM I NNER P RODUCT

      One possible direction is to combine the original polynomial-based construction from [75] together
      with our new number-theoretical one. These two approaches seem completely different, hence a
      clever combination of them may prove our conjecture.

   2. In order to prove ω(1) dimensional hardness for `2 -Furthest Pair and Bichromatic `2 -Closest
      Pair, we can also bypass the OV dimensionality reduction approach by proving ω(1) dimensional
      hardness for Z-Max-IP directly. One possible way to approach this question is to start from
      the NP · UPP communication protocol connection as in Section 5 (apply Theorem 1.22), and
      (potentially) draw some connections from some known UPP communication protocols.

   3. We have seen an efficient reduction from Z-OV to Z-Max-IP which only blows up the dimension
      quadratically, is there a similar reduction from Z-Max-IP back to Z-OV? Are Z-Max-IP and Z-OV
      equivalent?

                                                                          e √log n) factor from
   4. By making use of the new AG-code based MA protocols, we can shave a O(
                                                           √
      the communication complexity, can we obtain an O( n) MA communication protocol matching
      the lower bound for DISJn ? It seems new ideas are required.
      Since our MA protocol works for both DISJ and IP, and IP does seem to be a harder problem. It
                                                                                             √
      may be better to find an MA protocol only works for DISJ. It is worth noting that an O( n) AMA
      communication protocol for DISJ is given by [67], which doesn’t work for IP.

   5. Can the dependence on ε in  the algorithms from Theorem 1.5 be further improved? Is it possible to
                              e √
      apply ideas in the n2−1/Ω( c) algorithm for Max-IPn,c log n from [13]?

   6. For the complexity of computing 2-multiplicative-approximation to Max-IPn,c log n , Theorem 1.5
      implies that there is an algorithm running in n2−1/O(log c) time, the same as the best algorithm
      for OVn,c log n [6]. Is this just a coincidence? Or are there some connections between these two
      problems?

   7. We obtain a connection between hardness of Z-Max-IP and NP · UPP communication protocols for
      Set-Disjointness. Can we get similar connections from other NP · C type communication protocols
      for Set-Disjointness? Some candidates include NP · SBP and NP · promiseBQP (QCMA).


Acknowledgment
I would like to thank Ryan Williams for introducing the problem to me, countless encouragement and
helpful discussions during this work, and also many comments on a draft of this paper. In particular,
the idea of improving OV dimensionality self-reduction using CRT (the direct CRT based approach) is
introduced to me by Ryan Williams.
    I am grateful to Virginia Vassilevska Williams, Kaifeng Lyu and Peilin Zhong for helpful discussions
and suggestions. I would like to thank Aviad Rubinstein for sharing a manuscript of his paper, and
                         √
pointing out that the O( n log n log log n) MA protocol also works for Inner Product.

                      T HEORY OF C OMPUTING, Volume 16 (4), 2020, pp. 1–50                           43
                                            L IJIE C HEN

References
 [1] S COTT A ARONSON AND AVI W IGDERSON: Algebrization: A new barrier in complexity the-
     ory. ACM Trans. Comput. Theory, 1(1):2:1–2:54, 2009. Preliminary version in STOC’08.
     [doi:10.1145/1490270.1490272] 4, 11, 13

 [2] A MIR A BBOUD AND A RTURS BACKURS: Towards hardness of approximation for poly-
     nomial time problems.      In Proc. 8th Symp. Innovations in Theoretical Computer Sci-
     ence (ITCS’17), pp. 11:1–11:26. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, 2017.
     [doi:10.4230/LIPIcs.ITCS.2017.11] 14

 [3] A MIR A BBOUD AND S ØREN DAHLGAARD: Popular conjectures as a barrier for dynamic
     planar graph algorithms. In Proc. 57th FOCS, pp. 477–486. IEEE Comp. Soc., 2016.
     [doi:10.1109/FOCS.2016.58, arXiv:1605.03797] 13

 [4] A MIR A BBOUD AND AVIAD RUBINSTEIN: Fast and deterministic constant factor approximation
     algorithms for LCS imply new circuit lower bounds. In Proc. 9th Symp. Innovations in Theoretical
     Computer Science (ITCS’18), pp. 35:1–35:14. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik,
     2018. [doi:10.4230/LIPIcs.ITCS.2018.35] 13, 14

 [5] A MIR A BBOUD , AVIAD RUBINSTEIN , AND R. RYAN W ILLIAMS: Distributed PCP theorems
     for hardness of approximation in P. In Proc. 58th FOCS, pp. 25–36. IEEE Comp. Soc., 2017.
     [doi:10.1109/FOCS.2017.12, arXiv:1706.06407] 3, 4, 6, 8, 9, 13, 14, 23, 24, 35

 [6] A MIR A BBOUD , R ICHARD R. RYAN W ILLIAMS , AND H UACHENG Y U: More applications of
     the polynomial method to algorithm design. In Proc. 26th Ann. ACM-SIAM Symp. on Discrete
     Algorithms (SODA’15), pp. 218–230. ACM Press, 2015. [doi:10.1137/1.9781611973730.17] 6, 43

 [7] A MIR A BBOUD AND V IRGINIA VASSILEVSKA W ILLIAMS: Popular conjectures imply strong
     lower bounds for dynamic problems. In Proc. 55th FOCS, pp. 434–443. IEEE Comp. Soc., 2014.
     [doi:10.1109/FOCS.2014.53, arXiv:1402.0054] 13

 [8] A MIR A BBOUD , V IRGINIA VASSILEVSKA W ILLIAMS , AND O REN W EIMANN: Consequences
     of faster alignment of sequences. In Proc. 41st Internat. Colloq. on Automata, Languages and
     Programming (ICALP’14), pp. 39–51. Springer, 2014. [doi:10.1007/978-3-662-43948-7_4] 5, 13

 [9] A MIR A BBOUD , V IRGINIA VASSILEVSKA W ILLIAMS , AND H UACHENG Y U: Matching triangles
     and basing hardness on an extremely popular conjecture. SIAM J. Comput., 47(3):1098–1122, 2018.
     Preliminary version in STOC’15. [doi:10.1137/15M1050987] 13

[10] PANKAJ K. AGARWAL , H ERBERT E DELSBRUNNER , OTFRIED S CHWARZKOPF, AND E MO
     W ELZL: Euclidean minimum spanning trees and bichromatic closest pairs. Discrete Comput.
     Geom., 6(3):407–422, 1991. Preliminary version in SoCG’90. [doi:10.1007/BF02574698] 4

[11] T HOMAS DYBDAHL A HLE , R ASMUS PAGH , I LYA P. R AZENSHTEYN , AND F RANCESCO S IL -
     VESTRI : On the complexity of inner product similarity join. In Proc. 35th ACM Symp. Principles


                     T HEORY OF C OMPUTING, Volume 16 (4), 2020, pp. 1–50                         44
    O N T HE H ARDNESS OF A PPROXIMATE AND E XACT (B ICHROMATIC ) M AXIMUM I NNER P RODUCT

     of Database Sys. (PODS’16), pp. 151–164. ACM Press, 2016. [doi:10.1145/2902251.2902285,
     arXiv:1510.02824] 3, 6, 8, 14

[12] J OSH A LMAN: An illuminating algorithm for the light bulb problem. In 2nd Symp. on Simplicity in
     Algorithms (SOSA’19), pp. 2:1–2:11, 2019. [doi:10.4230/OASIcs.SOSA.2019.2, arXiv:1810.06740]
     17

[13] J OSH A LMAN , T IMOTHY M. C HAN , AND R. RYAN W ILLIAMS: Polynomial representations of
     threshold functions and algorithmic applications. In Proc. 57th FOCS, pp. 467–476. IEEE Comp.
     Soc., 2016. [doi:10.1109/FOCS.2016.57, arXiv:1608.04355] 5, 7, 20, 22, 43

[14] J OSH A LMAN AND R. RYAN W ILLIAMS: Probabilistic polynomials and Hamming nearest neigh-
     bors. In Proc. 56th FOCS, pp. 136–150. IEEE Comp. Soc., 2015. [doi:10.1109/FOCS.2015.18,
     arXiv:1507.05106] 3, 5, 7

[15] A LEXANDR A NDONI AND P IOTR I NDYK: Near-optimal hashing algorithms for approximate
     nearest neighbor in high dimensions. Comm. ACM, 51(1):117–122, 2008. Conference version in
     FOCS’06. [doi:10.1145/1327452.1327494] 3

[16] A LEXANDR A NDONI , P IOTR I NDYK , T HIJS L AARHOVEN , I LYA P. R AZENSHTEYN , AND L UDWIG
     S CHMIDT: Practical and optimal LSH for angular distance. In Adv. in Neural Inform. Proc. Systems
     (NIPS’15), pp. 1225–1233. MIT Press, 2015. NIPS. [arXiv:1509.02897] 3

[17] A LEXANDR A NDONI , P IOTR I NDYK , H UY L Ê N GUY ÊN˜ , AND I LYA P. R AZENSHTEYN: Beyond
     locality-sensitive hashing. In Proc. 25th Ann. ACM-SIAM Symp. on Discrete Algorithms (SODA’14),
     pp. 1018–1028. ACM Press, 2014. [doi:10.1137/1.9781611973402.76, arXiv:1306.1547] 3

[18] A LEXANDR A NDONI AND I LYA P. R AZENSHTEYN: Optimal data-dependent hashing for
     approximate near neighbors.     In Proc. 47th STOC, pp. 793–801. ACM Press, 2015.
     [doi:10.1145/2746539.2746553, arXiv:1501.01062] 3

[19] T OM M. A POSTOL: Introduction to Analytic Number Theory. Springer, 2013. [doi:10.1007/978-1-
     4757-5579-4] 15

[20] S ANJEEV A RORA AND B OAZ BARAK: Computational Complexity – A Modern Approach. Cam-
     bridge Univ. Press, 2009. 16

[21] A RTURS BACKURS AND P IOTR I NDYK: Which regular expression patterns are hard to match?
     In Proc. 57th FOCS, pp. 457–466. IEEE Comp. Soc., 2016. [doi:10.1109/FOCS.2016.56,
     arXiv:1511.07070] 13

[22] A RTURS BACKURS AND P IOTR I NDYK: Edit distance cannot be computed in strongly subquadratic
     time (unless SETH is false). SIAM J. Comput., 47(3):1087–1097, 2018. Preliminary version in
     STOC’15. [doi:10.1137/15M1053128, arXiv:1412.0348] 13

[23] J ON L OUIS B ENTLEY AND M ICHAEL I AN S HAMOS: Divide-and-conquer in multidimensional
     space. In Proc. 8th STOC, pp. 220–230. ACM Press, 1976. [doi:10.1145/800113.803652] 9

                      T HEORY OF C OMPUTING, Volume 16 (4), 2020, pp. 1–50                         45
                                            L IJIE C HEN

[24] K ARL B RINGMANN: Why walking the dog takes time: Fréchet distance has no strongly sub-
     quadratic algorithms unless SETH fails. In Proc. 55th FOCS, pp. 661–670. IEEE Comp. Soc., 2014.
     [doi:10.1109/FOCS.2014.76, arXiv:1404.1448] 13

[25] K ARL B RINGMANN , A LLAN G RØNLUND , AND K ASPER G REEN L ARSEN: A dichotomy for
     regular expression membership testing. In Proc. 58th FOCS, pp. 307–318. IEEE Comp. Soc., 2017.
     [doi:10.1109/FOCS.2017.36, arXiv:1611.00918] 13

[26] K ARL B RINGMANN AND M ARVIN K ÜNNEMANN: Multivariate fine-grained complexity of longest
     common subsequence. In Proc. 29th Ann. ACM-SIAM Symp. on Discrete Algorithms (SODA’18),
     pp. 1216–1235. ACM Press, 2018. [doi:10.1137/1.9781611975031.79, arXiv:1803.00938] 13

[27] H ARRY B UHRMAN , R ICHARD C LEVE , RONALD DE W OLF, AND C HRISTOF Z ALKA: Bounds for
     small-error and zero-error quantum algorithms. In Proc. 40th FOCS, pp. 358–368. IEEE Comp.
     Soc., 1999. [doi:10.1109/SFFCS.1999.814607, arXiv:cs/9904019] 8, 39, 40

[28] H ARRY B UHRMAN , R ICHARD C LEVE , AND AVI W IGDERSON: Quantum vs. classical
     communication and computation. In Proc. 30th STOC, pp. 63–68. ACM Press, 1998.
     [doi:10.1145/276698.276713, arXiv:quant-ph/9802040] 8

[29] C HRIS C ALABRO , RUSSELL I MPAGLIAZZO , AND R AMAMOHAN PATURI: The complexity of
     satisfiability of small depth circuits. In 4th Internat. Workshop on Parameterized and Exact
     Computation (IWPEC’09), pp. 75–85. Springer, 2009. [doi:10.1007/978-3-642-11269-0_6] 5

[30] T IMOTHY M. C HAN: A (slightly) faster algorithm for Klee’s measure problem. Comput. Geom.,
     43(3):243–250, 2010. Preliminary version in SoCG’08. [doi:10.1016/j.comgeo.2009.01.007] 11

[31] L IJIE C HEN , S HAFI G OLDWASSER , K AIFENG LYU , G UY N. ROTHBLUM , AND AVIAD RUBIN -
     STEIN : Fine-grained complexity meets IP = PSPACE. In Proc. 30th Ann. ACM-SIAM Symp. on
     Discrete Algorithms (SODA’19), pp. 1–20. ACM Press, 2019. [doi:10.1137/1.9781611975482.1,
     arXiv:1805.02351] 13

[32] L IJIE C HEN AND R. RYAN W ILLIAMS: An equivalence class for orthogonal vectors. In Proc.
     30th Ann. ACM-SIAM Symp. on Discrete Algorithms (SODA’19), pp. 21–40. ACM Press, 2019.
     [doi:10.1137/1.9781611975482.2, arXiv:1811.12017] 13

[33] T OBIAS C HRISTIANI: A framework for similarity search with space-time tradeoffs using locality-
     sensitive filtering. In Proc. 28th Ann. ACM-SIAM Symp. on Discrete Algorithms (SODA’17), pp.
     31–46. ACM Press, 2017. [doi:10.1137/1.9781611974782.3, arXiv:1605.02687] 3

[34] T OBIAS C HRISTIANI AND R ASMUS PAGH: Set similarity search beyond minhash. In Proc. 49th
     STOC, pp. 1094–1107. ACM Press, 2017. [doi:10.1145/3055399.3055443, arXiv:1612.07710] 3

[35] D ON C OPPERSMITH: Rapid multiplication of rectangular matrices. SIAM J. Comput., 11(3):467–
     471, 1982. [doi:10.1137/0211037] 14

                     T HEORY OF C OMPUTING, Volume 16 (4), 2020, pp. 1–50                         46
    O N T HE H ARDNESS OF A PPROXIMATE AND E XACT (B ICHROMATIC ) M AXIMUM I NNER P RODUCT

[36] S VYATOSLAV C OVANOV AND E MMANUEL T HOMÉ: Fast integer multiplication using generalized
     Fermat primes. Math. Comput., 88(317):1449–1477, 2019. [doi:10.1090/mcom/3367] 11

[37] ROEE DAVID , K ARTHIK C. S., AND B UNDIT L AEKHANUKIT: On the complexity of closest pair
     via polar-pair of point-sets. SIAM J. Discrete Math., 33(1):509–527, 2019. Preliminary version in
     SoCG’18. [doi:10.1137/17M1128733, arXiv:1608.03245] 13

[38] M ARTIN D IETZFELBINGER , T ORBEN H AGERUP, J YRKI K ATAJAINEN , AND M ARTTI P ENTTO -
     NEN : A reliable randomized algorithm for the closest-pair problem. J. Algorithms, 25(1):19–51,
     1997. [doi:10.1006/jagm.1997.0873] 9

[39] M ARTIN F ÜRER: Faster integer multiplication. SIAM J. Comput., 39(3):979–1005, 2009. Prelimi-
     nary version in STOC’07. [doi:10.1137/070711761] 11

[40] J IAWEI G AO , RUSSELL I MPAGLIAZZO , A NTONINA KOLOKOLOVA , AND R. RYAN W ILLIAMS:
     Completeness for first-order properties on sparse structures with algorithmic applications.
     ACM Trans. Algorithms, 15(3):23:1–23:35, 2018.          Preliminary version in SODA’17.
     [doi:10.1145/3196275] 13

[41] I SAAC G OLDSTEIN , T SVI KOPELOWITZ , M OSHE L EWENSTEIN , AND E LY P ORAT: Conditional
     lower bounds for space/time tradeoffs. In Proc. 15th Internat. Workshop on Algorithms and
     Data Structures (WADS’17), pp. 421–436. Springer, 2017. [doi:10.1007/978-3-319-62127-2_36,
     arXiv:1706.05847] 13

[42] L OV K. G ROVER: A fast quantum mechanical algorithm for database search. In Proc. 28th STOC,
     pp. 212–219. ACM Press, 1996. [doi:10.1145/237814.237866, arXiv:quant-ph/9605043] 8

[43] S ARIEL H AR -P ELED , P IOTR I NDYK , AND R AJEEV M OTWANI: Approximate nearest neighbors:
     Towards removing the curse of dimensionality. Theory of Computing, 8(14):321–350, 2012.
     Preliminary versions in STOC’98 and FOCS’01. [doi:10.4086/toc.2012.v008a014] 3

[44] DAVID H ARVEY, J ORIS VAN DER H OEVEN , AND G RÉGOIRE L ECERF: Even faster integer
     multiplication. J. Complexity, 36(C):1–30, 2016. [doi:10.1016/j.jco.2016.03.001, arXiv:1407.3360]
     11

[45] M ONIKA H ENZINGER , S EBASTIAN K RINNINGER , DANUPON NANONGKAI , AND T HATCHAPHOL
     S ARANURAK: Unifying and strengthening hardness for dynamic problems via the online
     matrix-vector multiplication conjecture. In Proc. 47th STOC, pp. 21–30. ACM Press, 2015.
     [doi:10.1145/2746539.2746609, arXiv:1511.06773] 13

[46] M ONIKA H ENZINGER , A NDREA L INCOLN , S TEFAN N EUMANN , AND V IRGINIA VASSILEVSKA
     W ILLIAMS: Conditional hardness for sensitivity problems. In Proc. 8th Symp. Innovations in
     Theoretical Computer Science (ITCS’17), pp. 26:1–26:31. Schloss Dagstuhl–Leibniz-Zentrum fuer
     Informatik, 2017. [doi:10.4230/LIPIcs.ITCS.2017.26, arXiv:1703.01638] 13

                      T HEORY OF C OMPUTING, Volume 16 (4), 2020, pp. 1–50                         47
                                             L IJIE C HEN

[47] RUSSELL I MPAGLIAZZO AND R AMAMOHAN PATURI: On the complexity of k-SAT. J. Comput.
     System Sci., 62(2):367–375, 2001. Preliminary version in CCC’99. [doi:10.1006/jcss.2000.1727] 3,
     5

[48] P IOTR I NDYK AND R AJEEV M OTWANI: Approximate nearest neighbors: Towards remov-
     ing the curse of dimensionality. In Proc. 30th STOC, pp. 604–613. ACM Press, 1998.
     [doi:10.1145/276698.276876] 3

[49] S TASYS J UKNA: Boolean Function Complexity: Advances and Frontiers. Volume 27. Springer,
     2012. [doi:10.1007/978-3-642-24508-4] 36

[50] M ATTI K ARPPA , P ETTERI K ASKI , AND J UKKA KOHONEN: A faster subquadratic algorithm for
     finding outlier correlations. ACM Trans. Algorithms, 14(3):31:1–31:26, 2018. Preliminary version
     in SODA’16. [doi:10.1145/3174804, arXiv:1510.03895] 3

[51] K ARTHIK C. S., B UNDIT L AEKHANUKIT, AND PASIN M ANURANGSI: On the parameterized
     complexity of approximating dominating set. J. ACM, 66(5):33:1–33:38, 2019. Preliminary version
     in STOC’18. [doi:10.1145/3325116, arXiv:1711.11029] 4, 13, 15, 23

[52] K ARTHIK C. S. AND PASIN M ANURANGSI: On closest pair in Euclidean metric: Monochromatic is
     as hard as bichromatic. Combinatorica, 2020. Preliminary version in ITCS’19. [doi:10.1007/s00493-
     019-4113-1, arXiv:1812.00901] 14

[53] S AMIR K HULLER AND YOSSI M ATIAS: A simple randomized sieve algorithm for the closest-pair
     problem. Inform. and Comput., 118(1):34–37, 1995. [doi:10.1006/inco.1995.1049] 9

[54] H ARTMUT K LAUCK: Rectangle size bounds and threshold covers in communication complexity.
     In Proc. 18th Conf. Computational Complexity (CCC’03), pp. 118–134. IEEE Comp. Soc., 2003.
     [doi:10.1109/CCC.2003.1214415, arXiv:cs/0208006] 11

[55] T SVI KOPELOWITZ , S ETH P ETTIE , AND E LY P ORAT: Higher lower bounds from the 3SUM
     conjecture. In Proc. 27th Ann. ACM-SIAM Symp. on Discrete Algorithms (SODA’16), pp. 1272–
     1287. ACM Press, 2016. [doi:10.1137/1.9781611974331.ch89, arXiv:1407.6756] 13

[56] ROBERT K RAUTHGAMER AND O HAD T RABELSI: Conditional lower bounds for all-pairs max-
     flow. ACM Trans. Algorithms, 14(4):42:1–42:15, 2018. Preliminary version in ICALP’17.
     [doi:10.1145/3212510, arXiv:1702.05805] 13

[57] F RANÇOIS L E G ALL AND F LORENT U RRUTIA: Improved rectangular matrix multiplication using
     powers of the Coppersmith-Winograd tensor. In Proc. 29th Ann. ACM-SIAM Symp. on Discrete
     Algorithms (SODA’18), pp. 1029–1046. ACM Press, 2018. [doi:10.1137/1.9781611975031.67,
     arXiv:1708.05622] 14

[58] J I ŘÍ M ATOUŠEK: Efficient partition trees. Discrete Comput. Geom., 8(3):315–334, 1992. Prelimi-
     nary version in SoCG’91. [doi:10.1007/BF02293051] 4

                      T HEORY OF C OMPUTING, Volume 16 (4), 2020, pp. 1–50                          48
    O N T HE H ARDNESS OF A PPROXIMATE AND E XACT (B ICHROMATIC ) M AXIMUM I NNER P RODUCT

[59] J I ŘÍ M ATOUŠEK: Range searching with efficient hierarchical cuttings. Discrete Comput. Geom.,
     10(2):157–182, 1993. Preliminary version in SoCG’92. [doi:10.1007/BF02573972] 11

[60] B EHNAM N EYSHABUR AND NATHAN S REBRO: On symmetric and asymmetric LSHs for inner
     product search. In Proc. 32nd Int. Conf. on Machine Learning (ICML’15), pp. 1926–1934, 2015.
     MLR Press. [arXiv:1410.5518] 3

[61] M IHAI PĂTRA ŞCU: Towards polynomial lower bounds for dynamic problems. In Proc. 42nd STOC,
     pp. 603–610. ACM Press, 2010. [doi:10.1145/1806689.1806772] 13

[62] M IHAI PĂTRA ŞCU AND R. RYAN W ILLIAMS: On the possibility of faster SAT algorithms. In Proc.
     21st Ann. ACM-SIAM Symp. on Discrete Algorithms (SODA’10), pp. 1065–1075. ACM Press, 2010.
     [doi:10.1137/1.9781611973075.86] 13

[63] R AMAMOHAN PATURI AND JANOS S IMON: Probabilistic communication complexity. J. Com-
     put. System Sci., 33(1):106–123, 1986. Preliminary version in FOCS’84. [doi:10.1016/0022-
     0000(86)90046-2] 10, 36

[64] A LI R AHIMI AND B ENJAMIN R ECHT: Random features for large-scale kernel machines. In Adv. in
     Neural Inform. Proc. Systems (NIPS’07), pp. 1177–1184. MIT Press, 2007. NIPS. 3

[65] PARIKSHIT R AM AND A LEXANDER G. G RAY: Maximum inner-product search using cone trees.
     In Proc. 18th Internat. Conf. on Knowledge Discovery and Data Mining (KDD’12), pp. 931–939.
     ACM Press, 2012. [doi:10.1145/2339530.2339677] 3

[66] L IAM RODITTY AND V IRGINIA VASSILEVSKA W ILLIAMS: Fast approximation algorithms for
     the diameter and radius of sparse graphs. In Proc. 45th STOC, pp. 515–524. ACM Press, 2013.
     [doi:10.1145/2488608.2488673] 13

[67] AVIAD RUBINSTEIN: Hardness of approximate nearest neighbor search. In Proc. 50th STOC, pp.
     1260–1268. ACM Press, 2018. [doi:10.1145/3188745.3188916, arXiv:1803.00904] 4, 6, 7, 11, 13,
     19, 20, 21, 22, 23, 38, 43

[68] A NSHUMALI S HRIVASTAVA AND P ING L I: Asymmetric LSH (ALSH) for sublinear time maximum
     inner product search (MIPS). In Adv. in Neural Inform. Proc. Systems (NIPS’14), pp. 2321–2329.
     MIT Press, 2014. NIPS. [arXiv:1405.5869] 3

[69] A NSHUMALI S HRIVASTAVA AND P ING L I: Asymmetric minwise hashing for indexing binary inner
     products and set containment. In Proc. 24th Int. World Wide Web Conf. (WWW’15), pp. 981–991,
     2015. [doi:10.1145/2736277.2741285] 3

[70] C HRISTINA T EFLIOUDI AND R AINER G EMULLA: Exact and approximate maximum inner product
     search with LEMP. ACM Trans. Database Syst., 42(1):5:1–5:49, 2016. [doi:10.1145/2996452] 3

[71] G REGORY VALIANT: Finding correlations in subquadratic time, with applications to learning
     parities and the closest pair problem. J. ACM, 62(2):13:1–13:45, 2015. Preliminary version in
     FOCS’12. [doi:10.1145/2728167] 3

                     T HEORY OF C OMPUTING, Volume 16 (4), 2020, pp. 1–50                         49
                                           L IJIE C HEN

[72] V IRGINIA VASSILEVSKA W ILLIAMS: On some fine-grained questions in algorithms and complex-
     ity. In Proc. Internat. Congr. of Mathematicians (ICM 2018), volume 4, pp. 3447–3487. World
     Scientific, 2019. [doi:10.1142/9789813272880_0188] 13

[73] R. RYAN W ILLIAMS: A new algorithm for optimal 2-constraint satisfaction and its implica-
     tions. Theoret. Comput. Sci., 348(2–3):357–365, 2005. Preliminary version in ICALP’04.
     [doi:10.1016/j.tcs.2005.09.023] 3, 4, 5

[74] R. RYAN W ILLIAMS: Faster all-pairs shortest paths via circuit complexity. SIAM J. Com-
     put., 47(5):1965–1985, 2018. Preliminary version in STOC’14. [doi:10.1137/15M1024524,
     arXiv:1312.6680] 6

[75] R. RYAN W ILLIAMS: On the difference between closest, furthest, and orthogonal pairs: Nearly-
     linear vs barely-subquadratic complexity. In Proc. 29th Ann. ACM-SIAM Symp. on Discrete
     Algorithms (SODA’18), pp. 1207–1215. ACM Press, 2018. [doi:10.1137/1.9781611975031.78,
     arXiv:1709.05282] 4, 8, 9, 11, 12, 13, 25, 29, 30, 31, 32, 43

[76] R. RYAN W ILLIAMS AND H UACHENG Y U: Finding orthogonal vectors in discrete structures. In
     Proc. 25th Ann. ACM-SIAM Symp. on Discrete Algorithms (SODA’14), pp. 1867–1877. ACM Press,
     2014. [doi:10.1137/1.9781611973402.135] 5

[77] RONALD DE W OLF: A note on quantum algorithms and the minimal degree of ε-error polynomials
     for symmetric functions. Quantum Info. Comput., 8(10):943–950, 2008. [doi:10.26421/QIC8.10,
     arXiv:0802.1816] 8, 39, 40

[78] A NDREW C HI -C HIH YAO: On constructing minimum spanning trees in k-dimensional spaces and
     related problems. SIAM J. Comput., 11(4):721–736, 1982. [doi:10.1137/0211059] 4


AUTHOR

     Lijie Chen
     Ph. D. student
     MIT, Cambridge, MA
     lijieche mit edu
     http://www.mit.edu/~lijieche/


ABOUT THE AUTHOR

     L IJIE C HEN got an undergraduate degree from Tsinghua University in 2017, and is now
         a Ph. D. student at MIT. His advisor is Ryan Williams. He likes turning red bull into
         theorems. While not doing that, he enjoys playing music games.




                     T HEORY OF C OMPUTING, Volume 16 (4), 2020, pp. 1–50                        50