DOKK Library

On Axis-Parallel Tests for Tensor Product Codes

Authors Alessandro Chiesa, Peter Manohar, Igor Shinkar,

License CC-BY-3.0

Plaintext
                           T HEORY OF C OMPUTING, Volume 16 (5), 2020, pp. 1–34
                                        www.theoryofcomputing.org




                           On Axis-Parallel Tests for
                            Tensor Product Codes
                Alessandro Chiesa∗                      Peter Manohar†                 Igor Shinkar‡
                Received August 30, 2018; Revised July 22, 2020; Published September 25, 2020




       Abstract. Many low-degree tests examine the input function via its restrictions to random
       hyperplanes of a certain dimension. Examples include the line-vs-line (Arora, Sudan 2003),
       plane-vs-plane (Raz, Safra 1997), and cube-vs-cube (Bhangale, Dinur, Navon 2017) tests.
           In this paper we study tests that only consider restrictions along axis-parallel hyperplanes,
       which have been studied by Polishchuk and Spielman (1994) and Ben-Sasson and Sudan
       (2006). While such tests are necessarily “weaker,” they work for a more general class
       of codes, namely tensor product codes. Moreover, axis-parallel tests play a key role in
       constructing LTCs with inverse polylogarithmic rate and short PCPs (Polishchuk, Spielman
       1994; Ben-Sasson, Sudan 2008; Meir 2010). We present two results on axis-parallel tests.
           (1) Bivariate low-degree testing with low agreement. We prove an analogue of the
       Bivariate Low-Degree Testing Theorem of Polishchuk and Spielman in the low-agreement
       regime, albeit for much larger fields. Namely, for the tensor product of the Reed–Solomon
     A preliminary version of this paper appeared in the Proceedings of the 21st International Workshop on Randomization and
Computation (RANDOM’17).
   ∗ Supported by the Center for Long-Term Cybersecurity at UC Berkeley.
   † Supported by the NSF Graduate Research Fellowship Program (under Grant No. DGE1745016); and the ARCS Foundation.

Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not
necessarily reflect the views of the National Science Foundation.
   ‡ Supported by NSERC Discovery Grant.



ACM Classification: E.4, F.2.2
AMS Classification: 68P30, 94B05, 94B25
Key words and phrases: error-correcting codes, tensor product codes, locally testable codes, low-degree
testing, extremal graph theory


© 2020 Alessandro Chiesa, Peter Manohar, and Igor Shinkar
c b Licensed under a Creative Commons Attribution License (CC-BY)                             DOI: 10.4086/toc.2020.v016a005
                      A LESSANDRO C HIESA , P ETER M ANOHAR , AND I GOR S HINKAR

      code with itself, we prove that for sufficiently large fields, the 2-query variant of the axis-
      parallel line test (row-vs-column test) works for arbitrarily small agreement. Prior analyses of
      axis-parallel tests assumed high agreement, and no results for such tests in the low-agreement
      regime were known.
          Our proof technique deviates significantly from that of Polishchuk and Spielman, which
      relies on algebraic methods such as Bézout’s Theorem, and instead leverages a fundamental
      result in extremal graph theory by Kővári, Sós, and Turán. To our knowledge, this is the first
      time this result is used in the context of low-degree testing.
          (2) Improved robustness for tensor product codes. Robustness is a strengthening of local
      testability that underlies many applications. We prove that the axis-parallel hyperplane test
      for the m-th tensor power of a linear code with block length n and distance d is Ω(d m /nm )-
      robust. This improves on a theorem of Viderman (2012) by a factor of 1/ poly(m). While
      the improvement is not large, we believe that our proof is a notable simplification compared
      to prior work.


1    Introduction
Locally testable codes (LTCs) are error-correcting codes for which, given an input word, one can verify
whether the word belongs to or is far from the code by inspecting the word in a few random locations.
LTCs have been studied extensively in different contexts, including program checking, interactive proofs,
and probabilistically checkable proofs (PCPs) [18, 33, 6, 5, 31, 25]. Goldreich and Sudan [25] describe
LTCs as “combinatorial counterparts of the complexity theoretic notion of PCPs,” motivating the study of
these objects separately.

LTC constructions The first constructions of LTCs were algebraic in nature, and relied on multivariate
polynomials. Starting with the seminal paper by Blum, Luby, and Rubinfeld [18], there has been much
work on such algebraic LTCs by way of results on linearity testing and low-degree testing in numerous
settings [18, 8, 13, 7, 1, 17]. Many other constructions [29, 36, 27] further optimize parameters of these
codes, including rate, distance, and the number of queries made by the tester.
     Ben-Sasson and Sudan [11] suggested a combinatorial approach to construct LTCs starting from any
linear code by (i) applying the tensor product operation [37, 38] to the code, and (ii) testing the resulting
code via the axis-parallel hyperplane test. We now discuss both.
                                                                                              2
     The tensor product of a linear code C ⊆ Fn with itself, denoted C2 , is the code in Fn consisting of
all 2-dimensional matrices whose n rows and n columns are codewords in C; similarly, the m-th tensor
                                             m
power of C, denoted Cm , is the code in Fn consisting of all m-dimensional tensors M whose restrictions
to any (m − 1)-dimensional axis-parallel hyperplane is a codeword in Cm−1 . For example, the code of
evaluations of all m-variate polynomials of individual degree at most r is the m-th tensor power of the
code of evaluations of all univariate polynomials of degree at most r.
     The axis-parallel hyperplane test for the code Cm works as follows: given a word M, sample a random
axis-parallel hyperplane and check if the restriction of M to this hyperplane is a codeword in Cm−1 . This
natural test extends ideas of axis-parallel line tests used in early PCP constructions [6, 5, 2] to arbitrary
tensor product codes.

                        T HEORY OF C OMPUTING, Volume 16 (5), 2020, pp. 1–34                               2
                         O N A XIS -PARALLEL T ESTS FOR T ENSOR P RODUCT C ODES

      We study two aspects of the axis-parallel hyperplane test for tensor product codes.

(1) Low-agreement regime All of the aforementioned papers study the axis-parallel hyperplane test in
the “high-agreement regime,” in which the given input word is within the unique decoding radius of the
tensor product code. What can be said about the “low-agreement regime,” in which the given input word
may be as far as the list-decoding radius? This setting is more challenging because one wishes to deduce
that a given word has some noticeable global correlation with a codeword (or a short list of codewords) by
only assuming that a noticeable fraction of local views of the test are local views of (potentially different)
codewords.
    Results in the low-agreement regime are known for other tests, such as tests for the Hadamard code
[7] and the long code [26] as well as random non-axis-parallel hyperplane tests in various dimensions
[32, 3, 30]. Moreover, these have applications to PCP constructions and hardness of approximation.
However, to our knowledge, prior to our work, no results have been known for the low-agreement regime
of axis-parallel tests.

(2) Robustness Ben-Sasson and Sudan [11] analyze the axis-parallel hyperplane test via the notion of
robustness, a stronger notion of local testability borrowed from the PCP literature [10, 21]. Informally, a
test for a code is robust if, given any input that is far from the code, the local view of the test is also far
from an accepting view on average. For example, the axis-parallel hyperplane test is robust if, given any
M that is far from Cm , the restriction of M to a random hyperplane is far from Cm−1 on average.
     Robustness thus relates the global distance to the expected local view distance and, as shown in [11],
facilitates query reduction via a natural way to compose tests. This notion has also found applications to
proof composition in the setting of PCPs [10]. These articles have motivated the study of the robustness of
the axis-parallel hyperplane test for tensor product codes, establishing both positive results [22, 14, 15, 35]
and limitations [34, 20, 24].
     Despite significant progress, robustness results for the axis-parallel hyperplane test seem to be far
from tight. The best known relation between the global distance and the local distance is due to Viderman
[35], but no examples that come anywhere close to his proven bound are known.


2      Main results
We present two main results about tests for tensor product codes. First, we prove an analogue of the
Bivariate Low-Degree Testing Theorem of Polishchuk and Spielman [31] in the low-agreement regime,
albeit for much larger fields. Second, we improve on the robustness of the hyperplane test for testing the
tensor product code Cm , for m ≥ 3. We now discuss our results.


2.1     Bivariate low-degree testing in the low-agreement regime
One of the applications of locally testable codes is constructing PCPs, where it is often desirable to reduce
the number of queries made by the test. Typically this is done by increasing the alphabet size so that each
“large” symbol bundles together several “small” symbols from different locations of the given word. This

                         T HEORY OF C OMPUTING, Volume 16 (5), 2020, pp. 1–34                                3
                      A LESSANDRO C HIESA , P ETER M ANOHAR , AND I GOR S HINKAR

bundling now introduces a consistency problem, because two large symbols may in principle disagree
about the same location in the word.
    For example, in [32, 3, 30, 16] the test has access to (alleged) restrictions of a low-degree polynomial
to all lines, planes, cubes, or other low-degree manifolds. The test samples several queries that intersect,
and checks that their answers are consistent on the intersection. These works establish that if the test
accepts with probability above a certain threshold, then the restrictions are close to the restrictions of
some low-degree polynomial.
    We study this problem in a modified setting, where the test only has access to axis-parallel restrictions.
Restricting the test in this way makes its task more difficult, but doing so provides other advantages. First,
axis-parallel restrictions are sometimes the only natural restrictions, such as when testing the m-th tensor
power of a general linear code C (more generally, one may consider restrictions to all (m − 1)-dimensional
axis-parallel hyperplanes). Second, having fewer restrictions enables more efficiency, e. g., it facilitates
the construction of short PCPs [31, 12].
    Indeed, for this very reason, Polishchuk and Spielman [31] study the above problem for bivariate
polynomials, where m = 2 and C is the degree-r Reed–Solomon code, i. e., the code of all evaluations
p : F2 → F of bivariate polynomials of individual degree at most r. The test has access to a table of row
polynomials and a table of column polynomials, and the goal is to check if these tables are consistent
with the restrictions of some global bivariate polynomial of individual degree r. That is, for each row y,
the test gets a polynomial R(·, y) : F → F of degree at most r in the first variable x. Analogously, for each
column x, the test gets a polynomial C(x, ·) : F → F of degree at most r in the second variable y. The test
works as follows: pick a random (x, y) ∈ F2 , read the row and column polynomials through this point,
and accept if and only if the two polynomials are equal on (x, y).
    Clearly, if all the row polynomials and column polynomials are restrictions of a bivariate polynomial
of individual degree r, then the test always accepts. They prove that, conversely, if the test accepts with
probability close to 1, then the given polynomials are “close” to being restrictions (to axis-parallel lines)
of some low-degree bivariate polynomial, as written below. In the statement, we say that a bivariate
polynomial in variables x and y has degree (a, b) if the degree in x is at most a and that in y is at most
b. This means that the table of row polynomials, R(x, y), has degree (r, n) and the table of column
polynomials, C(x, y), has degree (n, r), where n is the size of the table.

Theorem 2.1 ([31]). Let F be a field and X,Y ⊆ F subsets of size n := |X| = |Y |. Let R(x, y) be a
polynomial of degree (r, n) and C(x, y) a polynomial of degree (n, r) such that

                                         Pr      [C(x, y) = R(x, y)] = 1 − γ 2
                                     (x,y)∈X×Y

for some γ > 0. If n > 2γn + 2r, then there exists a polynomial Q(x, y) of degree (r, r) such that

                                Pr      [C(x, y) = R(x, y) = Q(x, y)] ≥ 1 − 2γ 2 .
                            (x,y)∈X×Y

   The theorem above assumes that n > 2γn + 2r, which means that γ 2 < (1/2 − r/n)2 < 1/4. In other
words, it requires the row polynomials and column polynomials to agree on (at least) more than three
quarters of the points in X ×Y . A slight improvement in the parameters of this theorem is shown in [9].
However, their result still requires the polynomials to agree on a large fraction of the points in X ×Y . But

                        T HEORY OF C OMPUTING, Volume 16 (5), 2020, pp. 1–34                                4
                         O N A XIS -PARALLEL T ESTS FOR T ENSOR P RODUCT C ODES

what, if anything, can be said if we only assume that they agree, for example, on more than a 0.1-fraction
of those points?
     There are several results on low-degree testing that show that, even if we only assume that the test
accepts with noticeable probability (for the row-vs-column test this probability equals the agreement
between row and column polynomials), one can still prove the existence of a short list of polynomials
that “explain” most of this probability, and this in turn has applications to constructing PCPs with small
errors (see, e. g., [32, 3, 30]).
     Our next result gives a positive answer to the question above, stating that even in the low-agreement
regime, we can still deduce some structure about the polynomials R and C, assuming that the order of the
field is sufficiently large.
Theorem 2.2. Fix r ∈ N, and let δ , ε ∈ R be such that δ ≥ ε > 0. Let F be a field of order n, where
n > exp(Ω((r/ε) log(1/ε))). Let R(x, y) be a polynomial of degree (r, n) and C(x, y) a polynomial of
degree (n, r) such that
                                     Pr [C(x, y) = R(x, y)] = δ .
                                       (x,y)∈F2

Then there exist t = O(1/ε) polynomials Q1 (x, y), . . . , Qt (x, y) of degree (r, r) such that
                             Pr [∃i ∈ [t] C(x, y) = R(x, y) = Qi (x, y)] ≥ δ − ε .
                           (x,y)∈F2

     We remark that Theorem 2.2 holds in general for C2 , where C ⊆ Fn is any linear code with minimal
distance ≥ n − r such that n > exp(Ω((r/ε) log(1/ε))). In particular, this means that the minimal distance
of C is at least n − O(log n). See the paragraph Beyond polynomials on page 6 for details.
     Note that in the above theorem, δ is the agreement probability, while γ 2 in Theorem 2.1 is the
disagreement probability. Also, both δ and ε can be sub-constant. This is the first result that analyzes the
row-vs-column test in the low-agreement regime that we are aware of.
     The row-vs-column test and its higher-dimensional analogues underlie many known PCP constructions
[6, 5, 31, 12]. However, in all these constructions the low-degree tests are only analyzed in the high-
agreement regime. We believe that analyzing the test in the low-agreement regime may imply short PCP
constructions with small (sub-constant) soundness. A weakness of the result stated in Theorem 2.2 is the
requirement that the field must be very large, which restricts us from getting PCPs with polynomial-length
proofs. Nonetheless, we consider Theorem 2.2 as a promising first step in this direction. More generally,
our result suggests that the low-agreement regime for tensor product codes merits further study.
     To prove the theorem we leverage a fundamental result in extremal graph theory by Kővári, Sós, and
Turán. To our knowledge, this is the first time this result is used in the context of low-degree testing. See
Subsection 3.1 below for a high-level description of our proof.
     We observe that in the above theorem, for δ ≥ ε sufficiently small, it is necessary to have a list of
at least (δ /ε)/polylog(δ /ε) polynomials in order to explain all but ε of the agreements of R and C. In
particular, if δ is a small constant and ε > 0 is sufficiently small, then the bound in Theorem 2.2 is tight
up to a polylog(1/ε) factor.
Proposition 2.3. Fix r ∈ N, and let δ , ε ∈ R be such that 1/12 > δ ≥ ε > 0. Let F be a field of order n,
where                                                          
                                              r δ
                                    n>Ω        · polylog(1/ε) .
                                              ε ε

                         T HEORY OF C OMPUTING, Volume 16 (5), 2020, pp. 1–34                              5
                         A LESSANDRO C HIESA , P ETER M ANOHAR , AND I GOR S HINKAR

Then there exists a polynomial R(x, y) of degree (0, n) and a polynomial C(x, y) of degree (n, 0) such that

                                              Pr [C(x, y) = R(x, y)] ∈ [δ , 2δ ] ,
                                           (x,y)∈F2

but for any t < (δ /ε)/polylog(δ /ε) and polynomials Q1 (x, y), . . . , Qt (x, y) of degree (r, r) it holds that

                Pr [∃i ∈ [t] s.t. C(x, y) = R(x, y) = Qi (x, y)] <            Pr [C(x, y) = R(x, y)] − ε .
              (x,y)∈F2                                                      (x,y)∈F2

    Thus, unlike other results in this area [32, 3, 16] where the list size depends only on δ , the list size
must grow as a function of ε in our setting. The reason for this difference comes from the restriction of
our test, which considers only axis-parallel lines, as opposed to arbitrary lines (or planes or cubes) used
in the other works. In particular, in our setting once we choose a point (x, y), the lines going through
this point are fixed by the design of the test, while in the aforementioned papers there are many lines (or
planes or cubes) going through this point, and the performed queries are chosen at random conditioned
on the chosen random point.

Beyond polynomials While the proof in [31] relies on polynomials (a key step is Bézout’s Theorem),
we rely on combinatorial techniques, so that our Theorem 2.2 holds in general for C2 , where C ⊆ Fn is
any linear code with minimal distance ≥ n − r such that n > exp(Ω((r/ε) log(1/ε))). In particular, this
means that the minimal distance of C is at least n − O(log n). The row-vs-column test is now given two
matrices R, C ∈ Fn×n such that every row of R is in C, and every column of C is in C. If

                                                Pr        [R(x, y) = C(x, y)] = δ ,
                                             (x,y)∈[n]2


then there exist t = O(1/ε) codewords Q1 , . . . , Qt ∈ C2 , such that

                               Pr       [∃i ∈ [t] s.t. R(x, y) = C(x, y) = Qi (x, y)] ≥ δ − ε .
                           (x,y)∈[n]2

    In this context it is worth mentioning that there has been a lot of work on the robustness of the axis-
parallel line test for tensor products of linear codes, proving both positive results [11, 22] and negative
ones [34, 20, 24]. We find it quite remarkable that this result holds for the tensor product C2 of an arbitrary
code C, albeit with very high distance, as the closely related notion of robustness does not hold for the
tensor product C2 of an arbitrary code C. Finally, in the high-agreement regime there is a correspondence
between the robustness of the axis-parallel line test and the soundness of the row-vs-column test (the
matrix is given as a collection of lines rather than explicitly).1 Yet this correspondence does not hold in
the low-agreement regime. Consider a matrix M whose rows are codewords chosen uniformly at random
independently from each other: assuming that F is sufficiently large, the average relative distance of a
row/column of M from some codeword is approximately 1/2 + k/(2n), where k is the dimension of the
   1 Let M ∈ Fn×n be such that the average relative distance of a row/column of M to some codeword is 1 − ε. One can verify

that by considering the closest codewords in each row and in each column, the obtained table of row/column codewords passes
the row-vs-column test with probability at least 1 − 2ε. Therefore, there exists a tensor codeword that agrees with most of the
rows and most of the columns, which in turn implies its agreement with M.


                            T HEORY OF C OMPUTING, Volume 16 (5), 2020, pp. 1–34                                             6
                         O N A XIS -PARALLEL T ESTS FOR T ENSOR P RODUCT C ODES

code. This is because when reading a row the test reads a codeword of C, and when reading a column,
then the view agrees with a codeword on approximately k coordinates with high probability. On the other
hand, M is typically far from a tensor codeword, e. g., it agrees with any tensor codeword on at most k/n
fraction of points.

Open problems We raise two questions on the low-agreement regime of axis-parallel line tests.
      • Smaller block length. Our result (Theorem 2.2) assumes that the length of code n is exponential in
        the degree r. Can one prove a similar result for smaller block length, e. g., n = poly(r)?
      • Higher dimensions. Polishchuk and Spielman [31] explain that their result (in the high-acceptance
        regime) also holds in higher dimensions, where now the test is given a table of low-degree
        polynomials for each axis-parallel line in Fm and works as follows: pick a random p ∈ Fm , read
        the polynomials along the m axis-parallel lines through p, and check that all polynomials agree on
        p. Can one prove a high-dimensional analogue of Theorem 2.2? Namely, is it true that if this test
        accepts with probability δ > 0, then there is a short list of low-degree polynomials that explain
        most of the agreements?

2.2     Improved robustness for the axis-parallel hyperplane test
                                                                                                   m
We study the robustness of the axis-parallel hyperplane test for the tensor product code Cm ⊆ Fn , for an
arbitrary linear code C with minimal distance d and block length n over the field F. Let H be the test that,
                       m
given a word M ∈ Fn , samples a random (m − 1)-dimensional axis-parallel hyperplane H and checks
                                     m
if M|H ∈ Cm−1 . For a word M ∈ Fn , we define δ (M) to be the relative distance of the word M to the
code Cm and ρ(M) to be EH [δ (M|H ,Cm−1 )], the expected local distance of M. The test H is α-robust if
                                           m
ρ(M) ≥ α · δ (M) for every word M ∈ Fn . The “strength” of the test increases with α, so the goal is to
establish the largest α for which this inequality holds.

What is known There are two main pieces of prior work that study the robustness of the test H for
general m. We state the results of these articles starting with one of Ben-Sasson and Sudan [11].
Theorem 2.4 ([11]). Let C ⊆ Fn be a linear code with minimal distance d. For
                                                    d −1 m
                                                        
                             m≥3         and                ≥ 7/8 ,
                                                      n
the test H is α-robust for Cm with α = 2−16 .
The above theorem is limited in that the proved robustness is small and, moreover, only provides a
guarantee when C has a very large distance. Viderman [35] shows that this condition on the distance is
not necessary in order to show some robustness guarantee.
Theorem 2.5 ([35]). Let C ⊆ Fn be a linear code with minimal distance d. For m ≥ 3, the test H is
α-robust for Cm with                             m
                                             1    d
                                       α=               .
                                            2m2 n


                         T HEORY OF C OMPUTING, Volume 16 (5), 2020, pp. 1–34                             7
                     A LESSANDRO C HIESA , P ETER M ANOHAR , AND I GOR S HINKAR

The above theorem, the state of the art in this setting, improves on the previous one as (i) even if

                                              d −1 m
                                                   
                                                         ≥ 7/8 ,
                                                n

the robustness provided by Theorem 2.5 is larger than that provided by Theorem 2.4 for m ≤ 169; (ii) a
robustness guarantee is provided for any choice of m, d, n (as long as m ≥ 3).

Our result We present a simpler proof of Theorem 2.5, which also achieves a m2 improvement in
the robustness by showing that the hyperplane test is Ω(d m /nm )-robust. This improved value for the
robustness appears more “natural,” because d m /nm is the distance of the code Cm .

Theorem 2.6. Let C ⊆ Fn be a linear code with minimal distance d. For m ≥ 3, the test H is α-robust
for Cm with
                                              1 d m
                                                 
                                        α=              .
                                             12 n

Tight or not? The test H has been studied in several papers, and all resulting analyses have an
exponential dependence on m in the robustness. Yet, there is no evidence indicating that this dependence
is necessary. Perhaps a “dream” result of constant robustness, for all codes C and m ≥ 3, is possible.
Like previous results, we too incur the same exponential dependence in the robustness. We present some
observations that may suggest that this dependence is not necessary.

   • Under certain conditions on M, we can prove that
                                                            m
                                                             
                                                   1     0d
                                  ρ(M) ≥ max          , c m · δ (M)
                                                 m+c      n

      for constants c, c0 > 0. These two expressions are incomparable, as we can set the parameters
      m, d, n to make either expression bigger than the other. (See Claim 8.1.)

   • The guarantees of Theorem 2.4, Theorem 2.5, and Theorem 2.6 all degrade as d m /nm decreases. In
     particular, the proven value of α in all these cases tends to 0 as d/n tends to 0. However, for any
     code C we can prove that
                                                             n−k
                                           δ (M) ≤ ρ(M) +           .
                                                              n
     In particular, we show that if C is an MDS code, then

                                                               d −1
                                             δ (M) ≤ ρ(M) +
                                                                 n
      for all M. (See Corollary 8.5.)

We, thus, think that determining the optimal robustness of H is an intriguing open problem:

                        T HEORY OF C OMPUTING, Volume 16 (5), 2020, pp. 1–34                           8
                         O N A XIS -PARALLEL T ESTS FOR T ENSOR P RODUCT C ODES

      What is the optimal robustness of the hyperplane test H?
      Can one prove that
                                                     1 dm
                                                           
                                      α = Ω max         ,       ,
                                                     m nm
      or even α = Ω(1), for all codes?

In [11], [35], and our result, the proof shows that when ρ(M) is below some threshold (related to the
code’s unique decoding radius), then δ (M) is also small. However, when ρ(M) is not below this threshold,
the analysis says nothing about δ (M), and naively uses δ (M) ≤ 1 to prove robustness in this regime.
We believe that progress on understanding the optimal robustness of H hinges on understanding what
techniques (if any) can be used to bound δ (M) in terms of ρ(M) for a larger range of ρ(M).


2.3    Roadmap
The rest of this paper is organized as follows. Section 3 describes the techniques used to prove our results.
Section 4 introduces preliminaries for Theorem 2.2. Section 5 provides the proof of Theorem 2.2 and
Proposition 2.3 (which justifies the list size in Theorem 2.2). Section 6 introduces basic notations and
definitions for Theorem 2.6. Section 7 provides the proof of Theorem 2.6.


3     Techniques
We give an overview of the proof techniques behind Theorem 2.2 and Theorem 2.6.


3.1    Theorem 2.2: bivariate testing in the low-agreement regime
Polishchuk and Spielman [31] prove their result (Theorem 2.1) using the following approach. Given R
and C (as in the theorem) such that Prx,y [R(x, y) = C(x, y)] > 1 − δ , they define an “error polynomial” E
that equals 0 for all (x, y) such that R(x, y) 6= C(x, y). Since the fraction of points where R(x, y) 6= C(x, y)
is small, E is a low-degree polynomial. However, in the low-agreement regime that we consider, the
degree of E is rather large, which seems to preclude their approach. In particular, a key step based on
Bézout’s Theorem in their proof appears to break down.
     We take a completely different approach, which relies on a combinatorial statement from extremal
graph theory. Given R and C such that Prx,y [R(x, y) = C(x, y)] = δ , we define A ∈ {0, 1}n×n to be the
“agreement matrix”: A(x, y) = 1 if and only if R(x, y) = C(x, y). By the assumption it follows that A has
at least δ n2 ones. By invoking the Kővári–Sós–and Turán Theorem (which may be thought of as an
analogue of Turán’s Theorem for bipartite graphs) it follows that there are some S, T ⊆ [n] such that
|S|, |T | > Ω(log(n))  r and A|S×T ≡ 1. Since the rows of R and the columns of C are polynomials of
degree r, we deduce that there exists a unique polynomial Q of degree (r, r) such that for all (x, y) ∈ S × T
it holds that R(x, y) = C(x, y) = Q(x, y).
     The argument above may appear to be good progress toward our goal. However, there is a total of
≈ δ n2 ones in A, and the rectangle S × T is of size O(log(n)), i. e., tiny compared to n. This means that
the progress is actually rather small!

                         T HEORY OF C OMPUTING, Volume 16 (5), 2020, pp. 1–34                                 9
                      A LESSANDRO C HIESA , P ETER M ANOHAR , AND I GOR S HINKAR

     Nevertheless, we can now set A|S×T to be zero, and repeat the same argument again, thus covering
all but a small fraction of ones of A with small rectangles. However, this raises a new problem. Each
rectangle S × T found in the previous step can be very small, and so there are potentially many different
polynomials Q that explain the agreements of R and C. Our next goal is therefore to “stitch” these
rectangles together to show that, in fact, there is only a small number of distinct polynomials. We do so
by “making the rectangles larger,” as we now explain.
     Consider a rectangle S × T from the first step, and let t 0 ∈ F \ T . Note that if there are r + 1 points
s0 ∈ S such that A(s0 ,t 0 ) = 1, then the row polynomial R(·,t 0 ) is uniquely defined by these r + 1 points,
and hence A(s,t 0 ) = 1 for all s ∈ S. Therefore, we can increase T by adding t 0 to it. On the other hand, if
there are less that r + 1 such points s0 ∈ S, then we may disregard these points as they amount to only
a small fraction of the points (since |S|  r). Thus, on a typical rectangle S × T , we can go from size
O(log(n)) × O(log(n)) to size roughly O(log(n)) × Ω(n).
     In the last step, we show that if we have many rectangles of size O(log(n)) × Ω(n) then it is possible to
“stitch” them together using the fact that if we have two rectangles S1 × T1 and S2 × T2 with corresponding
polynomials Q1 and Q2 such that |T1 ∩ T2 | > r, then Q1 ≡ Q2 . This follows by the fact that if two
univariate polynomials of degree r agree on more than r points, then they are equal. We then use the
inclusion-exclusion principle and some simple inequalities to show that we cannot have more than 2/ε
subsets Ti ⊆ [n] of size at least εn such that Ti ∩ T j ≤ r for all i 6= j.
     The full proof of Theorem 2.2 is provided in Section 5.

3.2   Theorem 2.6: improved robustness for the hyperplane test
Our goal is to prove that the axis-parallel hyperplane test H is α-robust for

                                                   1 d m
                                                      
                                             α=              .
                                                  12 n

We prove this statement via a careful combination of the approaches taken by [11] and [35]. Specifically,
we analyze ρ(M) and δ (M) by studying the following combinatorial object: the inconsistency graph G
of the hyperplane test H, which we now informally describe.
                                               m
    The test H has access to a word M ∈ Fn , allegedly in Cm . For any axis-parallel hyperplane H,
we denote by gH the closest codeword to M|H in Cm−1 (breaking ties by picking an arbitrary closest
codeword). The vertex set of the graph G is the set of (m − 1)-dimensional axis-parallel hyperplanes,
which are the local views of the test. There is an edge between two different hyperplanes H and H 0 if
gH and gH 0 disagree on the intersection of the hyperplanes, H ∩ H 0 . (See Definition 7.1 for details.) In
other words, the graph has an edge between two planes if the local codewords assigned to the planes
are inconsistent. The graph G that we study is similar to the inconsistency graph analyzed in [11]. The
difference is that, for some threshold parameter τ, the graph used in [11] adds an edge from H to every
other H 0 in the graph if δ (M|H , gH ) > τ.
    First, we show that if G has a large independent set I, then there is a codeword f in Cm that agrees
with the local codewords gH on every hyperplane H in I. For an independent set I, we define Ib to be
the set of i ∈ [n] such that the hyperplane {p ∈ [n]m : pb = i} is in I. A key property of tensor product
codes is the unique extension property, which we formally state later on as Claim 6.2. Using the unique

                        T HEORY OF C OMPUTING, Volume 16 (5), 2020, pp. 1–34                               10
                         O N A XIS -PARALLEL T ESTS FOR T ENSOR P RODUCT C ODES

extension property of tensor product codes, we show that if there are two axes b1 and b2 where Ib1 and
Ib2 both have at least n − d + 1 planes, then there is a word f in Cm where f |H = gH for every H in the
independent set. Without loss of generality assume b1 = 1 and b2 = 2. Intuitively, we fill in the restricted
                       m−2
hypercube in FI1 ×I2 ×n with the values of the closest codewords to M|H for each H in the independent
set. Since the independent set is large, the restricted hypercube is large enough so that we can extend the
partially filled-in hypercube to a unique codeword f in Cm . The uniqueness of the extension implies that
 f |H = gH for every H in I.
      Next, we analyze the structure of G to show that every edge is adjacent to a vertex of degree at least
(m − 2)d/2. The key point is that two different Cm−2 codewords must disagree on at least d m−2 points,
and these points have a particular structure. For two distinct Cm−2 codewords, we prove that on each
of the m − 2 remaining axes there must be at least d planes, parallel to that axis, that contain points of
disagreement. If not, then using the unique extension property we show that the two codewords must be
equal, which is a contradiction. For any edge (H, H 0 ), this gives us a total of (m − 2)d planes that disagree
with at least one of gH and gH 0 on H ∩ H 0 , which shows that deg(H) + deg(H 0 ) is at least (m − 2)d.
Therefore, at least one of H and H 0 has degree at least (m − 2)d/2. As an immediate consequence, the
set of planes of degree at least (m − 2)d/2, which we denote by L, is a vertex cover, and the set of planes
not in L is an independent set I.
      With some algebraic manipulation, we relate the size of this vertex cover to the expected local distance
ρ(M). By expressing ρ(M) as a sum over pairs of intersecting planes, we show that
                                           1
                           ρ(M) ≥                         ∑              ∆|H∩H 0 (gH , gH 0 ) .
                                     nm m(m − 1)   (H,H 0 ):H∩H 0 6=0/

This allows us to express the robustness of the test H in terms of the size of the vertex cover L.
    Similar to the analysis of [35], we break up the proof into two cases. If |L| is somewhat large, then

                                                     1 d m
                                                        
                                          ρ(M) ≥                ,
                                                    12 n

and the theorem follows immediately because δ (M) is anyways at most 1. If |L| is small, then the
corresponding independent set has two axes where |Ib | ≥ n − d + 1. Therefore, there is a global codeword
f that is consistent with all the hyperplanes in the independent set. We use this fact to show that δ (M)
must be small when ρ(M) is small, which concludes the proof.
    The full proof of Theorem 2.6 is provided in Section 7.


4     Preliminaries for Theorem 2.2
4.1   Low-degree polynomials
We will use the following lemmas about low-degree polynomials in the proof of Theorem 2.2. These are
standard interpolation lemmas, and direct proofs can be found in [31].
Lemma 4.1. Let S, T ⊆ F be two sets each of size at least r + 1. Suppose that for two polynomials
Q1 (x, y), Q2 (x, y) of degree (r, r), if holds that Q1 (x, y) = Q2 (x, y) for all (x, y) ∈ S × T . Then Q1 ≡ Q2 .

                        T HEORY OF C OMPUTING, Volume 16 (5), 2020, pp. 1–34                                   11
                         A LESSANDRO C HIESA , P ETER M ANOHAR , AND I GOR S HINKAR

Lemma 4.2. Let S, T ⊆ F be two sets each of size at least r + 1. Suppose that there is polynomial
R(x, y) of degree (r, n), and a polynomial C(x, y) of degree (n, r) such that R(x, y) = C(x, y) for all
(x, y) ∈ S × T . Then, there exists a polynomial Q(x, y) of degree (r, r) such that Q(x, y) = C(x, y) = R(x, y)
for all (x, y) ∈ S × T .

Corollary 4.3. Let S, T ⊆ F be two sets each of sizes |S| ≥ r + 2 and |T | ≥ r + 2, and let (x0 , y0 ) ∈ S × T .
Suppose that there is a polynomial R(x, y) of degree (r, n), and a polynomial C(x, y) of degree (n, r) such
that R(x, y) = C(x, y) for all (x, y) ∈ S × T \ {(x0 , y0 )}. Then C(x0 , y0 ) = R(x0 , y0 ).

4.2    The Kővári–Sós–Turán theorem
We first define the density of a binary matrix.

Definition 4.4. Let A ∈ {0, 1}k×` be a binary matrix. Define the density of A to be

                                                         ∑i∈[k], j∈[`] Ai, j
                                               δ (A) =                       .
                                                              k·`
We say that A is τ-dense if δ (A) ≥ τ.

   In the proof of Theorem 2.2 we will use a result due to Kővári, Sós, and Turán [28], which states that
any sufficiently dense binary matrix contains a large submatrix where every entry is 1.

Theorem 4.5 (Kővári, Sós, Turán). Let N, M,t, s be pnatural numbers that satisfy N ≥ s and M ≥ t ≥ s,
and let A ∈ {0, 1}N×M be a binary matrix. If A is ( s t − 1/M + s/N)-dense, then there are S ⊆ [N] and
T ⊆ [M] of sizes |S| = s and |T | = t such that A|S×T ≡ 1.

Remark 4.6. The Kővári–Sós–Turán theorem is usually stated as saying that any sufficiently dense
bipartite graph contains a large bipartite clique. It is clear, however, that the matrix formulation above is
equivalent by associating a bipartite graph with its adjacency matrix, where the rows correspond to the
vertices on the left, and the columns correspond to the vertices on the right.


5     Proof of Theorem 2.2
The key step in the proof of Theorem 2.2 is the following lemma.

Lemma 5.1 (Key lemma). Fix ε > 0. Let F be a field of order n, and suppose that
                                          r             
                                n > exp Ω        log(1/ε)      .
                                               ε
Then there are t ≤ 2/ε polynomials Q1 , . . . , Qt each of degree (r, r), and subsets S1 , . . . , St , B1 , . . . , Bt ⊆ F
such that:

    1. For all i ∈ [t] and (x, y) ∈ Si × Bi it holds that C(x, y) = R(x, y) = Qi (x, y).

    2. The sets Si are pairwise disjoint.

                          T HEORY OF C OMPUTING, Volume 16 (5), 2020, pp. 1–34                                          12
                              O N A XIS -PARALLEL T ESTS FOR T ENSOR P RODUCT C ODES


           i∈[t] Si × Bi
         S
   3.                        ≥ δ − 3ε, where δ = Pr[C(x, y) = R(x, y)].
             |F|2
      Before proving Lemma 5.1, let us see how it immediately implies Theorem 2.2.

Proof of Theorem 2.2 using Lemma 5.1. Let ε > 0, and apply Lemma 5.1 with ε/3. By Lemma 5.1 for
some t ≤ 2/(ε/3) = 6/ε there are disjoint subsets S1 × B1 , . . . , St × Bt ⊆ F2 such that

                                                i∈[t] (Si × Bi )
                                               S
                                                                   ≥ δ −ε ,
                                                    |F|2

and for all i ∈ [t] and (x, y) ∈ Si × Bi it holds that R(x, y) = C(x, y) = Qi (x, y). This implies that

                    Pr [∃i ∈ [t] s.t. R(x, y) = C(x, y) = Qi (x, y)] ≥ Pr[(x, y) ∈
                                                                                     [
                                                                                         (Si × Bi )] ,
                  (x,y)∈F2
                                                                                     i∈[t]

which is at least δ − ε, as required.

      We devote the rest of this section to proving Lemma 5.1.

5.1     Proof of Lemma 5.1
Let n = |F|, and define the binary matrix A ∈ {0, 1}n×n where A(x, y) = 1 if C(x, y) = R(x, y) and
A(x, y) = 0 otherwise. Note that by the assumption of Theorem 2.2, we have

                                                ∑x,y∈[n] A(x, y)
                                                                 =δ ,
                                                      n2
i. e., the matrix A is δ -dense.

5.1.1    Step 1
In the first step we apply Theorem 4.5 iteratively to show that there exists a collection of disjoint sets
S1 , . . . , Su ⊆ [n] with |Si | ≥ r/ε such that for most points (x, y) it holds that if A(x, y) = 1, then x ∈ i Si ,
                                                                                                              S

and for each i ∈ [u] there exists Ti ⊆ [n] of size |Ti | ≥ r/ε such that ASi ×Ti ≡ 1.
      For the rest of the proof we let k = dr/εe.

Claim 5.2. Let n, r ∈ N, δ > ε > 0, and let k ∈ N. Let A ∈ {0, 1}n×n be a δ -dense matrix as above,
and suppose that n > 2k2 (1/ε)k+1 . Then, there exist u ∈ N and two sequences Si ⊆ [n], Ti ⊆ [n] with
i = 1, . . . , u satisfying the following conditions.

   1. The sets Si are pairwise disjoint.

   2. |Si | = |Ti | = k.

   3. A(x, y) = 1 for every (x, y) ∈ Si × Ti and i ∈ [u].

                              T HEORY OF C OMPUTING, Volume 16 (5), 2020, pp. 1–34                                13
                        A LESSANDRO C HIESA , P ETER M ANOHAR , AND I GOR S HINKAR

   4. ∑(x,y)∈([n]\(Si Si ))×[n] A(x, y) < εn2 .
Proof. We will use Theorem 4.5 to find a submatrix of A of size k × k whose entries are all 1s. By the
choice of k and the assumption that n is sufficiently large we have that
                          k k               k k               k2
                                                            
                                   k                   k                      k−1
                       ε−      = ε 1−             > ε 1−           > ε k /2 >     ,
                          n                εn                εn                εn
and hence                                                 r
                                                           k    k−1 k
                                                    ε>              + .
                                                                 εn  n
Hence, since A is δ -dense, we have
                                                                    r
                                                                    k   k−1 k
                                          δ (A) ≥ δ ≥ ε >                  + .
                                                                         n  n
Therefore, by Theorem 4.5 there exist S1 ⊆ [n], T1 ⊆ [n] each of size |S1 | = |T1 | = k such that A|S1 ×T1 ≡ 1.
     Next, we remove the rows contained in S1 from A, and apply the same argument again. Let M1 =
[n] \ S1 and define A1 to be the (n − k) × n submatrix of A whose rows are indexed by M1 . Note that if

                                                     ∑          A1 (x, y) > εn2
                                                  x∈M1 ,y∈[n]

then δ (A1 ) ≥ εn/|M1 |, and thus we have
                                                         s
                                                  εn        k−1 k
                                       δ (A1 ) ≥     >ε > k      + .
                                                 n−k        |M1 | n
Therefore, we can apply Theorem 4.5 again, and find S2 ⊆ M1 and T2 ⊆ [n] of size |S2 | = |T2 | = k such
that A|S2 ×T2 ≡ 1.
     We repeat the same argument again, for each i ≥ 2 defining the the subset Mi = Mi−1 \ Si−1 , and
letting Ai = AMi ×[n] . Note that if
                                          ∑ A(x, y) ≥ εn2
                                                  x∈Mi ,y∈[n]

then |Mi | ≥ εn, and                                        s
                                                   εn          k−1 k
                                        δ (Ai ) ≥       ≥ε > k      + .
                                                  |Mi |        |Mi | n
Therefore, by Theorem 4.5 there exist Si ⊆ Mi and Ti ⊆ [n] of size |Si | = |Ti | = k such that A|Si ×Ti ≡ 1.
   We stop the process after u iterations when

                                                     ∑         A(x, y) < εn2 .
                                              x∈Mu ,y∈[n]

By the definition of the sets Si and Ti , this gives us the subsets with the desired properties.

    By the assumption |F| = n > exp(Ω((r/ε) log(1/ε))) in Theorem 2.2 and the choice of k = dr/εe,
we have n > 2k2 (1/ε)k+1 . Therefore, we can apply Claim 5.2 on A to get the sets Si and Ti as in the
claim.

                          T HEORY OF C OMPUTING, Volume 16 (5), 2020, pp. 1–34                              14
                           O N A XIS -PARALLEL T ESTS FOR T ENSOR P RODUCT C ODES

5.1.2     Step 2
Next, we find subsets Bi such that Ti ⊆ Bi ⊆ [n] for the sets Ti from the previous step, so that most points
(x, y) satisfying A(x, y) = 1 are contained in ∪ui=1 (Si × Bi ).

Claim 5.3. Let {Si }ui=1 be the sets from Claim 5.2. For each i ∈ [u] define

                                     Bi = {y0 ∈ [n] : ∑ A(x, y0 ) ≥ r + 1} .
                                                     x∈Si

Then

   1. ∑i∈[u] ∑ x∈Si A(x, y) ≤ εn2 .
                y∈[n]\Bi

   2. A(x, y) = 1 for every (x, y) ∈ Si × Bi and i ∈ [u].

Proof. The first item is by the choice of k ≥ r/ε. In each i ∈ [u] and y ∈ [n] \ Bi it holds that less than ε
fraction of the entries are ones, and hence the total number of ones in all i ∈ [u] and y ∈ [n] \ Bi is less that
εn2 . Formally, we have

                               ∑ ∑ A(x, y) ≤ ∑ ∑ r ≤ u · n · r ≤ εn2 ,
                              i∈[u] x∈Si           i∈[u] y∈[n]\Bi
                                   y∈[n]\Bi

where the last inequality uses the fact that u ≤ n/k, and k ≥ r/ε.
     To prove the second item, we use Corollary 4.3. Suppose that A(x0 , y0 ) = 0 for some x0 ∈ Si and
y0 ∈ Bi . By the assumption on Bi , it holds that |{x ∈ Si : A(x, y0 ) = 1}| ≥ r + 1. Let S = {x0 } ∪ {x ∈
Si : A(x, y0 ) = 1}, and let T = {y0 } ∪ Ti , so that A(x, y) = 1 for all (x, y) ∈ S × T \ {(x0 , y0 )}. Recall
that, by definition of A, R(x, y) = C(x, y) for all such (x, y), and hence, by Corollary 4.3 we also have
R(x0 , y0 ) = C(x0 , y0 ), and thus A(x0 , y0 ) = 1.

Note that the ones not covered by i (Si × Bi ) are the ≤ εn2 ones omitted in Claim 5.2 and the ≤ εn2 ones
                                      S

disregarded in the proof of Claim 5.3 above. Let us also disregard all the Si and the Bi such that |Bi | ≤ εn,
and consider only the remaining subsets. Note that the union of all the Si × Bi with |Bi | ≤ εn can contain
at most εn2 ones. Redefining u to be the number of remaining sets, we get two collections of subsets
{Si ⊆ [n], Bi ⊆ [n]}ui=1 such that:

   1. The sets Si are pairwise disjoint.

   2. |Bi | > εn for all i ∈ [u].
        Su                            2
   3. |    i=1 (Si × Bi )| ≥ (δ − 3ε)n .

   4. A(x, y) = 1 for every (x, y) ∈ Si × Bi and i ∈ [u].

In particular, by Lemma 4.2 for each i = 1, . . . , u there is a polynomial Pi of degree (r, r) such that
R(x, y) = C(x, y) = Pi (x, y) for all (x, y) ∈ Si × Bi .

                           T HEORY OF C OMPUTING, Volume 16 (5), 2020, pp. 1–34                               15
                          A LESSANDRO C HIESA , P ETER M ANOHAR , AND I GOR S HINKAR

5.1.3    Step 3
Next, we observe that if two sets Bi , B j from the previous step have large intersection, then the corre-
sponding polynomials Pi and Pj are equal.

Claim 5.4. Suppose that Bi ∩ B j ≥ r + 1 for some i 6= j ∈ [u]. Then Pi = Pj and Bi = B j .

Proof. Denote B = Bi ∩ B j . Note that, for each y ∈ B, Pi (x, y) = C(x, y) for all |Si | = k > r + 1 values of
x ∈ Si , and hence Pi (x, y) = C(x, y) for all x ∈ [n]. In particular, Pi (x, y) = C(x, y) for all (x, y) ∈ S j × B.
Therefore, Pi |S j ×B ≡ Pj |S j ×B , and thus Pi ≡ Pj by Lemma 4.1. Applying Corollary 4.3, we conclude that
Pi (x, y) = Pj (x, y) = C(x, y) = R(x, y) for all (x, y) ∈ (Si ∪ S j ) × (Bi ∪ B j ). This implies that Bi = B j , as
required.

5.1.4    Completing the proof
In the last step we will show that there is a short list of t ≤ 2/ε polynomials Q1 , . . . , Qt such that each of
the Pi is in fact equal to one of the Q j . Indeed, denote the number of distinct sets Bi by t. By Claim 5.4, if
Bi 6= B j then |Bi ∩ B j | ≤ r, and thus by the inclusion-exclusion principle we have

                                      t             t                               
                                      [                                             t
                               n≥           Bi ≥ ∑ |Bi | − ∑ Bi ∩ B j ≥ t · εn −      r ,
                                      i=1         i=1         i6= j                 2

where in the last inequality we used the bound |Bi | > εn for all i. The original u rectangles are disjoint
k × k squares, so it must hold that uk ≤ n. Since k ≥ r/ε, this implies that ur ≤ εn. However, it also must
hold that t ≤ u, since merging rectangles can only decrease the total number of rectangles. Therefore,
tr ≤ εn. Combining this with the first inequality, we get that
                            
                             t             tr          εn                εn       2
              n ≥ t · εn −      r ≥ t εn −      ≥ εn −           =⇒ n ≥ t ·      =⇒ ≥ t .
                            2                2             2                  2       ε

Therefore t ≤ 2/ε, as required.

5.2     Proof of Proposition 2.3
We prove that in Theorem 2.2 it is necessary to have a list of (δ /ε)/polylog(δ /ε) polynomials in order
to explain all but ε of the agreements of R and C.
     The idea of the proof is the following. Fix N elements of F arbitrarily, and (for simplicity of notation)
identify these elements with 1, . . . , N. We define a infinite sequence of positive numbers (ai )i≥1 such that
the series ∑i ai converges, and we choose a normalization constant C so that (i) ∑Ni=1 (C · ai )2 = δ , and
(ii) for all ε > 0 and t = (δ /ε)/polylog(δ /ε) it holds that ∑ti=1 (C · ai )2 < δ − ε. Choose a collection
of N disjoint sets Ai ⊆ F of size |Ai | = C · ai · |F|.2 Then, we define row polynomials R(x, y) that are
constant on every row y, and column polynomials C(x, y) that are constant on every column x, so that
   2 We assume for simplicity that C · a · |F| are all integers.
                                        i



                            T HEORY OF C OMPUTING, Volume 16 (5), 2020, pp. 1–34                                 16
                             O N A XIS -PARALLEL T ESTS FOR T ENSOR P RODUCT C ODES

R(x, y) = C(x, y) = i if and only if (x, y) ∈ Ai × Ai . That is R(x, y) ≡ i for all y ∈ Ai , and C(x, y) = i for
all x ∈ Ai . In particular, we have R(x, y) = C(x, y) if and only if (x, y) ∈ i (Ai × Ai ), and hence
                                                                             S

                                                                             N
                                          Pr [R(x, y) = C(x, y)] = ∑ (C · ai )2 .
                                       (x,y)∈F2                              i=1

Proposition 2.3 follows rather immediately by the convergence properties of the partial sums ∑Ni=1 a2i and
the definition of R and C.
     We now proceed to the formal proof. We start by examining the two sequences {c1 (N)}N∈N and
{c2 (N)}N∈N defined as follows. For all i ≥ 1 let
                                                                  1
                                                  ai =                     ,
                                                          i(log2 (i + 1))2

and let c1 (N) = ∑Ni=1 ai and c2 (N) = ∑Ni=1 a2i . Note that both c1 (N) and c2 (N) are non-decreasing and
uniformly bounded3 sequences, and hence both converge as N → ∞. We first show that
                                                               c2 (N)   1
                                                      1≥              >
                                                               c1 (N)2 3
for every N. The ratio is clearly at most 1, as
                                                                !2
                                                         N              N
                                          c1 (N)2 =      ∑ ai        ≥ ∑ a2i = c2 (N)
                                                         i=1           i=1

for every N. To show the other direction, we can compute that c2 (N) ≥ 1 and c1 (N) ≤ 1.63 for all N ≥ 1.
It follows that
                                           c2 (N)     1      1
                                                2
                                                  >     2
                                                          >
                                          c1 (N)    1.63     3
for every N.
    Let δ , ε be such that 1/12 > δ ≥ ε > 0, and set N = dlog(1/ε)/εe. Redefine c1 := c1 (N) and
c2 := c2 (N). From the previous calculations, we have that
                                                          1   c2
                                                  δ<        <     <1 .
                                                          12 4c21
Let F be a field of order
                                      3c2 r 3c2 rN 2 log4 (N + 1) 3c2 log6 (1/ε) · r
                              |F| >         =                    ≥                   .
                                      a2N δ            δ                δ · ε2
   3 The partial sum
                       ∑N
                        i=1 ai can be upper bounded by
          N
                  1                    N    1                       ln2 (2) N               ln2 (2)
                                      Z
a1 + a2 + ∑             2
                          ≤ a1 + a2 +        2
                                                   dx = a1 + a2 + −           = a1 + a2 + −         + ln(2) ≤ a1 + a2 + ln(2),
         i=3 i(log2 (i))              2 x log2 (x)                   ln(x) 2                ln(N)

and the partial sum ∑N    2                      N
                     i=1 ai is upper bounded by ∑i=1 ai .


                             T HEORY OF C OMPUTING, Volume 16 (5), 2020, pp. 1–34                                          17
                        A LESSANDRO C HIESA , P ETER M ANOHAR , AND I GOR S HINKAR

In particular, we have
                                                          3c2 r 3c2 r
                                                  |F| >         ≥ 2
                                                          a2N δ  ai δ
for all i ∈ [N].                                                                      p
     We now show that there are disjoint subsets A1 , . . . , AN ⊆ F of size |Ai | = d δ /c2 · ai |F|e. We have
that
                          N          N p                                  N
                                                                4p
                         ∑ |Ai | = ∑ d δ /c2 · ai |F|e ≤ 3 δ /c2 ∑ ai |F| ,
                         i=1        i=1                                  i=1
where the last inequality holds since
                                                                       r
                                             p                3            c2
                                              δ /c2 · ai |F| ≥ ·              ≥3
                                                              ai           δ
for every i ∈ [N], by our assumption on |F|. Since δ < c2 /4c21 , it follows that ∑Ni=1 |Ai | < (2/3)|F|, and
hence such sets Ai exist.
    We now construct R(x, y) and C(x, y) that satisfy the conditions in the proposition. Fix N elements of
F arbitrarily, and (for simplicity of notation) identify these elements with 1, . . . , N. For every y ∈ Ai , set
the row polynomial R(x, y) equal to i, and for every x ∈ Ai , set the column polynomial C(x, y) equal to i.
          / Ai let R(x, y) = N + 1, and for all x ∈
For all y ∈                                         / Ai let C(x, y) = N + 2. Note that R(x, y) = C(x, y) if
            S                                         S

and only if x, y ∈ Ai for some i. In particular,
                                                                                      !2
                                                   |Ai | 2
                                              N              N
                                                                   p
                                                                  d δ /c2 · ai |F|e
                                                        
                     Pr[R(x, y) = C(x, y)] = ∑             =∑                              .
                                             i=1 |F|                     |F|
                     x,y
                                                             i=1

The later expression is at least
                                             s           2
                                                                 
                                                  δ              δ
                                        ∑           · ai  = ∑     a2i = δ
                                         i        c2          i  c2

and at most                            s      2
                                                       
                                        4 δ            δ
                                    ∑ 3 c2 · ai < 2 ∑ c2 a2i = 2δ ,
                                              
                                    i               i

and hence R and C satisfy the requirement that
                                                               N              2
                                                                       |Ai |
                                 Pr[R(x, y) = C(x, y)] = ∑                          ∈ [δ , 2δ ] .
                                 x,y
                                                               i=1     |F|
By the Polynomial Identity Lemma,4 any non-constant polynomial of individual degree r (and therefore
total degree 2r) can be equal to i on at most 2r · |Ai | points in each Ai × Ai box, and hence agrees with
both R and C on at most 2r ∑ |Ai | ≤ 2r|F| points in total. Since
                                                               3c2 r
                                                       |F| >
                                                               a2i δ
   4 This is also known as Schwartz-Zippel Lemma, We refer to [4] Section 3.1 for discussion about the history of the lemma.



                          T HEORY OF C OMPUTING, Volume 16 (5), 2020, pp. 1–34                                           18
                        O N A XIS -PARALLEL T ESTS FOR T ENSOR P RODUCT C ODES

for every i, it follows that |Ai |2 > 2r|F| for every i ∈ [N], and therefore the t polynomials Qi that
maximize the probability Prx,y [∃i ∈ [t] s.t. R(x, y) = C(x, y) = Qi (x, y)] must be the constant polynomials
Q1 ≡ 1, . . . , Qt ≡ t.
    Since
                                                                   |Ai | 2 δ 2
                                                                       
                           Pr[R(x, y) = C(x, y) = Qi (x, y)] =              ≥ ai
                           x,y                                     |F|        c2
for each i ∈ [N], we have
                                                                                            N                 2
                                                                                                       |Ai |
          Pr[∃i ∈ [t] s.t. R(x, y) = C(x, y) = Qi (x, y)] = Pr[R(x, y) = C(x, y)] − ∑
          x,y                                                        x,y
                                                                                           i=t+1       |F|
                                                                                            N
                                                                                                 δ 2
                                                                 ≤ Pr[R(x, y) = C(x, y)] − ∑        ai .
                                                                                           i=t+1 c2
                                                                     x,y


Note that                                ∞                                    
                                                                     1
                                         ∑ a2i = Θ            t · polylog(t)
                                                                                   ,
                                        i=t+1

and since N > log(1/ε)/ε it holds that
                                                ∞
                                                      δ         δ
                                                ∑ c2 a2i < c2 ε < ε .
                                                i=N

Thus,
                                    N                    
                                        δ 2        δ
                                   ∑ ai ≥ Ω t · polylog(t) − ε .
                                  i=t+1 c2

This implies that
                                                                                                
                                                                                         δ
      Pr[∃i ∈ [t] s.t. R(x, y) = C(x, y) = Qi (x, y)] < Pr[R(x, y) = C(x, y)] − O                  +ε .
      x,y                                               x,y                       t · polylog(t)

Therefore, if the left hand side of the above is at least Prx,y [R(x, y) = C(x, y)] − ε, then we have that
t > (δ /ε)/polylog(δ /ε), as required.


6     Preliminaries for Theorem 2.6
6.1     Linear codes
A linear code C ⊆ Fn is a linear subspace of the vector space Fn . Each codeword w in C is a string of
length n, which is the block length of the code. The dimension of the code dim(C) is the dimension of C
as a vector space in Fn . For any two words w and v in Fn , the Hamming distance between w and v, denoted
by ∆(w, v), is the number of indices where i where wi 6= vi . Formally, ∆(w, v) = |{i ∈ [n] : wi 6= vi }|. The
relative distance between w and v is δ (w, v) = ∆(w, v)/n, which is the fraction of points where w and v
disagree. For any subset S of [n], we will define ∆|S (w, v) to be |{i ∈ S : wi 6= vi }|, which is the Hamming

                        T HEORY OF C OMPUTING, Volume 16 (5), 2020, pp. 1–34                                        19
                       A LESSANDRO C HIESA , P ETER M ANOHAR , AND I GOR S HINKAR

distance between w and v on the subset S. Similarly, δ |S (w, v) = ∆|S (w, v)/|S|. The distance d of a
code C is the minimum Hamming distance between any two distinct codewords of C, i. e., d = d(C) =
minw6=v∈C ∆(w, v). For any w in Fn , the distance from w to C is defined as ∆(w,C) = minv∈C ∆(w, v),
and the relative distance is defined similarly. For any subset S ⊆ [n], the distance from w to C on S is
∆|S (w,C) = minv∈C ∆|S (w, v). We will write δ (w) instead of δ (w,C) when the code is clear from the
context.
     Linear codes have a unique extension property.
Claim 6.1 (Unique Extension). Let C ⊆ Fn be a code with blocklength n and distance d. Let I be a subset
of [n] of size at least n − d + 1. Let C0 be the restriction of the code C to the subset I. Then, for every
codeword w ∈ C0 there exists a unique v ∈ C such that v|I = w.
Proof. By definition, for every w in C0 there must exist at least one v in C such that v|I = w. Suppose
there exists v1 and v2 such that v1 |I = v2 |I = w. This implies that ∆(v1 , v2 ) ≤ n − |I| ≤ d − 1, as v1 and v2
agree on I. Since v1 and v2 are codewords, ∆(v1 , v2 ) < d if and only if v1 = v2 . Therefore, the codeword
w has a unique extension to C.

6.2    Tensor product codes
For two linear codes C1 ⊆ Fn1 and C2 ⊆ Fn2 , the tensor product of C1 and C2 , denoted by C1 ⊗ C2 , is
the linear code in Fn1 ·n2 , where every codeword M ∈ Fn1 ×n2 is an n1 × n2 matrix, whose each column is
a codeword of C1 and each row is a codeword of C2 . In particular, we have dim(C1 ⊗C2 ) = dim(C1 ) ·
dim(C2 ) and d(C1 ⊗ C2 ) = d(C1 ) · d(C2 ). We then extend this definition inductively to m ≥ 3 codes
Ci ⊆ Fni , setting C1 ⊗ · · · ⊗ Cm := (C1 ⊗ · · · ⊗ Cm−1 ) ⊗ Cm . So C1 ⊗ · · · ⊗ Cm has block length n1 · · · nm ,
distance d(C1 ) · · · d(Cm ), and dimension dim(C1 ) · · · dim(Cm ).
     Furthermore, f ∈ C1 ⊗ · · · ⊗Cm if and only if it can be written as an m-dimensional n1 × n2 × · · · × nm
tensor, where the entries are values in F, and each line parallel to the i-th axis is in Ci . It is easy to see that
 f is in C1 ⊗ · · · ⊗Cm if and only if for each i, the restriction of f to any (m − 1)-dimensional axis-parallel
hyperplane perpendicular to the i-th axis is in C1 ⊗ · · · ⊗Ci−1 ⊗Ci+1 ⊗ · · · ⊗Cm .
     If C1 = · · · = Cm = C then we call the product C1 ⊗ · · · ⊗Cm the m-th tensor power of C and denote it
by Cm . It is worth noting that the fractional distance of the code Cm is (d/n)m , so the fractional distance
of the code decays exponentially in m.
     Tensor product codes have a unique extension property that will be used many times in the proof of
Theorem 2.6.
Claim 6.2 (Unique Extension for Tensor Product Codes). Let {Cb }m          b=1 be codes with blocklength nb and
distance db . Let Ib ⊆ [nb ] be a set of size at least nb − db + 1, and let Cb0 be the projection of Cb to Ib . Then
for every w ∈ C0 = C10 ⊗ · · · ⊗Cm0 , there exists a unique v in C = C1 ⊗ · · · ⊗Cm such that v|I1 ×···×Im = w.
Proof. By Claim 6.1, for all b ∈ [m] the projection map πb : Cb → Cb0 is bijective. We can extend πb to be
a bijective map from the hybrid code C10 ⊗ · · · ⊗Cb−1
                                                   0   ⊗Cb ⊗ · · · ⊗Cm to C10 ⊗ · · · ⊗Cb0 ⊗Cb+1 ⊗ · · · ⊗Cm .
For any v in the first hybrid code, define πb (v) = v|I1 ×···×Ib ×nb+1 ×···×nm , which is the projection of v to
Ib along the bth axis, and the identity map everywhere else. Clearly, πb is still a bijection, and so the
composition of maps π = πm ◦ πm−1 ◦ · · · ◦ π1 is therefore a bijection from C to C0 , which proves the
claim.

                         T HEORY OF C OMPUTING, Volume 16 (5), 2020, pp. 1–34                                    20
                         O N A XIS -PARALLEL T ESTS FOR T ENSOR P RODUCT C ODES

6.3     Locally testable codes and robust tests
A q-query test T for a code C ⊆ Fn is a probabilistic algorithm that, given oracle access to a word w ∈ Fn ,
makes q (non-adaptive) queries to w and then accepts or rejects. Informally, C is locally testable if there
is a test T that accepts (with probability 1) whenever w is in C, and rejects (say with probability at least
0.5) when w is far from C.
     The expected local view distance ρ T (w) of T on a word w is the average, over the local views of T,
of the distance of w to an accepting view. Instead of analyzing the local testability of Cm , we will instead
consider a stronger notion of local testability called robustness, that was introduced in [10] and adapted to
locally testable codes in [11]. The test T is α-robust if ρ T (w) ≥ α · δ (w,C) for every word w ∈ Fn . The
“strength” of the test increases with α, so the goal is to establish the largest α for which this inequality
holds.
     For formal definitions and more details on the subject see, e. g., [23].

6.4     The axis-parallel hyperplane test
Given a linear code C, we define below the axis-parallel hyperplane test for the tensor code Cm . An
(m − 1)-dimensional axis-parallel hyperplane H ⊆ [n]m is a set of the form H = {p ∈ [n]m : pb = i} for
some b ∈ [m] and i ∈ [n]. Therefore, there are nm such (m − 1)-dimensional axis-parallel hyperplanes in
total, and each such hyperplane can be specified by the pair (b, i). We will use the notation (b, i) to refer
to the hyperplane H = {p ∈ [n]m : pb = i}.
Definition 6.3. Let C be a linear code, and let Cm be the m-th tensor power of C. The axis-parallel
                                                             m
hyperplane test H for Cm is the test that given a word M ∈ Fn samples a random (m − 1)-dimensional
axis-parallel hyperplane H and checks if M|H ∈ Cm−1 .
                 m
    For M ∈ Fn and an axis-parallel hyperplane H in [n]m , we define gH to be the closest Cm−1 codeword
to M|H . If this codeword is not unique, then we break ties by picking an arbitrary closest codeword.
Using this notation, the expected local view distance ρ(M) can be expressed as
                             ρ(M) = EH [δ |H (M,Cm−1 )] = EH [δ |H (M, gH )] ,
where the expectation is taken over all axis-parallel hyperplanes H.
                                                                                         m
Definition 6.4. The test H is α-robust if ρ(M) ≥ α · δ (M,Cm ) for every word M ∈ Fn , where δ (M,Cm )
is the relative distance of the word M to the code Cm , and ρ(M) is the expected local distance of M.
      Note that robustness α for the test H is at most 1.
Lemma 6.5. The robustness of the axis-parallel hyperplane test H is α ≤ 1.
Proof. Let f be any Cm codeword such that δ (M) = δ (M, f ). Then,
                                            1                   1
                     δ (M) = δ (M, f ) =      ∑ δ |H (M, f ) ≥      δ |H (M, gH ) = ρ(M)
                                           nm H                nm ∑
                                                                  H

since gH is closer to M|H than f |H , as f |H ∈ Cm−1 . Thus α ≤ ρ(M)/δ (M) ≤ 1.

      In Section 9 we discuss the difference between robustness and soundness for the test H.

                         T HEORY OF C OMPUTING, Volume 16 (5), 2020, pp. 1–34                             21
                      A LESSANDRO C HIESA , P ETER M ANOHAR , AND I GOR S HINKAR

7     Proof of Theorem 2.6
We prove Theorem 2.6. We start by recalling some notation. Let C be a linear code with distance d and
block length n over F, and let Cm be the m-th tensor power of C, for some m ≥ 3. Let M be the input to
the test H, which is an evaluation table of a function from [n]m → F. Define gH to be the closest Cm−1
word to M|H , where ties are broken by picking an arbitrary closest codeword. We will view M as fixed
throughout the analysis.
    Our goal is to show that ρ(M) ≥ α · δ (M), for

                                                  1 d m
                                                     
                                            α=              .
                                                 12 n

In order to do that we assume that ρ(M) is small, and use this to deduce that δ (M) is also small. The
assumption that ρ(M) is small means that the restrictions M|H are close on average to gH , and the proof
uses this assumption to “stitch” together the gH into a global codeword f that is close to M.
    We begin by defining the inconsistency graph G. The graph G has each hyperplane as a vertex, and
has an edge between two hyperplanes H and H 0 if they have nonzero intersection and their respective
local codewords gH and gH 0 are inconsistent, i. e., they disagree on some point p in their intersection
H ∩ H 0.

Definition 7.1 (Inconsistency Graph). The inconsistency graph G of the test H is a graph where V is the
set of hyperplanes, and E = {(H, H 0 ) : ∃p ∈ H ∩ H 0 s.t. gH (p) 6= gH 0 (p)}.

     The proof will be divided into several steps. First, we will show that if G contains a large independent
set, namely a large set of planes which are all consistent with each other, then there is a global codeword
 f that stitches together all of the local codewords gH for every H in the independent set. Then, we will
show that every edge in G is adjacent to a vertex of (somewhat) large degree. This will imply that the set
of vertices that have large degree is a vertex cover, and its complement is an independent set. We will then
give an upper bound on the number of vertices with large degree in terms of ρ(M), and therefore if ρ(M)
is small then the vertex cover is small and hence the independent set is large. Using these components we
will conclude the proof.

7.1   Step 1: the case of a large independent set
We will show that if G has a large independent set I, then there is an f in Cm that agrees with gH on
H for every H in I. In other words, f is the codeword of Cm that stitches together all of the gH in the
independent set. The proof relies on Claim 6.2.

Lemma 7.2 (Interpolation). If G has an independent set I of size |I| > (m − 1)(n − d) + n, then there
exists f in Cm such that f |H = gH for every H ∈ I.

Our proof of this lemma is similar to the proof of a different lemma in [11].

Proof. Define Ib to be the set of i ∈ [n] such that the plane (b, i) is in I. Since |I| > (m − 1)(n − d) + n,
there must exist b1 6= b2 such that |Ib1 | and |Ib2 | are at least n − d + 1, as otherwise |I| = ∑mb=1 |Ib | ≤


                       T HEORY OF C OMPUTING, Volume 16 (5), 2020, pp. 1–34                                22
                         O N A XIS -PARALLEL T ESTS FOR T ENSOR P RODUCT C ODES

(m − 1)(n − d) + n. Without loss of generality assume b1 = 1 and b2 = 2. Let S = I1 × I2 × [n]m−2 and let
g : S → F be a matrix in FS . Define g(p) = gH (p) for every p ∈ S, where H is some plane in I1 ∪ I2 such
that p ∈ H. Note that g is well-defined since all the planes in I are consistent with each other, as I is an
independent set.
      We claim that g ∈ C|I1 ⊗C|I2 ⊗Cm−2 . This is because for any H ∈ I1 it holds that g|H ∈ C|I2 ⊗Cm−2 ,
as g|H = gH except that the second axis is now restricted to I2 . This means that for every axis b 6= 1, 2
and for every line `b parallel to the b-th axis it holds that g|`b ∈ C. Also, for every line `2 parallel to the
second axis we have that g|`2 ∈ C|I2 , because we took a Cm−1 codeword and restricted it to the subset
I2 . However, by symmetry we can repeat the same argument, swapping axis 1 and axis 2, and hence for
every line `1 parallel to the first axis it must hold that g`1 ∈ C|I1 . Thus, g ∈ C|I1 ⊗ C|I2 ⊗ Cm−2 . Since
|I1 | and |I2 | are at least n − d + 1, we can apply Claim 6.2 to the code C|I1 ⊗C|I2 ⊗Cm−2 to extend g to a
unique codeword f ∈ Cm .
      We still need to show that f |H = gH for every H ∈ I. By definition of Cm we have f |H ∈ Cm−1 .
There are three cases. If H ∈ I1 , then f agrees with gH on a subset of H of size I2 × [n]m−2 , because
gH |I2 ×[n]m−2 = g|I2 ×[n]m−2 = f |I2 ×[n]m−2 . Similarly, if H ∈ I2 , then f agrees with gH on a subset of size
I1 × [n]m−2 , and if H ∈ I \ (I1 ∪ I2 ), then f agrees with gH on a subset of size I1 × I2 × [n]m−3 . In all 3 of
the cases, since |I1 | and |I2 | are at least n − d + 1, by Claim 6.2 there is a unique codeword w ∈ Cm−1
that equals f |H (or gH ) on that subset of H. But f |H is in Cm−1 , so by the uniqueness of the extension it
follows that f |H = gH .


7.2   Step 2: the structure of G
We will now show that every edge (H, H 0 ) in G is adjacent to a vertex of large degree. The proof uses the
structure of Cm to show that if two planes disagree on a point, they must disagree on many points, and
these points have a certain structure. Using the structure of these points, we find (m − 2)d planes that
intersect H ∩ H 0 on at least one point where gH and gH 0 disagree, and therefore each of these new planes
must be adjacent to at least one of H and H 0 .

Lemma 7.3. If (H, H 0 ) ∈ E, then deg(H) + deg(H 0 ) ≥ (m − 2)d.

A similar lemma appears in [11], but the graph they consider is different from ours.

Proof. Without loss of generality assume that H = (1, i) and H 0 = (2, j). Fix k ∈ {3, . . . , m}. Let Ik be the
set of indices ` ∈ [n] such that the plane (k, `) is not adjacent to both H and H 0 . Suppose |Ik | ≥ n − d + 1.
Then gH |Ik ×[n]m−3 = gH 0 |Ik ×[n]m−3 . Since |Ik | ≥ n − d + 1, by Claim 6.2 gH |Ik ×[n]m−3 can be extended to a
unique w ∈ Cm−2 , and so w = gH |H∩H 0 . Similarly, gH 0 |Ik ×[n]m−3 can be extended to a unique v ∈ Cm−2 ,
and so v = gH 0 |H∩H 0 . However, since both gH |H∩H 0 and gH 0 |H∩H 0 agree on Ik × [n]m−3 , the uniqueness
of the extension implies that they are equal, contradicting the fact that (H, H 0 ) is an edge in the graph.
Therefore, |Ik | ≤ n − d for every k. This means that for a fixed k, there are at least d planes (k, `) such
that gH and gH 0 disagree on the intersection of all 3 planes. Since gH and gH 0 disagree, g(k,`) can agree
with at most one of them, so at least one of (H, (k, `)) and (H 0 , (k, `)) is an edge. This holds for at least d
planes for every k, which is a total of (m − 2)d planes. Therefore, deg(H) + deg(H 0 ) ≥ (m − 2)d.

                        T HEORY OF C OMPUTING, Volume 16 (5), 2020, pp. 1–34                                   23
                      A LESSANDRO C HIESA , P ETER M ANOHAR , AND I GOR S HINKAR

    Thus, for every edge (H, H 0 ) one of H and H 0 has degree ≥ (m − 2)d/2, so we deduce the following
corollary.
Corollary 7.4 (Vertex Cover). The set L of vertices of degree ≥ (m − 2)d/2 is a vertex cover.

7.3   Step 3: relating the expected local distance to the vertex cover
We now relate the set of vertices of large degree to the expected local view distance of the test H. The
main idea is to put the expression for ρ(M) into a particular form, and then apply the triangle inequality
to express ρ(M) as a sum over edges in the graph. Using a simple relation between |L| and |E|, the lemma
follows.
Lemma 7.5. Let L be the set of vertices of degree ≥ (m − 2)d/2 as in Corollary 7.4. Then

                                                              m − 2 d m−1 |L|
                                                  ρ(M) ≥                      .
                                                             4(m − 1) nm−1 nm
Proof. By definition,
                                                             1
                                              ρ(M) =               ∆(M|H , gH ) .
                                                            nm m ∑
                                                                 H

For any H = (b, i),
                            1                                           1
         ∆(M|H , gH ) =            ∑       ∑    ∆|H∩(c, j) (M, gH ) =             ∑ 0 ∆|H∩H 0 (M, gH ) .
                          m − 1 c∈[m]\{b} j∈[n]                       m − 1 H 0 :H∩H 6=0/

This is because for any point p ∈ H and for any axis c 6= b, the point p is in the intersection H ∩ (c, j) for
exactly one j. Therefore,
                                1                                1          1
                 ρ(M) =              ∑ ∆(M|H , gH ) = nm m ∑ m − 1                   ∑            ∆|H∩H 0 (M, gH )
                             nm m    H                                H         H 0 :H∩H 0 6=0/

Every pair (H, H 0 ) with H ∩ H 0 6= 0/ appears exactly twice in the sum, contributing ∆|H∩H 0 (M, gH ) and
∆|H∩H 0 (M, gH 0 ) to the sum. Therefore,
                        1
        ρ(M) =                              ∑             ∆|H∩H 0 (M, gH ) + ∆|H∩H 0 (M, gH 0 )
                  nm m(m − 1)       (H,H 0 ):H∩H 0 6=0/
                  1                                                             1
         ≥                          ∑             ∆|H∩H 0 (gH , gH 0 ) =                      ∑        ∆|H∩H 0 (gH , gH 0 ) .
             nm m(m − 1)    (H,H 0 ):H∩H 0 6=0/
                                                                           nm m(m − 1)    (H,H 0 )∈E

as (H, H 0 ) ∈
             / E =⇒ ∆|H∩H 0 (gH , gH 0 ) = 0 by definition. Fix (H, H 0 ) ∈ E. The local codewords gH
and gH 0 are both in Cm−1 , so gH |H∩H 0 and gH 0 |H∩H 0 are both Cm−2 codewords. In particular, since
∆|H∩H 0 (gH , gH 0 ) > 0, they are distinct codewords, and so ∆|H∩H 0 (gH , gH 0 ) ≥ d m−2 . Therefore,

                                         1                                       |E|d m−2
                      ρ(M) ≥                       ∑0    ∆|H∩H 0 (g  ,
                                                                    H Hg 0 ) ≥             .
                                    nm m(m − 1) (H,H )∈E
                                                                               nm m(m − 1)


                        T HEORY OF C OMPUTING, Volume 16 (5), 2020, pp. 1–34                                                    24
                         O N A XIS -PARALLEL T ESTS FOR T ENSOR P RODUCT C ODES

Since L is the set of vertices of degree ≥ (m − 2)d/2,

                                                         (m − 2)d              (m − 2)d
                 2|E| = ∑ deg(H) ≥ ∑ deg(H) ≥ |L|                 =⇒ |E| ≥ |L|          .
                         H             H∈L                  2                     4

      Thus,
                                 |E|d m−2    (m − 2)|L|d m−1    (m − 2) d m−1 |L|
                      ρ(M) ≥               ≥                 =                    .
                               nm m(m − 1)    4nm m(m − 1)     4(m − 1) nm−1 nm

7.4     Putting things together
We are now ready to prove Theorem 2.6. The result follows from straightforward applications of the
previous steps.

Proof of Theorem 2.6. Suppose first that

                                                       m − 2 dm
                                              ρ(M) ≥            .
                                                        4m nm
Then we trivially have
                                                   m − 2 dm
                                         ρ(M) ≥             ≥ δ (M) ,
                                                    4m nm
as δ (M) ≤ 1.
    Now, suppose that
                                                       m − 2 dm
                                              ρ(M) <            .
                                                        4m nm
Then by Lemma 7.5 we have

                                    m − 2 dm          (m − 2) d m−1 |L|
                                             > ρ(M) ≥                   ,
                                     4m nm            4(m − 1) nm−1 nm

and so |L| < (m − 1)d. For every f in Cm , using triangle inequality we have

                                     1                   1                    1
              δ (M) ≤ δ (M, f ) =      ∑ δ |H (M, f ) ≤    ∑ δ |H (M, gH ) +      δ |H (gH , f ) .
                                    nm H                nm H                 nm ∑
                                                                                H

Recalling that
                                                     1
                                          ρ(M) =         δ |H (M, gH )
                                                    nm ∑
                                                       H

we get that
                                                        1
                                      δ (M) ≤ ρ(M) +        δ |H (gH , f ) .
                                                       nm ∑
                                                          H

Since L is a vertex cover, the set L = V \ L is an independent set. Since |L| < (m − 1)d,

                                L > nm − (m − 1)d = (m − 1)(n − d) + n .

                         T HEORY OF C OMPUTING, Volume 16 (5), 2020, pp. 1–34                        25
                     A LESSANDRO C HIESA , P ETER M ANOHAR , AND I GOR S HINKAR

By Lemma 7.2, ∃ f ∗ ∈ Cm such that f ∗ |H = gH for every H ∈ L. Thus,

                              1                              1                             |L|
           δ (M) ≤ ρ(M) +         δ |H (gH , f ∗ ) = ρ(M) +    ∑ δ |H (gH , f ∗ ) ≤ ρ(M) + nm .
                             nm ∑
                                H                           nm H∈L

By Lemma 7.5,
                                                     (m − 2) d m−1 |L|
                                       ρ(M) ≥                          .
                                                    4(m − 1) nm−1 nm
Therefore,
                                        |L| 4(m − 1)nm−1
                                           ≤              ρ(M)
                                        nm   (m − 2)d m−1
and so

                                       4(m − 1)nm−1
                                                   
                        |L|                                            1
         δ (M) ≤ ρ(M) +     ≤ ρ(M) 1 +          m−1
                                                      =⇒ ρ(M) ≥            m−1 δ (M) .
                        nm             (m − 2)d                 1 + 4(m−1)nm−1       (m−2)d

Thus, ∀M, ρ(M) ≥ αδ (M), for
                                                                     
                                                       1    m − 2 d m
                                 α = min             m−1 ,         m
                                                                       .
                                           1 + 4(m−1)nm−1 4m n
                                                     (m−2)d

Since m ≥ 3, we have that

                         1             1            d m−1                  m − 2 dm   1 dm
                    4(m−1)nm−1
                               ≥           nm−1
                                                ≥               and                 ≥       .
                1 + (m−2)d m−1     1 + 8 d m−1      9nm−1                   4m nm     12 nm

Therefore,
                                                 d m−1 1 d m           1 dm
                                                              
                                  α ≥ min             ,            =         .
                                                 9nm−1 12 nm           12 nm

8    Other results
Below we prove several results that are incomparable to Theorem 2.6.
                                                                         1 d m
    We have already shown in Theorem 2.6 that H is robust for α ≥ 12
                                                                               
                                                                             n   . Most of the proof was
dedicated to analyzing the test when the number of vertices of large degree, L, was less than (m − 1)d. In
this same regime, we can prove an incomparable value for α. Specifically, we can show that for every M
such that |L| < (m − 1)d it holds that

                                                       1
                                           ρ(M) ≥         · δ (M) ,
                                                      m+c
where c is a constant.

                         T HEORY OF C OMPUTING, Volume 16 (5), 2020, pp. 1–34                          26
                         O N A XIS -PARALLEL T ESTS FOR T ENSOR P RODUCT C ODES

                                           1
Claim 8.1. If |L| < (m − 1)d, then ρ(M) ≥ m+c · δ (M), for c = 32/9. Combining with Theorem 2.6, this
implies that
                                                     1 d m
                                                         
                                              1
                               ρ(M) ≥ max          ,             · δ (M)
                                            m + c 12 n
when |L| < (m − 1)d.
Proof. Let I be the set of planes that are not in L. By the assumption |L| < (m − 1)d, we have |I| >
(m − 1)(n − d) + n, and thus, by Lemma 7.2 there exists f ∈ Cm such that f |H = gH for all H ∈ I.
    Let K = {p ∈ Fn : ∀H ∈ I, p ∈   / H} be the set of points that are not contained in any plane in I.
For each b ∈ [m] let Ib = {i ∈ [n] : (b, i) ∈ I}. It is clear that |I| = ∑b∈[m] |Ib |, and we can rewrite K as
K = {p ∈ Fn : pb ∈/ Ib ∀b ∈ [m]}. Therefore,
                                                  !m                           !m          m
                 m
                                        1 m                m         1   m
                                                                                         m |L|
          |K| = ∏ (n − |Ib |) ≤ n − ∑ |Ib |            = n 1−           ∑ |Ib | = n nm ,
                b=1                     m b=1                       nm b=1

where the inequality uses the AM-GM inequality. Now, we show that δ (M, f ) ≤ (m + c) · ρ(M). We start
by writing δ (M, f ) as follows.
                 1                          1                             1
  δ (M, f ) =     m
                    |{p : M(p) 6= f (p)}| = m |{p ∈ K : M(p) 6= f (p)}| + m |{p ∈
                                                                                / K : M(p) 6= f (p)}| .
                n                          n                             n
The first term is upper bounded by |K|/nm , and so it is at most (|L|/(nm))m . In order to bound the second
term, note that for all p ∈/ K there exists a plane Hp ∈ I such that p ∈ Hp , and thus, f (p) = gHp (p).
Therefore,
                  1                                      1
                    |{p ∈
                        / K : M(p) 6= f (p)}| =            {p ∈ / K : M(p) 6= gHp (p)}
                 nm                                     nm
                                                         1
                                                ≤          {p ∈ [n]m : M(p) 6= gHp (p)}
                                                        nm
                                                         1
                                                ≤           ∑ m |{H : p ∈ H, M(p) 6= gH (p)}|
                                                        nm p∈[n]
                                                = m · ρ(M) .
This implies that                                           m
                                                       |L|
                                     δ (M, f ) ≤                  + m · ρ(M) .
                                                       nm
Next, using the bound |L| < (m − 1)d in the assumption of the claim, as well as the bound
                                       |L|          4(m − 1) nm−1
                                           ≤ ρ(M) ·         ·
                                       nm            m − 2 d m−1
from Lemma 7.5, we get that
                               (m − 1)d m−1          4(m − 1) nm−1
                                                                
                  δ (M, f ) ≤               · ρ(M) ·         ·       + m · ρ(M)
                                  nm                  m − 2 d m−1
                                    1 m 4m
                                               
                            =   1−       ·     + m · ρ(M) .
                                    m      m−2

                         T HEORY OF C OMPUTING, Volume 16 (5), 2020, pp. 1–34                              27
                          A LESSANDRO C HIESA , P ETER M ANOHAR , AND I GOR S HINKAR

For m ≥ 3 we get that δ (M) ≤ (m + 32/9)ρ(M), as required.

Remark 8.2. In fact, by a slightly modified argument (writing ρ(M) as the sum over the intersections of
k planes) we can prove that for |L| < (m − 1)d it holds that

                                                            nm−k
                                                                
                                    δ (M) ≤ ρ(M) k + ck m−k        ,
                                                           d
where ck is a constant for a fixed k ∈ [m]. The proof of Theorem 2.6 used k = 1.
   We can also show that when |L| < (m − 1)d, we get a robustness of α = 1 plus an additive term of
d/n. Note that d is the distance of the code, so when d = O(n), the additive term is not small.
Claim 8.3. If |L| < (m − 1)d, then δ (M) ≤ ρ(M) + d/n.
Proof. In the proof of Theorem 2.6, we showed that if |L| < (m − 1)d, then
                                                    |L|          (m − 1)d         d
                             δ (M) ≤ ρ(M) +             ≤ ρ(M) +          ≤ ρ(M) + .
                                                    nm              nm            n
    Next, we observe that for any linear C of dimension k we have a similar claim without the constraint
on |L|, if we replace d/n with (n − k)/n
Claim 8.4. For any linear code C ⊆ Fn of dimension k, it holds that δ (M) ≤ ρ(M) + (n − k)/n uncondi-
tionally for all M ∈ Fn .
Proof. Define vb = ∑H=(b,i) ∆|H (M, gH ), and without loss of generality assume that v1 ≤ v2 ≤ · · · ≤ vm .
Observe that
                                                                1 m
                                                    ρ(M) =          ∑ vb .
                                                               nm m b=1

Let S ⊆ [n] be any subset of the coordinates that are pivotal for C, i. e., for any k values given to the
coordinates in S there exists a unique codeword in C that agrees with these values on S.5 Let us identify S
with the corresponding (1, i) planes. Since S is a pivotal set for C, there exists f in Cm such that f |H = gH
for every H in S. Therefore,
                                         1                      1                 1
              δ (M) ≤ δ (M, f ) =        m ∑      ∆|H (M, f ) ≤ m ∑ ∆|H (M, f ) + m (n − |S|) nm−1
                                        n H=(1,i)              n H∈S             n
                   1                   n−k   1      n−k
              =    m ∑  ∆|H (M, gH ) +     ≤ m v1 +
                  n H∈S                 n   n        n
                   1 m       n−k        n−k
              ≤        ∑ vb + n = ρ(M) + n .
                  nm m b=1

    In particular, if C is an MDS code then k = n − d + 1, and so we get the following corollary.
Corollary 8.5. If C is an MDS code, then δ (M) ≤ ρ(M) + (d − 1)/n unconditionally for all M ∈ Fn .
   5 Note that S exists since C is a linear subspace of Fn .



                            T HEORY OF C OMPUTING, Volume 16 (5), 2020, pp. 1–34                           28
                        O N A XIS -PARALLEL T ESTS FOR T ENSOR P RODUCT C ODES

9    Robustness vs. soundness
In Theorem 2.6, we study the robustness of the test H. We now compare the robustness of H to the
soundness of H, and show a somewhat tight result for the soundness of H.
    Robustness is a stronger guarantee than soundness, as Pr[HM rejects] ≥ ρ(M). Thus, if we show that
ρ(M) ≥ αδ (M), then we can deduce that Pr[HM rejects] ≥ αδ (M), which upper bounds the soundness
error. However, the converse is not true.
    In fact, unlike for robustness, we can prove a somewhat tight result on the soundness of H, stated
                                                                   1
below. Note that the query complexity of H is nm−1 , which is N 1− m where N := nm is the block length of
Cm , the code being tested.

Claim 9.1. Let ε0 = ((n − d)(m − 1) + n)/nm. If Pr[HM accepts] = ε > ε0 , then δ (M) ≤ (1 − ε)m .

Proof. Observe that
                                                               1
                                     Pr[HM accepts] = ε =        · |I| ,
                                                              nm
where I is the set of planes H such that M|H ∈ Cm−1 . The condition ε > ε0 implies that |I| > (n − d)(m −
1) + n, so by Lemma 7.2, there exists an f in Cm such that f |H = M|H . Let K = {p : p ∈ / H ∀H ∈ I}. By
the same logic as in the proof of Claim 8.1,

                                                           |I| m
                                                              
                                              |K|
                           δ (M) ≤ δ (M, f ) ≤ m ≤ 1 −            = (1 − ε)m .
                                               n           nm

   We remark that the above claim is somewhat tight, as there are functions where Pr[HM accepts] = ε
and δ (M) = Ω((1 − ε)m ), and furthermore some threshold is necessary, as we now explain.

    • Necessity of ε. Fix ε, and let S be a subset of [n] of size (1 − ε)n. Let M be a word in [n]m obtained
      from some codeword of Cm by corrupting all values on the set Sm to be uniformly random. It is
      clear that with high probability δ (M) = Ω(|S|m /nm ) = Ω((1 − ε)m ), and that the test H will accept
      if it reads the plane (b, i) for any b ∈ [m] and i ∈/ S. Furthermore, the test H will reject when it
      reads a plane (b, i) for any b ∈ [m] and i ∈ S. Therefore, Pr[HM accepts] = 1 − |S|/n = ε.

    • Necessity of ε0 . Some threshold is necessary, as we can get an acceptance probability of 1/m
      “for free” by setting M|(1,i) to be a random Cm−1 codeword. Then, the test H will accept with
      probability 1 when it reads a (1, i) plane, and will reject with high probability when it reads a (b, i)
      plane for b 6= 1. However, M will be very far from Cm , so having an acceptance probability of 1/m
      says little about δ (M).


Acknowledgements
The authors thank Eli Ben-Sasson, Oded Goldreich, and Madhu Sudan for helpful discussions, and the
anonymous referees for their very useful comments. This work was supported in part by the Center for
Long-Term Cybersecurity at UC Berkeley.

                       T HEORY OF C OMPUTING, Volume 16 (5), 2020, pp. 1–34                                29
                    A LESSANDRO C HIESA , P ETER M ANOHAR , AND I GOR S HINKAR

References
 [1] N OGA A LON , TALI K AUFMAN , M ICHAEL K RIVELEVICH , S IMON L ITSYN , AND DANA RON:
     Testing Reed-Muller codes. IEEE Trans. Inform. Theory, 51(11):4032–4039, 2005. Preliminary
     version in RANDOM’03. [doi:10.1109/TIT.2005.856958] 2

 [2] S ANJEEV A RORA AND S HMUEL S AFRA: Probabilistic checking of proofs: a new characterization
     of NP. J. ACM, 45(1):70–122, 1998. Preliminary version in FOCS’92. [doi:10.1145/273865.273901]
     2

 [3] S ANJEEV A RORA AND M ADHU S UDAN: Improved low-degree testing and its applications. Combi-
     natorica, 23(3):365–426, 2003. Preliminary version in STOC’97. [doi:10.1007/s00493-003-0025-0]
     3, 4, 5, 6

 [4] V IKRAMAN A RVIND , P USHKAR S. J OGLEKAR , PARTHA M UKHOPADHYAY, AND S. R AJA:
     Randomized polynomial-time identity testing for noncommutative circuits. Theory of Computing,
     15(7):1–36, 2019. [doi:10.4086/toc.2019.v015a007] 18

 [5] L ÁSZLÓ BABAI , L ANCE F ORTNOW, L EONID A. L EVIN , AND M ARIO S ZEGEDY: Checking
     computations in polylogarithmic time. In Proc. 23rd STOC, pp. 21–32. ACM Press, 1991.
     [doi:10.1145/103418.103428] 2, 5

 [6] L ÁSZLÓ BABAI , L ANCE F ORTNOW, AND C ARSTEN L UND: Non-deterministic exponential time
     has two-prover interactive protocols. Comput. Complexity, 1:3–40, 1991. Preliminary version in
     FOCS’90. [doi:10.1007/BF01200056] 2, 5

 [7] M IHIR B ELLARE , D ON C OPPERSMITH , J OHAN H ÅSTAD , M ARCOS A. K IWI , AND M ADHU
     S UDAN: Linearity testing in characteristic two. IEEE Trans. Inform. Theory, 42(6):1781–1795,
     1996. Preliminary version in FOCS’95. [doi:10.1109/18.556674] 2, 3

 [8] M ICHAEL B EN -O R , D ON C OPPERSMITH , M IKE L UBY, AND RONITT RUBINFELD: Non-Abelian
     homomorphism testing, and distributions close to their self-convolutions. Random Structures
     Algorithms, 32(1):49–70, 2008. Preliminary version in RANDOM’04. [doi:10.1002/rsa.20182] 2

 [9] E LI B EN -S ASSON , A LESSANDRO C HIESA , DANIEL G ENKIN , AND E RAN T ROMER: On the
     concrete efficiency of probabilistically-checkable proofs. In Proc. 45th STOC, pp. 585–594. ACM
     Press, 2013. [doi:10.1145/2488608.2488681] 4

[10] E LI B EN -S ASSON , O DED G OLDREICH , P RAHLADH H ARSHA , M ADHU S UDAN , AND S ALIL P.
     VADHAN: Robust PCPs of proximity, shorter PCPs, and applications to coding. SIAM J. Comput.,
     36(4):889–974, 2006. Preliminary version in STOC’04. [doi:10.1137/S0097539705446810] 3, 21

[11] E LI B EN -S ASSON AND M ADHU S UDAN: Robust locally testable codes and products of codes.
     Random Structures Algorithms, 28(4):387–402, 2006. Preliminary version in RANDOM’04.
     [doi:10.1002/rsa.20120, arXiv:cs/0408066] 2, 3, 6, 7, 9, 10, 21, 22, 23

                     T HEORY OF C OMPUTING, Volume 16 (5), 2020, pp. 1–34                        30
                       O N A XIS -PARALLEL T ESTS FOR T ENSOR P RODUCT C ODES

[12] E LI B EN -S ASSON AND M ADHU S UDAN: Short PCPs with polylog query complexity. SIAM J.
     Comput., 38(2):551–607, 2008. Preliminary version in STOC’05. [doi:10.1137/050646445] 4, 5

[13] E LI B EN -S ASSON , M ADHU S UDAN , S ALIL VADHAN , AND AVI W IGDERSON: Randomness-
     efficient low degree tests and short PCPs via epsilon-biased sets. In Proc. 35th STOC, pp. 612–621.
     ACM Press, 2003. [doi:10.1145/780542.780631] 2

[14] E LI B EN -S ASSON AND M ICHAEL V IDERMAN: Tensor products of weakly smooth codes are
     robust. Theory of Computing, 5(12):239–255, 2009. Preliminary version in RANDOM’08.
     [doi:10.4086/toc.2009.v005a012] 3

[15] E LI B EN -S ASSON AND M ICHAEL V IDERMAN: Composition of semi-LTCs by two-wise tensor
     products. Comput. Complexity, 24(3):601–643, 2015. Preliminary version in RANDOM’09.
     [doi:10.1007/s00037-013-0074-8] 3

[16] A MEY B HANGALE , I RIT D INUR , AND I NBAL L IVNI NAVON: Cube vs. cube low degree test.
     In Proc. 8th Innovations in Theoret. Comp. Sci. (ITCS’17), pp. 40:1–40:31. Schloss Dagstuhl–
     Leibniz-Zentrum fuer Informatik, 2017. [doi:10.4230/LIPIcs.ITCS.2017.40, arXiv:1612.07491] 4,
     6

[17] A RNAB B HATTACHARYYA , S WASTIK KOPPARTY, G RANT S CHOENEBECK , M ADHU S UDAN ,
     AND DAVID Z UCKERMAN: Optimal testing of Reed-Muller codes. In Property Testing, pp. 269–275,
     2010. Preliminary version in FOCS’10. [doi:10.1007/978-3-642-16367-8_19, arXiv:0910.0641] 2

[18] M ANUEL B LUM , M ICHAEL L UBY, AND RONITT RUBINFELD: Self-testing/correcting with appli-
     cations to numerical problems. J. Comput. System Sci., 47(3):549–595, 1993. Preliminary version
     in STOC’90. [doi:10.1016/0022-0000(93)90044-W] 2

[19] A LESSANDRO C HIESA , P ETER M ANOHAR , AND I GOR S HINKAR: On axis-parallel tests for
     tensor product codes. In Proc. 21st Internat. Workshop on Randomization and Computa-
     tion (RANDOM’17), pp. 472–481. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, 2017.
     [doi:10.4230/LIPIcs.APPROX-RANDOM.2017.39]

[20] D ON C OPPERSMITH AND ATRI RUDRA: On the robust testability of product of codes. Electron.
     Colloq. on Comput. Complexity (ECCC), 2005. [ECCC:TR05-104] 3, 6

[21] I RIT D INUR AND O MER R EINGOLD: Assignment testers: Towards a combinatorial proof of
     the PCP theorem. SIAM J. Comput., 36:975–1024, 2006. Preliminary version in FOCS’04.
     [doi:10.1137/S0097539705446962] 3

[22] I RIT D INUR , M ADHU S UDAN , AND AVI W IGDERSON: Robust local testability of tensor products
     of LDPC codes. pp. 304–315. [doi:10.1007/11830924_29] 3, 6

[23] O DED G OLDREICH: Short locally testable codes and proofs: A survey in two parts. In Property
     Testing, pp. 65–104. Springer, 2010. [doi:10.1007/978-3-642-16367-8_6] 21

                      T HEORY OF C OMPUTING, Volume 16 (5), 2020, pp. 1–34                           31
                    A LESSANDRO C HIESA , P ETER M ANOHAR , AND I GOR S HINKAR

[24] O DED G OLDREICH AND O R M EIR: The tensor product of two good codes is not necessarily
     robustly testable. Inform. Process. Lett., 112(8-9):351–355, 2012. [doi:10.1016/j.ipl.2012.01.007]
     3, 6

[25] O DED G OLDREICH AND M ADHU S UDAN: Locally testable codes and PCPs of almost-linear length.
     J. ACM, 53(4):558–655, 2006. Preliminary version in FOCS’02. [doi:10.1145/1162349.1162351] 2

[26] J OHAN H ÅSTAD: Some optimal inapproximability results. J. ACM, 48(4):798–859, 2001. Prelimi-
     nary version in STOC’97. [doi:10.1145/502090.502098] 3

[27] S WASTIK KOPPARTY, O R M EIR , N OGA RON -Z EWI , AND S HUBHANGI S ARAF: High-rate locally-
     correctable and locally-testable codes with sub-polynomial query complexity. J. ACM, 64(2):11:1–
     11:42, 2017. Preliminary version in STOC’16. [doi:10.1145/3051093, arXiv:1504.05653] 2

[28] TAMÁS KŐVÁRI , V ERA T. S ÓS , AND PÁL T URÁN: On a problem of K. Zarankiewicz. Colloquium
     Mathematicae, 3:50–57, 1954. Accessible at the Hungarian Acad. Sci. 12

[29] O R M EIR: Combinatorial construction of locally testable codes. SIAM J. Comput., 39(2):491–544,
     2009. Preliminary version in STOC’08. [doi:10.1137/080729967] 2

[30] DANA M OSHKOVITZ AND R AN R AZ: Sub-constant error low degree test of almost-linear size.
     SIAM J. Comput., 38(1):140–180, 2008. Preliminary version in STOC’06. [doi:10.1137/060656838]
     3, 4, 5

[31] A LEXANDER P OLISHCHUK AND DANIEL A. S PIELMAN: Nearly-linear size holographic proofs.
     In Proc. 26th STOC, pp. 194–203. ACM Press, 1994. [doi:10.1145/195058.195132] 2, 3, 4, 5, 6, 7,
     9, 11

[32] R AN R AZ AND S HMUEL S AFRA: A sub-constant error-probability low-degree test, and a sub-
     constant error-probability PCP characterization of NP. In Proc. 29th STOC, pp. 475–484. ACM
     Press, 1997. [doi:10.1145/258533.258641] 3, 4, 5, 6

[33] RONITT RUBINFELD AND M ADHU S UDAN: Robust characterizations of polynomi-
     als with applications to program testing. SIAM J. Comput., 25(2):252–271, 1996.
     [doi:10.1137/S0097539793255151] 2

[34] PAUL VALIANT: The tensor product of two codes is not necessarily robustly testable. In Proc. 9th
     Internat. Workshop on Randomization and Computation (RANDOM’05), pp. 472–481. Springer,
     2005. [doi:10.1007/11538462_40] 3, 6

[35] M ICHAEL V IDERMAN: A combination of testability and decodability by tensor products.
     Random Structures Algorithms, 46(3):572–598, 2013. Preliminary version in RANDOM’12.
     [doi:10.1002/rsa.20498, arXiv:1105.5806] 3, 7, 9, 10, 11

[36] M ICHAEL V IDERMAN: Strong LTCs with inverse poly-log rate and constant soundness. In Proc.
     54th FOCS, pp. 330–339. IEEE Comp. Soc. Press, 2013. [doi:10.1109/FOCS.2013.43] 2

                      T HEORY OF C OMPUTING, Volume 16 (5), 2020, pp. 1–34                          32
                      O N A XIS -PARALLEL T ESTS FOR T ENSOR P RODUCT C ODES

[37] JACK K EIL W OLF: On codes derivable from the tensor product of check matrices. IEEE Trans.
     Inform. Theory, 11(2):281–284, 1965. [doi:10.1109/TIT.1965.1053771] 2

[38] JACK K EIL W OLF AND B ERNARD E LSPAS: Error-locating codes – a new concept in error control.
     IEEE Trans. Inform. Theory, 9(2):113–117, 1963. [doi:10.1109/TIT.1963.1057813] 2


AUTHORS

     Alessandro Chiesa
     Assistant professor
     Department of Electrical Engineering and Computer Science
     UC Berkeley
     Berkeley, CA, USA
     alexch berkeley edu
     https://people.eecs.berkeley.edu/~alexch/


     Peter Manohar
     Ph. D. student
     Computer Science Department
     Carnegie Mellon University
     Pittsburgh, PA, USA
     pmanohar cs cmu edu
     https://www.cs.cmu.edu/~pmanohar


     Igor Shinkar
     Assistant professor
     School of Computing Science
     Simon Fraser University
     Burnaby, B.C. Canada
     ishinkar sfu ca
     https://www.cs.sfu.ca/~ishinkar/


ABOUT THE AUTHORS

     A LESSANDRO C HIESA graduated from MIT in 2014; his advisor was Silvio Micali. He was
        a postdoc at ETH Zurich between 2014-2015. Since then, he has been a faculty member
        at the University of California, Berkeley. He is interested in all kinds of proof systems,
        information theoretic and cryptographic.




                     T HEORY OF C OMPUTING, Volume 16 (5), 2020, pp. 1–34                            33
               A LESSANDRO C HIESA , P ETER M ANOHAR , AND I GOR S HINKAR

P ETER M ANOHAR is a Ph. D. student at Carnegie Mellon University, advised by Venkat
    Guruswami. He graduated from UC Berkeley with a B. S. in EECS in 2019 where he
   was advised by Professors Alessandro Chiesa and Ren Ng. He is broadly interested
    in complexity theory and cryptography, specifically in topics such as probabilistically
    checkable proofs, property testing, coding theory, and sum-of-squares.


I GOR S HINKAR is an assistant professor in the School of Computing Science at Simon Fraser
    University, Canada. He obtained his Ph. D. in Computer Science from the Weizmann
    Institute of Science, Israel in 2014 under the supervision of Irit Dinur. He spent two years
    as a postdoc at the Courant Institute of Mathematical Sciences, NYU, and two years
    as a postdoc at UC Berkeley. His research is in theoretical computer science, discrete
    mathematics, probability, and the interplay between them.




                 T HEORY OF C OMPUTING, Volume 16 (5), 2020, pp. 1–34                              34