Relaxed Locally Correctable Codes

Authors Tom Gur, Govind Ramnarayan, Ron Rothblum,
Plaintext
                          T HEORY OF C OMPUTING, Volume 16 (18), 2020, pp. 1–68
                                       www.theoryofcomputing.org




             Relaxed Locally Correctable Codes
               Tom Gur∗                Govind Ramnarayan†                      Ron D. Rothblum‡
                Received December 2, 2017; Revised July 31, 2020; Published December 30, 2020




       Abstract. Locally decodable codes (LDCs) and locally correctable codes (LCCs) are error-
       correcting codes in which individual bits of the message and codeword, respectively, can
       be recovered by querying only a few bits from a noisy codeword. These codes have found
       numerous applications both in theory and in practice.
           A natural relaxation of LDCs, introduced by Ben-Sasson et al. (SICOMP, 2006), allows
       the decoder to reject (i. e., refuse to answer) in case it detects that the codeword is corrupt.
       They call such a decoder a relaxed decoder and construct a constant-query relaxed LDC
       with nearly linear block length, whereas the state-of-the-art (full-fledged) LDCs in the
       constant-query regime have superpolynomial block length.
           We consider an analogous relaxation for local correction. Thus, a relaxed local corrector
       reads only few bits from a (possibly) corrupt codeword and either recovers the desired bit of
       the codeword, or rejects in case it detects corruption.
           We give constructions of relaxed LCCs in two regimes, where the first optimizes the
       query complexity and the second optimizes the rate.
     An extended abstract of this paper appeared in the Proceedings of the 9th Innovations in Theoretical Computer Science
Conference (ITCS 2018).
   ∗ Supported by the UKRI Future Leaders Fellowship.
   † Parts of this research were done while the author was supported by NSF grant number 1218547. Supported by Army

Research Office grant number W911NF1910217.
   ‡ Parts of this research were done while the author was at MIT and Northeastern University. Partially supported by NSF

MACS - CNS-1413920, Simons Investigator award Agreement Dated 6-5-12, the Cybersecurity and Privacy Institute at
Northeastern University and a Milgrom family grant.


ACM Classification: Theory of Computation: Error-correcting codes
AMS Classification: 68R01, 68R99
Key words and phrases: codes, PCPs, locally correctable codes


© 2020 Tom Gur, Govind Ramnarayan, and Ron D. Rothblum
c b Licensed under a Creative Commons Attribution License (CC-BY)                           DOI: 10.4086/toc.2020.v016a018
                       T OM G UR , G OVIND R AMNARAYAN , AND RON D. ROTHBLUM

         1. Constant Query Complexity: A relaxed LCC with polynomial block length whose
            corrector only reads a constant number of bits of the codeword. In contrast, the best
            upper bound on the block length achieved by known q-query (full-fledged) LCCs, for
            constant q, is exp(O(n1/(q−1) )).
         2. Constant Rate: A relaxed LCC with constant rate (i. e., linear block length) with
            quasi-polylogarithmic query complexity (specifically, exp(O(log log n)2 )). In contrast,
            the best known upper bound on the query complexity of constant-rate (full-fledged)
            LCCs is exp(O(e √log n)) (Kopparty et al., STOC’16). Our construction, however, is
            based on a locally testable code constructed by Kopparty et al., which we show to
            allow for relaxed local correction with better parameters.




Contents
1   Introduction                                                                                        3
    1.1 Our results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
    1.2 Technical overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
    1.3 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
    1.4 Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2   Preliminaries                                                                                    12
    2.1 Error-correcting codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

3   Definitions: local correcting and its relaxations                                                     13

4   Constant-query RLCC                                                                                   14
    4.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .     15
    4.2 Composition of relaxed LCC and PCPPs . . . . . . . . . . . . . . . . . . . . . . . . . .          20
    4.3 Exponential-length, constant-query, strongly canonical and self-correctable PCPP . . . .          28
    4.4 Polynomial-length, polylog-query, strongly canonical, self-correctable and robust PCPP            31
    4.5 Putting it together . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   42

5   Constant-rate RLCC                                                                                    43
    5.1 Analysis of tensoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .     45
    5.2 Analysis of distance amplification . . . . . . . . . . . . . . . . . . . . . . . . . . . . .      47
    5.3 Concatenation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .     57
    5.4 Putting it all together: final construction and parameters . . . . . . . . . . . . . . . . . .    57




                       T HEORY OF C OMPUTING, Volume 16 (18), 2020, pp. 1–68                               2
                                       R ELAXED L OCALLY C ORRECTABLE C ODES

1     Introduction
Dating back to the seminal work of Shannon [47] and Hamming [34], error correcting codes are used to
reliably transmit data over noisy channels and store data. Roughly speaking, error correcting codes are
injective functions that take a message and output a codeword, in which the message is encoded with
extra redundancy, with the property that even if some of the symbols in the codeword are corrupted, the
message is still recoverable. Some of the main parameters of codes are their block length, which is the
length of a codeword as a function of n, the message length; the rate, which is the message length divided
by the block length; and the relative distance, which is the minimal relative Hamming distance of every
pair of distinct codewords.
    Locally decodable codes (LDCs) and locally correctable codes (LCCs) are error correcting codes that
admit highly efficient procedures for recovering small amounts of data. More specifically, in an LDC, a
single symbol of the message can be recovered by only reading a few bits from a noisy codeword. An
LCC has the same property but with respect to recovering bits of the codeword itself (rather than the
message).
    Locally decodable codes and locally correctable codes have had a profound impact in various areas
of theoretical computer science including cryptography, PCPs, hardness of approximation, interactive
proofs, private information retrieval, program checking, and databases. (See [54] and the more recent
[38] for surveys on locally decodable and correctable codes.) While these codes have found numerous
uses in theory and practice, one significant downside is that current constructions require adding a large
amount of redundancy. Specifically, to decode or correct with a constant number of queries, the current
state-of-the-art LDCs have superpolynomial block length (Yekhanin [53], Efremenko [17]) and the best
upper bound on the block length achieved by known constant-query LCCs is1 exp(cq (n1/(q−1) )) where
cq ∼ q log2 q for q queries.
    Motivated by this, Ben-Sasson et al. [7] defined a natural relaxation of local decoding, for which they
could achieve a dramatically better block length. Roughly speaking, their relaxation allows the decoder to
abort in case of failure, while still requiring the decoder to successfully decode non-corrupted codewords
(in particular, this prevents the decoder from always aborting). Moreover, in the constant query regime,
such codes can be transformed to codes, with similar parameters, that are guaranteed to successfully
decode on a constant fraction of message bits.
    Thus, a relaxed local decoder for a code C gets oracle access to a string w that is relatively close
to some codeword c = C(x) and an index i ∈ [|x|]. The decoder should make only few queries to w and
satisfy the following:

    1. If the string w = c (i. e., w is an uncorrupted codeword), the relaxed decoder must always output xi .

    2. Otherwise, with high probability, the decoder should either output xi or a special “abort” symbol ⊥
       (indicating the decoder detected an error and is unable to decode).2

    1 These are Reed–Muller codes over a constant-size alphabet and with constant degree (but large dimension).
   2 The actual definition in [7] also requires that for a constant fraction of the coordinates, the decoder decodes correctly (i. e.,

does not output ⊥) with high probability. However, they later show that this additional condition follows from Conditions (1)
and (2) above when the decoder has O(1) query complexity. See further discussion in Remark 1.3.


                             T HEORY OF C OMPUTING, Volume 16 (18), 2020, pp. 1–68                                                  3
                           T OM G UR , G OVIND R AMNARAYAN , AND RON D. ROTHBLUM

    The additional freedom introduced by allowing the decoder to abort turns out to be extremely
useful. Using the notion of PCPs of proximity (PCPP), which they also introduce3 and construct,
Ben-Sasson et al. obtain relaxed locally decodable codes (RLDCs) with constant query complexity and
nearly linear block length.
    In this article we extend the relaxation of Ben-Sasson et al. to LCCs and define the analogous notion
of relaxed LCCs as follows: We say that a code C : Σk → Σn is a relaxed LCC with query complexity q, if
there exists a corrector that has oracle access to a string w ∈ Σn , which is close to some codeword c ∈ C,
and also gets as explicit input an index i ∈ [n]. The algorithm makes at most q queries to the string w, and
satisfies the following:

   1. If w = c (i. e., w was not corrupted), then the corrector always outputs ci .

   2. Otherwise, with high probability, the corrector either outputs ci , or a special “abort” symbol ⊥.

    The remarkable savings achieved by Ben-Sasson et al. begs the question: can relaxed locally
correctable codes achieve similar savings in block length over current constructions of locally correctable
codes? We answer this question in the affirmative by constructing relaxed LCCs with significantly better
parameters than that of the state-of-the-art (full-fledged) LCCs.

1.1    Our results
In this paper we construct relaxed locally correctable codes in two different parameter regimes: the first,
which we view as our main technical contribution, focuses on the constant query complexity regime,
whereas the second, which is easier to prove given previous work, focuses on constant rate.

Constant-query RLCC. Our first result is a relaxed LCC which requires only O(1) queries and has a
polynomial block length.

Theorem 1.1 (Constant-query relaxed LCC, informally stated). There exists a relaxed LCC C : {0, 1}k →
{0, 1}n with constant relative distance, constant query complexity, and block length n = poly(k). Further-
more, C is a linear code.

    Theorem 1.1 shows that in the constant-query regime, relaxed LCCs can achieve polynomial block
length, while the upper bound on the block length achieved by the best known (full-fledged) LCCs with
constant query complexity is exp(exp(O((log
                                        e      n)ε ))) where ε = 1/blog2 qc for q queries [17].
    The proof of Theorem 1.1 heavily relies on a certain type of PCPs of proximity (PCPPs) that we
construct. We elaborate on our PCPP constructions in Section 1.1.1 below.
    We remark that the specific block length in Theorem 1.1 is roughly quartic (i. e., fourth power) in the
message length. Constructing a constant-query RLCC with a shorter block length (let alone an (nearly)
linear one, as known for relaxed LDCs) is an interesting open problem.4
   3 The equivalent notion of assignment tester was introduced independently by Dinur and Reingold [16].
   4 Following the initial publication of this work, the aforementioned open problem was recently resolved by Chiesa, Gur, and

Shinkar [12]. See Section 1.3 for details.


                            T HEORY OF C OMPUTING, Volume 16 (18), 2020, pp. 1–68                                           4
                                     R ELAXED L OCALLY C ORRECTABLE C ODES

Constant-rate RLCC. Our second main result is a construction of a relaxed LCC with constant rate
and quasi-polylogarithmic query complexity.
Theorem 1.2 (Constant-rate relaxed LCC, informally stated). There exists a relaxed LCC C : {0, 1}k →
{0, 1}n with constant relative distance, query complexity exp(O(log log n)2 ), and constant rate (i. e.,
block length n = O(k)). Furthermore, C is a linear code and has distance–rate tradeoff approaching the
Zyablov bound.5
    This result shows that in the constant-rate regime, there exist relaxed LCCs with query complexity
exp(O(log log n)2 ), whereas the upper bound on the number of queries in the state-of-the-art construction
                                                                              √
of constant-rate (full-fledged) LCCs, due to Kopparty et al. [37], is exp(O( log n · log log n)).
    Actually, our construction is essentially identical to one of the constructions of [37].6 Our main
insight in proving Theorem 1.2 is that their code allows for relaxed local correction with much better
parameters.7 Just like in [37], as an intermediate result we first construct a large-alphabet code with the
same asymptotic query complexity that approaches the Singleton bound, which may also be of interest
(see Theorem 5.2). As a secondary contribution, we also provide a modular presentation for the distance
amplification step, which is the main step in [37] (and is originally due to Alon, Edmonds and Luby [1]).
Remark 1.3. As mentioned in Footnote 2, the original definition of RLDC [7] includes a third condition,
which requires that the decoder must successfully decode a constant fraction of the coordinates. More
precisely, for every w ∈ Σn that is close to some codeword c = C(x), there exists a set Iw ⊆ [k] of size
Ω(k) such that for every i ∈ Iw with high probability the decoder D outputs xi (rather than outputting ⊥).
    Ben-Sasson et al. showed that every RLDC with constant query complexity that satisfies the first two
conditions, can be transformed into an RLDC with similar parameters that satisfies the third condition
as well. We remark that this transformation also applies to RLCCs with constant query complexity.
However, for superconstant query complexity (as in Theorem 1.2) the same transformation only guarantees
successful decoding of a constant fraction of coordinates, if the codeword is corrupted on a subconstant
fraction of its coordinates (i. e., the fraction roughly corresponds to the reciprocal of the query complexity).
Remark 1.4. Both our constant-query and constant-rate RLCCs are systematic.8 Hence they are auto-
matically also relaxed locally decodable codes (i. e., RLDCs). In particular, the code from Theorem 1.2
is also the first construction of a relaxed locally decodable code in the constant-rate regime, with query
complexity exp(O(log log n)2 ).

1.1.1 PCP constructions
PCPs of proximity (PCPP), first studied by Ben-Sasson et al. [7] and by Dinur and Reingold [16] were
originally introduced to facilitate PCP composition. Beyond their usefulness in PCP constructions,
PCPPs have proved to be extremely useful in coding theory as well. Indeed, PCPPs lie at the heart
   5 For the Zyablov bound [55], see [32, Chap. 10.2].
   6 Interestingly, our construction is inspired by the [37] construction of a locally testable code, rather than their locally

correctable code.
   7 We were informed that a similar observation about the [37] code has been made recently and independently in an unpublished

paper by Hemenway, Ron-Zewi, and Wootters [35].
   8 Recall that a code is systematic if the first part of every codeword is the original message.



                           T HEORY OF C OMPUTING, Volume 16 (18), 2020, pp. 1–68                                             5
                       T OM G UR , G OVIND R AMNARAYAN , AND RON D. ROTHBLUM

of several constructions of LTCs [27], relaxed LDCs [7, 26], universal LTCs [25, 24], as well as in our
construction of relaxed LCCs (specifically in Theorem 1.1).
    Loosely speaking, a PCPP is a proof system that allows for probabilistic verification of approximate
decision problems by querying only a small number of locations in both the statement and the proof.
(In contrast, a standard PCP verifier reads the entire statement, and probabilistically verifies an exact
decision problem, by querying only a small number of locations in the proof.) Similarly to the scenario in
property testing, the soundness guarantee provided by PCPPs is that the PCPP verifier is only required
to reject statements that are “far” (in Hamming distance) from being correct.
    In this paper we provide new constructions of PCPPs that play a crucial role in our constant-query
relaxed LCC construction. The PCPPs that we construct are for verifying membership in affine subspaces
(rather than general languages in P or NP), since this is all that we need for our RLCC constructions.
More specifically, we shall construct PCPPs that are: linear, robust, self-correctable, and admit strongly
canonical soundness. We discuss these properties in more detail next (see Section 4.2.5 for precise
definitions). We remark that our PCPP construction is inspired by the construction of linear-inner
proof-systems (LIPS) by Goldreich and Sudan [27].

Linearity. We call a PCPP proof-system linear if it satisfies two conditions. First, the prescribed proof
π for any statement x must be a linear function of the statement. Second, to decide whether to accept,
the PCPP verifier only checks that the values that it reads from the input and PCPP proof lie in an
affine subspace. Put differently, the verifier’s decision predicate is computable by a conjunction of linear
functions. We remark that in the literature [8, 41], the term “linear PCPP” sometimes refers only to the
latter of the two requirements but here we also insist that the proof be a linear function of the statement.
     We use linearity both to assure that our resulting codes are linear codes, as well as to facilitate
composition with other PCPPs. We note that standard PCPs are typically inherently non-linear (since
they are designed for general languages in P or in NP). However, in our context we are only trying to
verify membership in affine spaces and so it is reasonable to expect to have linear PCPPs.

Robustness. The notion of robust PCPPs, introduced by Ben Sasson et al. [7], refers to PCPP systems
whose verifier, roughly speaking, is not only required to reject statements that are far from valid but
also that the local view of the verifier (i. e., the answers to the queries made by verifier) is far from any
local view that would have caused the verifier to accept. Robustness plays a key role in enabling PCP
composition. While this condition holds trivially for verifiers with constant query complexity, in our
construction we will also consider verifiers with superconstant query complexity, for which achieving
robustness is non-trivial.

Self-correctability. In a self-correctable PCPP system, the proof oracle admits a local correction
procedure that allows for local recovery of individual bits of a moderately corrupted PCPP proof. In
other words, the proof string itself is a locally correctable code. The self-correctability of the PCPP
oracles allows us to include them as part of an RLCC’s codeword.

Strongly canonical soundness. The notion of PCPPs with strongly canonical soundness, introduced
by Goldreich and Sudan [27], requires that correct inputs (i. e., that reside in the target language) have a

                       T HEORY OF C OMPUTING, Volume 16 (18), 2020, pp. 1–68                               6
                                      R ELAXED L OCALLY C ORRECTABLE C ODES

canonical proof and the PCPP verifier is required to reject “wrong” (i. e., non-canonical) proofs, even for
correct statements. In more detail, these PCPPs satisfy two additional requirements: (1) canonicity: for
every true statement there exists a unique canonical proof that the verifier is required to always accept, and
(2) strongly canonical soundness: the verifier is required to reject any pair (x, π) of statement and proof
with probability that is roughly proportional to its distance from a true statement and its corresponding
canonical proof.
     We are now ready to state our results on PCPs of proximity with the aforementioned properties. The
first construction has exponential length and constant query complexity, whereas the second construction,
whose proof is significantly more involved, has polynomial length and polylogarithmic query complexity.
     Our first result is a variant of the Hadamard PCPP, with exponential length but constant query
complexity.

Theorem 1.5 (Informally stated, see Theorem 4.26). There exists a linear, self-correctable, strongly
canonical PCPP for membership in affine subspaces, with query complexity O(1) and exponential length
(in the size of the statement).

    Our second result is a variant of the [5] PCP, which has polylogarithmic query complexity and
polynomial length.

Theorem 1.6 (Informally stated, see Theorem 4.27). There exists a linear, self-correctable, Ω(1)-robust,
strongly canonical PCPP for membership in affine subspaces, with query complexity polylog(n) and
poly(n) length, for statements of length n.

1.2     Technical overview
The techniques used for our two constructions are quite different. We first outline the constant-query
result, which is more complex, in Section 1.2.1 and then outline the constant-rate result in Section 1.2.2.

1.2.1    Constant-query relaxed LCC
The starting point for our construction is the [7] construction of relaxed locally decodable codes (RLDC),
which we review next.9 In their construction, each codeword has two parts: the first part provides the
distance, and the second enables relaxed local decodability. More specifically, they construct an RLDC
C0 whose codewords consist of the following two equal-length parts: (1) repetitions of a codeword
C(x), where C : {0, 1}k → {0, 1}n is some systematic code with constant distance and rate, (2) for every
message bit in C(x), they add a PCPP, which is a proof that xi is indeed the ith bit of C(x).10
    We remark that the repetitions in the first part of the code are simply meant to ensure distance (as the
PCPP proof strings are not necessarily a code with good distance). To decode, the relaxed decoder for C0
invokes the PCPP verifier to check that the i’th bit of the first part is indeed C(x)i and outputs it, unless
the verifier rejects, in which case the relaxed decoder may return ⊥.
    9 We describe the simpler variant of the [7] construction, which achieves nearly quadratic block length. We remark that [7]

also present a more involved construction that achieves nearly linear block length.
   10 Actually, our presentation differs slightly from that of [7]. Their construction contains an additional part that consists of

repetitions of the original message. However, when using a systematic code C, this addition is not necessary.


                            T HEORY OF C OMPUTING, Volume 16 (18), 2020, pp. 1–68                                                7
                           T OM G UR , G OVIND R AMNARAYAN , AND RON D. ROTHBLUM

     Ben-Sasson et al. show that this code is indeed a relaxed LDC. However, in general, it will not
necessarily be a relaxed LCC. Specifically, it is unclear how to correct bits that are part of the PCPP
proof strings. Simply appending even more PCPPs to deal with the original ones will not do since we
will also need to correct those. Moreover, it is worth pointing out that even if the PCPP proof strings had
some internal self-correction mechanism, this would still not suffice since each PCPP proof string by
itself is very short (as compared to the entire codeword) and could therefore be entirely corrupted.
     Before proceeding to cope with this difficulty, we first suggest a different perspective on the [7]
construction, which is inspired by the highly influential and useful notion of PCP composition [3].
Specifically, we think of the [7] construction as a composition of the code C, which is trivially locally
decodable with n queries, with a constant-query PCPP. This composition yields a relaxed LDC with
query complexity O(1), at a moderate increase in block length (which comes from appending all of the
PCPP proof strings).
     We shall adopt the composition perspective, and use it to construct a relaxed locally correctable code,
by introducing a technique for composing a (possibly relaxed) LCC C with a special type of PCP of
proximity (PCPP). The result of the composition is a relaxed LCC C0 which basically inherits the query
complexity of the PCPP (and with a moderate overhead in block length).
     Similarly to the relaxed LDCs of [7], the codewords of C0 are constructed by taking repetitions of a
codeword of a (possibly relaxed) robust11 RLCC C and concatenating it with many PCPP proof strings.
Specifically, for each set of queries that the relaxed local corrector for C would like to make, we write
down a PCPP proof that this set of queries would make the corrector output the symbol that is in the
position being corrected (had it made those queries). We shall refer to the first part of C0 , which contains
repetitions of C, as the core of C0 , and refer to the second as the PCPP part.
     Observe that the foregoing approach allows us to locally correct bits of the core of C0 . The relaxed
corrector for the composed RLCC takes the queries made by the old corrector as input, and uses the PCPP
verifier to test if the old corrector would have accepted.12 However, we shall need a more sophisticated
machinery to correct the PCPP part of C0 (indeed this is exactly the challenge that we faced when trying
to follow the [7] approach). This will be achieved by ensuring that the PCPPs that we use have strong
properties.
     In particular, we shall employ the foregoing composition strategy while using the PCPPs of Sec-
tion 1.1.1, which admit canonical proofs, strongly canonical soundness, and self-correctability. Recall
that a PCPP is said to have strongly canonical soundness if every valid input has a canonical proof that it
accepts, and any pair of statement and proof are rejected with probability proportional to their distance
from a true statement and its corresponding canonical proof. In addition, recall that a canonical PCPP
is said to be self-correctable if the canonical proof strings form a locally correctable code (i. e., if it is
possible to locally recover individual bits of a noisy PCPP oracle).
     Suppose that we want to correct a bit that lies in the PCPP part of a purported codeword of C0 . If this
bit is in a PCPP oracle that is not too corrupted, we can simply use the PCPP’s self-corrector to recover
the bit. However, as pointed out before, this naive attempt to self-correct fails when the entire proof string
  11 We define robust RLCCs formally in Section 4.1.2.
  12 Even for this to work, we need to ensure that the original RLCC is robust, in the sense that with high probability the

corrector’s view (i. e., the answers to its queries) are far from answers that would make it output an incorrect value. Otherwise,
we have no guarantee that the PCPP verifier will see the error.


                            T HEORY OF C OMPUTING, Volume 16 (18), 2020, pp. 1–68                                               8
                                     R ELAXED L OCALLY C ORRECTABLE C ODES

is corrupted. This can easily happen when the proof strings, each of which refers to a single possible
query set of the original corrector, are short relative to the size of the entire codeword.
    Thus, our main challenge is to detect whether the given PCPP proof string was (possibly entirely)
corrupted. We observe that if on the one hand, the PCPP oracle we wish to correct is heavily corrupted,
while on the other hand, the statement to which the PCPP refers (i. e., the queries that the corrector for C
makes) is not heavily corrupted, then the proof is far from the prescribed canonical proof. The strongly
canonical soundness guarantees that in such case the PCPP verifier will detect the corruption and reject.
Thus, we are left with the case that both the PCPP oracle and the statement that it refers to are heavily
corrupted.
    To detect this deviation, we choose a random point in the foregoing statement and read it directly.
Since that point is in the core of the code, and we have already described the procedure for correcting in
the core, we can also correct this point and compare the corrected value with the symbol that we read
directly. Since we have assumed that the statement was heavily corrupted, the value that we read directly
will likely be different than what the corrector returns, in which case we can reject.
    Equipped with this composition theorem, we can now construct our code. By applying the composition
theorem to the low-degree extension code, of suitable parameters, and the PCPP given in Theorem 1.5,
we can already construct a constant-query RLCC with quasi-polynomial block length. We note that this is
already a considerable savings when compared to the current best (full-fledged) LCCs. However, to obtain
polynomial block length, we shall perform two compositions with different PCPPs (in direct analogy to
the first proof of the PCP theorem [2]).
    As in the quasi-polynomial result mentioned above, our starting point is the low-degree extension
code. Under a suitable parameterization, this code is known to be a robust (full-fledged) LCC with nearly
linear block length and polylogarithmic query complexity. We shall first compose it with the polynomial
length, polylogarithmic query, strongly canonical, self-correctable and robust PCPP of Theorem 1.6.
Since the foregoing PCPP is robust, this composition yields a robust RLCC with polynomial block length
and slightly sublogarithmic query complexity. Finally, we compose yet again with the exponential length,
constant query, strongly canonical, self-correctable PCPP from Theorem 1.5, which yields an RLCC with
constant query complexity.
    Each of our two composition steps introduces at least a quadratic overhead to the block length. This
is because our composition of an RLCC with a PCPP appends a PCPP proof-string for every pair (i, ρ)
of coordinate i to be corrected from the base code and random string ρ of the underlying corrector with
respect to the point i.13 Since we apply two such composition steps, we get a code with roughly n ≈ k4
block length.

1.2.2    Constant-rate RLCC
For the constant-rate construction, we build on the recent breakthrough construction of locally testable14
and correctable codes of Kopparty et al. [37]. Interestingly, we will actually focus on the [37] construction
  13 In contrast, in standard PCP composition, one only appends an inner PCP proof-string for every random string ρ of the

outer PCP. Thus, as long as the randomness complexity of the outer PCP is minimal, it is possible to achieve close to constant
multiplicative overhead when composing.
  14 Recall that a locally testable code [27] is a code for which one can test, using a sublinear number of queries, whether a

given string is a codeword or far from such.


                           T HEORY OF C OMPUTING, Volume 16 (18), 2020, pp. 1–68                                            9
                           T OM G UR , G OVIND R AMNARAYAN , AND RON D. ROTHBLUM

of locally testable codes (rather than correctable ones), even though our own goal is to construct (relaxed)
locally correctable codes.15
    Kopparty et al. construct locally testable codes with query complexity exp(O(log log n)2 ) by taking
an iterative approach, similar to that of Meir [40]. They start off with a code of dimension poly log(n)
(which is trivially locally testable, by reading the entire codeword) and gradually increase its block
length, while (almost) preserving the local testability and maintaining the rate of the code close to 1. This
amplification step is achieved by combining two transformations on codes:

   1. Code tensoring: this transformation squares the block length (which is good since we want to
      obtain block length n) and rate (which is not too bad since our rate is close to 1). The main negative
      affect is that this transformation also squares the distance.

   2. Distance amplification: remarkably, this transformation fixes the loss in distance caused by the
      tensoring step, without harming the rate or local testability too much.

     As noted above, in their work, Kopparty et al. also construct a locally correctable code, albeit only
with query complexity exp(O(        e √log n)). The reason why their LCC construction does not match the
parameters of their LTC construction is that the tensoring step, used in their construction of locally
testable codes, is not known to preserve local correctability.16
     Our key observation is that tensoring does preserve relaxed local correctability. Recall that the tensor
                                                   2      2                                    2
of a code C : Fk → Fn is the code C2 : Fk → Fn that consists of all strings c ∈ Fn , viewed as n × n
matrices, that consist of rows and columns that are each codewords of c.
     Suppose that C is a (relaxed) LCC with query complexity q. We want to show that C2 is also a
                                                                         2
(relaxed) LCC with query complexity roughly q. Let w ∈ Fn be a (possibly) corrupt codeword of C2 .
                                                                                               2
Thus, w which we also view as an n × n matrix, is close to some codeword c ∈ Fn . Given an index
(i, j) ∈ [n] × [n] to correct, a natural approach is apply the (relaxed) local corrector of C on the ith row of
w, with respect to the index j.
     If it were the case that the ith row of w were close to the ith row of c, we would be done. However, the
ith row of w only constitutes a 1/n fraction of w and so it could potentially be entirely corrupt. Let us
assume that it is indeed the case that the ith rows of c and w (almost) entirely disagree.
     To detect that this is the case, our corrector chooses at random a few columns J ⊂ [n]. On the one
hand, since the ith rows of w and c disagree almost everywhere, with high probability for some j0 ∈ J it
will be the case that wi, j0 6= ci, j0 . On the other hand, since j0 is just a random column, with high probability
the j0th columns of w and c are close.
     Given this, a natural approach is to have our corrector robustly read the (i, j0 )-th entries of w for
every j0 ∈ J , by applying the (relaxed) local corrector of C on each column j0 . In the likely case that it
chooses a j0 such that the j0th column of w and c are close, with high probability the corrector will either
return ci, j0 or ⊥. If our corrector sees ⊥ it can immediately reject (since this would never happen for an
exact codeword). Otherwise, if our corrector sees the value ci, j0 , it can compare this value with wi, j0 (by
  15 This may not be surprising, since the notions of relaxed correctability and testability are closely related. In particular, as

observed in [25], every RLDC (analogously, RLCC) is roughly equivalent to a code C such that for every coordinate i and value
b, the subcode obtained by fixing the i’th bit to b (i. e., {C(x) : C(x)i = b}) is locally testable.
   16 It can be shown that tensoring at most squares the query complexity for local correcting. However, the [37] iterative

approach cannot afford such an overhead in each iteration.


                           T HEORY OF C OMPUTING, Volume 16 (18), 2020, pp. 1–68                                                10
                                R ELAXED L OCALLY C ORRECTABLE C ODES

explicitly reading the (i, j0 )’th entry of w). By the above analysis, these values will be different (with
high probability), in which case our corrector can also reject.
    To calculate the overall query complexity of the resulting code, we need to account for the overhead
introduced by both the tensoring and distance amplification steps. Assuming that C is (relaxed) locally
correctable up to distance δR , the tensoring step only increases the query complexity by O(1/δR ). Each
distance amplification increases the query complexity by roughly a polylog(n) factor. Thus, since we need
roughly log log n iterations to reach block length n, the overall query complexity is exp(O(log log n)2 ).


1.3   Related work
Subsequent to the initial publication of this work [30], the open problem raised in Section 1.1 was recently
resolved by Chiesa, Gur, and Shinkar [12], who improved the block length in Theorem 1.1 from nearly
quartic to nearly linear. More precisely, they showed that for every α > 0, there exists an O(1)-query
RLCC with block length n = k1+α , matching the parameters of state-of-the-art constructions of RLDC.
Recent work [29, 13] shows lower bounds for RLCCs, which match the improved parameters of [12]
nearly tightly.
     A similar notion to RLDCs called Locally Decode/Reject Codes (LDRCs) arose in the beautiful work
of Moshkovitz and Raz [42] on constructing two-query PCPs with subconstant error. These are similar
to RLDCs in that they are codes with a decoder that is permitted to reject if it sees errors. However, it
is important to note that the two notions differ in a few significant ways and are overall incomparable.
First, LDRCs decode a k-tuple of coordinates jointly, rather than a single coordinate. Second, LDRCs
have a “list-decoding” guarantee—namely, that the decoding agrees with one message in a short list of
messages—as opposed to RLDCs, which provide unique decoding (but up to a smaller radius). Finally,
LDRCs only need to work with high probability over the choice of k-tuple, while RLDCs must decode or
reject with high probability for every coordinate. See [42, Section 2] for the formal definition of LDRCs
and a comparison to RLDCs.
     A closely related notion is that of decodable PCPs (dPCP), first introduced by Dinur and Harsha
[15] to the end of obtaining a modular and simpler proof of the the [42] result. A dPCP is a PCP oracle,
encoding an NP-witness, which allows for list decoding of individual bits of the NP witness it encodes.
Dinur and Harsha provided constructions of such dPCPs as well as a composition theorem for dPCPs.
     The notion of locally recoverable codes (LRCs) [28], which recently received much attention in the
context of data storage (e. g., [49, 10, 6, 33]), is also closely related to that of locally correctable codes.
Here, the goal is to correct a single error locally, obtaining dramatically better parameters than those
obtainable via locally correctable codes, which can tolerate a linear number of errors.
     Additionally, in a recent paper, Goldreich and Gur [25] introduced the notion of universal locally
testable codes (universal-LTC), which can be thought of as generalizing        the notion    of relaxed LDCs.
A universal-LTC C : {0, 1} → {0, 1} for a family of functions F = fi : {0, 1} → {0, 1} i∈[M] is a
                             k            n                                              k

code such that for every i ∈ [M] and b ∈ {0, 1}, membership in the subcode {C(x) : fi (x) = b} can be
locally tested. As was shown in [25], universal-LTCs with respect to the family of dictator functions
(i. e., functions of the form f (x) = xi ) are roughly equivalent to RLDCs. While universal-LTCs do not
capture the notion of RLCC, one can consider a natural generalization of universal-LTCs in which the
functions are defined on the codeword rather than message space, i. e., with respect to family of functions

                       T HEORY OF C OMPUTING, Volume 16 (18), 2020, pp. 1–68                                11
                       T OM G UR , G OVIND R AMNARAYAN , AND RON D. ROTHBLUM

F = { fi : {0, 1}n → {0, 1}}i∈[M] . Such a generalization captures the notion of RLCC, however, it is not
clear whether the results in [25] apply to this setting. We leave this question to future work.
    Finally, we remark that the relaxed LDCs have been used in the context of interactive proofs of
proximity [31] and property testing [11].


1.4     Organization
In Section 2 we provide standard definitions and notations. In Section 3 we formally define local
correcting, and the relaxation that we introduce. In Section 4 we give our first construction, an RLCC with
constant query complexity, while in Section 5 we give our second construction, an RLCC with constant
rate.


2     Preliminaries
We denote the relative distance, over alphabet Σ, between two strings x ∈ Σn and y ∈ Σn by

                                             def |{i ∈ [n] :   xi 6= yi }|
                                     ∆ (x, y) =                              .
                                                           n

If ∆ (x, y) ≤ ε, we say that x is ε-close to y, and otherwise we say that x is ε-far from y. Similarly, we
                                                                                 def
denote the relative distance of x from a non-empty set S ⊆ Σn by ∆ (x, S) = miny∈S ∆ (x, y). If ∆ (x, S) ≤ ε,
we say that x is ε-close to S, and otherwise we say that x is ε-far from S.


2.1     Error-correcting codes
Let k < n be positive integers and let Γ, Σ be alphabets. A code C : Γk → Σn is an injective mapping from
messages of length k (over the alphabet Γ) to codewords of length n (over the alphabet Σ). Typically it
will be the case that Γ = Σ, in which case we simply say that the code is over the alphabet Σ. We refer to
n as the block length of the code (which we think of as a function of k). The rate of the code is defined
as k/n. The relative distance of the code is the minimum, over all pairs of distinct messages x, y ∈ Γk ,
of ∆ (C(x),C(y)). We shall sometimes slightly abuse notation and use C to denote the set of all of its
codewords {C(x)}x∈Γk ⊂ Σn .
    Let F be a finite field (which we think of as an alphabet). We say that a code C : Fk → Fn is a linear
code if it is a linear map from Fk to Fn . In this case the set C of codewords is a subspace of Fn .


2.1.1    Code concatenation

Code concatenation is an operation on codes that is commonly used to reduce alphabet size. Fix alphabets
Σ, Ξ, and Γ. Fix an “outer” code C : Σk → Ξn with distance δC and rate rC , and an “inner” code D : Ξ → Γr
with distance δD and rate rD . The concatenation of C with D is the code C0 : Σk → Γr·n such that each
x ∈ Σk is first encoded with C, and then each symbol of the resulting codeword is encoded using the code
D. The relative distance of C0 is at least δC · δD and the rate is rC · rD .

                       T HEORY OF C OMPUTING, Volume 16 (18), 2020, pp. 1–68                              12
                                       R ELAXED L OCALLY C ORRECTABLE C ODES

    Let F and G be finite fields, such that G is an extension field of F. That is, G ∼
                                                                                     = Fm for some m ∈ N.
         k      n                 r
Let C : F → G and D : G → F be error-correcting codes that are F-linear (where we identify G with
Fm ). Then the code obtained by concatenating C and D is F-linear.
    We will often concatenate our codes with good binary linear codes, that is, binary linear codes with
constant rate and distance.
Lemma 2.1 ([36]). There exist (explicit) binary linear codes with constant rate and distance.


3      Definitions: local correcting and its relaxations
First, we define the notion of (full-fledged) local correctability for codes.
Definition 3.1 (Locally Correctable Codes (LCCs)). Let C ⊆ Σn be an error correcting code with relative
distance δ . We say that C is locally correctable if there exists a constant δradius < δ /2, which we call
the correcting radius, and a randomized polynomial time algorithm M that gets oracle access to a string
w ∈ Σn and explicit input i ∈ [n]. We require that if w is δradius -close to some codeword c, we have that
Mw (i) = ci with probability at least 2/3. Furthermore, if w is a codeword, we require that Mw (i) = ci
with probability 1.17
    The query complexity of M is the maximal number of queries that M makes for any input i and w.
(Note that the constant 2/3 can be amplified as usual by repeating the process multiple times and
outputting the majority symbol.)
    Following Ben-Sasson et al. [7], in this article we consider a relaxation of LCCs, with the aim of
constructing more efficient codes that use this relaxation. Loosely speaking, the relaxation allows the
decoder to output a special abort symbol ⊥ which indicates that it is unsure how to correct.
Definition 3.2 (Relaxed Locally Correctable Codes (RLCCs)). Let C ⊆ Σn be an error correcting code.
We say that C is relaxed locally correctable (RLCC) if there exists a constant δradius ∈ (0, 1], which we
call the correcting radius, and a randomized polynomial time algorithm M, which gets as input oracle
access w ∈ Σn and explicit access to an index i ∈ [n], such that the following two conditions hold.
     1. If w ∈ C, then Mw (i) = wi with probability 1.
     2. If w is δradius -close to some codeword c ∈ C, then
                                            Pr Mw (i) ∈ {ci , ⊥} ≥ 1/2,
                                                               

         where ⊥6∈ Σ is a special abort symbol.
    Some remarks about Definition 3.2 are in order. First, we note that every LCC is, in particular, a
relaxed LCC. Second, one might worry that Condition 2, which allows the corrector to state that it does
not know how to decode, could allow the trivial decoder that simply responds with ⊥ to everything.
However, Condition 1 prevents this, by stipulating that the corrector must always succeed when given a
valid codeword.
    We conclude this section with two additional remarks.
    17 Requiring perfect correctness for exact codewords is non-standard, but it holds for state-of-the-art LCCs. We include this

requirement here since it is required for RLCCs.


                            T HEORY OF C OMPUTING, Volume 16 (18), 2020, pp. 1–68                                             13
                            T OM G UR , G OVIND R AMNARAYAN , AND RON D. ROTHBLUM

Remark 3.3 (RLCC Error Reduction). Note that the probability of error in Definition 3.2 can be reduced
by repeating the test t times (with independent coin tosses). If all the tests agree on a symbol σ 6=⊥
then the amplified corrector outputs σ and otherwise it outputs ⊥. For the amplified corrector to make a
mistake, all t iterations must fail, which happens with probability 2−t .

Remark 3.4 (Success Rate). Lastly, we mention a third condition that appears in the definition of relaxed
local decoders in [7], which states that there is a constant fraction of coordinates i ∈ [n] for which
Pr[M(w, i) = ci ] ≥ 2/3—namely, for which the relaxed local decoder successfully retrieves the symbol
(and does not reject).
    In the scenario for which our relaxed local corrector makes a constant number of queries, this property
can be shown to follow directly from the two conditions in Definition 3.2, via a transformation that is
analogous to the one in [7].


4      Constant-query RLCC
In this section we construct a relaxed locally correctable code with constant query complexity and
polynomial length.

Theorem 4.1. For every ε > 0, there exists a (linear) relaxed LCC C : {0, 1}k → {0, 1}n with constant
relative distance, constant query complexity and block length n = k4+ε .

    The key tool in the proof of Theorem 4.1 is a composition technique which shows that by combining a
(relaxed) locally correctable code that is “robust” with a suitable PCP of proximity (PCPP), one obtains
an RLCC with improved query complexity.18 More specifically, we need a general purpose PCPP that
has the following two properties:

     1. Canonical Soundness: Each input x ∈ L , where L is the target language, has a specific “canonical”
        PCPP proof string. Loosely speaking, the requirement is that even inputs that are in the language
        are rejected if the proof that is provided is not the canonical one. Canonical soundness was first
        defined by Goldreich and Sudan [27].

     2. Self Correction: We require that the set of canonical PCPP proof strings forms an error correcting
        code (i. e., it has distance), and furthermore, that this code is locally correctable. In other words,
        symbols in the PCPP proof string can be reconstructed even if the proof string is slightly corrupted,
        using few queries.

   Our composition is similar to (and inspired by) composition of PCPs, and especially by the approach
advocated by Ben Sasson et al. [7] who compose any “robust” PCP with a PCP of proximity (PCPP).
   For a high-level overview of our construction, see Section 1.2.1.
    18 Loosely speaking, a PCPP is similar to a PCP except that the verifier has oracle access both to the proof and to the main

input. The soundness requirement is only that the verifier rejects inputs that are far from the language.


                            T HEORY OF C OMPUTING, Volume 16 (18), 2020, pp. 1–68                                            14
                                 R ELAXED L OCALLY C ORRECTABLE C ODES

Section organization. In Section 4.1 we introduce several important definitions that will be used
throughout this section, including the various strong notions of PCPPs that we use. In Section 4.2 we
state and prove our composition theorem. We then proceed to construct the two PCPPs that we will
use. In Section 4.3 we construct an “Hadamard-like” PCPP (which has exponential length but constant
query complexity) and then in Section 4.4 we construct a “[5]-like” PCPP (with polynomial-length and
polylogarithmic query complexity). Finally, in Section 4.5 we put it all together and prove Theorem 4.1.


4.1     Preliminaries
In this section we provide the required preliminaries for our construction. Specifically, we define strongly
canonical PCPs of proximity with self-correctable oracles, and review basic properties of the low-degree
extension code.


4.1.1    Canonical self-correctable PCPP

We begin by recalling the definition of PCPs of proximity, which loosely speaking, are PCPs in which
the verifier is only allowed to make a small number of queries to both the statement and the proof, and
soundness only means that, with high probability, the statement is close to a correct statement.

Definition 4.2 (Probabilistically Checkable Proof of Proximity (PCPP) [7, 16]). A probabilistically
checkable proof of proximity (PCPP) for a set S ⊆ {0, 1}n consists of a probabilistic verifier V that
gets access to an input oracle x ∈ {0, 1}n and to a proof oracle π. The verifier is required to satisfy the
following properties:

      • Completeness: For every x ∈ S, there exists a proof π such that when V is given access to the
        input oracle x and proof oracle π it accepts with probability 1.

      • Soundness: For every x 6∈ S and every proof string π, when V is given access to the input oracle x
        and proof oracle π it rejects with probability at least Ω(∆ (x, S)).

The query complexity q of the PCPP is the maximal number of queries that V makes, to both the input
and proof oracles, on any input x ∈ {0, 1}n and proof π. Similarly, the randomness complexity r of the
PCPP is the maximal number of random bits that V uses given any input oracle x ∈ {0, 1}n and proof
oracle π.

      In our constructions it will be important for us to use PCPPs that are linear in the following sense:

Definition 4.3 (Linear PCPP). We say that a PCPP (P, V) is linear if it satisfies the following two
conditions:

   1. The mapping P from inputs x to PCPP proof strings is a linear function.

   2. The verifier V’s checks amount to checking that the answers lie in some affine subspace.

                        T HEORY OF C OMPUTING, Volume 16 (18), 2020, pp. 1–68                            15
                          T OM G UR , G OVIND R AMNARAYAN , AND RON D. ROTHBLUM

    We say that a PCPP has a linear verifier V if it satisfies condition 2 above: that is, deciding predicate
D can be expressed as a conjunction of linear functions (i. e., a linear system).
    We shall actually need a stronger notion of PCPPs that have strong soundness with respect to a
canonical proof. Loosely speaking, strongly canonical PCPP are PCPPs with two additional require-
ments: (1) canonicity: for every true statement there exists a unique proof (called the canonical proof )
that the verifier is required to accept, and any other proof (even for a correct statement) must be rejected,
and (2) strong soundness: the PCPP verifier is required to be proximity oblivious, namely, it rejects any
pair of statement and proof with probability that is related to its distance from a true statement and its
corresponding canonical proof.
Definition 4.4 (Strongly Canonical PCPP [27]). A PCPP for a set S, with verifier V, is said to be
strongly canonical if there exists a (deterministic) function P : S → {0, 1}∗ that maps each input x ∈ S to
a canonical proof π = P(x) such that

    • Completeness (with respect to P): For every x ∈ S, when V is given access to the input oracle x
      and proof oracle π = P(x) it accepts with probability 1.
    • Strong Soundness (with respect to canonical proofs): For every x ∈ {0, 1}n and every proof
      oracle π, when V is given access to the input oracle x and proof oracle π it rejects with probability
      Ω(µ), where:                                                        
                                                                       
                                   def                    0         0
                                             max ∆ x, x , ∆ P(x ), π
                                                            
                                 µ = min
                                       0
                                                                             .                       (4.1)
                                              x ∈S

    Before we proceed, we give some intuition about Equation (4.1). For simplicity, we will pretend here
that the proof length is the same as the input length (i. e., |P(x)| = n for every x ∈ S ∩ {0, 1}n ). Recall
that in a PCPP, the verifier makes few queries to the proof oracle and to the input oracle. Unfortunately,
this means that our verifier will usually be unable to distinguish x 6∈ S from some very close x0 ∈ S. On
the other hand, if on input x ∈ S, the verifier is presented with a purported proof π that is very close to the
canonical proof P(x), it cannot be expected to reject whp. If we think of the input x and proof π as being
noisy versions of some “true” input x0 and canonical proof P(x0 ), the probability that our PCPP verifier
distinguishes (x, π) from (x0 , P(x0 )) is
                                                                                               
                     0      0              0          0                        0         0
                                                                                              
           ∆ x ◦ π, x ◦ P(x ) = ∆ x, x + ∆ P(x ), π = Θ max ∆ x, x , ∆ P(x ), π
                                                                              
                                                                                                   .      (4.2)

Minimizing over all x0 ∈ S gives us Equation (4.1).19
   Finally, we define (strongly canonical) PCPP oracles that are self-correctable as follows.
Definition 4.5 (Self-correctable PCPP). A strongly canonical PCPP with prover strategy P and query
complexity q is said to be self-correctable if the function P is a locally correctable code with at most q
queries and constant correcting radius.20
  19 We use the RHS of Equation (4.2) rather than the LHS, since the RHS also handles the case that the length of the proof

string differs from the length of the input.
   20 One could alternatively introduce an additional parameter that measures the number of queries that the PCPP’s self-

corrector does. However, since we already have a lot of parameters, we choose to bound the corrector’s query complexity by the
PCPP verifier’s query complexity.


                          T HEORY OF C OMPUTING, Volume 16 (18), 2020, pp. 1–68                                           16
                                 R ELAXED L OCALLY C ORRECTABLE C ODES

4.1.2    Robustness
Loosely speaking, a PCPP is said to have robust soundness [7] if instead of just asking that the PCPP
verifier reject false statements (with high probability), we ask that the verifier’s local view (i. e., the
answers to all queries) be “far” from any local view that would have made it accept. We shall refer to
such PCPPs as robust PCPPs.
    For the following definition, it would be convenient to think of a PCPP verifier V as a procedure that
chooses a random string ω, and generates: (1) an (ordered) sequence of q queries Iω = (i1 , . . . , iq ), and (2)
a decision predicate Dω : {0, 1}q → {0, 1}. Given  query access to input x and PCPP oracle π, the verifier
queries (x ◦ π)|Iω and outputs Dω (x ◦ π)|I . We write (I, D) ← V to denote the queries and decision
                                             

predicate (randomly) generated by V (and note that these do not depend on the input x nor the proof string
π). We also write (Iω , Dω ) = V(ω) to denote the (fixed) query locations and decision predicate that V
outputs given the random string ω. While slightly abusing notation, we also denote by Dω the set of
answers α ∈ {0, 1}q that satisfy Dω .
Definition 4.6 (Robust Strongly Canonical PCPP). For robustness parameter ρ ∈ (0, 1), a PCPP with
canonical proof P and verifier V for a set S has ρ-robust strong soundness s (with respect to canonical
proofs) if the following condition is satisfied:

   1. Accepting Views are Far Apart: For every ω, the set Dω of accepting views, is an error correcting
      code with distance ρ.

   2. Robust Soundness: For every x ∈ {0, 1}n and every proof oracle π, when V is given access to the
      input oracle x and proof oracle π it holds that
                                             h                  i
                                       Pr     ∆ (x ◦ π)|I , D > ρ ≥ s · µ,
                                       (I,D)←V

        where:                                                      
                                      def             0       0
                                                                    
                                            max ∆ x, x , ∆ P(x ), π
                                                        
                                    µ = min
                                         0
                                                                        .
                                            x ∈S

    Intuitively, the strong soundness with respect to canonical proofs assures that the verifier V will
reject statement-proof pairs with probability that is proportional to their distance from a valid pair (of
statement x0 ∈ S and its canonical proof P(x0 )), and the robustness additionally assures that the distance
of V’s local view from an accepting one is proportional to the aforementioned distance between the given
statement-proof pair and a valid pair. We note that when the distance from V’s local view to an accepting
view is nonzero, the verifier always rejects, and so robust soundness is stronger than traditional soundness.
    Next, we extend the foregoing notion of robustness to the setting of relaxed locally correctable
codes. Note that relaxed LCCs admit correcting procedures rather than testing procedures, and so it is
not immediately clear what soundness means in this context (let alone robust soundness). However, it
is natural to define the robustness condition for RLCC as the requirement that with high probability the
local view of the corrector is far from all local views that would have caused the (relaxed) corrector to
output a wrong value.
    We will think of a relaxed corrector as a procedure that, given an index i ∈ [n], randomly generates an
(ordered) sequence of q queries I = (i1 , . . . , iq ) and a function D : Σq → Σ ∪ {⊥} (where ⊥ 6∈ Σ is a special

                       T HEORY OF C OMPUTING, Volume 16 (18), 2020, pp. 1–68                                  17
                        T OM G UR , G OVIND R AMNARAYAN , AND RON D. ROTHBLUM

abort character) according to which the corrector decides; that is, the corrector’s output with respect to a
given word w ∈ Σn to be corrected is D(w|I ). We will also denote by (Ii,ω , Di,ω ) = M(i; ω), the particular
query set Ii,ω and decision function Di,ω generated by M on input i and random string ω. We say that a
RLCC has a linear corrector M if D is a procedure that checks that w|I satisfies a conjunction of linear
constraints; if not M outputs ⊥, and otherwise it outputs the evaluation of a linear function of w|I . We
proceed to define robust RLCC.

Definition 4.7 (Robust RLCC). For robustness parameter ρ ∈ (0, 1) and soundness parameter s ∈ (0, 1),
we say that an error correcting code LCC C ⊆ Σn , with correcting radius δradius ∈ (0, 1), is ρ-robust
s-sound RLCC if there exists a correcting procedure M, that on input i ∈ [n] outputs a sequence of queries
I ∈ [n]q and a decision predicate D : Σq → Σ ∪ {⊥} such that the following conditions are satisfied:

   1. Accepting Views are Far Apart: For every i ∈ [n] and ω ∈ R, the set D−1
                                                                           i,ω (Σ) is an error correcting
      code with distance ρ.

   2. Robust Soundness: For every i ∈ [n] and w ∈ Σn that is δradius -close to a codeword c ∈ C, it holds
      that                                h                              i
                             Pr(I,D)←M(i) ∆ w|I , D−1 Σ \ {ci } > ρ ≥ s,
                                                                


        where I ∈ [n]q is a set of query locations, D : Σq → Σ ∪ {⊥} is the decision predicate and q is the
        query complexity.

Recall that ⊥6∈ Σ and so the set D−1 (Σ \ {ci }) above, refers to all answers that would have led the decoder
to answer incorrectly. Intuitively, similarly to robustness of PCPs, the robust soundness condition in
Definition 4.7 assures that with high probably the corrector will not only avoid outputting a wrong symbol,
but also that its local view will be far from any local view that leads to outputting a wrong symbol.
    We define robust self-correctable PCPPs similarly to Definition 4.5 except that we require that local
correction procedure be robust.

4.1.3    Low-degree polynomials and the low-degree extension code

Let F be a finite field, H ⊆ F, and m = m(k) ≥ 1 be a parameter, which we call the dimension. A basic
algebraic fact is that for every function f : Hm → F there exists a unique function fe : Fm → F such that
 fe is a polynomial with individual degree |H| − 1 that agrees with f on Hm . Moreover, there exists an
individual degree |H| − 1 polynomial β : Fm × Fm → F such that for every function f : Hm → F it holds
that
                                         fe(z) = ∑ β (x, z) · f (x).
                                                x∈Hm

The function fe is called the low-degree extension of f (with respect to the field F, subset H and dimension
m).
    The low-degree extension can also be viewed as an error-correcting code in the following way.
Suppose that H and m are such that |H|m = k. Then, we can associate a string x ∈ Fk with a function
x : Hm → F by identifying Hm with [k] in some canonical way.

                        T HEORY OF C OMPUTING, Volume 16 (18), 2020, pp. 1–68                             18
                                    R ELAXED L OCALLY C ORRECTABLE C ODES

     We define the low-degree extension of a string x as LDEF,H,m (x) = xe. That is, the function LDEF,H,m
is given as input the string x ∈ Fk , views it as a function x : Hm → F and outputs its low-degree extension xe.
The Polynomial Identity Lemma21 (Lemma 4.8) implies that this code has relative distance 1 − m·(|H|−1)  |F|    .

Lemma 4.8 (Polynomial Identity Lemma). Let P : Fm → F be a non-zero polynomial of total degree d.
Then,
                                                     d
                                  Prr∈Fm P(r) = 0 ≤       .
                                                      |F|

Self-correction of polynomials. We will use the fact that low-degree polynomials are self-correctable,
with a corrector that is both linear and robust.

Lemma 4.9 (Self-Correction Procedure (see [19, 48])). Let δ < 1/5 and d, m ∈ N such that d ≤ |F|/10.
There exists an algorithm that, given x ∈ Fm and oracle access to an m-variate function P : Fm → F that is
δ -close to a polynomial P0 of total degree d, makes O(d) oracle queries and outputs P0 (x) with probability
2/3. Moreover, if P has total degree d, then given x ∈ Fm , the algorithm outputs P(x) with probability 1.
     Furthermore, with probability 2/3, the answers that the algorithm receives to its queries are 5δ far
from values that would lead it to output any value other than P(x), and the algorithm computes a linear
function of the symbols it queries.

    The self-correction algorithm samples a random line `x in Fm passing through the point x, and runs
the Berlekamp-Welch corrector for univariate polynomials of degree at most d on `x . To get robustness,
we additionally read O(d) extra points along the line (more than the Berlekamp-Welch corrector requires),
and check that they match the decoded polynomial. Since each point read is individually random, the
probability that we read a set of points that is 5δ -close to ones that is consistent with a different degree-d
polynomial can be made arbitrarily small via Markov’s inequality. To establish that the test is linear,
observe that the Berlekamp-Welch algorithm is a linear function of the symbols read.

Low-degree tests. The celebrated low-degree test is a key operation in most PCP constructions. We
will need several variants of this test which are listed below. The most standard one is the following:

Lemma 4.10 (Total Degree Test (a.k.a. Low Degree Test) (see, e. g., [46, 18]). Let ε ∈ (0, 1/2) and
d, m ∈ N such that d ≤ |F|/2. There exists an algorithm that, given oracle access to an m-variate function
P : Fm → F, makes O(d · poly(1/ε)) queries and:

   1. Accepts every function that is a polynomial of total degree d with probability 1; and

   2. Rejects functions that are ε-far from every polynomial of total degree d with probability at least
      1/2.

    We will also need a more refined version of the test that tests the individual degree of the polynomial.
  21 This lemma, best known under the name “Schwartz–Zippel Lemma,” goes back to a 1922 paper by Ore and has been

rediscovered several times since. This history was uncovered by Arvind et al. [4]. The name “Polynomial Identity Lemma” was
coined by Babai and has been adopted by [4] and two other papers.


                          T HEORY OF C OMPUTING, Volume 16 (18), 2020, pp. 1–68                                         19
                        T OM G UR , G OVIND R AMNARAYAN , AND RON D. ROTHBLUM

Lemma 4.11 (Individual Degree Test (see, e. g., [31, Theorem A.8]). Let d, m ∈ N such that dm < |F|/10
and ε ∈ (0, 1/10). There exists an algorithm that, given oracle access to an m-variate polynomial
P : Fm → F, makes O(dm · poly(1/ε)) queries, and:

   1. Accepts every function that is a polynomial of individual degree d with probability 1; and

   2. Rejects functions that are ε-far from every polynomial of individual degree d with probability at
      least 1/2.

      Lastly, we require a robust analog of the low-degree test.

Lemma 4.12 (Robust Low Degree Test [18]). Let d, m ∈ N such that d ≤ |F|/100. Let ` : F → Fm be a
random line. Then, for any m-variate function P : Fm → F that is δ -far from having degree d it holds that
                                                                       
                           E` ∆ (P ◦ `, degree d univariate polynomials) ≥ Ω(δ ),

where P ◦ ` denotes the univariate polynomial obtained by composing P with `.

4.2     Composition of relaxed LCC and PCPPs
In this section we prove a composition theorem for robust RLCCs with self-correctable PCPPs, which
loosely speaking, yields a robust RLCC with improved query complexity, at a relatively small increase
to the block length. We recall that composition consists of appending PCPP proofs to the RLCC, one
for each set of queries that the corrector for the code would make. We now state the parameters of our
composition formally.

Theorem 4.13 (Composition of Robust RLCC with (suitable) PCPP). Let F be a finite field, and let
C : Fk → Fn be a linear ρcode -robust relaxed LCC, with respect to soundness scode , with a linear corrector,
relative distance δ , correcting radius δradius < δ /2, randomness complexity rcode , and query complexity
qcode .
    Let (P, V) be a linear ρpcp -robust strongly canonical PCPP for inputs of length 2qcode , with respect to
soundness spcp , for affine relations, with a linear verifier, randomness complexity rpcp , query complexity
qpcp , self-correction radius δpcp-correction-radius ≥ ρpcp , self-correction robustness ρself-correct ≥ ρpcp , self-
correction soundness sself-correct ≥ spcp , and self-correction query complexity qself-correct = qpcp .
    The composition of the outer code C with the inner PCPP (P, V) yields a linear (ρpcp /5)-robust
                            0
relaxed LCC C0 : Fk → Fn , with respect to soundness s0 = (spcp · scode · ρcode 2 )/8, with a linear corrector,
block length n0 = 2n · 2rcode · 2rpcp · qpcp , relative distance
                                                                δ /2, correcting radius δradius /4, randomness
complexity 2rcode + O rpcp + log(qpcp ) + log(qcode ) , and query complexity 5qpcp .

    In the following subsections we prove Theorem 4.13. Specifically, In Section 4.2.1 we describe the
construction of the composed RLCC. In Section 4.2.2 we describe the relaxed corrector. In Section 4.2.3
we establish the basic properties of the code (distance, length, linearity, query complexity, and robustness).
Then, in Sections 4.2.4 and 4.2.5 we establish the self-correctability of the “code part” and “PCPP” part
of the composed code.

                        T HEORY OF C OMPUTING, Volume 16 (18), 2020, pp. 1–68                                    20
                                      R ELAXED L OCALLY C ORRECTABLE C ODES

4.2.1    The construction of the composed code
Let C : Fk → Fn be the RLCC stated in Theorem 4.13. Let M be the relaxed correcting procedure
associated with C. Let R = {0, 1}rcode , and for every index i ∈ [n] and ω ∈ R, denote by Ii,ω (resp., Di,ω )
the query set (resp., decision predicate), respectively, generated by M on input the index i and the random
string ω. (Recall that this means that given oracle access to a purported codeword w, location i ∈ [n], and
randomness ω ∈ R, the corrector M outputs Di,ω (w|Ii,ω )).
     For every affine subspace S ⊆ {0, 1}n , let (PS , VS ) be the PCPP from the theorem’s statement. Let
MPCP denote the self-correction procedure associated with the PCPP (where we omit the dependence of
the corrector on S from the notation.)
                                       0
     The composed code C0 : Fk → Fn (for a block length n0 as in the theorem’s statement) is constructed
as follows. Given an input x ∈ {0, 1}k , it’s encoding C0 (x) consists of two parts:

   1. The code’s core: The core part consists of t copies of C(x), where t will be set below such that the
      core and the PCPP part have the same length.22

   2. The code’s PCPP: The second part of the code will consist of a sequence of PCPP proof strings.
      For every i ∈ [n] and every ω ∈ R, let (Ii,ω , Di,ω ) = M(i; ω). We define a set Si,ω as follows:
                        n                                                                       o
                Si,ω = (z1 , z2 ) ∈ (Fqcode )2 : z1 = σ qcode for some σ ∈ F, and Di,ω (z2 ) = σ .

        or in words, the set Si,ω consists of all pairs (z1 , z2 ) ∈ (Fqcode )2 such that (1) z1 consists of qcode
        copies of some element σ ∈ F, and (2) z2 ∈ Fqcode makes M(i; ω) output the symbol σ given
        answers z2 .
        Observe that since M is F-linear, the set Si,ω is an affine subspace. Hence, by the theorem’s
        hypothesis
                  it has a strongly canonical  and locally-correctable PCPP (Pi,ω , Vi,ω ). We let πi,ω =
                                      
        Pi,ω (C(x)[i]) code ,C(x)|Ii,ω , for each i ∈ [n] and ω ∈ R.
                       q


                                                                 0
Putting both parts together, we define C0 : Fk → Fn as follows
                                                                     
                                   C0 (x) = (C(x))t , (πi,ω )i∈[n],ω∈R ,

where t is chosen such that |C(x)|t = ∑i∈[n],ω∈R |πi,ω | (i. e., the core and the PCPP parts have the same
length).

4.2.2    The relaxed corrector
                              0
Consider a string w ∈ Fn (which we think of as a noisy codeword for the correcting procedure). We view
w as a string composed of two parts (analogous to the two parts of the construction above):

   1. (w1 , . . . , wt ) ∈ (Fn )t : the t alleged repetitions of some codeword in C (i. e., the code’s core).
  22 As usual, integrality issues can be resolved via padding.



                           T HEORY OF C OMPUTING, Volume 16 (18), 2020, pp. 1–68                                21
                          T OM G UR , G OVIND R AMNARAYAN , AND RON D. ROTHBLUM

   2. π̄ = (πi, j )i∈[n], j∈[R] : the alleged canonical PCPP oracles asserting each one of the affine relations
      (Ai, j , bi, j )i∈[n], j∈[R] .

    We construct a relaxed local corrector M0 that given a location `∗ ∈ [n0 ] and oracle access to the
purported codeword string w = (c̄, p̄), either corrects at the point `∗ or rejects. (Here and below, we use
the convention that starred indices refer to locations that are non-random, whereas non-starred indices are
random.) Our correctors works differently, depending on whether `∗ belongs to the core or to the PCPP
part:

    • Correcting in the Core (i. e., `∗ ∈ [n · t]):

          1. Let i∗ ∈ [n] be ((`∗ − 1) mod n) + 1, where we recall that n = |C(x)| is the original block
             length. In other words, i∗ is the internal index within the (alleged) codeword of C that `∗
             refers to.
          2. Select at random j ∈ [t]. Read the value w j [i∗ ] qpcp times and check that the value read is
             always the same.23
          3. Uniformly draw ω ∈ R, corresponding to the randomness of C’s corrector M. Let Ii∗ ,ω , and
             Di∗ ,ω be the query set and decision predicate, respectively, of M(i∗ , ω).
          4. Run the PCPP verifier Vi∗ ,ω that was defined above, with respect to the input oracle (z1 , z2 ),
             where z1 = (w j [i∗ ])qcode and z2 = w j |Ii∗ ,ω , and the PCPP oracle πi∗ ,ω . If the PCPP verifier
             rejects then output ⊥.
          5. Output w j [i∗ ].

    • Correcting the PCPP Part (i. e., `∗ ∈ [n · t + 1, n0 ]):

          1. Let πi∗ ,ω ∗ be the PCPP oracle that `∗ refers to. Let Ii∗ ,ω ∗ and Di∗ ,ω ∗ be the query set and
             decision predicate, respectively, of M on index i∗ and random coins ω ∗ .
          2. Invoke the PCPP verifier Vi∗ ,ω ∗ with respect to input oracle (z1 , z2 ), where z1 = (w1 [i∗ ])qcode
             and z2 = w1 |Ii∗ ,ω ∗ , and the PCPP oracle πi∗ ,ω ∗ , and output ⊥ if it rejects.24
          3. Choose a random location i0 ∈ Ii∗ ,ω ∗ . Read the value w1 [i0 ] qpcp times and check that the value
             read is always the same. Denote this value by η.
          4. Invoke the relaxed corrector (recursively) with respect to location i0 (note that this location is
             in the core, which was handled above), and output ⊥ if it does not output η.
          5. Invoke the PCP’s self-corrector MPCP on the location that corresponds to `∗ in πi∗ ,ω ∗ , and
             output whatever it returns.

  23 This step, which at first may seem silly, is meant to ensure robust soundness. Indeed, when analyzing robustness, we must

consider an adversary that is allowed to modify answers to certain queries in retrospect.
  24 The choice of w here (rather than any other w ) was arbitrary.
                     1                               j



                          T HEORY OF C OMPUTING, Volume 16 (18), 2020, pp. 1–68                                            22
                                R ELAXED L OCALLY C ORRECTABLE C ODES

4.2.3   Basic properties of the composed code
In this subsection we analyze the basic properties of the composed code C0 and its corrector, as defined
in Section 4.2.1, and show they satisfy all the conditions of Theorem 4.13, except for the (robust)
                                                                      0
soundness of the relaxed corrector with respect to correcting radius δradius , which will be established in
the subsequent subsections.

Block length. The code C0 (x) consists of two parts: (1) the codeword C(x) repeated t times, which
makes it equal in length to the second part, and (2) a PCPP for every i ∈ [n] and ω ∈ R, where each
PCPP refers to a statement of length 2qcode , and where t is selected such that both parts (1) and (2) are
of equal length. Recall that R = 2rcode and so the length of each PCPP oracle with respect to statements
of length 2qcode is 2rpcp · qpcp . Hence the block length of C0 is n0 = 2n · 2rcode · 2rpcp · qpcp .

Randomness complexity. For correcting in the core, log(t) random bits are used to select j, where
                                n0   2n · 2rcode · 2rpcp · qpcp
                           t=      =                            = 2rcode · 2rpcp · qpcp .
                                2n              2n
Then, rcode random bits are needed to uniformly choose ω ∈ R, and rpcp random bits are used by the
PCPP verifier. This totals in 2rcode + 2rpcp + log(qpcp ) randomness complexity.
    For correcting the PCPP part, by the above analysis 2rcode +2rpcp +log(qpcp ) random bits are needed
for the recursive invocations of the corrector with respect to the core. In addition, at most log(qcode )
random bits are needed to uniformly choose the location i0 ∈ Ii∗ ,ω ∗ , and rpcp for the invocation of the
PCPP verifier. Lastly the PCPP self-corrector also uses rpcp randomness.
    Hence the total randomness complexity is upper bounded by
                                                                         
                               2rcode + O rpcp + log(qpcp ) + log(qcode ) .

Query complexity. For correcting a point in the core, we perform qpcp repeated queries to read w j [i∗ ].
Then, qpcp queries for invoking the PCPP on a statement of length 2qcode (the corrector’s queries).
    For correcting a point in the PCPP part of C0 , we (recursively) invoke the corrector with respect to
the core, which, by the above analysis, takes 2qpcp queries. In addition, we invoke the PCPP verifier and
self-correction procedure for a total of 2qpcp additional queries. We also read w1 [i0 ] qpcp times. Overall,
the total query complexity is 5qpcp .

Distance. Half of each codeword of C0 consists of repetitions of the code C, which has relative distance
δ . Hence, the the relative distance of C0 is at least δ 0 ≥ δ /2.

Linearity. The linearity of the code C0 is immediate from the construction: The first part of each
codeword of C0 consists of repetitions of a codeword of C, which is a linear code, and the second part
consists of linear PCPP oracles that refer to the first part (i. e., each PCPP oracle is a linear encoding
of the first part). The linearity of the corrector of C0 also follows by construction, since the predicate C0
checks is the one induced by the invocation of the linear PCPP verifier MP to verify the linear claims of
the linear corrector of C.

                       T HEORY OF C OMPUTING, Volume 16 (18), 2020, pp. 1–68                              23
                         T OM G UR , G OVIND R AMNARAYAN , AND RON D. ROTHBLUM

4.2.4   Correcting the code’s core
In this subsection we analyze the relaxed corrector presented in Section 4.2.2 in the case that the location
`∗ (to be corrected) is in the code’s core (i. e., `∗ ∈ [n · t]).
     The first condition of relaxed correctors (successfully correct uncorrupted codewords with probability
1) follows easily from the construction of the corrector and the completeness of the underlying PCPP. To
elaborate, since the codeword is completely uncorrupted, the string z2 = w j |Ii∗ ,ω ∗ leads the corrector to
output w j [i∗ ], due to the completeness of C, the “core” RLCC. Therefore, the PCPP verifier will always
accept in this case and we will always output w j [i∗ ]. We proceed to demonstrate that the correcting
procedure M0 additionally satisfies the properties of a robust RLCC (Definition 4.7). First, we will show
that accepting views for M0 are far apart, which will follow from the robustness of the underlying PCPP.

Lemma 4.14 (Accepting Views are Far Apart (in the core part)). For every i ∈ [n · t] and random string
ω 0 for M0 , the accepting views of M0 have distance ρpcp /2.

Proof. Fix a random string ω 0 for M0 and an accepting view a ∈ F2qpcp . Recall that the view of M0 (when
querying the core) consists of two parts (a1 , a2 ) ∈ F2qpcp . The corrector M0 checks that a1 = σ qpcp . Hence,
to change the first part of a1 to some different a01 , one must change all of it. As for a2 , by the fact that the
PCPP is robust, one must change at least a ρpcp fraction of get an accepting a02 .

   To complete the proof (of robustness for the core part) we still need to show robust soundness. This is
proved in the following lemma.
                                                                         0
Lemma 4.15 (Robust Soundness (in the core part)). Let w0 ∈ Fn be (δradius /4)-close to some codeword
c0 ∈ C0 and let `∗ ∈ [n · t]. Then,
                               h                                      i
               Pr(I,D)←M0 (`∗ ) ∆ w0 |I , D−1 F \ {c0 [`∗ ]} > ρpcp /2 ≥ scode · spcp · ρcode /4.
                                                            


Proof. Since c0 ∈ C0 , there exists some x ∈ Fk such that the core of c0 consists of t copies of C(x).
    Recall that we used (w1 , . . . , wt ) to denote the core of w0 . We first argue that with constant probability
(over the choice of j ∈ [t]), it holds that w j is δradius -close to C(x).
Claim 4.16.
                                                                         1
                                          Pr [∆ (w j ,C(x)) ≤ δradius ] ≥ .
                                         j∈[t]                           2

Proof. Since the core (w1 , . . . , wt ) is half of the length of w0 , it holds that ∆ ((w1 , . . . , wt ), (C(x))t ) ≤
2∆ (w0 , c0 ) ≤ δradius /2. By Markov, this implies that for at most half of the indices j ∈ [t] it holds that
∆ (w j ,C(x)) > δradius .

    Therefore, throughout the rest of the proof we can fix j such that w j is δradius -close to C(x) (and this
only costs us at most a constant factor in the success probability of the corrector). Note that since it has
correcting radius δradius , the relaxed local corrector is guaranteed to work for such w j .
    Let i∗ be the internal index associated with `∗ ; that is, that w0 [`∗ ] refers to the same symbol as w j [i∗ ].
We first consider the case that the location i∗ that we are trying to correct is not “corrupted” in w j .

                         T HEORY OF C OMPUTING, Volume 16 (18), 2020, pp. 1–68                                      24
                                      R ELAXED L OCALLY C ORRECTABLE C ODES

Claim 4.17. If w j [i∗ ] = C(x)[i∗ ], then the view of M0 is 1/2-far from any view that would make it output
an incorrect answer (i. e., an answer in F \ {w j [i∗ ]}).

Proof. First observe that by construction, the corrector M0 , always outputs either w j [i∗ ] or ⊥. Since
exactly half of M’s queries were to w j [i∗ ], all of these queries would need to be modified to make it
answer incorrectly.

      Thus, we may assume without loss of generality that w j [i∗ ] 6= C(x)[i∗ ]. In this case, all the queries to
w j [i∗ ] are consistent with making it output an incorrect answer (i. e.not C(x)[i∗ ]). Our goal now will be to
show that, in this case, the answers to the queries to our robust PCPP will be far from the set of answers
that would make it accept.
      We now want to argue that the input on which we run the PCPP is far from an accepting input.
Claim 4.18. w j |Ii∗ ,ω is ρcode -far from D−1               ∗
                                                                
                                            i∗ ,ω F \ {C(x)[i ]} , with probability at least scode .

Proof. Recall that C is a ρcode -robust relaxed LCC with soundness scode . Since ∆ (w j ,C(x)) ≤ δradius , by
definition we have the following:
                             h                                             i
                       Prω∈R ∆ w j |Ii∗ ,ω , D−1
                                              i∗ ,ω F \ {C(x)[i ∗
                                                                  ]}   > ρcode ≥ scode ,

where (Ii∗ ,ω , Di∗ ,ω ) = M(i; ω).

    Using the fact that the input to the PCPP is far from accepting (as shown in Theorem 4.18) and that
the PCPP has robust soundness, we show that the view of M0 is far from any that would not make it
reject.
Claim 4.19. With probability spcp · scode · ρcode /2 the view of M0 is ρpcp /2-far from any view that would
not make it reject.

Proof. Recall that we run the PCPP verifier Vi∗ ,ω with respect to the input oracle (z1 , z2 ), where z1 =
(w j [i∗ ])qcode and z2 = w j |Ii∗ ,ω , and PCPP proof string πi∗ ,ω .
      The robust soundness of our PCP implies that with probability at least spcp · µ, the local view of the
PCPP verifier will be ρpcp -far from any view that would make it accept, where
                                                                                                   
                          def                                     0 0             0 0
                                                                                                   
                                              max ∆ (z1 , z2 ), (z1 , z2 ) , ∆ P(z1 , z2 ), πi∗ ,ω
                                                                          
                        µ = 0 min   0
                                                                                                       .
                           (z1 ,z2 )∈Si∗ ,ω

   Let (z01 , z02 ) ∈ Si∗ ,ω that minimize µ. Notice that z01 = σ qcode , for some σ ∈ F. If σ 6= w j [i∗ ], then
µ ≥ ∆ ((z1 , z2 ), (z01 , z02 )) ≥ 12 (since the entire first half needs to be changed). Thus, we may assume that
σ = w j [i∗ ].
   The fact that (z01 , z02 ) ∈ Si∗ ,ω means that z02 ∈ D−1                       ∗
                                                                i∗ ,ω (F \ {C(x)[i ]}). Thus, by Theorem 4.18, with
probability scode , it holds that w j |Ii∗ ,ω is ρcode -far from z02 . Assuming that the latter happens we have that
µ ≥ ∆ ((z1 , z2 ), (z01 , z02 )) ≥ ρcode /2. We conclude that in any case µ ≥ ρcode /2.
   The claim follows by observing that the PCPP queries that M’ makes account for half of its total
queries.

    This concludes the proof of Lemma 4.15.

                         T HEORY OF C OMPUTING, Volume 16 (18), 2020, pp. 1–68                                   25
                         T OM G UR , G OVIND R AMNARAYAN , AND RON D. ROTHBLUM

4.2.5   Correcting the PCPP Part
In this subsection we complete the analysis of the relaxed corrector presented in Section 4.2.1, by
analyzing it in the case that the location (to be corrected) is in the code’s PCPP part.
     As in the previous case, the first condition of relaxed correctors is immediate from the construction.
Specifically, the fact that C is an RLCC tells us that Di∗ ,ω ∗ (z2 ) = w1 [i∗ ], and so it follows from the perfect
completeness of the PCPP that we will never reject in Step 2. The completeness of the relaxed corrector
for the “core” that was established in the previous section, which guarantees we will never reject in Step
4. Finally, the PCP’s self-corrector will always output the correct value on an uncorrupted codeword, so
we will not output the wrong thing in Step 5.
     As before, we first show that accepting views are far apart.

Lemma 4.20 (Accepting Views are Far Apart (in the PCPP part)). For every i ∈ [n · t + 1, n0 ] and random
string ω 0 for M0 , the accepting views of M0 have distance ρpcp /5.

Proof. Fix a random string ω 0 for M0 and an accepting view a ∈ F5qpcp . Recall that the view of M0
(when querying the PCPP) consists of four parts (a1 , a2 , a3 , a4 ) ∈ Fqpcp × Fqpcp × F2qpcp × Fqpcp , where:
(1) a1 corresponds to the PCPP answers (wrt the invocation of Vi∗ ,ω ∗ ), (2) a2 corresponds to reading
w1 [i∗ ] qpcp times (and checking that all queries were the same), (3) a3 corresponds to the answers for the
recursive invocation of the corrector on the core, and (4) a4 corresponds to the answers for the PCPP
self-correction procedure.
     We observe that each of these parts is robust by itself, where the first part is ρpcp -robust since
the PCPP is ρpcp -robust, the second part is 1-robust (since you need to change all answers to ensure
consistency), the third part is ρpcp /2-robust by Lemma 4.15 and the fourth part is ρpcp -robust by the
robustness of the PCPP self-correction.
     Overall we get that accepting views are at least (ρpcp /5)-far from each other.

    In the rest of this subsection we prove the following lemma, which shows that the second condition
(robust soundness) is satisfied.

Lemma 4.21. For every string w0 = (w1 , . . . , wt ), (πi,ω )i∈[n],ω∈R that is (δradius /4)-close to a valid
                                                                            

codeword c0 ∈ C0 , and index `∗ ∈ [n · t + 1, . . . , n0 ], the following inequality holds:

              Pr(I,D)←M0 (`∗ ) ∆ w0 |I , D−1 (F \ {c0 [`∗ ]}) > ρcode /5 ≥ (scode · spcp · ρcode 2 )/8.
                                                                      


Proof. Let i∗ ∈ [n] and ω ∗ ∈ R be the indices that are associated with `∗ (that is, such that πi∗ ,ω ∗ is the
PCPP oracle in which the index `∗ resides). Let x ∈ Fk such that C0 (x) = c0 . Recall that the core of c0
consists of t copies of C(x).
    Let Good denote the event that the view of M0 is (ρpcp /5)-far from any view that would make it
return an incorrect value (i. e., different from C0 (x)[`∗ ] or ⊥).
    We will show that Pr[Good] ≥ (scode · spcp · ρcode 2 )/8. This is done by a careful (and somewhat
tedious) case analysis.

Claim 4.22. If w1 |Ii∗ ,ω ∗ is (ρcode /2)-far from C(x)|Ii∗ ,ω ∗ , then Pr[Good] ≥ (scode · spcp · ρcode 2 )/8.


                         T HEORY OF C OMPUTING, Volume 16 (18), 2020, pp. 1–68                                    26
                                          R ELAXED L OCALLY C ORRECTABLE C ODES

Proof. If w1 |Ii∗ ,ω ∗ is (ρcode /2)-far from C(x)|Ii∗ ,ω ∗ then, with probability ρcode /2 over the choice of
i0 ∈ Ii∗ ,ω it holds that w1 [i0 ] 6= C(x)[i0 ]. By Lemma 4.15, with probability at least scode · spcp · ρcode /4, the
local view of the recursive call to the corrector will be ρpcp /2-far from any view that would make that
recursive corrector return anything but C(x)[i0 ] or ⊥. Recall that the corrector reads the value w1 [i0 ] qpcp
times, checks that they are all equal, and denotes it by η. Thus, η = w1 [i0 ] and moreover, the view of the
recursive corrector is ρpcp /5 from any view in which η 6= w1 [i0 ].
     Since the original corrector (that is, the one that initiates the recursive call) compares η with the value
given by the recursive invocation, with probability (ρcode /2) · (scode · spcp · ρcode /4), its view is ρpcp /5
far from any view that would make it output anything but ⊥.

    Thus, in the rest of the analysis we may assume that w1 |Ii∗ ,ω ∗ is (ρcode /2)-close to C(x)|Ii∗ ,ω ∗ .

Claim 4.23. If w1 [i∗ ] 6= C(x)[i∗ ], then Pr[Good] ≥ spcp · ρcode /4.

Proof. Consider the input (z1 , z2 ) on which Vi∗ ,ω ∗ is invoked. Recall that z1 = (w1 [i∗ ])qcode and z2 =
w1 |Ii∗ ,ω ∗ . By the strongly canonical soundness of the PCPP, with probability spcp · µ, the local view of
the PCPP verifier will be ρpcp -far from any view that would make it accept, where:
                                                                                                             
                                                                          0 0             0 0
                                                                                                             
                                                      max ∆ (z1 , z2 ), (z1 , z2 ) , ∆ P(z1 , z2 ), πi∗ ,ω ∗
                                                                                  
                     µ=          min                                                                             .
                           (z01 ,z02 )∈Si∗ ,ω ∗


     We show that µ ≥ ρcode /4. Let (z01 , z02 ) ∈ Si∗ ,ω ∗ that minimize µ. Then, there exists σ ∈ F such that
z01 = σ qcode and Di∗ ,ω ∗ (z02 ) = σ . If σ 6= w1 [i∗ ], then µ ≥ ∆ ((z1 , z2 ), (z01 , z02 )) ≥ 1/2, since the first part
needs to be entirely changed. Thus, we may assume that σ = w1 [i∗ ].
     Assume for contradiction that z02 is (ρcode /2)-close to w1 |Ii∗ ,ω ∗ . Then, by the triangle inequality z02 is
ρcode /2 + ρcode /2 = ρcode close to C(x)|Ii∗ ,ω ∗ . However,

                                 Di∗ ,ω ∗ (C(x)|Ii∗ ,ω ∗ ) = C(x)[i∗ ] 6= w1 [i∗ ] = Di∗ ,ω ∗ (z02 ).

   Thus, by the first part of the robust RLCC definition, it must be the case that z02 is ρcode -far from
C(x)|Ii∗ ,ω ∗ , which is a contradiction. Thus,
                                                                          
                                   µ ≥ ∆ z02 , z2 /2 = ∆ z02 , w1 |Ii∗ ,ω ∗ /2 ≥ ρcode /4.
                                                 


   So, with probability at least spcp · ρcode /4, the local view of the PCPP verifier will be ρpcp far from
any view that would make it accept. Since the queries to the PCP consist of 1/5 of the corrector’s total
queries, the claim follows.

    Thus, in the rest of the analysis we may assume also that w1 [i∗ ] = C(x)[i∗ ].

Claim 4.24. If πi∗ ,ω ∗ is ρcode -far from Pi∗ ,ω ∗ ((C(x)[i∗ ])qcode ,C(x)|Ii∗ ,ω ∗ ), then Pr[Good] ≥ spcp · ρcode /4.

Proof. Consider yet again the input (z1 , z2 ) on which Vi∗ ,ω ∗ is invoked. Recall that z1 = (w1 [i∗ ])qcode =
(C(x)[i∗ ])qcode and z2 = w1 |Ii∗ ,ω ∗ . By the strongly canonical soundness of the PCPP, with probability

                          T HEORY OF C OMPUTING, Volume 16 (18), 2020, pp. 1–68                                         27
                           T OM G UR , G OVIND R AMNARAYAN , AND RON D. ROTHBLUM

spcp · µ, the local view of the PCPP verifier will be ρpcp -far from any view that would make it accept,
where:                                                                                     
                                                           0 0             0 0
                                                                                           
                                                                       P(z
                                                                 
                    µ = 0 min
                            0
                                    max   ∆   (z ,
                                                1 2z ), (z  ,
                                                           1 2z )  , ∆      ,
                                                                           1 2z ), π ∗
                                                                                    i ,ω ∗     .
                            (z1 ,z2 )∈Si∗ ,ω ∗

    We now show that µ ≥ ρcode /4. Suppose toward a contradiction that µ < ρcode /4. Let (z01 , z02 ) ∈ Si∗ ,ω ∗
that minimize µ. That is,

                          µ = max ∆ (z1 , z2 ), (z01 , z02 ) , ∆ P(z01 , z02 ), πi∗ ,ω ∗ .
                                                                                       
                                                                                                        (4.3)

First observe that if z01 6= (C(x)[i∗ ])qcode , then µ ≥ ∆ ((z01 , z02 ), (z1 , z2 )) ≥ 1/2. Thus, we may assume that
z01 = (C(x)[i∗ ])qcode = z1 .
     Equation (4.3) implies that µ ≥ ∆ ((z01 , z02 ), (z1 , z2 )) and so, ∆ (z02 , z2 ) ≤ 2µ < ρcode /2.
                                                                                                      But, by ouras-
sumption, z2 = w1 |Ii∗ ,ω ∗ is ρcode /2-close to C(x)|Ii∗ ,ω ∗ and so, by the triangle inequality, ∆ z02 ,C(x)|Ii∗ ,ω ∗ <
ρcode . By the first property of a robust RLCC, this implies that z02 = C(x)|Ii∗ ,ω ∗ .
    Using Equation (4.3) yet again, we we obtain the following equation:
                                                                                               
                                 0 0                         ∗ qcode
                     µ ≥ ∆ P(z1 , z2 ), πi ,ω = ∆ P((C(x)[i ])
                                               
                                          ∗  ∗                        ,C(x)|Ii∗ ,ω ∗ ), πi ,ω
                                                                                          ∗   ∗



which, by the claim’s hypothesis is more than ρcode . Thus, in any case µ ≥ ρcode /4.
    Thus, with probability spcp · ρcode /4, the local view of the PCPP verifier will be ρpcp -far from any
view that would make it accept. Since this view accounts for 1/5 of the corrector’s queries, the claim
follows.

      Hence, we may assume that πi∗ ,ω ∗ is ρcode -close to Pi∗ ,ω ∗ ((C(x)[i∗ ])qcode ,C(x)|Ii∗ ,ω ∗ ).
Claim 4.25. Pr[Good] ≥ spcp .

Proof. The self-correction procedure of the PCPP is run on the string πi∗ ,ω , which is ρcode -close to
Pi∗ ,ω ∗ ((C(x)[i∗ ])qcode ,C(x)|Ii∗ ,ω ∗ ). Since ρcode ≤ δpcp-correction-radius , this means that the string is within
the correcting radius of the PCPP. The claim follows by the robust local correction of the PCPP and the
fact that these queries are a 1/5 fraction of the queries that M0 makes.

      This concludes the proof of Lemma 4.21.

4.3     Exponential-length, constant-query, strongly canonical and self-correctable PCPP
In this section we construct self-correctable PCPPs for affine subspaces, with exponential length, constant
query complexity, and strong soundness with respect to a canonical proof. Our construction is closely
related to the Hadamard-based inner PCPP in [2].

Theorem 4.26. Let n ≥ k be integers. Let A ∈ {0, 1}k×n be a matrix and let b ∈ {0, 1}k be a vector. Then,
the set SA,b = {x ∈ {0, 1}n : Ax = b} has a self-correctable, strongly canonical, Ω(1)-robust PCPP with
O(1) queries and O(n) randomness complexity, where the canonical proof string of the PCPP is a linear
function of the input. Furthermore, the PCPP has a linear verifier.

                           T HEORY OF C OMPUTING, Volume 16 (18), 2020, pp. 1–68                                     28
                                R ELAXED L OCALLY C ORRECTABLE C ODES

     We note that, while the PCPP verifier in Theorem 4.26 is robust and checks a conjunction of linear
functions of the symbols queried, we actually do not need these properties here. This is because these
properties are used to ensure that resulting composed code behaves well under further composition, but
this PCPP will be used in the final composition to construct our codes. We still prove these properties in
order to formally use Theorem 4.13.

Proof. Fix a matrix A ∈ {0, 1}k×n and vector b ∈ {0, 1}k . We construct an (exponential length) self-
correctable, strongly canonical PCPP for the set SA,b , where the canonical proof string is a linear function
of the input.
    For a given input x ∈ SA,b , the canonical PCPP proof string is the Hadamard encoding of the string x.
Namely, the truth table of the function π : {0, 1}n → {0, 1} defined as π(z) = hz, xi, for every z ∈ {0, 1}n .
    Given access to the input oracle x ∈ {0, 1}n and an (alleged) proof oracle π : {0, 1}n → {0, 1}, the
PCPP verifier V chooses at random w, z ∈ {0, 1}n and performs the following tests:

   • Linearity Test: check that π(w) + π(z) = π(z + w) (note that the first addition refers to addition
     of scalars over {0, 1} whereas the second refers to vector addition over the vector space {0, 1}n ).

   • Circuit Test: choose at random v ∈ {0, 1}k and check that π(z) + π(z + vA) = hv, bi.

   • Input Proximity Test: choose at random i ∈ [n] and check that π(z) + π(z + ei ) = xi , where
     ei ∈ {0, 1}n is the unit vector with 1 in its i-th coordinate and 0 everywhere else.

    The verifier V accepts if and only if all of the tests above were successful. We proceed to show that
this PCPP satisfies the required conditions.

Linearity. The canonical proof is the Hadamard encoding of the input, which is a linear code. The fact
that the PCPP has a linear verifier follows because it checks a conjunction of linear constraints.

Self-correction. Follows immediately from the fact that the Hadamard code is a 2-query locally
correctable code.

Robustness. Follows trivially from the fact that the number of queries is O(1).

Completeness. Let x ∈ SA,b (i. e., Ax = b) and let π be the Hadamard encoding of x. Since π is a linear
function, the linearity test passes with probability 1. For the circuit test observe that

                    π(z) + π(z + vA) = hz, xi + hz + vA, xi = hvA, xi = hv, Axi = hv, bi

as desired. For the input proximity test observe that

                            π(z) + π(z + ei ) = hz, xi + hz + ei , xi = hei , xi = xi

which completes our analysis.

                       T HEORY OF C OMPUTING, Volume 16 (18), 2020, pp. 1–68                               29
                            T OM G UR , G OVIND R AMNARAYAN , AND RON D. ROTHBLUM

Strongly canonical soundness. Let x ∈ {0, 1}n and fix a proof string πe. We need to show that V rejects
with probability Ω(δ ), where


                                                                                    
                                        def                           0
                                                                             0 e
                                                                                    
                                     δ = min
                                         0
                                                            max ∆ x, x , ∆ P(x ), π     .
                                              x ∈SA,b




We assume without loss of generality25 that δ ≤ 21 .

    The proof goes as follows. We first use the celebrated linearity test of Blum, Luby and Rubinfeld [9]
(see also [23]) to conclude that π must be very close to the Hadamard encoding of some string x0 . Note
that x0 may be either close to (or even equal to) x or far from x. Moreover, using the local correctability of
the Hadamard code, we can treat π as though it actually is exactly an Hadamard encoding of x0 .

    Suppose x0 6∈ SA,b . Then, due to the distance of the Hadamard code, π is very far from the Hadamard
encoding of any string x∗ ∈ SA,b , so the verifier needs to reject with constant probability. Ideally, the
verifier would like to directly compute Ax0 and compare it to b, but it only has access to the Hadamard
code of x0 , not x0 itself (and computing Ax0 would require many queries). Similarly to the case of the
Hadamard PCP, we observe that the requirement of Ax0 = b is a conjunction of k affine functions on x0 .
The high level idea is to take a random linear combination of these functions so that we end up with
just a single linear function. More specifically, the verifier selects a random vector v ∈ {0, 1}k . We
observe that if Ax0 = b, then hvA, x0 i = hv, bi, whereas if Ax0 6= b then, with probability 1/2, it holds that
hvA, x0 i =
          6 hv, bi. Thus, our verifier rejects if hvA, x0 i =
                                                            6 hv, bi (where we observe that computing the inner
product of x0 with any fixed vector requires just a single query to the Hadamard encoding of x0 ).

    We are left with the case that x0 ∈ SA,b . In this case, we notice that x0 is the minimizing string for δ
and so δ = ∆(x, x0 ). So the verifier only needs to reject with probability proportional to ∆(x, x0 ). To do
this, it suffices to pick i ∈ [n] at random, read the ith bit of the input x, compare it to the ith bit of x0 , and
reject if they are different. We proceed to the formal proof.

    The [9] test shows that if πe is δ 0 -far from every linear function (for all sufficiently small δ 0 > 0), then,
with probability Ω(δ 0 ), over the choice of z and w, it holds that πe(z) + πe(w) 6= πe(z + w). Thus, we may
assume without loss of generality that πe is δ /4-close to some linear function π : {0, 1}n → {0, 1} (since
otherwise the verifier rejects with probability Ω(δ ) when performing the linearity test). Let x0 ∈ {0, 1}n
such that π(z) = hz, x0 i, for every z ∈ {0, 1}n .

    Consider first the case that x0 6∈ SA,b (i. e., Ax0 6= b). In this case the verifier rejects in the circuit test




  25 Indeed, for fixed constant values of δ , the requirement that the verifier rejects with probability Ω(δ ) is trivial.



                            T HEORY OF C OMPUTING, Volume 16 (18), 2020, pp. 1–68                                            30
                                    R ELAXED L OCALLY C ORRECTABLE C ODES

with probability:
                    h                           i     h                               i
                  Pr πe(z) + πe(z + vA) = hv, bi ≥ Pr π(z) + π(z + vA) = hv, bi − 2 · δ /4
                                                      h                                 i
                                                  ≥ Pr hz, x0 i + hz + vA, x0 i = hv, bi − δ /2
                                                      h                  i
                                                  = Pr hvA, x0 i = hv, bi − δ /2
                                                      h                  i
                                                  = Pr hv, Ax0 + bi = 0 − δ /2
                                                         1
                                                       =   − δ /2
                                                         2
                                                         1
                                                       ≥
                                                         4
where the first inequality is by the fact that z and z + vA are each individually random in {0, 1}n and our
assumption that πe is δ /4-close to π, and the final equality uses the fact that Ax0 6= b.
    Now consider the case that       x0 ∈ SA,b (i. e., Ax0 = b). Note that by definition of δ , it holds that
δ ≤ max ∆ (x, x0 ) , ∆ (P(x0 ), πe) ≤ max ∆ (x, x0 ) , δ /4 and so ∆ (x, x0 ) ≥ δ . Thus, the verifier rejects in
                                   

the Input Proximity Test with probability at least
                        h                        i       h                      i
                     Pr πe(z) + πe(z + ei ) 6= xi ≥ Pr π(z) + π(z + ei ) 6= xi − 2 · δ /4
                                                         h          i
                                                   = Pr π(ei ) 6= xi − δ /2
                                                         h       i
                                                   = Pr xi0 6= xi − δ /2
                                                   = ∆ x, x0 − δ /2
                                                              

                                                       ≥ δ /2.

4.4    Polynomial-length, polylog-query, strongly canonical, self-correctable and robust
       PCPP
In Section 4.3 we constructed a self-correctable, strongly canonical, robust PCPP with a constant number
of queries and exponential size. That PCPP does not suffice for our end goal of an RLCC with polynomial
block length and constant query.26
    As described in the introduction, our first composition step will compose the low-degree extension
code with a PCPP, which we call the “intermediate” PCPP. Thus, we want to construct a suitable PCPP.
In this section we construct the intermediate PCPPs we use in our code construction. More formally, we
prove the following theorem.
Theorem 4.27. Let A ∈ {0, 1}k×n be a matrix and let b ∈ {0, 1}k be a vector. There exist universal
constants c ∈ N and s, ρ ∈ (0, 1) such that for every m ≤ loglog(n)                               n
                                                               log(n) , the set SA,b = {x ∈ {0, 1} : Ax = b}
has a PCPP (over the binary alphabet) with the following properties:
  26 By composing that PCPP with a low-degree extension code (of suitable parameters), we obtain relaxed locally correctable

codes with quasi-polynomial block length.


                          T HEORY OF C OMPUTING, Volume 16 (18), 2020, pp. 1–68                                          31
                          T OM G UR , G OVIND R AMNARAYAN , AND RON D. ROTHBLUM

    • The PCPP is ρ-robust with strong soundness s (with respect to canonical proofs).

    • The PCPP is ρ-robust self-correctable with soundness s.

    • The canonical PCPP proof string is a GF(2)-linear function of the input x. The length of the
      PCPP proof string is poly(n).

    • The predicates that the PCPP verifier and self-corrector compute are also GF(2) linear func-
      tions. Both the verifier and the self correctors have query complexity O(m3 nc/m ) and randomness
      complexity polylog(n) · n1/m .

    Indeed, we will use this theorem where we set m = loglog(n)
                                                            log(n) , which yields a PCPP with polylogarith-
mic query complexity and polynomial length.   27

    Recall that a strongly canonical PCPP means that each valid input x has a unique accepting proof,
and the verifier must reject any other proof with probability proportional to how far away the proof is
from the correct one. Additionally, self-correctability refers to the fact that the canonical proofs form a
locally correctable code, while robustness stipulates that the view of the verifier on any invalid input x
must be far from an accepting view. See Section 4.1 for details.
    We note that the proof of Lemma 4.28 is closely related to (and influenced by) the construction of a
linear inner proof system (LIPS) in [27, Theorem 5.19]. The following lemma is the main step in proving
Theorem 4.27.

Lemma 4.28. Let A ∈ {0, 1}k×n be a matrix and let b ∈ {0, 1}k be a vector. For every m ≤ loglog(n)
                                                                                              log(n) and for
every finite field F of characteristic 2 and order |F| ≥ Ω(m·n1/m ), the set SA,b = {x ∈ {0, 1}n : Ax = b} has
a strongly canonical PCPP over the alphabet F, and has query complexity O(m2 · n2/m ) and randomness
complexity O(m2 · n2/m · log(|F|)).
     Furthermore, (1) the canonical PCPP proof string is an O(m)-variate total degree O(m · n1/m ) polyno-
mial, (2) the canonical PCPP proof string viewed as a bit string (in the natural way), is a GF(2)-linear
function of the input, (3) the PCPP verifier, viewed as a function on bits, computes a GF(2) affine
relation on its view, and (4) the PCPP verifier only makes a single query to the input.

   Indeed, this lemma gives the desired PCPP for Theorem 4.27, modulo the self-correction and
robustness properties.
   In Section 4.4.1 we prove Lemma 4.28 and then in Section 4.4.2 we use Lemma 4.28 to prove
Theorem 4.27.


4.4.1    Proof of Lemma 4.28

Fix A ∈ {0, 1}k×n and b ∈ {0, 1}k . We construct a strongly canonical self-correctable PCPP for verifying
membership in SA,b . Let C : {0, 1}n → {0, 1}k be a circuit composed only of fan-in 2 parity gates such
  27 We believe that with more care our approach would yield a PCP with only logarithmic randomness (rather than polyloga-

rithmic). However, we do not optimize the randomness complexity since it is not required for our main results.


                          T HEORY OF C OMPUTING, Volume 16 (18), 2020, pp. 1–68                                       32
                                        R ELAXED L OCALLY C ORRECTABLE C ODES

that on input x ∈ {0, 1}n it outputs y ∈ {0, 1}k where y is the all zeros string if and only if x ∈ SA,b .28 We
denote the size (i. e., number of wires) of C by N and note that N = O(n · k).
    Let F be a field of characteristic 2, let H ⊆ F an arbitrary subset such that {0, 1} ⊆ H, and where
|H| = dN 1/m e and |F| = Ω(|H| · m). Note that |Hm | ≥ N. We associate the integers in [N] with the first N
elements in Hm , given in lexicographic order.
    For a given input x ∈ {0, 1}n , we extend x to be a string in {0, 1}N by associating the i-th entry with
the value of the i-th wire of C on input x. We list the wires in order of increasing depth—this puts the
input x in the first n positions of the extended string and the output C(x) in the last k positions.
    Let ` = 4m + 4. We define a polynomial X : F` → F to be the (unique) individual degree |H| − 1
polynomial such that for every i ∈ [N], it holds that X(i) = xi and for every i ∈ H` \[N] it holds that
X(i) = 0.29
    We define a function φA,b : (Hm )3 ×({0, 1})4 → {0, 1} such that for every i1 , i2 , i3 ∈ Hm and α1 , α2 , α3 ,
σ ∈ {0, 1} we define φA,b (i1 , i2 , i3 , α1 , α2 , α3 , σ ) = 1 if there is a gate of the form

                                             α1 · xi1 + α2 · xi2 + α3 · xi3 + σ = 0

in the circuit and φA,b (i1 , i2 , i3 , α1 , α2 , α3 , σ ) = 0 otherwise. We first extend φA,b to be a function φA,b :
H3m+4 → {0, 1} by defining it to be zero outside of H3m ×{0, 1}4 . Let φ̂A,b : F3m+4 → F be the low-degree
extension of φA,b .
    The idea behind the definition of φA,b is that it encodes local constraints, that, put together, encode the
global correctness of a computation performed by the circuit C. Namely, φA,b is nonzero only at tuples
that encode gates present in C, and is zero otherwise. Suppose a prover gives the verifier a polynomial that
the prover claims is 0 at all tuples that encode gates in C. The verifier can check this claim by multiplying
that polynomial with φ̂A,b and checking that the product is identically 0. This property will be particularly
useful if we create a polynomial that is 0 at all tuples encoding gates in C if and only if the assignment
to the wires satisfies the relations imposed by the gates. In this case, we will be able to check that the
assignment that the prover gives is valid via polynomial identity testing on this product polynomial. This
motivates our definition of the polynomial P0 below.
    Recall that ` = 4m + 4. We define a polynomial P0 : F` → F such that for every z ∈ F` , where
z = (i1 , i2 , i3 , i4 , α1 , α2 , α3 , σ ) ∈ (Fm )4 × F4 , the following equation holds.
                                                                                                         
   P0 (z) =φ̂A,b i1 , i2 , i3 , α1 , α2 , α3 , σ · α1 · X 0`−m , i1 + α2 · X 0`−m , i2 + α3 · X 0`−m , i3 + σ
                                                      
               + X 0`−m , i4 · 1 − X 0`−m , i4 .

    Notice that P0 is the sum of two terms. The first term is exactly what we described beforehand: a
product of φ̂A,b with a simple polynomial that checks that the assignment to the wires satisfies each
gate of thecircuit C. The purpose  of the second term is to enforce booleanity in the input. Namely,
X 0 , i4 · 1 − X 0 , i4 will be nonzero whenever X 0`−m , i4 is non-boolean, so this means that
    `−m                 `−m
                             

  28 For each coordinate i ∈ [k], we can compute
                                                   ∑ j Ai j x j + bi mod 2 with at most n + 1 parity gates with 0/1 weights. It will be
convenient to include wires with 0-weights for notation purposes. We can do this for all coordinates i to get the desired circuit.
  29 The reader may find it mysterious why X was defined as an ` variate polynomial rather than m-variate (since we are

essentially taking the low-degree extension of the computation which is of size N ≤ |H|m . Jumping ahead, we note that the
reason that we do so is to facilitate the “bundling” of this polynomial with other `-variate polynomials that are defined below.


                            T HEORY OF C OMPUTING, Volume 16 (18), 2020, pp. 1–68                                                   33
                         T OM G UR , G OVIND R AMNARAYAN , AND RON D. ROTHBLUM

P0 is identically 0 on H` only if X encodes a boolean assignment x such that C(x) = 0. This will be shown
formally in Theorem 4.31.
    Thus, we would like to be able to check that P0 is identically 0 over H` . Doing so is not immediate
since P0 has degree O(m · |H|) and so it does not necessarily have to be identically 0 on all of F` (indeed,
typically it will not be). As usual in the PCP and interactive proof literature, we enforce this condition
using the sumcheck approach of Lund et al. [39].
    The key idea is to define additional auxiliary polynomials P1 , . . . , P` : F` → F (which are often called
the sumcheck polynomials) and are defined as follows. For every (z1 , . . . , z` ) ∈ F` :

                             Pi (z1 , . . . , z` ) = ∑ Pi−1 (z1 , . . . , zi−1 , h, zi+1 , . . . , z` ) · zhi ,          (4.4)
                                                         h

where by exponentiating in h, we mean that we associate H with the integers {0, . . . , |H| − 1} in some
canonical way, and simply exponentiate by the corresponding integer. Notice that Equation (4.4) can be
equivalently rewritten as:

                         Pi (z1 , . . . , z` ) =     ∑ P0 (h1 , . . . , hi , zi+1 , . . . , z` ) · zh1 · · · · · zhi .
                                                                                                       1           i
                                                                                                                         (4.5)
                                                   h1 ,...,hi


The PCPP proof string. The PCP proof bundles the polynomials X, P0 , . . . , P` in the following way.
Let λ , λ0 , . . . , λ` be distinct elements in F. We define a polynomial Q : F`+1 → F as follows. For every
z ∈ F` we define Q(λ , z) = X(z) and Q(λi , z) = Pi (z), for every i ∈ {0, . . . , `}. We extend the definition
of Q to F`+1 by interpolation (in the first variable).
    Observe that Q has degree ` + 1 in its first variable and total degree O(|H| · m).

The PCPP verifier. Recall that the PCPP verifier is given access to two oracles, the input oracle
x ∈ {0, 1}n and an alleged proof oracle Q : F`+1 → F. The verifier first runs a total degree test (see
Lemma 4.10) on Q, with respect to degree O(m · |H|). If the test fails the verifier immediately rejects.
    We “un-bundle” the polynomial Q : F`+1 → F as follows. Let X : F` → F and P0 , . . . , P` be defined
as X(z) = Q(λ , z) and Pi (z) = Q(λi , z), where λ , λ0 , . . . , λ` ∈ F are the fixed field elements as defined
above. In the sequel, whenever we say that the verifier queries one of the polynomials Q, X, P0 , . . . , P` ,
we actually mean that it reads these values via the self-correction procedure (see Lemma 4.9) applied to
Q (with respect to degree O(m · |H|)).
    The verifier runs the following additional tests. If any test fails then the verifier rejects. Otherwise, it
accepts.

   1. Individual Degree Test for First Variable of Q: The verifier chooses at random (z1 , z2 , . . . , z`+1 ) ∈
      F`+1 . It checks that the value Q(z1 , . . . , z`+1 ) is consistent with value obtained by interpolating
      from Q(γ, z2 , . . . , z`+1 ) and {Q(γi , z2 , . . . , z`+1 )}i∈{0,...,`} (where we recall that all values are read
      via self-correction from Q).

   2. Low Individual Degree |H| − 1 Test for X. The verifier runs the individual degree |H| − 1 test on
      X (see Lemma 4.11).

                         T HEORY OF C OMPUTING, Volume 16 (18), 2020, pp. 1–68                                             34
                                    R ELAXED L OCALLY C ORRECTABLE C ODES

   3. Zero Output (and Padding) for X: check that the |H|` − N + k bit suffix of X|H` is the all zero
      string. An efficient procedure for doing so is described in, e. g., [44, Proposition 3.9].
   4. Input Proximity Test: choose at random i ∈ [n] and check that X(0`−m , i) = xi . (Recall that we
      use the local correction procedure for this.)
   5. X vs. P0 Test: choose at random z ∈ F` . Let i1 , i2 , i3 , i4 ∈ Fm and α1 , α2 , α3 , σ ∈ F such that
      z = (i1 , i2 , i3 , i4 , α1 , α2 , α3 , σ ). Check that
                      P0 (z) = φ̂A,b (z) · α1 · X(0`−m , i1 ) + α2 · X(0`−m , i2 ) + α3 · X(0`−m , i3 ) + σ
                                                                                                            
                                                                  
                                    + X 0`−m , i4 · 1 − X 0`−m , i4 .

   6. Sumcheck Test: choose at random (z1 , . . . , z` ) ∈ F` and check that for every i ∈ [`]:
                              Pi (z1 , . . . , z` ) = ∑ Pi−1 (z1 , . . . , zi−1 , hi , zi+1 , . . . , z` ) · zhi i ,
                                                      hi ∈H

      where the values {Pi−1 (z1 , . . . , zi−1 , h, zi+1 , . . . , z` )}h∈H are read via the self-correction procedure.
   7. P` Zero Test: choose random z ∈ F` and check that P` (z) = 0.

Query complexity. It can be readily verified that each one of our tests makes at most O(m · |H|) calls
to the self correction procedure. The latter procedure requires O(m˙|H|) queries and so we obtain query
complexity O(m2 · |H|2 ).

Randomness complexity. It can also be readily verified that each of our tests by itself uses O(log(|F|m ))
randomness. In addition, for each one of queries we invoke the self-correction procedure which induces
uses O(log(|F|m )) randomness. Thus, the total amount of randomness used is O(m · |H| · log(|F|m )).

Completeness. Suppose that we have an input x ∈ {0, 1}n such that C(x) = 0, and that we construct
the polynomials X, P1 , . . . P` from this input and bundle them into Q as specified above.
    Clearly the Zero Suffix test passes, since this is how we constructed the polynomial. The input
proximity test passes because X(0`−m , i) = xi for all i by construction. The X vs. P0 test passes because
P0 is constructed correctly from X (and in particular, X is boolean valued inside H` ). The sumcheck test
passes because Pi+1 is constructed correctly from Pi .
    Only the fact that the Zero Test for P` passes remains to be analyzed. Since x is a boolean string such
that C(x) = 0, we conclude that P0 |H` ≡ 0. Now we establish that Pi |Fi ×H`−i ≡ 0 for all i ∈ [`]. Assume
that the claim is true for i − 1. Fix a point~z = (z1 , . . . , z` ) ∈ Fi × H`−i . By definition, we have that
                          Pi (z1 , . . . , z` ) = ∑ Pi−1 (z1 , . . . , zi−1 , hi , zi+1 , . . . , z` ) · zhi i .
                                                 hi ∈H

Applying the inductive hypothesis, we see that, for all hi ∈ H,
                                Pi−1 (z1 , . . . , zi−1 , hi , zi+1 , . . . , z` ) · zhi i |Fi ×H`−i ≡ 0.
    Hence, by induction, we have that P` |F` ≡ 0, and so the Zero Test passes.

                        T HEORY OF C OMPUTING, Volume 16 (18), 2020, pp. 1–68                                          35
                         T OM G UR , G OVIND R AMNARAYAN , AND RON D. ROTHBLUM

                                                                     e : F`+1 → F. We need to show
Strongly canonical soundness. Let x ∈ {0, 1}n and fix a proof string Q
that V rejects with probability Ω(δ ), where
                                                                 
                                def                 0
                                                          0 e
                              δ = min
                                    0
                                          max ∆ x, x , ∆ Q(x ), Q      ,                      (4.6)
                                       x ∈SA,b


where Q(x0 ) denotes the canonical PCPP proof string for input x0 .
    We assume without loss of generality that Q   e is δ -close to a total degree O(|H| · m) polynomial Q∗ ,
since otherwise the total degree test fails with high probability.

Claim 4.29 (Individual Degree of Q in its First Variable). If Q∗ does not have individual degree ` + 1 in
its first variable, then the verifier rejects with probability 1 − O(|H|·`)
                                                                     |F| .

Proof. Suppose that Q∗ has individual degree t > ` + 1 in its first variable (but not t − 1). We can write
Q∗ as:
                                                               t
                                    Q∗ (z1 , . . . , z`+1 ) = ∑ zi1 · Qi (z2 , . . . , z`+1 )
                                                              i=0

for some total degree O(|H| · m) polynomials Q0 , . . . , Qt : F` → F. Furthermore, the polynomial Qt is not
identically zero (since otherwise Q∗ would have individual degree t − 1 in z1 ). Thus, with probability
1 − O(|H| · `)/|F|, it holds that Qt (z2 , . . . , z`+1 ) 6= 0. Assuming that the latter holds, the univariate
polynomial T (z1 ) = ∑ti=0 zi1 · Qi (z2 , . . . , z`+1 ) has degree t but not degree t − 1. Therefore with probability
1 − O(`)/|F| the value T (z1 ) is inconsistent with the value obtained by interpolating from T (γ) and
{T (γi )}i∈{0,...,`} . Therefore, our verifier rejects with probability 1 − O(|H| · `)/|F|.

     Hence, we can assume that Q∗ has individual degree ` + 1 in its first variable. Let X ∗ : F` → F
and P0∗ , . . . , P`∗ : F` → F be defined as X ∗ (z) = Q∗ (γ, z) and Pi∗ (z) = Q∗ (γi , z), for every z ∈ F` and
i ∈ {0, . . . , `}. Since the verifier accesses Q     e using the self-correction procedure , we can assume that the
verifier has direct access to the polynomials X ∗ , P0∗ , . . . , P`∗ . This assumption is without loss of generality,
as we can make the error probability of the self-correction procedure sufficiently low such that an error in
reading any of the polynomials X ∗ , P0∗ , . . . , P`∗ only occurs with negligible probability, with a negligible
increase in query complexity.
     We now give some high-level intuition for the proof of strongly canonical soundness. We want to
show that at least one of the tests will reject with probability Ω(δ ). We will think about this in the
contrapositive: we want to show that, if the test accepts with probability > 1 − O(δ ), then the proof
strings X ∗ , P0∗ , . . . , P`∗ are very close to the canonical proofs for a bit string x0 ∈ SA,b , and furthermore that
∆(x, x0 ) < δ .
     We will start by assuming that the “P` Zero Test” accepts with high probability, and use the acceptance
of sumcheck tests to conclude that P0∗ |H` ≡ 0 (Theorem 4.30). This argument is similar to the proof we
gave that the P` Zero Test accepts in the Completeness case. Additionally, the acceptance of the X vs.
P0 Test and the Zero Suffix Test tells us that P0∗ was appropriately constructed from X ∗ . These two facts
together let us conclude that the first n elements of X ∗ encode a boolean string x0 such that x0 ∈ SA,b ,
and that X ∗ is close to a polynomial X 0 that is a part of the canonical proof generated with respect to x0
(Theorem 4.32). The sumcheck tests let us extend this observation to show that all the proof polynomials

                         T HEORY OF C OMPUTING, Volume 16 (18), 2020, pp. 1–68                                      36
                                           R ELAXED L OCALLY C ORRECTABLE C ODES

X ∗ , P0∗ , . . . , P`∗ are very close to the canonical polynomials generated with respect to x0 (Theorem 4.34).
Finally, we verify that x0 is close to the input x using the Input Proximity Test (Theorem 4.36).

Claim 4.30. If for some i ∈ {0, . . . , `} it holds that Pi∗ |Fi ×H`−i 6≡ 0, then the verifier rejects with probability
1 − O(|H|·`)
      |F| .

Proof. We proceed by reverse induction on i. For i = ` the claim follows from the zero test that the
verifier runs on P`∗ , and the Polynomial Identity Lemma (Lemma 4.8). For the inductive step, we will rely
on the sumcheck test.
     Assume that the claim holds for i ∈ [`]. We show that it holds for i − 1. We may assume that
  ∗
Pi |Fi ×H`−i ≡ 0 (since otherwise the claim follows from the inductive hypothesis). Suppose that the
restriction Pi−1∗ |
                   Fi−1 ×H`−i+1 is not identically zero. We need to show that the verifier rejects with high
probability.
                                             ∗ (z , . . . , z                                  hi
     Define Ti (z1 , . . . , z` ) = ∑hi ∈H Pi−1  1           i−1 , hi , zi+1 , . . . , z` ) · zi . Observe that by our assumption
that Pi−1 |Fi−1 ×H`−i+1 6≡ 0 it follows that T |Fi ×H`−i 6≡ 0 and in particular it must be distinct from Pi∗ . Since
        ∗

both have total degree O(m · |H|), by the Polynomial Identity Lemma (Lemma 4.8), with probability
1 − O(m · |H|/|F|) over z1 , . . . , z` ∈ F it holds that

                    Pi∗ (z1 , . . . , z` ) 6= Ti (z1 , . . . , z` ) = ∑ Pi−1
                                                                          ∗
                                                                             (z1 , . . . , zi−1 , hi , zi+1 , . . . , z` ) · zhi i ,
                                                                     hi ∈H

in which case the verifier rejects in the sumcheck test.

      In particular, by Theorem 4.30 (with i = 0), we may assume that P0∗ |H` ≡ 0. Let x0 ∈ Fn be the
restriction of X ∗ to the first n field elements. (Recall that these refer to the input bits.) Let Q(x0 ) =
(X 0 , P00 , . . . , P`0 ) be the canonical PCPP proof with respect to the input x0 .30

Claim 4.31. If x0 6∈ {0, 1}n (i. e., x0 is not Boolean valued) then, with probability 1 − O(m · |H|/|F|), the
verifier rejects.

Proof. Suppose that xi0∗ 6∈ {0, 1} for some i∗ ∈ [n]. In particular this means that xi0∗ · (1 − xi0∗ ) 6= 0 (since
the polynomial λ · (1 − λ ) has precisely two solutions: 0 and 1).
     Define a polynomial T0 : F` → F as T0 (i1 , i2 , i3 , i4 , α1 , α2 , α3 , σ ) = φ̂A,b (i1 , i2 , i3 , α1 , α2 , α3 , σ ) · α1 ·
X ∗ (i1 ) + α2 · X ∗ (i2 ) + α3 · X ∗ (i3 ) + σ + X ∗ (i4 ) · (1 − X ∗ (i4 )). Observe that T0 (~0,~0,~0, i∗ , 0, 0, 0, 0) =
                                                

X ∗ (i∗ ) · (1 − X ∗ (i∗ )) 6= 0 whereas P0∗ (~0,~0,~0, i∗ , 0, 0, 0, 0) = 0 (since we have assumed that it is identically
0 in H` ). Thus, by the Polynomial Identity Lemma (Lemma 4.8), with probability 1 − O(m · |H|/|F|)
over the choice of z ∈ F` , it holds that P∗ (z) 6= T (z), in which case the verifier rejects.

    Thus, we may assume that x0 ∈ {0, 1}n .

Claim 4.32. If X ∗ 6≡ X 0 , then the verifier rejects with probability 1 − O(m · |H|/|F|).
  30 Note that we have not yet established that x0 is Boolean valued (this fact will be established in Theorem 4.31), nor that

x ∈ SA,b . Nevertheless, at least syntactically, the construction of the canonical PCPP proof string given above extends naturally
for every input in Fn .


                             T HEORY OF C OMPUTING, Volume 16 (18), 2020, pp. 1–68                                                     37
                              T OM G UR , G OVIND R AMNARAYAN , AND RON D. ROTHBLUM

Proof. We will first show that if X ∗ |H` 6≡ X 0 |H` then the verifier rejects with high probability.
     Suppose that there exists an index i ∈ {N + 1, . . . , H` } such that X ∗ (i) 6= X 0 (i). This means that X ∗
and X 0 differ on the “zero region.” Since X 0 was defined as a canonical polynomial for input x0 , we know
that X 0 is 0 on the “zero region” by construction, which means that X ∗ (i) 6= 0. In this case, the verifier
rejects with probability 1 − O(m · |H|/|F|) due to the Zero Output and Padding Test (see the analysis of
this test in [44, Proposition 3.9]).
     Now, we can assume that the suffixes match, so if X ∗ |H` 6≡ XH0 ` , they must differ on some index in
[N]. Let i∗1 ∈ [N] be the smallest such that X ∗ (i∗1 ) 6= X 0 (i∗1 ) (notice that i∗1 > n since we have defined x0
as the n element prefix of X ∗ ). Since the circuit is deterministic, the value at each wire is determined
by the input x0 to the circuit. By definition, the correct value of each wire is included in X 0 . Since
our assumption is that X ∗ (i∗1 ) 6= X 0 (i∗1 ), there exist α1∗ , α2∗ , α3∗ , σ ∈ {0, 1} and i∗2 , i∗3 ∈ Hm such that
φA,b (i∗1 , i∗2 , i∗3 , α1∗ , α2∗ , α3∗ , σ ) = 1 but α1∗ · X ∗ (i∗1 ) + α2∗ · X ∗ (i∗2 ) + α3∗ · X ∗ (i∗3 ) + σ 6= 0.
     Define a polynomial T0 : F` → F as follows. For every z ∈ F` , where z = (i1 , i2 , i3 , i4 , α1 , α2 , α3 , σ ) ∈
(F )4 × F4 it holds that
   m

                                                                                                                       
   T0 (z) =φ̂A,b i1 , i2 , i3 , α1 , α2 , α3 , σ · α1 · X ∗ 0`−m , i1 + α2 · X ∗ 0`−m , i2 + α3 · X ∗ 0`−m , i3 + σ
                                                                                                                     

                                                               
                  + X ∗ 0`−m , i4 · 1 − X ∗ 0`−m , i4 .

Observe that T0 (i∗1 , i∗2 , i∗3 ,~0, α1∗ , α2∗ , α3∗ , σ ∗ ) 6= 0 whereas P0∗ (i∗1 , i∗2 , i∗3 ,~0, α1∗ , α2∗ , α3∗ , σ ∗ ) = 0 (since we have
assumed that P0∗ is identically 0 in H` ). Thus, with probability 1 − O(m · |H|/|F|) over the choice of
z ∈ F` it holds that P0∗ (z) 6= T0 (z), in which case the verifier rejects on the X vs. P0 Test.
    Hence, we may assume that X 0 |H` ≡ X ∗ |H` . By construction X 0 has individual degree |H| − 1. If
  ∗
X does not have individual degree |H| − 1, then the individual degree |H| − 1 rejects with probability
1 − O(m · |H|)/|F|. Hence, we may assume that also X ∗ has individual degree |H| − 1. Two individual
degree |H| − 1 polynomials that agree on H` must agree on F` and so we obtain that X ∗ ≡ X 0 .

    Thus we may assume that X ∗ ≡ X 0 . Now we can use the X vs. P0 Test again to argue that the verifier
will reject if P00 6≡ P0∗ .

Claim 4.33. If P00 6≡ P0∗ then the verifier rejects with probability 1 − O(|H| · m/|F|).

Proof. Since we have assumed that X ∗ ≡ X 0 (due to Theorem 4.32), we know that

    P00 (z) =φ̂A,b (i1 , i2 , i3 , α1 , α2 , α3 , σ ) · α1 · X ∗ (0`−m , i1 ) + α2 · X ∗ (0`−m , i2 ) + α3 · X ∗ (0`−m , i3 ) + σ
                                                                                                                                        

                + X ∗ (0`−m , i4 ) · (1 − X ∗ (0`−m , i4 )).

So we can rephrase the X vs. P0 Test as testing whether P0∗ is equal to P00 at a random point. Since P0∗
and P00 are both polynomials of degree at most O(m · |H|), we can conclude that, if P00 and P0∗ are not
equivalent, the test will reject with probability 1 − O(m · |H|/|F|).

    Now we can assume that P00 ≡ P0∗ . We will use the Sumcheck Tests to conclude that, if Pi0 and Pi∗ are
not equivalent for any i ∈ [`], the test will reject with probability 1 − O(m · |H|/|F|).

                              T HEORY OF C OMPUTING, Volume 16 (18), 2020, pp. 1–68                                                         38
                                          R ELAXED L OCALLY C ORRECTABLE C ODES

Claim 4.34. For every i ∈ {0, . . . , `}, if Pi∗ 6≡ Pi0 , then the verifier rejects with probability 1 − O(m ·
|H|/|F|).

Proof. We prove by induction on i. The base case i = 0 follows from Theorem 4.33. Assume that the
                                                                             ∗ ≡ P0
claim holds for i − 1 and we show that it holds for i. We may assume that Pi−1    i−1 (since otherwise
the claim follows from the inductive hypothesis).
    Suppose that Pi∗ 6≡ Pi0 . Thus, by the Polynomial Identity Lemma (Lemma 4.8), the following holds
with probability 1 − O(m · |H|/|F|) over the choice of z1 , . . . , z` ∈ F :

                               Pi∗ (z1 , . . . , z` ) 6= Pi0 (z1 , . . . , z` )
                                                           0
                                                     = ∑ Pi−1 (z1 , . . . , zi−1 , hi , zi+1 , . . . , z` ) · zhi i
                                                         hi ∈H
                                                           ∗
                                                     = ∑ Pi−1 (z1 , . . . , zi−1 , hi , zi+1 , . . . , z` ) · zhi i ,
                                                         hi ∈H

in which case the verifier rejects.

     Thus, we may assume that Pi0 ≡ Pi∗ for all i ∈ {0, . . . , `}.
     So we have established that Q∗ (γ, ·, . . . , ·) ≡ Q0 (γ, ·, . . . , ·) and Q∗ (γi , ·, . . . , ·) ≡ Q0 (γi , ·, . . . , ·), for all
i ∈ [`]. Since Q0 has individual degree ` + 1 by construction and Q∗ by our assumption, they must agree
on F`+1 . Thus, Q0 ≡ Q∗ .

Claim 4.35. If x0 6∈ SA,b then the verifier rejects with probability 1 − O(m·|H|)
                                                                           |F| .

Proof. Since x0 6∈ SA,b , we know that C(x0 ) 6= 0. This means that there exists an index N − k ≤ i ≤ N such
that X 0 (i) 6= 0. Since we have already established that X ∗ ≡ X 0 , this means that also X ∗ (i) 6= 0. Thus,
when applying the zero output test to X ∗ , by the analysis in [44, Proposition 3.9], the verifier rejects with
probability 1 − O(m·|H|)
                     |F| .

   Thus, we have established that the proof Q∗ is the canonical proof for an input x0 ∈ SA,b . We will now
show that this input must be close to x.

Claim 4.36. If x0 is δ -far from x, then the verifier rejects with probability Ω(δ ).

Proof. If x0 is δ -far from x then with probability δ over the choice of i ∈ [n], it holds that xi0 6= xi .
Accounting for the error in the self-correction procedure, we have that the verifier rejects with probability
Ω(δ ) in the Input Proximity Test.
                                                                                                  
    Thus, we may assume that x0 is δ -close to x. In particular, using the fact that ∆ Q(x0 ), Q e < δ we
obtain that                                                 
                                    max ∆ x, x0 , ∆ Q(x0 ), Qe    <δ.

Since x0 ∈ SA,b we get a contradiction to the definition of δ . (Recall the definition of δ , Equation (4.6).)
   This completes the proof of Lemma 4.28.

                             T HEORY OF C OMPUTING, Volume 16 (18), 2020, pp. 1–68                                                   39
                            T OM G UR , G OVIND R AMNARAYAN , AND RON D. ROTHBLUM

Linearity. We first establish that the canonical proof string Q is generated as a GF(2)-linear function
of the input x. We will later argue that the verifier can be expressed as a linear-system.
    Consider first the polynomial X : F` → F. Each point in [N] is clearly generated as a GF(2)-linear
combination of the input (since the circuit C has only parity gates). All other points in H` \ [N] are
identically 0. The rest of X is obtained as the unique low-degree extension, which is an F-linear code.
Since F has characteristic 2, the latter is also a GF(2)-linear transformation.
    The polynomial P0 is the most tricky to handle. Indeed, from its definition it is self evident that it is a
quadratic form over the field F (for the second term which handles booleanity). Nevertheless, we argue
that it is a GF(2) linear function of the input x. The reason is that the mapping λ → λ 2 is a GF(2)-linear
transformation.31 The rest of the sumcheck polynomials are obtained as F-linear combinations of P0 ,
which in particular are GF(2)-linear and so they are a GF(2)-linear combination of the input x. Lastly,
the polynomial Q is obtained by interpolation, which is a linear operation over F (and therefore also over
GF(2)).
    As for the verifier’s checks, it is easy to see that all, except the X vs. P0 test, are F-linear and therefore
also GF(2)-linear. The X vs. P0 test involves an F quadratic form of the same form as that analyzed
above, and similarly, it can be shown to be GF(2)-linear.

4.4.2    Proof of Theorem 4.27
Fix A ∈ {0, 1}n×k and b ∈ {0, 1}k . Applying Lemma 4.28 with respect to a finite field of order |F| =
O(m2 · n2c/m ), we have that SA,b has a strongly canonical PCPP (P, V) with all the properties listed in
the statement of Lemma 4.28. In particular, the proof string for an input x ∈ {0, 1}n is a total degree
                                   0
d = O(m · nc/m )-polynomial Q : F` → F, where `0 = O(m). The query complexity is also q = O(m2 · nc/m ).
    We construct a PCPP proof-system (P0 , V0 ) by making the following modification to (P, V). Given
                                                      0
oracle access to an (alleged) PCPP proof string Q : F` → F, the verifier runs the following tests:

    1. A robust low-degree test on Q, with respect to degree d (see Lemma 4.12).
                                                                                                        0
    2. In addition, the PCPP verifier generates the PCPP queries s1 , . . . , sq ∈ F` as in the PCPP of
       Lemma 4.28. However, rather than reading these points directly, our verifier chooses an additional
       random point s0 and takes a degree q curve through {s0 , . . . , sq }. Namely, we take a degree-q curve
       P that interpolates the points {s0 , . . . , sq } at 0, . . . , q. The verifier reads all the points on Q that lie
       on the curve P. Thus, the verifier obtains the truth table of the (univariate) polynomial Q ◦ P, which
       should have degree q · d = O(m3 · n2c/m ) ≤ |F|/100. Our verifier first checks that Q ◦ P is indeed a
       degree O(m3 · n2c/m ) polynomial. It then checks that the verifier of Lemma 4.28 accepts when its
       answer to each query si is Q ◦ P(i), for all i ∈ [q]. If all of its checks pass then the verifier accepts.

In addition, recall that V only reads a single point from the input x. In order to ensure robustness, we
weight this test as well as the two foregoing tests so that each test consists of one third of the queries. That
is, to read a single point from the input x, we actually query the point multiple times until the number of
queries to the point is one-third the total number of queries, and check that all answers are the same.
  31 To show GF(2) linearity it suffices to show that (λ +λ )2 = λ 2 +λ 2 , which is true in GF(2) (also notice that multiplication
                                                        1  2      1    2
by a scalar is a trivial operation in GF(2)).


                            T HEORY OF C OMPUTING, Volume 16 (18), 2020, pp. 1–68                                               40
                                  R ELAXED L OCALLY C ORRECTABLE C ODES

Remark 4.37. Actually, since we are aiming for a PCPP over the binary alphabet, we further “con-
catenate” the PCPP proof-string with a good binary linear error correcting code C. Namely, each of the
  0
F` field elements in the proof is encoded using C. The verifier is further modified so that when it is
supposed to read a field element it reads the entire codeword of C, checks that it is a valid codeword (and
otherwise rejects) and decodes to the corresponding symbol of F. For simplicity, and since it only affects
our parameters by a constant factor, we ignore this concatenation in the analysis below.

Robust soundness. To establish robustness we first need to show that accepting views of the PCPP
verifier V0 are at least 0.1-far apart. This follows immediately from the fact that in the two tests that
the verifier preforms, it checks that the answers that it gets are low-degree polynomials (of degree d for
the low-degree test and d · q for the additional test). Since |F| > 100d · q, there is 0.99 distance between
accepting views.
    We still need to establish the robust soundness condition. Namely, that for every x ∈ {0, 1}n and every
proof oracle Q, when V0 is given access to the input oracle x and proof oracle Q it holds that
                                           h                   i
                                     Pr     ∆ (x ◦ Q)|I , D > ρ ≥ Ω(µ),
                                    (I,D)←V0

where:                                                            
                                    def            0     0 0
                                                                  
                                         max ∆ x, x , ∆ P (x ), Q
                                                     
                                 µ = min
                                      0
                                                                      .
                                          x ∈S
                                                           0
    Fix an input x ∈ {0, 1}n and a proof oracle Q : F` → F. Consider first the case that Q is 0.1-far from
having degree d. In such case, by the robustness of the low degree test, with probability at least 0.5, the
view of the verifier will be at least 0.1-far from accepting (see Lemma 4.9 and the discussion immediately
following it). Thus, we may assume that Q is 0.1-close to some degree d polynomial Q0 .
    By the strongly canonical soundness of (P, V) with probability Ω(µ 0 ), where:
                                                                        
                                   0 def                   0        0
                                                                       
                                               max ∆ x, x , ∆ P(x ), π
                                                             
                                 µ = min 0
                                                                            .
                                          x ∈S

the queries s1 , . . . , sq are such that the values (Q(si ))i∈[q] , would lead the verifier V to reject. Assume that
this is indeed the case.
    On the other hand, since s0 was chosen at random, each point on the curve P, other than s1 , . . . , sq , is
individually a random point. Since Q is 0.1-close to Q0 (which has degree d), with probability 1/2, the
function Q ◦ P will be 0.2-close to the degree d · q polynomial Q0 ◦ P. To change the view of the verifier
V0 to an accepting one, an adversary must now change at least 1 − d·q          |F| − 0.2 ≥ 0.7 of the answers to the
second test.
    Lastly, we observe that the test that reads the input bit (required for V) has robustness 1.
    Overall we got that with probability Ω(µ), the answers that the verifier V0 sees are 0.1-far from any
that would make it accept. The theorem follows.

Self Correction. Suppose that the PCPP string is 0.1-close to a degree O(m · nc/m ) polynomial. Self
correction follows from the fact that low-degree polynomials are robust locally correctable (see the
furthermore part of Lemma 4.9).

                        T HEORY OF C OMPUTING, Volume 16 (18), 2020, pp. 1–68                                     41
                      T OM G UR , G OVIND R AMNARAYAN , AND RON D. ROTHBLUM

Length of proof string. The proof string is the same as that in Lemma 4.28.

                                                                                          0
Randomness complexity. The verifier only selects one additional (random) point in F` to generate the
degree-q curve P, and so has the same asymptotic randomness complexity as the verifier in Lemma 4.28.

4.5    Putting it together
To prove Theorem 4.1, we will use our composition theorem (Theorem 4.13) to perform two iterative
compositions of a robust RLCC with a robust, self-correctable, strongly canonical PCPP. Specifically,
we proceed in the following steps.
   1. Base Code: Our starting point is the low-degree extension (LDE) code (aka, the Reed–Muller code),
      as defined in Section 4.1.3, of roughly logarithmic order. With a suitable setting of parameters, the
      LDE is known to be a robust (full-fledged) LCC with nearly linear block length and polylogarithmic
      query complexity.
   2. First Composition: We compose this low-degree extension code with the polynomial length,
      polylogarithmic query, strongly canonical, self-correctable and robust PCPP that was constructed
      in Theorem 4.27. This composition yields a robust RLCC with polynomial block length and
      sublogarithmic query complexity; we denote the composed code by C0 .
   3. Second Composition: Finally, we compose the code C0 with the exponential length, constant
      query, strongly canonical, self-correctable PCPP from Theorem 4.26. This yields our final code:
      an RLCC with polynomial block length and constant query complexity.
    In what follows, we provide the full details of the steps of the proof outlined above. Recall that
Theorem 4.13 allows us to compose a robust RLCC with a robust strongly canonical PCP of proximity
to obtain a robust RLCC with improved query complexity, with a relatively mild overhead to the block
length.
    We proceed to describe our base code, then the first composition (using Theorem 4.13) with the
PCPP in Theorem 4.27, and finally the second composition (also using Theorem 4.13) with the PCPP in
Theorem 4.26.

Our base code. Fix F to be a finite field of characteristic 2 let H ⊆ F, and let m be a dimension such that
                         log(k)
|H| = (log(k))c , m = c·log log(k) and |F| = Θ(|H| · m), where c ≥ 1 is chosen as a large constant. Denote
 def
n = |Fm |, and note that |H|m = k and that n = |F|m = k · O(m)m ≤ k1+1/c . Consider the low-degree
extension code LDEF,H,m : Fk → Fn as defined in Section 4.1.3.
    LDEF,H,m is an F-linear code with relative distance 1 − |H|·m |F| ≥ 0.9. The furthermore clause of
Lemma 4.9 and the discussion underneath it implies that it is 0.5-robust relaxed LCC with respect to
soundness 2/3, and decoding radius δradius = 0.1. Furthermore, the corrector is an F-linear function
and has randomness complexity rcode = log(|F|m ) = log(n), and query complexity qcode = O(|H| · m) =
polylog(n).
    To obtain a binary code, we concatenate LDEF,H,m with a good binary linear code. Since F has
characteristic 2, the resulting code is GF(2)-linear. Furthermore, the resulting code has constant distance

                      T HEORY OF C OMPUTING, Volume 16 (18), 2020, pp. 1–68                             42
                                 R ELAXED L OCALLY C ORRECTABLE C ODES

δ > 0 and is an ρcode -robust relaxed LCC, for constant ρcode > 0, with respect to constant soundness
scode > 0, and constant decoding radius δradius > 0. Lastly, the corrector is a GF(2)-linear function and
has randomness complexity rcode = log(|F|m ) = log(n), and query complexity qcode = polylog(n).

The first composition. Let (P, V) be the PCPP proof-system for affine relations, guaranteed by
Theorem 4.27, with respect to inputs of length ` = 2qcode = polylog(n) and with dimension m =
log(`)/ log log(`). This PCPP is a self-correctable strongly canonical and ρpcp -robust PCPP (P, V)
(over the binary alphabet), where ρpcp = Ω(1), with respect to soundness spcp = Ω(1), which is a
GF(2)-linear PCPP oracle with a GF(2)-linear verifier and self-corrector, and has query complexity
qpcp = polylog(`) = poly(log log(n)) and randomness complexity rpcp = polylog(`) = poly(log log(n)).
     We compose the code from step 1 (i. e., LDEF,H,m concatenated with a good binary linear error
correcting code) with the PCPP (P, V) using Theorem 4.13. This yields a GF(2)-linear ρ 0 -robust relaxed
                               0
LCC C0 : {0, 1}k → {0, 1}n , with respect to soundness s0 = (spcp · scode · ρcode 2 )/4 = Ω(1) and where
ρ 0 = ρpcp /5 = Ω(1), with a GF(2)-linear corrector, block length n0 = 2n · 2rcode · 2rpcp · qpcp = O(n
                                                                                                    e 2) =
O(k
 e   2+2/c                       0                          0
           ), relative distance δ ≥ δ /2, decoding radius δradius = δradius /4, randomness complexity

                r0 = 2rcode + O rpcp + log(qpcp ) + log(qcode ) = 2 log(n) + poly(log log(n)),
                                                               


and query complexity q0 = O qpcp = O(log log(n)).
                                     


The second composition. Let (P0 , V0 ) be the PCPP proof-system guaranteed by Theorem 4.26 with
respect to an input of length poly(log log(n)). This PCPP is a self-correctable, ρpcp 0 -robust, strongly
canonical PCPP, with respect to soundness spcp 0 = Ω(1), with qpcp 0 = O(1) queries, soundness ρpcp 0 =
Ω(1), and randomness complexity rpcp 0 = poly(log log(n)). The PCPP proof string is a GF(2)-linear
function of the input and the verifier and self-correctors are GF(2)-linear functions of their answers.
    We compose the code C0 with the PCPP (P,0 V0 ) using Theorem 4.13. This yields a GF(2)-linear
                                     00
relaxed LCC C00 : {0, 1}k → {0, 1}n , with respect to soundness s00 = Ω(spcp 0 · s0 · ρ 02 ) = Ω(1) The com-
                                               0      0
posed code C00 has block length n00 = 2n0 · 2r · 2rpcp · qpcp 0 = O(k
                                                                  e 2+2/c ) · O(n
                                                                              e 2 ) · no(1) = k4+4/c+o(1) , rel-
                 00    0                                  00        0
ative distance δ ≥ δ /2 = Ω(1), decoding radius δradius = δradius /4 = Ω(1), and query complexity
q00 = O qpcp 0 = O(1).
    Theorem 4.1 follows by setting c  4/ε so that k4+4/c+o(1) = k4+ε , for the desired parameter ε from
the theorem statement.


5    Constant-rate RLCC
In this section, we construct a relaxed locally correctable code (RLCC) with quasi-polylogarithmic query
complexity (specifically, exp(O(log log n)2 )) and constant rate (which approaches 1). Our construction is
based on the recent breakthrough result of Kopparty et al. [37], who give constructions of constant-rate
locally testable codes (LTC) with quasi-polylogarithmic query
                                                            p complexity and constant-rate full-fledged
(i. e., non-relaxed) locally correctable codes with exp(O( log(n) · log log(n))) query complexity.
     We show that the [37] LTC is simultaneously relaxed locally correctable with quasi-polylogarithmic
query complexity.

                       T HEORY OF C OMPUTING, Volume 16 (18), 2020, pp. 1–68                                 43
                          T OM G UR , G OVIND R AMNARAYAN , AND RON D. ROTHBLUM

Theorem 5.1 (Constant-Rate Binary RLCC with Quasipolylogarithmic Query Complexity). For every
constant r ∈ (0, 1), there exist constants ε > 0, δ > 0, and δradius > 0, as well as an (explicit) infinite
family of binary, linear codes {Cn }n with the following properties:
    • Cn has block length n, rate r, and relative distance δ .

    • Cn is relaxed locally correctable with query complexity exp(O(log log n)2 ) and correcting radius
      δradius .
and δ furthermore satisfies that
                                                                                 
                                                                              r
                                    δ = max             (1 − R − ε) · H −1 1 −      ,
                                            r<R<1                              R

where H is the binary entropy function, and H −1 is its inverse in the range (0, 1/2).
    The rate–distance tradeoff for the codes in Theorem 5.1 approaches the Zyablov bound (see Foot-
note 5), like the codes in [37].
    On the way to proving Theorem 5.1, we will construct relaxed locally correctable codes with a larger
alphabet that approach the Singleton Bound, again analogous to a result of Kopparty et al.
Theorem 5.2 (Constant Rate RLCC with Quasipolylogarithmic Query Complexity Approaching Singleton
Bound). For every constant r ∈ (0, 1) and constant γ > 0, there exists a constant δradius > 0 and an
infinite family of codes {Cn }n over an alphabet {Σn }n with the following properties:
    • Cn has block length n, rate r, and relative distance at least 1 − r − γ.

    • Cn is relaxed locally correctable with query complexity exp(O(log log n)2 ) and correcting radius
      δradius .

    • The alphabet Σn of Cn has size at most |Σn | ≤ exp(poly(1/γ))).

    • Viewed as a function over bits, Cn is GF(2)-linear.
    In the construction, following [37], we start off with a high-rate code C of only polylogarithmic
block length and relative distance δ = 1/polylog(n). The code C is trivially locally correctable with a
polylogarithmic number of queries: given a string w and coordinate i, read all of w, decode to the nearest
codeword c and output ci .
    Our goal is now to extend the block length of C with only a mild overhead to the query complexity.
We do so gradually, by iteratively applying two transformations. The first transformation is code tensoring,
which squares the block length (as well as the distance and the rate). We note that tensoring does not
seem to preserve local correctability (nor decodability).32 Nevertheless, our main observation is that
tensoring does preserve relaxed local correctability.
    Thus, one could try to simply take the code C and tensor it with itself roughly log log(n) times to
obtain a code with block length n. The problem with this approach, however, is that tensoring also squares
  32 Even assuming that the base corrector makes “smooth” queries, the natural local corrector for a tensor code squares the

query complexity, which we cannot afford.


                          T HEORY OF C OMPUTING, Volume 16 (18), 2020, pp. 1–68                                          44
                                R ELAXED L OCALLY C ORRECTABLE C ODES

the distance, which would mean that the resulting code would have very poor distance. To rectify this,
[37] uses an additional transformation that amplifies the distance without raising the query complexity of
the LTC too much.
    Specifically, Kopparty et al. use the Alon–Edmonds–Luby (AEL) [1] transform after each tensoring
step to amplify the distance without hurting the rate too much. Furthermore, they show that the Alon–
Edmonds–Luby transform preserves local testability (and correctability). In Section 5.2, we show that the
Alon–Edmonds–Luby transform additionally preserves relaxed local correctability. Thus, similarly to
[37], by repeated applications of tensoring and distance amplification we obtain our desired RLCC.

Remark 5.3. The iterative approach in the [37] construction (and therefore also in ours) resembles the
“zig-zag approach” that has been used in the literature for a variety of purposes, including the construction
of expander graphs [45], undirected connectivity in log space [43], PCPs [14], locally testable codes [40]
and doubly efficient interactive proofs [44]. See also Goldreich’s survey [21].

Section organization. In Section 5.1, we formally define the operation of tensoring on codes and show
that tensoring preserves relaxed local correctability without increasing the query complexity too much. In
Section 5.2, we give the AEL distance amplification lemma and prove that the AEL transform preserves
relaxed local correctability. In Section 5.3, we provide the binary codes with small block length that we
will use for the final constructions. In Section 5.4, we use the tools and codes from the previous sections
to construct the codes of Theorem 5.2 and Theorem 5.1.

5.1   Analysis of tensoring
Following [37], the basic operation used to increase the block length of a code in our construction is
tensoring.

Definition 5.4 (Tensor Code). Given a field F and an F-linear code C : Fk → Fn , define the tensor of C
                    2     2                                         2
with itself, C2 : Fk → Fn , to be the code such that messages m ∈ Fk are encoded as follows. View m as
a k × k matrix over F and encode each of its rows using C. Then, encode each of the columns (including
those generated in the first step) also using C.

    It is a well-known fact that the rows and columns of each codeword c0 ∈ C2 are themselves codewords
of C.
    Now we are ready to state our main lemma of this section: that tensoring preserves relaxed locally
correctability.

Lemma 5.5 (Tensoring Preserves Relaxed Local Correctability). Let F be a field, and let C ∈ Fn be
a linear, relaxed locally correctable code with query complexity q, rate r, relative distance δ , and
                                           2
correcting radius δradius . Then C2 ∈ Fn is a linear, relaxed locally correctable code with query complexity
O(q/δradius ), rate r2 , relative distance δ 2 , and correcting radius δradius
                                                                        2      /2.
                    2
Proof. Let w in Fn (which we view as an n × n matrix) be a (possibly) corrupt codeword, with at most
  2
δradius /2 fraction of corruptions, and let (i, j) ∈ [n]2 be an index to correct. Consider the ith row of w.
Intuitively, if the fraction of corruptions on this row is small (less than δradius ) than we could simply run

                        T HEORY OF C OMPUTING, Volume 16 (18), 2020, pp. 1–68                              45
                        T OM G UR , G OVIND R AMNARAYAN , AND RON D. ROTHBLUM

                                                                                               2
the base corrector on this row. However, the total number of corruptions can be as large as (δradius /2) · n2 ,
                                                                        th
which can be larger than n, and so it may be the case that the entire i row is corrupt. Our plan is to
detect this by choosing many coordinates of this row at random and checking whether they are corrupt by
invoking the base corrector on the corresponding columns.
    We proceed with the proof. For a matrix w ∈ Fn×n , and indices i, j ∈ [n], we denote the ith row of w
by rowi (w) ∈ Fn and the jth column of w by col j (w) ∈ Fn . We denote the (i, j)th entry of w by wi, j .

Description of M0 . We now describe the action that M0 , the relaxed local corrector for C2 , takes when
                                                                                                      0
given a string w ∈ Fn×n and a coordinate pair (i, j) ∈ [n] × [n]. Recall that we use the notation Mw (i)
to denote the result of running relaxed local corrector M to correct coordinate i with oracle access to a
corrupted codeword w0 ∈ Fn .

   1. Repeat O(1/δradius ) times:

        (a) Select at random j0 ∈ [n].
        (b) Run Mcol j0 (w) (i), with error bound 0.1 (see Remark 3.3). If the result is not equal to wi, j0 (and
            in particular if it is ⊥), then output ⊥ and abort.

   2. Output Mrowi (w) ( j).

Correctness of M0 . Fix w ∈ Fn and i, j ∈ [n]. We first consider the case that w ∈ C2 , in which case
the corrector should correctly output wi, j . Since each row and column of w is a codeword of C (see
Definition 5.4), and M is a relaxed local corrector for C, we know that Mcol j0 (w) (i) = wi, j0 , for every
choice of j0 ∈ [n]. Therefore, M0 will never reject in Step 1b. Rather, it will reach Step 2 and output the
correct value wi, j . Hence, the completeness condition is satisfied.
                                                                 2
    It remains to show Condition 2, which says that if w is (δradius  /2)-close to a codeword c ∈ C2 , then,
with probability at least 1/2, it holds that M either decodes the correct symbol ci, j or outputs ⊥.
                                              0

    We consider the following cases:

    • If ∆ (rowi (w), rowi (c)) < δradius , then by the relaxed correcting property, Mrowi (w) ( j) will output
      either the correct value ci, j or ⊥ with probability at least 1/2.

    • Thus, we may assume that ∆ (rowi (w), rowi (c)) ≥ δradius . That is, at least δradius fraction of the
      coordinates in rowi (w) are corrupt. We will consider the set of coordinates j0 ∈ [n] such that wi, j0
      is corrupted (i. e., wi, j0 6= ci, j0 ) and col j0 (w) has less than δradius fraction of corruptions. The main
      observations are (1) this set is large (since the overall number of corruptions is small) and (2) if
      our choice of j0 falls in this set (which happens with constant probability), then the relaxed local
      correction guarantees that when we run M on this column, we will detect a problem in this column
      with high probability. Namely, M will either output ⊥ or the value ci, j0 6= wi, j0 in which case we
      reject. Details follow.
      Define the sets J err and J obv as follows:

                                             J err = { j0 ∈ [n] : wi, j0 6= ci, j0 }

                        T HEORY OF C OMPUTING, Volume 16 (18), 2020, pp. 1–68                                    46
                                    R ELAXED L OCALLY C ORRECTABLE C ODES

        is the set of coordinates j0 ∈ [n] where wi, j0 is corrupt and

                                 J obv = j0 ∈ J err | ∆ col j0 (w), col j0 (c) < δradius
                                                                             


        is the subset of coordinates j0 of J err where the error is obvious to the corrector: since the column
        indexed by j0 is not too corrupted, we can correct the error in wi, j0 by running M on the column
        indexed by j0 .
        We know that |J err | ≥ δradius · n, by assumption. Furthermore, we argue that at most half of the
        columns associated with these coordinates can have more than δradius fraction of errors. Since we
        assumed that dist(w, c) < δradius2    /2, there can be at most (δradius /2) · n columns with more than
        δradius · n errors. Because there are at least δradius · n columns in J err , we conclude that |J obv | ≥
        δradius · n − (δradius /2) · n = (δradius /2) · n. Finally, since we sampled O(1/δradius ) coordinates j0
        uniformly at random, by setting the constant in the big-O notation suitably, with probability at least
                                                                       0   obv                  0    obv
                                      we sampled a coordinate j ∈ J . Assuming that j ∈ J , it follows
        0.99, in one of the iterations
        that ∆ col j0 (w), col j0 (c) < δradius . So, by the relaxed local correction property, with probability at
        least 0.9, M will output either ci, j0 or ⊥. In both of these cases M0 will output ⊥ (since wi, j0 6= ci, j0 ).
        Therefore, M0 will output ⊥ with probability at least 0.99 · 0.9 > 1/2.

Parameters of tensor code. Let δ , r, δradius , and q denote the distance, rate, correcting radius, and
query complexity of C. It is a well-known fact that the rate and distance of C2 are r2 and δ 2 respectively,
                                                                             0
and furthermore that C2 is linear. As analyzed above, the correcting radius δradius of C2 is at least δradius
                                                                                                       2      /2.
Finally, our corrector M makes at most O (1/δradius ) calls to the corrector for C and reads an additional
                           0

O (1/δradius ) points by itself. Since the underlying corrector M makes at most q queries in each call, we
get that M0 makes at most O (q/δradius ) queries in total.


5.2     Analysis of distance amplification
We now analyze the Alon–Edmonds–Luby (AEL) distance amplification method [1], and show that it
preserves relaxed local correctability. This analysis is similar to the analysis given in [37], which shows
that AEL distance amplification preserves the stronger property of local correctability (although it does
not directly follow since we are starting off with the weaker hypothesis of relaxed correctability).

Lemma 5.6 (Alon–Edmonds–Luby distance amplification (see [37, Lemma 6])). Suppose that C is a
binary linear code with relative distance δ and rate r that is relaxed locally correctable with q queries
                                                         0
and correcting radius δradius ≤ δ . Then, for every 0 < δradius < 12 and 0 < ε < 1, there exists a code CAEL
that is relaxed locally correctable with query complexity q · poly (1/(ε · δradius )) with correcting radius
  0
δradius with the following properties:
                                                 0
      • CAEL has relative distance at least 2 · δradius                                   0
                                                        , and rate at least r · (1 − 2 · δradius − ε).

      • The alphabet of CAEL is {0, 1} p for some p = poly(1/(ε · δradius )).

      • CAEL is GF(2)-linear.

                          T HEORY OF C OMPUTING, Volume 16 (18), 2020, pp. 1–68                                    47
                       T OM G UR , G OVIND R AMNARAYAN , AND RON D. ROTHBLUM

    The proof of Lemma 5.6 uses “graph samplers” which are defined next. For a graph G = (V, E), a
vertex v ∈ V and a subset of vertices S ⊆ V , we denote the set of edges from v to S as E(v, S). Similarly,
given two subsets of vertices S, T ⊆ V , we denote the set of edges between the sets S and T by E(S, T ).
Definition 5.7 (Sampler Graphs (c.f. [22])). Let G = (L, R, E) be a bipartite d-regular graph with |L| = |R|.
We say that G is an (ε, δ )-sampler if for every subset S ⊆ R, for at least 1 − δ fraction of vertices v ∈ L it
holds that
                                                |E(v, S)| |S|
                                         −ε ≤            −     ≤ ε.
                                                   d       |R|
Lemma 5.8 (Explicit Samplers (see, e. g., [22, Lemma 5.3] or [37, Lemma 1])). For every ε, δ > 0 and
every sufficiently large n ∈ N there exists a bipartite d-regular graph Gn,ε,δ = (L, R, E) with |L| = |R| = n
                 1
                   
and d = poly ε·δ     such that Gn,ε,δ is an (ε, δ )-sampler. Furthermore, there exists an algorithm that
                                                                                                      takes
                                                                                                          
                                                                                                         log n
in n, ε, δ , and a vertex v of Gn,ε,δ , and computes the list of neighbors of v in Gn,ε,δ in time poly    ε·δ    .

Equipped with Lemma 5.8 we are now ready to describe AEL distance amplification and prove Lemma 5.6.

5.2.1   Overview of construction and blockwise errors
We start by giving an overview of the steps. The subsequent subsections will go into details on proving
                                                                              0
each step. Recall that our goal is to construct an RLCC that corrects from a δradius fraction of errors, with
              0
distance ≥ 2δradius .
    At a high level, we will create CAEL from C by going through the following three steps:
                                                                                0
   1. Transform C into a code Cspread that is relaxed locally correctable from δradius fraction of errors
      that are well-spread throughout different “blocks” of the codeword (the concept of blocks in a
      codeword will be defined below).
                                                                                            0
   2. Transform Cspread into a code Cconcentrated that is relaxed locally correctable from δradius fraction
      of errors that are concentrated in certain blocks of the codeword.
                                                                                                 0
   3. Transform Cconcentrated into the final code CAEL that is relaxed locally correctable from δradius
      fraction of adversarial errors.
       We start by defining blockwise error patterns, which allow us to formalize what we mean by “well-
spread” and “concentrated” errors above. This definition is implicit in the proof of distance amplification
in [37].
       Fix n and b to be integers such that b divides n. Suppose we have a code C with alphabet Σ and block
length n. We partition the indices {1, . . . , n} into nb contiguous blocks of size b each, which we denote
s1 , . . . , sn/b .
       We say that e ∈ {0, 1}n is a (δ , ε, b)-blockwise error pattern if there are at most δ fraction of the
blocks in which e contains more than ε fraction of ones. In other words, when there are at most δ · nb
choices of j such that the block s j is contains more than ε · b ones. Note that a (δ , ε, b)-blockwise error
pattern is permitted to have any number of blocks with ≤ ε fraction of ones, but can only have at most δ
fraction of blocks with more than ε fraction of ones. The δ fraction of blocks that exceed the ε fraction
are not restricted in any way; they could potentially be all ones.

                       T HEORY OF C OMPUTING, Volume 16 (18), 2020, pp. 1–68                                     48
                                     R ELAXED L OCALLY C ORRECTABLE C ODES

    Given a string w ∈ Σn and a codeword c ∈ C, we say that w is (δ , ε, b)-blockwise close to c if there
exists a codeword c ∈ C such that the string e ∈ {0, 1}n defined as
                                               (
                                                 1 if wi 6= ci
                                          ei =
                                                 0 o/w
is a (δ , ε, b)-blockwise error pattern. We furthermore say that w is (δ , ε, b)-blockwise close to C if there
is some codeword c ∈ C such that w is (δ , ε, b)-blockwise close to c.
     Finally, we say that C is relaxed locally correctable from (δ , ε, b)-blockwise errors if there exists a
relaxed local corrector M such that M works correctly on strings w that are (δ , ε, b)-blockwise close to
C. We now formally define relaxed local correctability with respect to blockwise errors.
Definition 5.9 (Relaxed Local Correctability from Blockwise Errors). Fix integers n, b > 0 such that b
divides n. Let s1 , . . . , sn/b be contiguous blocks of indices of length b as described above. Let C ⊆ Σn be
an error correcting code. Let δ , ε be in [0, 1].
    We say that C is relaxed locally correctable from (δ , ε, b)-blockwise errors if there exists a polynomial
time algorithm M, with oracle access to a string w ∈ Σn and explicit access to an index i ∈ [n], such that
the following two conditions hold.
   1. If w ∈ C, then Mw (i) = wi with probability 1.
   2. Otherwise, for all strings w ∈ Σn such that w is (δ , ε, b)-blockwise close to some codeword c ∈ C,
                                         Pr Mw (i) ∈ {ci , ⊥} ≥ 1/2,
                                                               

       where ⊥6∈ Σ is a special abort symbol.
    In all the definitions above, we optionally drop b from the notation when the blocking size is
understood; thus, (δ , ε, b)-blockwise errors become (δ , ε)-blockwise errors.
    To give some intuition about blockwise density and how it applies in our setting, there are two
parameter regimes we will be interested in. When δ is small compared to ε, we have errors that are
well-spread, since there are relatively few blocks that contain more than ε fraction of errors. By contrast,
when δ is large and ε is small, we have many blocks that could be entirely corrupted, but relatively few
corruptions outside of them. In this case, we have errors that are concentrated within a few blocks.
    Now we give a more detailed overview of each of the three major steps of the proof, using the
language of blockwise density:
   1. In the first step, we partition each codeword c ∈ C into blocks, then encode each block with a
      Reed–Solomon code with block length d and distance 2δradius0    + ε. We will show that this gives us
                                                                              0
      a new code Cspread that is relaxed locally correctable from (δradius , δradius + ε/2)-blockwise errors.
      This is because these kinds of blockwise errors have the errors well-spread, since we think of δradius
                                                             0
      as small compared to the desired correcting radius δradius  .
   2. In the second step, we will use the sampler graphs of Lemma 5.8 to apply a “pseudorandom”
      permutation33 σ to the indices of codewords of Cspread to transform Cspread into a new code
  33 That is, a specific deterministic permutation that satisfies some random-like properties (which will be specified below). In

particular, we emphasize that we do not refer to the cryptographic primitive of the same name.


                          T HEORY OF C OMPUTING, Volume 16 (18), 2020, pp. 1–68                                               49
                            T OM G UR , G OVIND R AMNARAYAN , AND RON D. ROTHBLUM

                                             0
        Cconcentrated that is resilient to (δradius , 0)-blockwise errors. We choose the permutation σ that
        we apply such that the inverse permutation σ −1 scrambles (δradius    0    , 0)-blockwise errors and
                                         0
        turns them into (δradius /2, δradius + ε/2)-blockwise errors. Since Cspread is correctable from
                      0
        (δradius /2, δradius + ε/2)-blockwise errors,34 this ensures that Cconcentrated is relaxed locally cor-
                            0
        rectable from (δradius , 0)-blockwise errors.

   3. Finally, we will increase the alphabet size by collapsing blocks together into characters over a larger
                                                                                  0
      alphabet to create CAEL . If the adversary is only permitted to corrupt δradius  fraction of symbols,
      this limits the error patterns that the adversary can impose on the underlying code Cconcentrated to
         0
      (δradius , 0)-blockwise error patterns. Thus, by using the corrector for Cconcentrated on the resulting
                                                                                              0
      error patterns, we have a relaxed local corrector for CAEL that works up to radius δradius  .

    For a diagram depicting these steps, we refer the reader to Figure 1.

Remark 5.10 (Comparison to Proof of [37]). As previously stated, our proof that AEL distance amplifi-
cation preserves relaxed local correctability is very similar to the proof that it preserves local correctability
from [37]. We address the reasons for this similarity at a high level before proceeding with the proof.
    Kopparty et al. show that AEL distance amplification constructs a bijection between the base code
C and the final code CAEL such that the local corrector MAEL for CAEL calls the local corrector M for C
many times as a subroutine, with the property that if all the calls to M were successful in correction, then
MAEL will correct successfully. By sufficiently amplifying the success probability of M each time it is
invoked, they conclude that MAEL succeeds with probability at least 2/3.
    We now observe that a similar story can be told for relaxed correctors. We suppose that the code C
has a relaxed local corrector M, and we provide a relaxed local corrector MAEL for CAEL that invokes
M. Just like before, whenever all the invocations to M correct successfully, the relaxed corrector MAEL
corrects successfully. Furthermore, whenever a single invocation to M returns ⊥, the corrector MAEL can
deduce that there is some error in the purported codeword and can safely return ⊥. By reducing the error
probability of M sufficiently (see Remark 3.3), we can conclude that the probability that MAEL makes an
error is less than 1/3.


5.2.2    Setting parameters for the construction
                     0
Let C, r, δradius , δradius , q, and ε be as in Lemma 5.6. Let n be the block length of C. Let {Gn }n denote
an explicit, infinite family of (δradius /2, ε/2)-sampler graphs as given by Lemma 5.8, and let d be their
degree. Define b to be (1 − 2δradius0    − ε) · d.
    We will use the sampler graphs from Lemma 5.8 to construct the “pseudo-random” permutations
used to scramble errors in Step 2. Additionally, d, the degree of the sampler graphs, must match the
size of the blocks in Step 2 (see Lemma 5.12). Hence, we are forced to select b to be small enough with
                                                            0
respect to d so that the blocks in Step 1 can tolerate 2δradius  + ε fraction of errors.

  34 We use δ                                                                                                     0
             radius /2 as opposed to δradius here in order to prove that the distance of the resulting codes is 2δradius . This appears
in Lemma 5.13.


                            T HEORY OF C OMPUTING, Volume 16 (18), 2020, pp. 1–68                                                   50
                                R ELAXED L OCALLY C ORRECTABLE C ODES




Figure 1: A high-level overview of the Alon–Edmonds–Luby transform. In the first step, a codeword
is partitioned into equal-sized blocks, depicted here as length 2. Then, each block is encoded with a
Reed–Solomon code. The encoded blocks are associated with vertices on one side of a bipartite graph
with good sampling properties, and the permutation is applied according to the edges of the graph. The
                                                     0
sampler is chosen such that the “concentrated” (δradius  , 0)-blockwise errors in the resulting code are
sufficiently scrambled after applying the inverse permutation. Finally, the alphabet is enlarged to restrict
                    0
the adversary to (δradius , 0)-blockwise errors.


                      T HEORY OF C OMPUTING, Volume 16 (18), 2020, pp. 1–68                              51
                        T OM G UR , G OVIND R AMNARAYAN , AND RON D. ROTHBLUM

5.2.3    Encoding for well spread errors
We start by describing the first transformation from the code C to the code Cspread . Consider a codeword
c ∈ C. Recall that C is a binary code, so c is a string in {0, 1}n . Let t = dlog de.

   1. Let F2t be a finite field of order 2t . (Note that in particular F2t has characteristic 2.)
        Let B : Fb2t → Fd2t be the Reed–Solomon code over F with block length d, message length b, distance
           0
        2δradius                       0
                 + ε, and rate 1 − (2δradius + ε). Note that B basically has our target correcting radius, but
        with small block length. The rest of this first transformation will use B to modify C and endow it
        with some of B’s error correcting properties.
                                       n
   2. Partition the codeword c into b·t  blocks of length b · t. If b · t does not divide the length of c, simply
      pad c with 0’s so that it does. This adds at most b · t to the block length, which is negligible if
      the block length is at least b·t                     b·t
                                    ε . If it is less than ε , then a Reed–Solomon code with block length
           b·t
      n ≤ ε suffices, since the block length is so small that we can afford to read the whole thing to
      correct. We will interpret each block as a string of length b over the alphabet {0, 1}t .

   3. Encode each block using the code B. Each encoded block has length d (with alphabet F2t ), so the
                                                 n
      total length of all the encoded blocks is b·t · d.
                          0
    Note that (δradius , δradius + ε/2)-blockwise error patterns only have δradius fraction of blocks that have
            0                                                               0
more than δradius + ε/2 fraction of errors. Any block that gets at most δradius   + ε/2 fraction of errors can
be corrected by using the Reed–Solomon corrector. The δradius fraction of blocks that are so corrupted
that the Reed–Solomon corrector will err are few enough that they can be handled by the local corrector
for C. We now formalize this intuition.

Lemma 5.11 (Cspread Relaxed Locally Correctable from “Well-Spread” Errors). Let C, δ , δradius , δradius  0    ,
d, q, and ε be as defined in Section 5.2.2. Let B be the Reed–Solomon Code with distance 2δradius + ε 0

described in the construction of Cspread above.
                                         d·n/(b·t)
     The code Cspread : {0, 1}k → F2t              (over the alphabet F2t ) is GF(2)-linear and relaxed locally
                                 0
correctable from (δradius , δradius   + ε/2)-blockwise errors. Furthermore, for any two distinct codewords
 (1)       (2)                                                                                (1)         (2)
cspread , cspread in Cspread , there exists a set of blocks of measure at least δ such that cspread and cspread
                        0
differ in at least (2δradius   + ε)-fraction of coordinates in each of these blocks.
     Finally, the corrector Mspread for Cspread makes at most q · poly(d) queries.

Proof. Note that Cspread is clearly GF(2)-linear, since C was binary and linear, and using the fact that F2t
has characteristic 2, the transformation from C to Cspread preserves GF(2)-linearity.
                                                                                       0
    The property that any two distinct codewords in Cspread differ in at least (2δradius   + ε)-fraction of
coordinates in at least δ fraction of blocks follows from the distances of the codes C and B. Since any
two codewords in C differ in at least δ fraction of coordinates, even after we partition these codewords
into blocks of size b over the alphabet Ft2 in Step 2 of the process above, we retain the property that any
two codewords transformed by this process differ in at least δ fraction of the blocks. Then, each of these
blocks is treated as an input to the Reed–Solomon code B, which has distance 2δradius0    + ε, which gives
us the desired claim.

                        T HEORY OF C OMPUTING, Volume 16 (18), 2020, pp. 1–68                                 52
                                       R ELAXED L OCALLY C ORRECTABLE C ODES

    Now we analyze relaxed local correctability. Suppose that our corrector Mspread is given oracle access
                   d·n/(b·t)
to a string w0 ∈ F2t                              0
                             that is (δradius , δradius + ε2 )-blockwise close to a codeword cspread ∈ Cspread , and
an index i ∈ [d · n/(b · t)] to correct. Let c denote the codeword in C that was transformed into cspread .
    Let us first pretend that Mspread has oracle access to c. Fix j ∈ [ b·t     n
                                                                                  ] such that the index i belongs to
the j block. Then correcting index i is easy: Mspread can query all the bits of c that lie in the jth block,
     th

encode these bits using B, and read off cspread [i] correctly from the resulting string.
    It would be great if we could simulate direct oracle access to c. Instead, we do the next best thing: we
show how Mspread can simulate M, the relaxed local corrector of C, on a string w that is δradius -close to c
and has the property that w0 = cspread ⇒ w = c.
    Consider replacing every query to ck with a call to M with string w and index k. Since M has perfect
completeness, it is clear that we get perfect completeness for Mspread , since in this case, w = c.
    Now we analyze the soundness. We note that we can amplify the success probability of M (Re-
                                                                                                            1
mark 3.3) whenever we call it so that the probability of error in the soundness case is at most 3b·t          . This
costs us a multiplicative factor of O(log(b · t)) = poly(d) in the query complexity of Mspread . In return,
we get that the probability that Mspread sees a response from M that is incorrect (i. e., not ⊥ or consistent
with c) in this whole process is at most 1/3.
    If Mspread ever sees M output ⊥, Mspread immediately aborts and outputs ⊥, which is valid since the
transformation from C to Cspread guarantees that w 6= c ⇒ w0 6= cspread . Hence, Mspread will either output
cspread [i] or ⊥ with probability at least 2/3.
    Now we describe how Mspread simulates M on w. Whenever M wishes to query a coordinate, Mspread
reads the entire block of w0 that corresponds to the encoding of the block in which this coordinate lies. If
this block is not a valid codeword of B, Mspread rejects and outputs ⊥, so wlog we will assume that all
the blocks read are valid codewords of B. So, the corrector decodes this block of w0 , and answers the
query accordingly. This process of answering a single query made by M costs d queries. We call the
implicit string from which M receives responses to its queries w, which is well-defined since we assume
that all the blocks that Mspread queries are valid codewords.35 Note that w0 = cspread ⇒ w = c.
    Note that, since the distance of B was 2δradius     0     + ε, Mspread will decode any block that has at most
  0
δradius + ε/2 fraction of errors correctly.      36

    By the assumption that w is (δradius , δradius    0   + ε/2)-blockwise close to Cspread , at least (1 − δradius )
                              0                     0
fraction of the blocks of w have at most (δradius + ε/2) fraction of errors, we know that the implicit string
w, from which Mspread provides answers to the queries of M, is δradius -close to C. Hence, by assumption,
Mspread can simulate the relaxed local corrector M correctly.


5.2.4    From “well-spread” to “concentrated” errors
                                                                                                   0
In this step we start with the code Cspread that is relaxed locally correctable from (δradius , δradius  + ε2 )-
blockwise errors with query complexity q · poly(d) (see Lemma 5.11), and describe how to transform it
                                                                   0
into a code Cconcentrated that is relaxed local correctable from (δradius , 0)-blockwise errors. Recall by the
  35 Of course, w is only defined on the coordinates in blocks where M
                                                                       spread makes queries.
  36 As noted by a reviewer, it will actually decode any block with strictly fewer than 2δ 0
                                                                                          radius + ε errors correctly, as fewer
errors will lead to a non-codeword. Since we only desire a constant correcting radius for our final results, we sacrifice a factor of
two to avoid off-by-one issues.


                            T HEORY OF C OMPUTING, Volume 16 (18), 2020, pp. 1–68                                                 53
                           T OM G UR , G OVIND R AMNARAYAN , AND RON D. ROTHBLUM

definition of blockwise error patterns that this is equivalent to being resilient against adversaries who can
               0
choose a δradius     fraction of blocks to fully corrupt, but cannot corrupt any other coordinates.
    The high level idea to cope with this is to apply a “pseudorandom” permutation to the coordinates of
Cspread to get a code Cconcentrated . As a warmup, we will outline the argument for a random permutation
       n            n
σ : [ b·t · d] → [ b·t · d]. Let
                                            n                                                 o
                              σ (Cspread ) = cσ −1 (1) ◦ . . . ◦ cσ −1 ( n ·d ) : c ∈ Cspread
                                                                        b·t


be the set of strings that results from applying σ to the coordinates of each codeword cspread ∈ Cspread .
                                                                                                     n
Just like for strings in Cspread , we will view strings c ∈ σ (Cspread ) as a sequence of b·t            blocks of length d.
                                                                                                        0
    Now we sketch why σ (Cspread ) is a code that is relaxed locally correctable from (δradius , 0)-blockwise
error patterns whp. Consider a string w such that w is (δradius      0    , 0)-blockwise close to σ (Cspread ). Let
c be a codeword of σ (Cspread ) that is (δradius , 0)-blockwise close to w, and let c0 be the codeword of
                                                0

Cspread that corresponds to c under the permutation σ , i. e., c0 = cσ (1) ◦ . . . ◦ cσ ( b·tn ·) . Since σ was a random
permutation, we can apply a Chernoff bound to conclude that cσ (1) ◦ . . . ◦ cσ ( b·tn ·d) differs from w on less
       0
than δradius  + ε2 fraction of coordinates on all but δradius /2 fraction of blocks, and we can use the corrector
for Cspread .
    While the fact that a random permutation works is nice, in order to make our code Cconcentrated
explicit we need to use an explicit permutation. Now we observe that we do not actually need σ to be a
truly random permutation—a cleverly selected “pseudorandom” permutation will suffice. The property of
σ used was that it mixes up coordinates well enough that, if we start with a (δradius          0      , 0)-blockwise error
pattern e, the permuted string eσ (1) ◦ . . . ◦ eσ ( n ·d ) has at most δradius /2 fraction of blocks that have more
                                                      b·t
        0
than a δradius                                                      0
               + ε2 fraction of ones—that is, it is a (δradius /2, δradius + ε2 )-blockwise error pattern.

Lemma 5.12 (Explicit Permutations That Sufficiently Mix). Fix n, b, d, t, δradius , ε, and δradius 0     to be as in
                                                             n           n                             0
Section 5.2.2. There exists an (explicit) permutation σ : [ b·t ·d] → [ b·t ·d] such that, for every (δradius , 0, d)-
                                     n
                                         ·d                                                                   0
blockwise error pattern ē ∈ F2b·tt , the permuted string ēσ (1) ◦ . . . ◦ ēσ ( b·tn ·d) is a (δradius /2, δradius + ε2 , d)-
blockwise error pattern.

    We recall that d depends on δradius and ε as it is set using Lemma 5.8.

Proof. Towards the goal of constructing σ , we describe a correspondence between d-regular, simple,
                                                                n
bipartite graphs G = (L, R, E) where |L| = |R| = b·t                  and permutations π(G) on [ n·d  b·t ]. Each of the b·t
                                                                                                                            n

blocks of d indices will be associated with one vertex in L and one vertex in R. For each vertex, we assign
an arbitrary order to its incident edges. If an edge e is the ith incident edge on vertex u ∈ L and the jth
incident edge on vertex v ∈ R, then we let π(G) take the ith position in the block corresponding to u to
the jth position in the block corresponding to v.
    Recall the property we are looking for in the permutation σ - we want that, for all (δradius         0    , 0)-blockwise
error patterns ē, the permuted string ēσ (1) ◦ . . . ◦ ēσ ( b·tn ·d) has at most δradius /2 fraction of blocks that have
             0
more than δradius   + ε2 fraction of ones. By replacing blocks with vertices and identifying the δradius        0    fraction
of corrupted blocks of ē with the vertex set S in the definition of sampler graphs (Definition 5.7) and
interpreting the edges of the graph as describing a mapping between indices, we can see that this property

                          T HEORY OF C OMPUTING, Volume 16 (18), 2020, pp. 1–68                                            54
                                    R ELAXED L OCALLY C ORRECTABLE C ODES

exactly corresponds to G being a (δradius /2, ε/2)-sampler. Hence, by taking a d-regular (δradius /2, ε/2)-
              n
sampler with b·t vertices on each side guaranteed by Lemma 5.8 and
                                                                taking  the corresponding permutation,
                                                                              log(n)
we can construct sufficiently mixing permutations in time poly               δradius ·ε   = poly(log(n) · d).

    Let σ be the explicit permutation given by Lemma 5.12. We define Cconcentrated = σ (Cspread ).

Lemma 5.13. Let δradius 0     , q, and d be defined as in Section 5.2.2. The code Cconcentrated is F2 -linear.
                                                      (1)           (2)
Furthermore, every pair of distinct codewords cconcentrated , cconcentrated in Cconcentrated differ in at least
   0
2δradius fraction of blocks.
                                                                  0
     Finally, Cconcentrated is relaxed locally correctable from (δradius , 0)-blockwise errors, where the relaxed
local corrector makes at most q · poly(d) queries.

Proof. Let σ be the permutation used in the definition of Cconcentrated . Let ε and δradius be as defined
in Section 5.2.2. Note that applying a permutation preserves GF(2)-linearity, so Cconcentrated is GF(2)-
linear.
     The property that every pair of distinct codewords in Cconcentrated differ in at least 2δradius0       fraction of
                                                   0
blocks follows from the fact that σ maps (δradius , 0)-blockwise error patterns to (δradius /2, δradius     0        + ε2 )-
error patterns. Therefore, σ must map (2δradius     0   , 0)-blockwise error patterns to (δradius , 2δradius0        + ε)-
                                                                      0
blockwise error patterns, which follows from splitting the (2δradius , 0)-blockwise error pattern into two
   0
(δradius  , 0)-blockwise error patterns and taking a union bound.
     Consider fixing any codeword cconcentrated ∈ Cconcentrated . We argue that any string w that is
     0
(2δradius    − γ, 0)-blockwise close to cconcentrated for any γ > 0 cannot be a codeword of Cconcentrated ,
which establishes the claim. Suppose for contradiction that w was a codeword of Cconcentrated , and let
cspread be the codeword of Cspread that corresponds to cconcentrated . By applying the permutation σ −1 on
the coordinates of w, we get some codeword wspread ∈ Cspread by the definition of Cconcentrated . However,
due to the mixing property of σ described above, we must have that wspread is (δradius , 2δradius     0        + ε − γ)-
blockwise close to cspread . This is a contradiction, since two distinct codewords of Cspread should differ
                  0
by at least 2δradius   + ε fraction of symbols in at least δ ≥ δradius fraction of blocks.
     We now describe the relaxed local corrector. This corrector will simply apply the permutation σ to
the coordinates of purported codewords of Cconcentrated and run the relaxed local corrector of Cspread on
the resulting string. This suffices due to the mixing property of the permutation σ −1 .
     Fix a string z0 that is (δradius
                                 0    , 0)-blockwise close to a codeword c0 ∈ Cconcentrated and a coordinate
       n
i ∈ [ b·t · d] to correct. Run the relaxed local corrector for Cspread with input string z00 = z0σ (1) ◦ . . . ◦ z0σ ( n ·d)
                                                                                                                      b·t
and coordinate σ −1 (i). Whenever the corrector queries a position j, provide it with z00j . Return whatever
the relaxed local corrector for Cspread returns.
     To establish soundness, note that, due to the properties of σ given by Lemma 5.12, the string z00 will
have at most δradius /2 fraction of blocks with more than δradius         0     + ε2 fraction of errors. Since we have
a relaxed local corrector for Cspread , we will recover z00σ −1 (i) = z0i or ⊥ with high probability due to the
soundness property of the relaxed local corrector for Cspread (see Lemma 5.11).
     To establish completeness, note that when z0 = c0 , we will always return c0i . This is because the
string z00 := c0σ (1) ◦ . . . ◦ c0σ ( n ·d) is a valid codeword of Cspread , and so the corrector for Cspread will always
                                     b·t
correctly return z00σ −1 (i) by the completeness property of relaxed local correction.

                          T HEORY OF C OMPUTING, Volume 16 (18), 2020, pp. 1–68                                             55
                         T OM G UR , G OVIND R AMNARAYAN , AND RON D. ROTHBLUM

   The number of queries made by this corrector for Cconcentrated is exactly the same as the number of
queries used by the corrector for Cspread .

    Finally, we describe how to transform Cconcentrated into a code that is resilient to adversarial errors.

5.2.5   From blockwise to adversarial errors
We start from a code Cconcentrated that is relaxed locally correctable for (δradius 0      , 0)-blockwise error patterns
as described above, and construct a code CAEL that is relaxed locally correctable for adversarial errors.
    We form CAEL from Cconcentrated by taking each codeword in Cconcentrated and rewriting each block of
symbols (c1 , . . . , cd ) ∈ Fd2t into a single character in F2t·d in the natural way (i. e., by viewing d-dimensional
vectors over F2t as an element of the field F2t·d )
    The idea is that, now, adversarial errors for CAEL correspond to blockwise errors in the code
Cconcentrated , which we can correct with the relaxed local corrector for Cconcentrated .
    We now prove Lemma 5.6.

Proof of Lemma 5.6. The corrector for CAEL , when asked to correct a coordinate i, will simply use the
corrector for Cconcentrated to correct each of the coordinates (i − 1) · d + 1, . . . , i · d. Since the corrector
for Cconcentrated has 
                      query complexity
                                         q · poly(d), we conclude that CAEL also has query complexity
                           1
q · poly(d) = q · poly ε·δradius . Furthermore, since Cconcentrated was locally correctable from (δradius 0   , 0)-
                                                          0
blockwise errors, we see that CAEL has correcting radius δradius .
    We begin by proving the rate and distance of the overall AEL transform we have described.
    The rate of CAEL can be written as

                                                           log |C|
                                             rAEL =                         .
                                                      (n/(b · t)) · (d · t)

Noting that the rate of C is r = logn|C| and simplifying, we get that

                                                              b
                                                    rAEL ≥ r · .
                                                              d
                             0
Plugging in that b = d(1 − 2δradius − ε), we get that
                                                          0
                                            rAEL ≥ (1 − 2δradius − ε) · r

as desired.
                                                  0
    The distance of the code CAEL is at least 2δradius , which follows immediately from the fact that any
                                                             0
two distinct codewords of Cconcentrated differ in at least 2δradius fraction of blocks (Lemma 5.13).
                                                                                          n/(b·t)      0
    A relaxed local corrector for CAEL works as follows. Given a string a ∈ F2t·d that is δradius          -close
to CAEL and desired coordinate i ∈ [n/(b · t)] for correction, expand the alphabet of a in the natural way
                      d·n/(b·t)
to get a string a0 ∈ F2t                  0
                                that is (δradius , 0, d)-blockwise close to Cconcentrated . Run the relaxed local
                                  0
corrector for Cconcentrated on a for each of the coordinates in the ith block.

                         T HEORY OF C OMPUTING, Volume 16 (18), 2020, pp. 1–68                                       56
                                     R ELAXED L OCALLY C ORRECTABLE C ODES

    Note that we can amplify each of these results so that the relaxed corrector for Cconcentrated works
                                                      1
correctly on each coordinate with probability 1 − 3d    at a O(log d) cost in query complexity, and so we
will get the whole block correct with probability at least 2/3 by a union bound.
    Note that this operation of collapsing blocks trivially preserves GF(2)-linearity, so C is GF(2)-linear.
    The final alphabet size is {0, 1}t·d , where we recall that t · d = ddlog de = poly(1/(ε · δradius )).

5.3    Concatenation
Recall that our goal is to start with a code with constant block length and repeatedly tensor and apply the
Alon–Edmonds–Luby transform to increase the block length, while preserving relaxed local correctability
and distance. However, for the tensor of a code C to be well-defined, we need C to be a linear code,
while the AEL transform (Lemma 5.6) only outputs a GF(2)-linear code. Fortunately, we can transform a
GF(2)-linear code C into a binary linear code by concatenating C with a binary code. See Section 2.1.1
for the definition of code concatenation.
    We will concatenate the code that is output by the AEL Transform (Lemma 5.6) with a good binary
code, so that we do not undo the distance amplification achieved by the AEL Transform and simultaneously
do not lose too much in the rate. Fortunately, sufficiently good binary codes do exist.
    For most applications of concatenation in this construction we will be dealing with codes with
extremely small distance. For sufficiently small δ , we can use codes that satisfy the following rate
distance trade-off.

Fact 5.14 (Zyablov Bound for Small Distance (Fact 2.4 from [37])). For every sufficiently small δ > 0,
there exists an explicit, infinite family {Zn }n of binary linear codes with relative distance δ and rate r at
least                                                     √
                                                          3
                                                 r ≥ 1− δ.

    For our final concatenation, we will use binary codes with the best possible rate–distance tradeoff we
know of: codes that meet the Gilbert–Varshamov bound.37 . We now state a fact about the existence of
linear codes of constant block length matching the Gilbert–Varshamov bound.

Fact 5.15 (Codes Achieving Gilbert–Varshamov Bound). There exist (trivially explicit) binary linear
codes with constant block length that achieve the Gilbert–Varshamov bound: namely, for any constants
δ , ε > 0, there exists a binary linear code with block length depending on δ and ε, distance δ > 0 and
rate 1 − H(δ ) − ε, where H(p) = −p · log(p) − (1 − p) · log(1 − p) is the binary entropy function.

    We note that both the codes described have efficient decoders that decode up to a constant radius. We
will use these decoders in our construction.
    Now we are ready to describe the overall construction of our RLCCs.

5.4    Putting it all together: final construction and parameters
As alluded to earlier, the high-level description of this construction is that we start with a linear code with
poly log(n) block length and high rate, then tensor it with itself, apply the Alon–Edmonds–Luby transform,
  37 For the Gilbert–Varshamov bound [20, 51], see, e. g., [32, Chap. 4.2].



                          T HEORY OF C OMPUTING, Volume 16 (18), 2020, pp. 1–68                             57
                           T OM G UR , G OVIND R AMNARAYAN , AND RON D. ROTHBLUM

concatenate with a sufficiently good binary code, and repeat. We now give the exact construction and
its analysis. Let δ := (log1n)36 . We first construct an intermediate RLCC with subconstant distance by
iterating the AEL transform, concatenation, and tensoring. This part of the construction is identical to the
construction of LTCs in [37], except that they combine the AEL transform and concatenation into what
they call -distance amplification. We expand on this in Remark 5.17.

Lemma 5.16 (RLCC with Sub-constant Distance (Analogous to Lemma 5.2 in [37])). There exists an
explicit infinite family of binary linear codes {Wn }n satisfying:

   1. Wn has block length at least n, rate at least 1 − O(1/ log(n)), and relative distance at least
      1/polylog(n).

   2. Wn is relaxed locally correctable with query complexity (log n)log log(n) , and constant correcting
      radius.

Proof. We give an algorithm to construct Wn given n.
                                                                            √
                                                                            3
   1. Let D0 be a code from Fact 5.14 with block length polylog(n), rate 1 − δ , and relative distance δ .

   2. For i = 0, . . . , log log(n) − 1:
                   ei be the result of applying Lemma 5.6 with input code Di and target distance δ 0
         (a) √
             Let D             √                                                                    radius =
              4                4
                δ /2 and ε = δ . Let Γ denote the output alphabet of Di .
                                                                      e
                                                           √4
                                                                                    √
                                                                                    12
         (b) Let Z be an explicit binary code with distance δ and rate at least 1 − δ as guaranteed by
             Fact 5.14 where Z has sufficiently large message length to be concatenated with Dei .38 Let D0i
             denote the concatenation of Di with Z.
                                             e
         (c) Set Di+1 = (D0i )2 .

   3. Return Dlog log(n) .

    Now we prove that the code Dlog log(n) has the required properties.

Correcting radius. We begin by observing that the correcting radius of Di is a constant for each i.
Clearly this holds for i = 0, as our corrector for D0 just reads the entire codeword, so the correcting radius
is constant. Assume this is true for i − 1, and we will show that it is true for i. √
                                                                                       4
     In the first step, we apply√AEL distance amplification to get the distance to δ . We note√that this
                                4                                                                    4
                        0
results from setting δradius to 2δ in Lemma 5.6, and hence the correcting radius of this code is 2δ . Then,
we concatenate with the code Z. Our corrector for the concatenated code D0i will use the corrector for
Dei and answer queries made by the corrector for D   ei by reading the entire codeword of Z used to encode
symbols of Γ, and using the underlying corrector, which corrects to a constant distance. Overall, we get
that the code is correctable to a constant radius.
  38 When Z has a larger message size than the alphabet size of D
                                                                ei , we pad the alphabet to make the sizes equal. As it turns out,
the code family from Fact 5.14 ensures that this incurs a negligible cost for our purposes.


                           T HEORY OF C OMPUTING, Volume 16 (18), 2020, pp. 1–68                                              58
                                 R ELAXED L OCALLY C ORRECTABLE C ODES

Distance. We will prove by induction that the distance of Di is δ for all i. Assume this is true for Di ,
and now we want to prove that it is true for Di+1 . Note that, from Section 2.1.1, we know that the distance
of D0i , the concatenated code,
                            √ is equal to the distance of Di multiplied by the distance of Z. Therefore, we
                                                           e
              0                                   0 2
have that Di has distance δ . Since Di+1 = (Di ) , and tensoring squares the distance, the claim follows.

Rate. Let us analyze how the rate of Di+1√differs from the rate of Di . Denote the rate √   of Di as ri . First,
                                             4                                            4
we apply Lemma  √   5.6 with target distance δ and a parameter ε, which we set to be√ δ . This gives us
                4                                                                      12
rate ri · (1 − 2 δ ). Then, we concatenate with the code Z, which has rate at least 1 − δ , which gives us
                                                         √
                                                         4
                                                                 √
                                                                 12
                                       ri+1 ≥ ri · (1 − 2 δ )(1 − δ )
                                                         √
                                                         12
                                            ≥ ri · (1 − 3 δ ).

Finally, we tensor the code. Since tensoring squares the rate, this gives us that
                                                    √
                                                    12
                                                                     √
                                                                     12
                                ri+1 ≥ ri2 · (1 − 3 δ )2 ≥ ri2 (1 − 6 δ ).
                                                      √3
                                                                  √
                                                                  12
    Note that we start with D0 which has rate 1 − δ > 1 − 6 δ . We furthermore prove by induction
that the rate of Di is at least                          √
                                                         12   i
                                             ri ≥ (1 − 6 δ )3 .
Indeed, assume the fact is true for i − 1. Then we have that
                                                       √
                                                       12
                                           2
                                     ri ≥ ri−1 · (1 − 6 δ )
                                                  √    i−1       √
                                        ≥ (1 − 6 δ )3 ·2 · (1 − 6 δ )
                                                  12             12

                                                  √
                                                  12   i
                                        ≥ (1 − 6 δ )3 .

Applying the fact that δ := (log1n)36 we can find the final rate:

                                                                 3loglog(n)
                                                           6
                                      rloglog(n) ≥ 1 −
                                                         log3 n
                                                               log2 n
                                                          6
                                                ≥ 1−
                                                       log3 n
                                                        6
                                                ≥ 1−     3
                                                             · log2 n
                                                     log n
                                                       6
                                                ≥ 1−       .
                                                     log n

Queries. Our goal for this section will be to prove that each iteration of the loop contributes a multi-
plicative factor of polylog(n) queries, hence our overall number of queries is (log n)O(loglog(n)) . Note that
D0 is relaxed locally correctable with polylog(n) queries and furthermore that D0 has constant correcting
radius (since its relaxed local corrector just reads everything). We will prove by induction that, for every i
from 0 to loglog(n), the number of queries needed to relaxed locally correct Di is at most (log n)k(i+1) , for

                       T HEORY OF C OMPUTING, Volume 16 (18), 2020, pp. 1–68                                 59
                       T OM G UR , G OVIND R AMNARAYAN , AND RON D. ROTHBLUM

some constant k that we will define later. We have already established that this holds for i = 0. Assume
below that this holds for i − 1, and we want to prove√ it for i.
                                                     4
    First, we amplify the distance of our code to δ using AEL distance amplification. We maintain
the invariant that the distance of our code at the beginning of each loop is δ and the correcting radius is
constant, Lemma 5.6 tells us that the query complexity of this new code is (log n)k(i) · (log n)O(1) . Then,
we concatenate the code with the code Z. Since we can take Z to have block length (log(n))O(1) , we
can simulate the old corrector by reading entire codewords whenever the old corrector queried a symbol
while incurring at most a polylogarithmic factor in query complexity. Finally, we tensor the code, which
costs us an O( δ1 ) = (log(n))O(1) factor by Lemma 5.5. So overall, we get that the query complexity of
the relaxed local corrector for Di is at most

                   (log n)k(i) · (log(n))O(1) · (log(n))O(1) · (log(n))O(1) = (log n)k·i+O(1) .

Defining k to be the O(1) additive term in the exponent gives us the desired result.


    Now we will use the intermediate RLCCs from Lemma 5.16 in order to construct RLCCs with constant
rate, constant distance, and quasipolylogarithmic query complexity.

Proof of Theorem 5.2. To construct these codes, we will apply the AEL Transform to the codes of
Lemma 5.16 to amplify the distance to a constant. However, doing so entirely at once will cause our
alphabet size to become superconstant, as we started with subconstant distance (see Lemma 5.6 for
details). We will insert a step of concatenation to rectify this issue.

   1. Start with a code D of block length at least n from Lemma 5.16. Apply AEL distance amplification
      (Lemma 5.6) to D with target distance δCe := γ/10, where γ is from the statement of Theorem 5.2,
      and the parameter ε in Lemma 5.6 set to γ/10. Call the resulting code C.
                                                                             e

   2. Let Z 0 be an explicit binary code satisfying the conditions in Fact 5.14 with distance (γ)3 /1000,
                γ
      rate 1 − 10                                                                            e Let C0 be
                  , and some polylogarithmic block length appropriate for concatenation with C.
      the code that results from concatenating Ce with Z 0 .

   3. Apply AEL distance amplification (Lemma 5.6) to C0 with target distance 1 − r − γ and ε in
      Lemma 5.6 set to γ/10. Call the resulting code C.

   4. Return C.

Distance. This follows from the fact that we used distance amplification in the final step.
                                                                         
Rate.    Recall that Lemma 5.16 gave us codes with rate 1 − O log1 n . After applying AEL, we will get
                        
rate at least 1 − O log1 n · (1 − γ/5) > 1 − 2γ5 . After concatenation, Subsection 2.1.1 tells us that we
                             
will get rate at least 1 − 2γ5 · 1 − 10
                                      γ 
                                          > 1 − 3γ5 . After the final distance amplification, we will get rate
                 
at least 1 − 3γ5 · (r + γ − γ/10) ≥ r.

                       T HEORY OF C OMPUTING, Volume 16 (18), 2020, pp. 1–68                               60
                                 R ELAXED L OCALLY C ORRECTABLE C ODES

Queries. Recall that Lemma 5.16 gave us codes with query complexity at most q = (log n)O(log log n)
that are relaxed locally correctable from poly1log n fraction of errors. After applying the first AEL transform
with target distance γ/10, the resulting code will be an RLCC with q · poly log(n) queries by Lemma 5.6.
Concatenating with the code Z 0 will increase our queries by another multiplicative poly log n factor, and
then our final AEL transform will increase our queries by a multiplicative constant. Overall, we retain
query complexity (log n)O(log log n) in our final code.

Alphabet. The output alphabet of the final step is {0, 1} p , where p = poly(1/γ) by Lemma 5.6.

   Finally, we can prove Theorem 5.1 by starting with the codes from Theorem 5.2, and concatenating
them with binary codes given to us by Fact 5.14.

Proof of Theorem 5.1. Start with codes from Theorem 5.2 that have rate R and distance at least 1 − R − γ,
for a small constant γ > 0. Then, simply concatenate them with binary codes of appropriate constant
block length from Fact 5.15 that have distance δ 0 and rate 1 − H(δ 0 ) − γ. The linearity of these codes
follows from the fact that the codes in Theorem 5.2 are GF(2)-linear, and we are concatenating with a
linear binary code. The query complexity is easily seen to remain quasi-polylogarithmic.
    We now calculate the rate and distance of the resulting codes. The rate of these concatenated codes is
r = R · (1 − H(δ 0 ) − γ). Rewriting the equation to isolate δ 0 , we get
                                                          r       
                                          δ 0 = H −1 1 − − γ .
                                                           R
The distance of the resulting code is δ = (1 − R − γ) · δ 0 . Plugging in for δ 0 , we get
                                                                 r     
                                   δ = (1 − R − γ) · H −1 1 − − γ
                                                                  R
where r is the rate of our desired code and R is the rate of the code from Theorem 5.2. Noting that
Theorem 5.2 allowed us to construct codes for any constant R ∈ (0, 1) and that H −1 is continuous, we
can maximize the above expression over R to achieve
                                                                    
                                                          −1
                                                                 r
                               δ = max (1 − R − ε) · H        1−
                                    r<R<1                         R

for arbitrarily small ε > 0.

Remark 5.17. We used the presentation of first applying Alon–Edmonds–Luby then tensoring at each
iteration in order to construct the same code that is constructed in Kopparty et al. [37]. Readers who are
already familiar with this construction may recognize that our analysis of relaxed local correctability for
the code with subconstant distance (Lemma 5.16) is simpler than the analysis of the analogous theorem
for local testability in Kopparty et al. This is because our proof that tensoring preserves relaxed local
correctability does not require any additional assumptions on the base code.
     In contrast, it is not necessarily true that the tensor of a locally testable code is locally testable [50].
However, it is true if the base code C is square of some code D [52]. Therefore, Kopparty et al. prove a
distance amplification lemma that takes in a code C = D2 and outputs a code C0 = (D0 )2 , by applying

                       T HEORY OF C OMPUTING, Volume 16 (18), 2020, pp. 1–68                                  61
                      T OM G UR , G OVIND R AMNARAYAN , AND RON D. ROTHBLUM

the AEL Transform on the base code. Furthermore, they need to prove that this modified distance
amplification method still preserves local testability.
    Since we do not need to use this syntactically modified distance amplification method to prove
relaxed local correctability, we use the vanilla AEL Transform to construct the same code that they
construct in [37]. Since the construction is exactly the same as the one in [37], it follows that the LTCs
constructed in [37] are simultaneously locally testable and relaxed locally correctable (proved here), both
with quasi-polylogarithmic query complexity.


Acknowledgments
We thank Oded Goldreich for initiating a discussion that led to this paper, for insightful conversations
and for encouragement. The second author would like to thank Dana Moshkovitz for useful discussions
surrounding this work. Finally, we would like to thank our reviewers, who gave excellent and thorough
feedback that greatly improved the readability of this manuscript.


References
 [1] N OGA A LON AND M ICHAEL L UBY: A linear time erasure-resilient code with nearly optimal
     recovery. IEEE Trans. Inform. Theory, 42(6):1732–1736, 1996. Preliminary version by Noga Alon,
     Jeff Edmonds, and Michael Luby in FOCS’95. [doi:10.1109/18.556669] 5, 45, 47

 [2] S ANJEEV A RORA , C ARSTEN L UND , R AJEEV M OTWANI , M ADHU S UDAN , AND M ARIO
     S ZEGEDY: Proof verification and the hardness of approximation problems. J. ACM, 45(3):501–555,
     1998. Preliminary version in FOCS’92. [doi:10.1145/278298.278306] 9, 28

 [3] S ANJEEV A RORA AND S HMUEL S AFRA: Probabilistic checking of proofs: A new characterization
     of NP. J. ACM, 45(1):70–122, 1998. Preliminary version in FOCS’92. [doi:10.1145/273865.273901]
     8

 [4] V IKRAMAN A RVIND , P USHKAR S. J OGLEKAR , PARTHA M UKHOPADHYAY, AND S. R AJA:
     Randomized polynomial-time identity testing for noncommutative circuits. Theory of Computing,
     15(7):1–36, 2019. Preliminary version in STOC’17. [doi:10.4086/toc.2019.v015a007] 19

 [5] L ÁSZLÓ BABAI , L ANCE F ORTNOW, L EONID A. L EVIN , AND M ARIO S ZEGEDY: Checking
     computations in polylogarithmic time. In Proc. 23rd STOC, pp. 21–31. ACM Press, 1991.
     [doi:10.1145/103418.103428] 7, 15

 [6] A LEXANDER BARG , I TZHAK TAMO , AND S ERGE V L ĂDU Ţ: Locally recoverable codes on
     algebraic curves. IEEE Trans. Inform. Theory, 63(8):4928–4939, 2017. Preliminary version in
     ISIT’15. [doi:10.1109/TIT.2017.2700859, arXiv:1603.08876] 11

 [7] E LI B EN -S ASSON , O DED G OLDREICH , P RAHLADH H ARSHA , M ADHU S UDAN , AND S ALIL P.
     VADHAN: Robust PCPs of proximity, shorter PCPs, and applications to coding. SIAM J. Com-

                      T HEORY OF C OMPUTING, Volume 16 (18), 2020, pp. 1–68                             62
                              R ELAXED L OCALLY C ORRECTABLE C ODES

     put., 36(4):889–974, 2006. Preliminary version in STOC’04. [doi:10.1137/S0097539705446810,
     ECCC:TR04-021] 3, 5, 6, 7, 8, 13, 14, 15, 17

 [8] E LI B EN -S ASSON , P RAHLADH H ARSHA , O DED L ACHISH , AND A RIE M ATSLIAH: Sound 3-
     query PCPPs are long. ACM Trans. Comput. Theory, 1(2):7:1–7:49, 2009. Preliminary version in
     ICALP’08. [doi:10.1145/1595391.1595394, ECCC:TR07-127] 6

 [9] M ANUEL B LUM , M ICHAEL L UBY, AND RONITT RUBINFELD: Self-testing/correcting with appli-
     cations to numerical problems. J. Comput. System Sci., 47(3):549–595, 1993. Preliminary version
     in STOC’90. [doi:10.1016/0022-0000(93)90044-W] 30

[10] V IVECK R. C ADAMBE AND A RYA M AZUMDAR: Bounds on the size of locally recoverable
     codes. IEEE Trans. Inform. Theory, 61(11):5787–5794, 2015. Preliminary version in NetCod’13.
     [doi:10.1109/TIT.2015.2477406] 11

[11] C LÉMENT L. C ANONNE AND T OM G UR: An adaptivity hierarchy theorem for property testing.
     Comput. Complexity, 27(4):671–716, 2018. Preliminary version in CCC’17. [doi:10.1007/s00037-
     018-0168-4, arXiv:1702.05678] 12

[12] A LESSANDRO C HIESA , T OM G UR , AND I GOR S HINKAR: Relaxed locally correctable
     codes with nearly-linear block length and constant query complexity. In Proc. 31st Ann.
     ACM-SIAM Symp. on Discrete Algorithms (SODA’20), pp. 1395–1411. ACM Press, 2020.
     [doi:10.1137/1.9781611975994.84, ECCC:TR20-113] 4, 11

[13] M ARCEL DALL’AGNOL , T OM G UR , AND O DED L ACHISH: A structural theorem for local algo-
     rithms with applications to coding, testing, and privacy. Electron. Colloq. on Comput. Complexity
     (ECCC), TR20-154:1–55, 2020. [ECCC] 11

[14] I RIT D INUR: The PCP theorem by gap amplification. J. ACM, 54(3):12–55, 2007. Preliminary
     version in STOC’06. [doi:10.1145/1236457.1236459, ECCC:TR05-046] 45

[15] I RIT D INUR AND P RAHLADH H ARSHA: Composition of low-error 2-query PCPs using decodable
     PCPs. In O DED G OLDREICH, editor, Property Testing: Current Research and Surveys, volume
     6390 of LNCS, pp. 280–288. Springer, 2010. Preliminary version in FOCS’09. [doi:10.1007/978-3-
     642-16367-8_21] 11

[16] I RIT D INUR AND O MER R EINGOLD: Assignment testers: Towards a combinatorial proof of
     the PCP theorem. SIAM J. Comput., 36(4):975–1024, 2006. Preliminary version in FOCS’04.
     [doi:10.1137/S0097539705446962] 4, 5, 15

[17] K LIM E FREMENKO: 3-query locally decodable codes of subexponential length. SIAM J. Comput.,
     41(6):1694–1703, 2012. Preliminary version in STOC’09. [doi:10.1137/090772721, ECCC:TR08-
     069] 3, 4

[18] K ATALIN F RIEDL AND M ADHU S UDAN: Some improvements to total degree tests. In 4th Israel
     Symp. on Theory of Computing and Systems (ISTCS’95), pp. 190–198. IEEE Comp. Soc., 1995.
     [doi:10.1109/ISTCS.1995.377032, arXiv:1307.3975] 19, 20

                     T HEORY OF C OMPUTING, Volume 16 (18), 2020, pp. 1–68                         63
                     T OM G UR , G OVIND R AMNARAYAN , AND RON D. ROTHBLUM

[19] P ETER G EMMELL AND M ADHU S UDAN: Highly resilient correctors for polynomials. Inform.
     Process. Lett., 43(4):169–174, 1992. [doi:10.1016/0020-0190(92)90195-2] 19

[20] E DGAR N. G ILBERT: A comparison of signalling alphabets. Bell Sys. Tech. J., 31(3):504–522,
     1952. [doi:10.1002/j.1538-7305.1952.tb01393.x] 57

[21] O DED G OLDREICH: Bravely, moderately: A common theme in four recent works. In Studies in
     Complexity and Cryptography. Miscellanea on the Interplay between Randomness and Computation,
     pp. 373–389. Springer, 2011. [doi:10.1007/978-3-642-22670-0_26] 45

[22] O DED G OLDREICH: A sample of samplers: A computational perspective on sampling. In O DED
     G OLDREICH, editor, Studies in Complexity and Cryptography. Miscellanea on the Interplay between
     Randomness and Computation, pp. 302–332. Springer, 2011. [doi:10.1007/978-3-642-22670-0_24]
     48

[23] O DED G OLDREICH: Introduction to Property Testing.            Cambridge Univ. Press, 2017.
     [doi:10.1017/9781108135252] 30

[24] O DED G OLDREICH AND T OM G UR: Universal locally verifiable codes and 3-round interactive
     proofs of proximity for CSP. Electron. Colloq. on Comput. Complexity (ECCC), 23:1–20, 2016.
     [ECCC:TR16-192] 6

[25] O DED G OLDREICH AND T OM G UR: Universal locally testable codes. Chicago J. Theoret. Comp.
     Sci., 2018(3):1–21, 2018. [doi:10.4086/cjtcs.2018.003, ECCC:TR16-042] 6, 10, 11, 12

[26] O DED G OLDREICH , T OM G UR , AND I LAN KOMARGODSKI: Strong locally testable codes with
     relaxed local decoders. ACM Trans. Comput. Theory, 11(3):17:1–17:38, 2019. Preliminary version
     in CCC’15. [doi:10.1145/3319907, ECCC:TR14-025] 6

[27] O DED G OLDREICH AND M ADHU S UDAN: Locally testable codes and PCPs of almost-linear length.
     J. ACM, 53(4):558–655, 2006. Preliminary version in FOCS’02. [doi:10.1145/1162349.1162351,
     ECCC:TR02-050] 6, 9, 14, 16, 32

[28] PARIKSHIT G OPALAN , C HENG H UANG , H USEYIN S IMITCI , AND S ERGEY Y EKHANIN: On
     the locality of codeword symbols. IEEE Trans. Inform. Theory, 58(11):6925–6934, 2012.
     [doi:10.1109/TIT.2012.2208937, arXiv:1106.3625] 11

[29] T OM G UR AND O DED L ACHISH: On the power of relaxed local decoding algorithms. In Proc.
     31st Ann. ACM-SIAM Symp. on Discrete Algorithms (SODA’20), pp. 1377–1394. ACM Press, 2020.
     [doi:10.1137/1.9781611975994.83] 11

[30] T OM G UR , G OVIND R AMNARAYAN , AND RON D. ROTHBLUM: Relaxed locally correctable
     codes. In Proc. 9th Innovations in Theoret. Comp. Sci. conf. (ITCS’18), pp. 27:1–27:11. Schloss
     Dagstuhl–Leibniz-Zentrum fuer Informatik, 2018. [doi:10.4230/LIPIcs.ITCS.2018.27] 11

[31] T OM G UR AND RON D. ROTHBLUM: Non-interactive proofs of proximity. Comput. Complexity,
     27(1):99–207, 2018. Preliminary version in ITCS’15. [doi:10.1007/s00037-016-0136-9] 12, 20

                     T HEORY OF C OMPUTING, Volume 16 (18), 2020, pp. 1–68                        64
                             R ELAXED L OCALLY C ORRECTABLE C ODES

[32] V ENKATESAN G URUSWAMI , ATRI RUDRA , AND M ADHU S UDAN: Essential Coding Theory.
     2019. Book. Draft available at author’s website. 5, 57

[33] V ENKATESAN G URUSWAMI , C HAOPING X ING , AND C HEN Y UAN: How long can optimal locally
     repairable codes be? IEEE Trans. Inform. Theory, 65(6):3662–3670, 2019. Preliminary version in
     RANDOM’18. [doi:10.1109/TIT.2019.2891765, arXiv:1807.01064] 11

[34] R ICHARD W. H AMMING: Error detecting and error correcting codes. Bell Sys. Tech. J., 29(2):147–
     160, 1950. [doi:10.1002/j.1538-7305.1950.tb00463.x] 3

[35] B RETT H EMENWAY, N OGA RON -Z EWI , AND M ARY W OOTTERS: 2017. Personal communication.
     5

[36] J ØRN J USTESEN: Class of constructive asymptotically good algebraic codes. IEEE Trans. Inform.
     Theory, 18(5):652–656, 1972. [doi:10.1109/TIT.1972.1054893] 13

[37] S WASTIK KOPPARTY, O R M EIR , N OGA RON -Z EWI , AND S HUBHANGI S ARAF: High-rate locally-
     correctable and locally-testable codes with sub-polynomial query complexity. J. ACM, 64(2):11:1–
     11:42, 2017. Preliminary version in STOC’16. [doi:10.1145/3051093, arXiv:1504.05653] 5, 9, 10,
     43, 44, 45, 47, 48, 50, 57, 58, 61, 62

[38] S WASTIK KOPPARTY AND S HUBHANGI S ARAF: Local testing and decoding of high-rate error-
     correcting codes. Electron. Colloq. on Comput. Complexity (ECCC), TR17-126:1–21, 2017. A
     version of this paper appeared in SIGACT News. [ECCC] 3

[39] C ARSTEN L UND , L ANCE F ORTNOW, H OWARD J. K ARLOFF , AND N OAM N ISAN: Algebraic
     methods for interactive proof systems. J. ACM, 39(4):859–868, 1992. Preliminary version in
     FOCS’90. [doi:10.1145/146585.146605] 34

[40] O R M EIR: Combinatorial construction of locally testable codes. SIAM J. Comput., 39(2):491–544,
     2009. Preliminary version in STOC’08. [doi:10.1137/080729967] 10, 45

[41] O R M EIR: Combinatorial PCPs with short proofs. Comput. Complexity, 25(1):1–102, 2016.
     Preliminary version in CCC’12. [doi:10.1007/s00037-015-0111-x, ECCC:TR13-134] 6

[42] DANA M OSHKOVITZ AND R AN R AZ: Two-query PCP with subconstant error. J. ACM, 57(5):29:1–
     29:29, 2010. Preliminary version in FOCS’08. [doi:10.1145/1754399.1754402, ECCC:TR08-071]
     11

[43] O MER R EINGOLD: Undirected connectivity in log-space. J. ACM, 55(4):17:1–17:24, 2008. Prelim-
     inary version in STOC’05. [doi:10.1145/1391289.1391291, ECCC:TR04-094] 45

[44] O MER R EINGOLD , G UY N. ROTHBLUM , AND RON D. ROTHBLUM: Constant-round inter-
     active proofs for delegating computation. In Proc. 48th STOC, pp. 49–62. ACM Press, 2016.
     [doi:10.1145/2897518.2897652, ECCC:TR16-061] 35, 38, 39, 45

                     T HEORY OF C OMPUTING, Volume 16 (18), 2020, pp. 1–68                        65
                     T OM G UR , G OVIND R AMNARAYAN , AND RON D. ROTHBLUM

[45] O MER R EINGOLD , S ALIL P. VADHAN , AND AVI W IGDERSON: Entropy waves, the zig-zag graph
     product, and new constant-degree expanders. Ann. Math., 155(1):157–187, 2002. Preliminary
     version in FOCS’00. [doi:10.2307/3062153, arXiv:math/0406038, ECCC:TR01-018] 45

[46] RONITT RUBINFELD AND M ADHU S UDAN: Robust characterizations of polynomi-
     als with applications to program testing. SIAM J. Comput., 25(2):252–271, 1996.
     [doi:10.1137/S0097539793255151] 19

[47] C LAUDE E LWOOD S HANNON: Communication in the presence of noise. Proc. IRE, 37(1):10–21,
     1949. [doi:10.1109/JRPROC.1949.232969] 3

[48] M ADHU S UDAN: Efficient Checking of Polynomials and Proofs and the Hardness of Approximation
     Problems. Springer, 1995. [doi:10.1007/3-540-60615-7] 19

[49] I TZHAK TAMO AND A LEXANDER BARG: A family of optimal locally recoverable codes.
     IEEE Trans. Inform. Theory, 60(8):4661–4676, 2014. Preliminary version in ISIT’14.
     [doi:10.1109/TIT.2014.2321280, arXiv:1311.3284] 11

[50] PAUL VALIANT: The tensor product of two codes is not necessarily robustly testable. In Proc. 9th
     Internat. Workshop on Randomization and Computation (RANDOM’05), pp. 472–481. Springer,
     2005. [doi:10.1007/11538462_40] 61

[51] ROM R. VARSHAMOV: Estimate of the number of signals in error correcting codes. In Dokl. Akad.
     Nauk SSSR (Russian), volume 117, pp. 739–741, 1957. 57

[52] M ICHAEL V IDERMAN: A combination of testability and decodability by tensor products. Random
     Struct. Algor., 46(3):572–598, 2013. Preliminary version in RANDOM’12. [doi:10.1002/rsa.20498,
     arXiv:1105.5806] 61

[53] S ERGEY Y EKHANIN: Towards 3-query locally decodable codes of subexponential length. J. ACM,
     55(1):1:1–1:16, 2008. Preliminary version in STOC’07. [doi:10.1145/1326554.1326555] 3

[54] S ERGEY Y EKHANIN: Locally decodable codes. Found. Trends Theor. Comp. Sci., 6(3):139–255,
     2012. [doi:10.1561/0400000030] 3

[55] V ICTOR VASILIEVICH Z YABLOV: An estimate of the complexity of constructing binary linear
     cascade codes (Russian). Problemy Peredachi Informatsii, 7(1):5–13, 1971. Link at Mathnet.ru
     English translation: Probl. Inform. Transmission 7(1) (1971) 3–10. 5




                     T HEORY OF C OMPUTING, Volume 16 (18), 2020, pp. 1–68                        66
                            R ELAXED L OCALLY C ORRECTABLE C ODES

AUTHORS

    Tom Gur
    Associate professor
    Department of Computer Science
    University of Warwick
    Coventry, CV4 7AL, United Kingdom
    tom gur warwick ac uk
    https://www.dcs.warwick.ac.uk/~tomgur/


    Govind Ramnarayan
    Software Engineer
    Neural Magic
    Somerville, Massachusetts, USA
    govind neuralmagic com
    http://www.mit.edu/~govind/


    Ron D. Rothblum
    Assistant professor
    Department of Computer Science
    Technion
    Haifa, Israel
    rothblum cs technion ac il
    http://www.cs.technion.ac.il/~rothblum/


ABOUT THE AUTHORS

    T OM G UR is an associate professor in the Department of Computer Science at the University
       of Warwick, and a UKRI Future Leaders Fellow. He received his Ph. D. from the
       Weizmann Institute of Science, under the supervision of Oded Goldreich, and spent two
       years at UC Berkeley before joining the University of Warwick. He was awarded the
       Shimon Even Prize in Theoretical Computer Science. His research interests are primarily
       in the foundations of computer science and combinatorics. Specific interests include
       sublinear-time algorithms, complexity theory, coding theory, cryptography, quantum
       computing, and more.




                   T HEORY OF C OMPUTING, Volume 16 (18), 2020, pp. 1–68                          67
                T OM G UR , G OVIND R AMNARAYAN , AND RON D. ROTHBLUM

G OVIND R AMNARAYAN received his Ph. D. from the Massachusetts Institute of Technology
   in 2020, supervised by Elchanan Mossel. His research interests lie primarily in the
   intersection of combinatorics and statistics, concretely in applications of these fields to
   coding theory and algorithmic inference. He currently works as an algorithm designer
   and software engineer at Neural Magic, a startup in Somerville, Massachusetts, USA,
   which produces software to accelerate neural network inference on CPUs. In his spare
   time, he enjoys walking and biking around the Boston area, and trying new restaurants.


RON D. ROTHBLUM is an assistant professor at the department of computer science at
  the Technion–Israel Institute of Technology. He completed his Ph. D. at the Weizmann
  Institute of Science in 2015, advised by Oded Goldreich. His dissertation received the
  John F. Kennedy Ph. D. Distinction Prize and the Shimon Even Prize in Theoretical
  Computer Science. His research is mainly focused on cryptography and complexity
  theory.




                T HEORY OF C OMPUTING, Volume 16 (18), 2020, pp. 1–68                            68