List-Decoding Multiplicity Codes

Authors Swastik Kopparty,
Plaintext
                         T HEORY OF C OMPUTING, Volume 11 (5), 2015, pp. 149–182
                                       www.theoryofcomputing.org




                List-Decoding Multiplicity Codes
                                                 Swastik Kopparty∗
                    Received February 4, 2013; Revised April 1, 2015; Published May 29, 2015




       Abstract: We study the list-decodability of multiplicity codes. These codes, which are
       based on evaluations of high-degree polynomials and their derivatives, have rate approaching
       1 while simultaneously allowing for sublinear-time error correction. In this paper, we show
       that multiplicity codes also admit powerful list-decoding and local list-decoding algorithms
       that work even in the presence of a large error fraction. In other words, we give algorithms
       for recovering a polynomial given several evaluations of it and its derivatives, where possibly
       many of the given evaluations are incorrect.
           Our first main result shows that univariate multiplicity codes over fields of prime order
       can be list-decoded up to the so-called “list-decoding capacity.” Specifically, we show
       that univariate multiplicity codes of rate R over fields of prime order can be list-decoded
       from a (1 − R − ε) fraction of errors in polynomial time (for constant R, ε). This resembles
       the behavior of the “Folded Reed-Solomon Codes” of Guruswami and Rudra (Trans. Info.
       Theory 2008). The list-decoding algorithm is based on constructing a differential equation
       of which the desired codeword is a solution; this differential equation is then solved using a
       power-series approach (a variation of Hensel lifting) along with other algebraic ideas.
           Our second main result is a list-decoding algorithm for decoding multivariate multiplicity
       codes up to their Johnson radius. The key ingredient of this algorithm is the construction
       of a special family of “algebraically-repelling” curves passing through the points of Fm  q ; no
       moderate-degree multivariate polynomial over Fm    q  can simultaneously    vanish on all these

    A version of this paper was posted online as an Electronic Colloq. on Computational Complexity Tech. Report [20].
   ∗ Supported in part by a Sloan Fellowship and NSF grant CCF-1253886.



ACM Classification: E.4, G.1.7
AMS Classification: 11T71, 94B35, 68W30
Key words and phrases: error-correcting codes, list-decoding, polynomials, differential equations


© 2015 Swastik Kopparty
c b Licensed under a Creative Commons Attribution License (CC-BY)                           DOI: 10.4086/toc.2015.v011a005
                                             S WASTIK KOPPARTY

      curves. These curves enable us to reduce the decoding of multivariate multiplicity codes
      over Fm
            q to several instances of decoding univariate multiplicity codes over the big field Fqm ,
      for which such list-decoding algorithms are known.
          As a corollary, we show how multivariate multiplicity codes of length n and rate nearly 1
      can be locally list-decoded up to their Johnson radius in O(nε ) time.


1    Introduction
Reed-Solomon codes (which are codes based on univariate polynomials) and Reed-Muller codes (which
are codes based on multivariate polynomials) are classical families of error-correcting codes which have
found wide applicability within theoretical computer science. One of the crucial reasons for these codes
being very useful, apart from the fact that polynomials appear naturally in many basic settings, is that
they admit very good decoding algorithms.
    In recent work [22], a new family of polynomial codes called multiplicity codes was introduced.
These codes extend the classical polynomial codes; they are based on evaluating polynomials and their
derivatives. The codewords of the m-variate order-s multiplicity code of degree-d polynomials over Fq
are obtained by taking an m-variate polynomial over Fq of degree at most d, finding all its derivatives up
to order s, and evaluating all of them on all the points of Fm q . A typical setting of interest is where Fq is a
large finite field, m, s are constants, and d = (1 − δ )sq for some δ ∈ (0, 1). (So δ becomes the minimum
distance of this code.) The augmentation of the derivatives allows one to consider polynomials over Fq of
degree larger than q, and this leads to much better tradeoffs in the rate and minimum distance of these
codes, while retaining the good local decodability of Reed-Muller codes.
    In this work, we study the list-decodability of multiplicity codes. Specifically, we study the problem
of decoding multiplicity codes from an α fraction of errors: given a “received” vector, the problem is to
find the list of all codewords of the multiplicity code which are within distance α from this vector. We
show two main results, showing that multiplicity codes of a given rate can be list-decoded from a much
larger fraction of errors than what is known for the corresponding classical code.

    • Our first result shows that univariate multiplicity codes achieve “list-decoding capacity.” Specif-
      ically, for every R, ε, there is a family of univariate multiplicity codes of rate R which can be
      list-decoded from a (1 − R − ε) fraction of errors in polynomial time. This thus provides a new
      (and perhaps more natural) example of an explicit capacity-achieving list-decodable code, the only
      other example being the original “Folded Reed-Solomon codes” of Guruswami and Rudra [10].

    • Our second result shows how to list-decode multivariate multiplicity codes up to the Johnson bound.
      Specifically, we give a polynomial time
                                            √ algorithm for list-decoding multivariate multiplicity codes
      of minimum distance δ from a (1 − 1 − δ ) fraction of errors. In particular, this gives the first
      algorithm for unique-decoding multivariate multiplicity codes up to half the minimum distance of
      the code.

   As a corollary of our second result, we show that m-variate multiplicity
                                                                √           codes of distance δ (which
                    m
have rate ≈ (1 − δ ) ) can be locally list-decoded from a (1 − 1 − δ − ε) fraction of errors in time
O(nO(1/m) ). This also gives the first algorithm for local decoding of multivariate multiplicity codes

                     T HEORY OF C OMPUTING, Volume 11 (5), 2015, pp. 149–182                                150
                                   L IST-D ECODING M ULTIPLICITY C ODES

up to half the minimum distance. (The local decoding algorithm of [22] can at best decode from 1/4
the minimum distance.) This gives the best known tradeoff between rate, error tolerance, and query
complexity of local decoding.
    One goal of this line of research is to develop good algorithms to support error correction for
multiplicity codes. Multiplicity codes are currently the only known codes which achieve rate arbitrarily
close to 1 while supporting o(n)-time local decodability. Thus they could potentially be useful for real-life
error correction. In addition, these codes are very closely related to the omnipresent (and omnipotent?)
polynomial codes, and they seem ripe for use in complexity theory and pseudorandomness applications.
    The other motivation is that the actual algorithmic questions underlying these decoding questions
are very natural and basic; they amount to interpolating a polynomial given evaluations of it and its
derivatives at various points, even though some of the given evaluations may be wrong.
    We now describe our results and methods in detail.

1.1   List-decoding of univariate multiplicity codes
Our first result deals with univariate multiplicity codes. It is well known that for every R ∈ (0, 1),
the largest fraction of errors from which an error-correcting code of rate R can be list-decoded, with
polynomial size lists, is less than (1 − R). We show that univariate multiplicity codes of rate R achieve
“list-decoding capacity”; they can be list-decoded from a (1 − R − ε) fraction of errors in polynomial time
with polynomial size lists. The first (and only known) example of such codes was given by Guruswami
and Rudra [10].
Theorem A (Informal). For every R ∈ (0, 1) and every ε > 0, there is an s > 0 such that for every
sufficiently large prime q, the order-s univariate multiplicity code over Fq of length n and rate R can be
list-decoded from a (1 − R − ε) fraction of errors in time poly(n).
     See Theorem 3.1 for a formal statement of this result.
     At the core of this theorem is an algorithm for the following natural problem: given n tuples
        (0)  (1)          (s−1)
(αi , βi , βi , . . . , βi      ) in Fs+1
                                      q , find all polynomials P(T ) of degree at most d such that for at least A
                                          ( j)
values of i ∈ [n], we have P( j) (αi ) = βi for each j < s (here P( j) denotes the j’th derivative of P).
    Following earlier algorithms for list-decoding algebraic codes [25, 12, 23, 10], our algorithm is based
on first processing the tuples to find an equation which we know every such P(T ) satisfies, and then
finding all solutions of that equation.
    For multiplicity codes the kind of equation that turns out to be appropriate is a differential equation.
Specifically we first use the tuples to find a differential equation of the form
                                    Q(T, P(T ), P(1) (T ), . . . , P(r) (T )) = 0
which every such P(T ) satisfies, where Q is a multivariate low-degree polynomial. This equation is found
by imposing constraints on Q that force it to vanish with a suitable “weighted-multiplicity” along certain
curves that come from the tuples.
    Once we have done this, we need to solve the differential equation. Fortunately, it turns out that
there are classical techniques that can give us a grip on such equations. Next we give an overview of
these classical techniques and how these can be used to find all low-degree polynomials P(T ) which are
solutions of the differential equation.

                      T HEORY OF C OMPUTING, Volume 11 (5), 2015, pp. 149–182                                151
                                            S WASTIK KOPPARTY

1.1.1    Interlude: The Cauchy-Kowalevski differential equation

In characteristic 0, the differential equation

                                           Q(T, P(T ), P0 (T )) = 0

and its higher-order analogues, are well studied in analysis and mathematical physics. If P(T ) is assumed
to be analytic, then this is called the Cauchy-Kowalevski differential equation. One example of a situation
where such equations arise is the following: Let T be “time” and let P(T ) be the “position of some particle
at time T .” Then P0 (T ) is the velocity at time T . The equation above expresses some invariant maintained
in the evolution of the particles position and velocity over time. The familiar “(1/2)mv2 + mgh = C” is
an example of such a differential equation.
     The classical analysis approach to such equations (see, e. g., [5]) is based on power series expansions.
To begin, suppose one is given some initial conditions T = t0 , P(t0 ) = α0 , P0 (t0 ) = α1 . If these initial
conditions are “nonsingular” (which roughly means that the initial conditions uniquely specify a particular
solution of the differential equation), one can simply recover P(T ) as a power series in T −t0 by iteratively
considering the equation mod (T − t0 ), mod (T − t0 )2 , mod (T − t0 )3 , etc.; at each iteration we get one
new coefficient of the power series representation of P(T ). This method works essentially verbatim even
when we work over finite fields (over C, the nontrivial part is to show that this power series is convergent,
but this does not bother us when we work with formal power series).
     We thus know how to find the solution P(T ) if we are given nonsingular initial conditions consistent
with P(T ). This motivates the following algorithm: try various choices of the initial conditions, and
whenever the initial conditions are nonsingular, recover the power series expansion of the solution with
the given initial conditions. For any given P(T ), if we chance upon initial conditions that are nonsingular
and consistent with P(T ), this algorithm will recover P(T ).
     Quite unfortunately, there can be “very singular” situations; P(T ) could be such that no matter
which initial conditions we consider, those initial conditions are singular. At this point an algebraic
phenomenon specific to our setting (and nonexistent in the analytic setting) comes to the rescue. It
turns out that in the case where P(T ) is a low degree polynomial, which is the case we are interested in,
the very singular situation happens only when P(T ) also satisfies a certain related differential equation
Q∗ (T, P(T ), P0 (T )) = 0. Furthermore, Q∗ is of lower degree; and thus we have reduced our problem to
that of solving a simpler differential equation. Iterating this, we arrive at the complete list of all the low
degree P(T ) that satisfy the original differential equation.


1.2     List-decoding of multivariate multiplicity codes
Our second result deals with general multiplicity codes. Recall that  √ the Johnson bound states that for
every code of distance δ , the list-size for list-decoding from a (1 − 1 − δ ) fraction of errors is at most a
polynomial in the length of the code. We show that for multiplicity codes, this can be made algorithmic;
we can find that list in polynomial time.

Theorem B (Informal). For m, s = O(1), √  every m-variate order-s multiplicity code of length n and
distance δ can be list-decoded from a (1 − 1 − δ ) fraction of errors in time poly(n).

                     T HEORY OF C OMPUTING, Volume 11 (5), 2015, pp. 149–182                              152
                                         L IST-D ECODING M ULTIPLICITY C ODES

     See Theorem 3.2 for a formal statement of this result.
     When s = 1 (the classical polynomial code case), this result was previously shown by Guruswami-
Sudan [12] and Pellikaan-Wu [24]. When m = 1 (the univariate case), this result was previously shown
by Guruswami-Sudan [12] and Guruswami-Sahai-Sudan [11] (this is based on the fact that univariate
multiplicity codes are a special case of ideal-based error-correcting codes [26]). Also note that when
m = 1 and q is prime, Theorem A is stronger than Theorem B.
     Theorem B is also the first algorithm for decoding multivariate multiplicity codes up to half the
minimum distance.
     All the known algorithms for decoding classical multivariate polynomial codes up to half their
minimum distance are quite indirect and specialized.1 We briefly recall one of these algorithms, which
proceeds via a reduction to decoding univariate polynomial codes. Here one is given a received word
r : Fm
     q → Fq , and one wants to find a multivariate polynomial Q(X1 , . . . , Xm ) of degree at most d such that
Q(a) = r(a) for many a ∈ Fm      q . One first finds a special kind of low-degree curve γ which maps the finite
field Fqm bijectively to the vector space Fm   q . Using γ, we get a function r ◦ γ : Fqm → Fq . By the bijectivity
of γ, every Q(X1 , . . . , Xm ) which has high agreement with r will also have the univariate polynomial
Q ◦ γ(T ) having high agreement with r ◦ γ. Since Q ◦ γ(T ) is a low-degree univariate polynomial (by the
low-degreeness of Q and γ), we have reduced to the problem of finding all univariate polynomial that has
high agreement with r ◦ γ. This is precisely the problem of list-decoding univariate polynomial codes;
tracing the parameters through, an algorithm for list-decoding univariate polynomial codes up to their
Johnson radius gives an algorithm for decoding multivariate polynomial codes up to their Johnson radius
(and the same statement holds for decoding up to half the minimum distance).
     While dealing with multiplicity codes, the possibility that the multivariate polynomials we are
dealing with have degree > q starts causing difficulties. It turns out that there can be exponentially
many polynomials Q(X1 , . . . , Xm ) of degree 2q (say) such that the compositions Q ◦ γ(T ) are all the same
univariate polynomial. Thus composing with a curve γ no longer preserves the original decoding problem
unambiguously.
     To bypass this, we use a special family of “algebraically repelling” curves from Fqm to Fm          q . More
concretely, we will construct a collection of curves γi : Fqm → Fm       q which   have  the property that for any
low-degree polynomial Q(X1 , . . . , Xm ) of degree less than sq, the collection of univariate polynomials
Q ◦ γi uniquely determines Q. Equivalently, if Q formally vanishes on all the γi , then Q itself must be
identically 0.
     Given these curves, the decoding algorithm for multivariate multiplicity codes does the following.
Using the received word
                                                              (m+s−1
                                                                 m )
                                                   r : Fm
                                                        q → Fq

for the multivariate multiplicity code, one comes up with a collection or received words

                                                       ri : Fqm → (Fqm )s
   1 To illustrate this, we note that it is unknown how to efficiently decode multivariate polynomial codes from half the minimum

distance when the set of evaluation points is S × S, where S is an arbitrary subset of Fq ; it is only known for certain special sets
S such as S = Fq . This is quite odd, since the principle on which the minimum distance of these codes is proved has a proof
which does not depend on the specific choice of S.


                         T HEORY OF C OMPUTING, Volume 11 (5), 2015, pp. 149–182                                                153
                                             S WASTIK KOPPARTY

(in spirit, ri is the composition of r with γi ) for a univariate multiplicity code. Then via a univariate
multiplicity code decoder, we find low-degree univariate polynomials Pi which have high agreement
with the ri . Then using the above mentioned property of the γi , there is a unique Q(X1 , . . . , Xm ) such that
Q ◦ γi = Pi for each i. Q is then our decoded multivariate polynomial.

1.3     Local decoding and local list-decoding of multiplicity codes
The above list-decoding algorithm for multivariate multiplicity codes (in particular the bivariate case)
enables improved local decoding and local list-decoding algorithms for multiplicity codes.
Theorem C (Informal). For every ε > 0 and integers m, s = O(1), every m-variate order-s multiplicity
code of length n and distance δ can be:
      • locally decoded from a (δ /2 − ε) fraction of errors, and
                                        √
      • locally list-decoded from a (1 − 1 − δ − ε) fraction of errors,
in nO(1/m) time, using nO(1/m) queries.
    See Theorem 3.3 for a formal statement of this result.
    The local decoding algorithm follows the original [22] approach, using planes instead of lines. The
reason planes allows for better local decoding than lines is the following simple fact: for a fixed point
a ∈ Fm                      m
     q and a fixed set S ⊆ Fq (which will correspond to the erroneous coordinates), if we pick a uniformly
random plane P passing through a, with high probability P samples S well:
                                                |P ∩ S|   |S|
                                                        ≈ m .
                                                  |P|    |Fq |

This fact allows us decode the corrected value of a corrupted multiplicity codeword at a point a ∈ Fm
                                                                                                    q by
decoding several bivariate multiplicity codewords; this is where the algorithm of Theorem B gets used.
    The local list-decoding algorithm for multiplicity codes follows the basic outline of the Arora-
Sudan [1] and Sudan-Trevisan-Vadhan [27] local list-decoders for multivariate polynomial codes (specifi-
cally the analysis from [2], which uses decoding on planes).

1.4     Interpolating sets for multiplicity codes
Finally, we also give a construction of explicit interpolating sets for multiplicity codes.
     Multiplicity codes are most naturally “locally-correctable.” To make them locally-decodable in the
strictest sense we need to specify an encoding map E, as well as a local decoding algorithm which when
given access to a corrupted version of E(m), finds any given symbol of the original message m. The
existence of such an encoding map follows easily from the linearity of the code. Such a map can be
efficiently computed with the help of advice which is of length polynomial in the length of the code. Thus
the sublinear time local-correction algorithm of [22] implies that multiplicity codes have non-uniform
local-decoding algorithms which run in sublinear time.
     To make the local decoding fully uniform, we need to be able to explicitly describe the encoding map.
This can be done by giving an interpolating set: a set of coordinates of the multiplicity code such that if

                      T HEORY OF C OMPUTING, Volume 11 (5), 2015, pp. 149–182                                154
                                   L IST-D ECODING M ULTIPLICITY C ODES

we specify symbols arbitrarily on those coordinates, there is a unique codeword of the multiplicity code
that extends this. We give an explicit description of an interpolating set for multiplicity codes.

Theorem D (Informal). There are explicit interpolating sets for multiplicity codes. Thus, multiplicity
codes can be locally decoded in uniform sublinear time.

    See Section 6 for a description of the construction and an outline of the analysis.
    The interpolating sets we find are simple combinations of standard interpolating sets for multivariate
polynomial codes. They are based on representing high-degree multivariate polynomials as appropriate
linear combinations of multivariate polynomials with individual degrees bounded by q − 1.

1.5     Related and subsequent work
Theorem A was independently discovered by Guruswami and Wang [13]. The method of proof and the
algorithm there are simpler than ours, but our approach yields a slightly better list-decoding radius for
each fixed multiplicity code. An exact comparison appears in Section 3. (This is similar to the relationship
that Vadhan’s simpler version, see Chapter 5 of [28], of the list-decoding of Folded Reed-Solomon codes
has to the original algorithm of Guruswami and Rudra [10].)
     Since this paper was posted online [20], there have been a number of advances in the theory of list
decoding. Many new families of codes meeting list-decoding capacity have been discovered [13, 4, 16,
14, 15, 9], and today we know constructions of such codes which also achieve near-optimal alphabet size,
list size and list-decoding time. See [9] for a detailed description of the state of the art in this area.
     There have also been new constructions of locally correctable codes of high rate [8, 6, 17], qualitatively
matching the local correctability of multiplicity codes. Local list-decoding algorithms for the high-rate
codes of [8] were given by [7]. Some further developments in the theory of multiplicity codes are
discussed in the survey [21].

Organization of this paper. In the next section we introduce some notation related to codes, polyno-
mials, derivatives, multiplicities and multiplicity codes. In Section 3 we state our main results formally.
In Section 4 we prove Theorem A. In Section 5 we prove Theorem B. Theorem C and Theorem D are
proved in Sections 6 and 7 respectively. We conclude with some open problems.


2      Preliminaries
2.1     Codes
Let Σ be a finite set and let n be an integer. We will endow Σn with a metric called the (normalized)
Hamming distance ∆, defined by:
                                          ∆(x, y) = Pr [xi 6= yi ] .
                                                     i∈[n]

      A code of length n over the alphabet Σ is simply a subset C of Σn . The rate of the code is defined to
be:
                                                    log|Σ| |C|
                                               R=                .
                                                        n

                      T HEORY OF C OMPUTING, Volume 11 (5), 2015, pp. 149–182                              155
                                                  S WASTIK KOPPARTY

The minimum distance of the code C is defined to be the smallest value of ∆(c, c0 ) for distinct elements
c, c0 of C.

List decoding. In the problem of list-decoding the code C from an η fraction of errors, we are given as
input r ∈ Σn , and we wish to compute the set

                                            L = {c ∈ C | ∆(r, c) < η} .

The maximum possible value of |L| as r varies over all elements of Σn is called the list-size for list-
decoding C from an η fraction of errors.

2.2     Polynomials and derivatives
We use [m] to denote the set {1, . . . , m}. For a vector i = hi1 , . . . , im i of non-negative integers, its weight,
denoted wt(i), equals ∑mj=1 i j .
     For a field F, let F[X1 , . . . , Xm ] = F[X] be the ring of polynomials in the variables X1 , . . . , Xm with
coefficients in F.
                                                                                                                i
     For a vector of non-negative integers i = hi1 , . . . , im i, let Xi denote the monomial ∏mj=1 X j j ∈ F[X].
The (c1 , . . . , cm )-weighted degree of this monomial is defined to be ∑mj=1 c j i j . The (c1 , . . . , cm )-weighted
degree of a polynomial is defined to be the maximum (c1 , . . . , cm )-degree of any of its nonzero monomials.
A special case of interest is the total degree of a monomial/polynomial, which is defined to be the
(1, 1, . . . , 1)-weighted degree of that monomial/polynomial. We use the term degree to denote the total
degree when there is no confusion.
     We now define derivatives and the multiplicity of vanishing at a point, and then state a basic bound
on the total number of zeroes (counting multiplicity) that a polynomial can have on a product set Sm . An
elementary proof of this lemma can be found in [3].
Definition 2.1 ((Hasse) Derivative). For P(X) ∈ F[X] and non-negative vector i, the ith (Hasse) derivative
                                                                                  def
of P, denoted P(i) (X), is the coefficient of Zi in the polynomial P̃(X, Z) =P(X + Z) ∈ F[X, Z].
      Thus,
                                            P(X + Z) = ∑ P(i) (X)Zi .                                            (2.1)
                                                            i

Definition 2.2 (Multiplicity). For P(X) ∈ F[X] and a ∈ Fn , the multiplicity of P at a ∈ Fn , denoted
mult(P, a), is the largest integer M such that for every non-negative vector i with wt(i) < M, we have
P(i) (a) = 0 (if M may be taken arbitrarily large, we set mult(P, a) = ∞).
Lemma 2.3. Let P ∈ F[X] be a nonzero polynomial of total degree at most d. Then for any finite S ⊆ F,

                                            ∑ mult(P, a) ≤ d · |S|n−1 .
                                            a∈Sn

In particular, for any integer s > 0,
                                                                       d
                                            Pr [mult(P, a) ≥ s] ≤          .
                                           a∈Sn                       s|S|

                       T HEORY OF C OMPUTING, Volume 11 (5), 2015, pp. 149–182                                     156
                                     L IST-D ECODING M ULTIPLICITY C ODES

2.3   Multiplicity codes
Now we recall the definition of multiplicity codes and the formulas for their rate and minimum distance.

Definition 2.4 (Multiplicity code [22]). Let s, d, m be nonnegative integers and let q be a prime power.
Let
                                            (m+s−1)      {i:wt(i)<s}
                                      Σ = Fq m = Fq                  .
For P(X1 , . . . , Xm ) ∈ Fq [X1 , . . . , Xm ], we define the order-s evaluation of P at a, denoted P(<s) (a), to be
the vector hP(i) (a)iwt(i)<s ∈ Σ.
    We define the multiplicity code of order-s evaluations of degree-d polynomials in m variables over Fq
as follows. The code is over the alphabet Σ, and has length qm (where the coordinates are indexed by
elements of Fm    q ). For each polynomial P(X) ∈ Fq [X1 , . . . , Xm ] with deg(P) ≤ d, there is a codeword in C
given by:
                                                                                   m
                                           Encs,d,m,q (P) = hP(<s) (a)ia∈Fmq ∈ (Σ)q .

Lemma 2.5 (Rate and distance of multiplicity codes [22]). Let C be the multiplicity code of order-s
evaluations of degree-d polynomials in m variables over Fq . Then C has minimum distance

                                                                     d+m
                                                                         
                                            d                          m
                                     δ = 1−          and rate             ,
                                                                   s+m−1 m
                                            sq                       m    q

which is at least                          m  m 
                                                       m2
                                                         
                                      s        d
                                             ·    ≥ 1−      (1 − δ )m .
                                     m+s       sq       s

   Note that in the m = 1 case, the rate of the code is (d + 1)/(sq) ≥ 1−δ , and thus univariate multiplicity
codes lie on the Singleton bound.


3     Main theorems
We first consider the problem of list-decoding univariate multiplicity codes.

Theorem 3.1 (List-decodability of Univariate Multiplicity Codes). Let r, s be integers with 0 ≤ r < s =
O(1). Let δ ∈ (0, 1) be a constant, and let R = 1 − δ .
   For every prime q, let d = (1 − δ )sq and let Cq be the multiplicity code of order-s evaluations of
degree-d polynomials in 1 variable over Fq . Thus Cq has rate R and minimum distance δ .
   There is an algorithm that can list-decode Cq from a
                                                    r+1
                                                                 16s2 8s2
                                                                            
                                         s           r+2
                                     1−     ·R             − max     ,
                                        s−r                       d    q

fraction of errors in time qO(rs) with lists of size qO(rs) .

                      T HEORY OF C OMPUTING, Volume 11 (5), 2015, pp. 149–182                                    157
                                                 S WASTIK KOPPARTY

    We prove this theorem in Section 4. Taking

                                   log(1 + ε3 )
                                                                         
                                                                          3
                             r=                         and     s = r· 1+     ,
                                    log(1/R)                              ε
we have for sufficiently large q:
                                  r+1
                                              16s2 8s2
                                r+2                   
                        s
                           ·R           + max     ,          ≤ (1 + ε/3) · R1−1/r + ε/3
                       s−r                     d    q
                                                             ≤ (1 + ε/3) · R · (1 + ε/3) + ε/3
                                                             ≤ (1 + 2ε/3) · R + ε/3
                                                             ≤ R+ε .

This gives the desired 1 − R − ε list-decoding radius, and so we get Theorem A from the introduction.
    The algorithm of Guruswami and Wang [13] can decode multiplicity codes from a
                                                             
                                          r+1            s
                                               · 1−         ·R
                                          r+2          s−r
fraction of errors. This decoding radius is smaller than the decoding radius of our algorithm (upto the
vanishing term max(16s2 /d, 8s2 /q)) by the AM-GM inequality.
    Next we consider the problem of list-decoding multivariate multiplicity codes.

Theorem 3.2. Let s, m = O(1). Let δ ∈ (0, 1) be a constant.
    For every prime power q, let d = (1 − δ )sq and let Cq be the multiplicity code of order-s evaluations
of degree-d polynomials in m variables over Fq . Thus Cq has minimum distance δ .
    There is an algorithm that can list-decode Cq from a
                                                p
                                            1− 1−δ −ε

fraction of errors in time qO(1) , with lists of size poly(1/ε). If ε = 0, the the list size is qO(1) .

    The proof of this theorem appears in Section 5.
    As a corollary of this, we show the local list-decodability of multivariate multiplicity codes.

Theorem 3.3. Let s, m = O(1). Let δ ∈ (0, 1) be a constant.
    For every prime power q, let d = (1 − δ )sq and let Cq be the multiplicity code of order-s evaluations
of degree-d polynomials in m variables over Fq . Thus Cq has minimum distance δ .
    There is an algorithm that can locally list-decode Cq from a
                                                 p
                                             1− 1−δ −ε

fraction of errors in time O(q6 ), using O(q2 ) queries, with lists of size poly(1/ε).

    The proof of this theorem is sketched in Section 7.

                       T HEORY OF C OMPUTING, Volume 11 (5), 2015, pp. 149–182                            158
                                     L IST-D ECODING M ULTIPLICITY C ODES

4      List decoding univariate multiplicity codes
We first describe a slightly more general problem than that problem of list-decoding univariate multiplicity
codes, and we give our algorithmic results for this general problem. Later we will specialize our results
to univariate multiplicity codes.
    Let                                     (0)  (1)         (s−1) 
                                 S = αi , βi , βi , . . . , βi        i ∈ [n]
                                                                  (<s)               (0)   (1)     (s−1)
be a collection of n distinct points in Fs+1
                                         q . We will use βi    to denote (βi , βi , . . . , βi ) ∈ Fsq .
    We are interested in polynomials P(T ) ∈ Fq [T ] of degree-d which have “high agreement” with S.
Define:                                      n                            o
                                                                     (<s)
                               agree(P, S) = i ∈ [n] P(<s) (αi ) = βi       .

Theorem 4.1. Let r be an integer with 0 ≤ r < s and r < d/2.
   Let
                                                            ! 1
                                              ∏rj=0 (d − j) r+2
                                    A0 = n · r                  .
                                              ∏ j=0 (s − j)
Let ε < 1 be a real number such that
                                              8s2 A0                   4s2 A0
                                         ε>              and    ε>            .
                                               dq                        q2
Let A = A0 · (1 + ε).
    Let L be the set of polynomials that have more than A points of agreement with S:

                             L = {P(T ) ∈ Fq [T ] | deg(P) ≤ d, agree(P, S) > A} .

Then                                                               d
                                           |L| ≤ (r + 1) · qO(r q +r+1) ,
and there is an algorithm that can compute L in time qO(rd/q+r+1) .

   This theorem will be derived from the following two theorems. The first one finds a differential
equation for which P(T ) is a solution.

Theorem 4.2. There exists a polynomial Q(X,Y0 , . . . ,Yr ) ∈ Fq [X,Y0 , . . . ,Yr ], such that:
                        2
    1. degY j (Q) ≤ 4As
                     εd for each j.
                                                                            2
    2. The (1, d, d − 1, . . . , d − r)-weighted degree of Q is ≤ A · 2sε .

    3. For every P(T ) ∈ L, we have

                                              Q(T, P(T ), . . . , P(r) (T )) = 0 .

Furthermore, Q can be found in time poly (n · (s/ε)r · log q).

                       T HEORY OF C OMPUTING, Volume 11 (5), 2015, pp. 149–182                             159
                                                 S WASTIK KOPPARTY

    The next theorem shows how to find all solutions to such differential equations.
Theorem 4.3. Let t > 0 be an integer.
   Suppose Q(X,Y0 , . . . ,Yr ) ∈ Fq [X,Y0 , . . . ,Yr ] is such that:
   1. degY j (Q) ≤ t < q for each j.

   2. The (1, d, d − 1, . . . , d − r)-weighted degree of Q is < q2 .
    Then the set of all P(T ) ∈ Fq [T ] with degree at most d such that

                                       Q(T, P(T ), P(1) (T ), . . . , P(r) (T )) = 0

has cardinality at most t · (r + 1) · q2r·bd/qc+4r+4 .
                                                       rd
    Furthermore, this set can be found in time qO(r+ q +1) .
    Combining these two theorems, we complete the proof of Theorem 4.1.

Proof of Theorem 4.1. We first apply Theorem 4.2 to obtain polynomial Q(X,Y0 , . . . ,Yr ).
   This Q will satisfy the hypothesis of Theorem 4.3 (with t = q − 1) if we have
                                          4As2                       2s2
                                               < q and          A·       < q2 .
                                           εd                         ε
Our conditions on ε and the fact that A < 2A0 imply precisely these conditions.
    Thus we may apply Theorem 4.3, and get the desired list L. The running time of the algorithm and
the list size bound follow immediately.

    We may now complete the proof of the main theorem on list-decoding univariate multiplicity codes.

Proof of Theorem 3.1. We have C ⊆ ΣFq , where Σ = Fsq . Suppose we are given a received word w : Fq → Σ.
We think of w as a tuple of functions w = (w(0) , w(1) , . . . , w(s−1) ) : Fq → Fsq .
   Now let
                              S = α, w(0) (α), . . . , w(s−1) (α)) α ∈ Fq .
                                  

We want to apply Theorem 4.1 to this set S.
   We have n = |S| = q. We may choose

                                                        8s2 A 4s2 A
                                                                            
                                            ε = 2 · max      , 2                 ;
                                                         qd    q
then
                                          8s2 A            4s2 A
                                     ε>         ,    ε>          ,    and        ε < 1.
                                           dq               q2
Then by Theorem 4.1, in time qO(rs) , we can find all codewords c ∈ C such that ∆(c, w) is at most:
                                                                                   1
                                                                                  r+2
                                                                      d r+1
                                                                         
                                          A                    q·    s−r
                                     1−     ≤ 1 − (1 + ε)                                ,
                                          q                              q

                       T HEORY OF C OMPUTING, Volume 11 (5), 2015, pp. 149–182                        160
                                          L IST-D ECODING M ULTIPLICITY C ODES

with a list size at most qO(rd/q+r+1) .
    Finally we observe that the list decoding radius above is at least as big as our target list decoding
radius:
                                                r+1
                                                            16 · s2 8s2
                                              r+2                    
                                         s
                                 1−         ·R      − max          ,      ,
                                        s−r                   d      q
(this uses the fact that ε ≤ max(16 · s2 /d, 8s2 /q) which follows from the inequalities A ≤ n ≤ q).
     This completes the proof of the theorem.

4.1     Finding a differential equation
Proof of Theorem 4.2. Choose parameters
                                                                      
                                        r                    2B
                         B = (r + 1) s −    +1,             M=      ,                    and   D = bAMc − 1 .
                                         2                      ε

Observe that we have:
                                                                      D
                                                                        < A,                                                (4.1)
                                                                      M
                                                           D
                                                              > A0 .                                                        (4.2)
                                                          M+B
   The polynomial Q will be chosen to satisfy some “weighted-multiplicity” vanishing conditions. We
now give these conditions.
   Fix i ∈ [n].
   Define
                                                                s−1
                                                                       ( j)
                                                 Ri (T ) = ∑ βi (T − αi ) j .
                                                                j=0

                                                                           (<s)
Notice that if P(T ) ∈ Fq [T ] is such that P(<s) (αi ) = βi                      , then P(T ) = Ri (T ) mod (T − αi )s . Further-
more, in this case we have for each 0 ≤ j < s:
                                                         ( j)
                                          P( j) (T ) = Ri (T ) mod (T − αi )s− j .                                          (4.3)

      Let Mi,e,e0 ,...,er (X,Y0 , . . . ,Yr ) ∈ Fq [X,Y0 , . . . ,Yr ] be the polynomial:
                                                                                     r
                                                                                                ( j)
                              Mi,e,e0 ,...,er (X,Y0 , . . . ,Yr ) = (X − αi )e ∏ (Y j − Ri (X))e j .
                                                                                    j=0


Condition Ci : The polynomial Q(X,Y0 , . . . ,Yr ) lies in the ideal Ii of Fq [X,Y0 , . . . ,Yr ] generated by
                                                                 r
                                             Mi,e,e0 ,...,er e + ∑ ((s − j) · e j ) = M .
                                            
                                                                       j=0



                          T HEORY OF C OMPUTING, Volume 11 (5), 2015, pp. 149–182                                            161
                                                  S WASTIK KOPPARTY

    Observe that every polynomial in Fq [X,Y0 , . . . ,Yr ] can be uniquely represented as an Fq -linear combi-
nation of the polynomials:
                                    {Mi,e,e0 ,...,er | e, e0 , . . . , er ≥ 0} .
Asking that Q(X,Y0 , . . . ,Yr ) lies in Ii is equivalent to asking that for each (e, e0 , . . . , er ) such that e +
∑rj=0 ((s − j) · e j ) < M, the coefficient of Me,e0 ,...,er in this unique representation equals 0. Thus, this
condition Ci imposes at most
                                  n                          r                      o
                                   (e, e0 , . . . , er ) e + ∑ ((s − j) · e j ) < M
                                                          j=0

independent Fq -linear constraints on the coefficients of Q.
    Letting i vary in [n], we get that the total number of independent Fq -linear constraints imposed on the
coefficients of Q is at most
                                                r
                     n                                                o             (M + B)r+2
                n·    (e, e0 , . . . , er ) e + ∑ ((s − j) · e j ) < M ≤ n ·                           ,
                                            j=0                                 (r + 2)! ∏rj=0 (s − j)

(this bound is obtained by bounding the number of unit cubes that can be contained in the appropriate
simplex, via a volume argument).
      If the dimension of the Fq -linear space of polynomials Q(X,Y0 , . . . ,Yr ) of (1, d, d − 1, . . . , d − r)-
weighted degree at most D is larger than the number of independent Fq -linear constraints imposed by the
Ci , then such a polynomial satisfying all the conditions Ci will exist. Counting monomials, the dimension
in question is at least
                                                     Dr+2
                                                                     ,
                                            (r + 2)! · ∏rj=0 (d − j)
(again, this bound is obtained by considering the volume of a suitable simplex). Thus, such a polynomial
satisfying all the Ci exists if the following condition holds:

                                        Dr+2                     (M + B)r+2
                                           r            > n·                        ,
                               (r + 2)! · ∏ j=0 (d − j)      (r + 2)! ∏rj=0 (s − j)

which follows from                                                       1
                                                                      ! r+2
                                      D               ∏rj=0 (d − j)
                                         >          n· r                      = A0 ,
                                     M+B              ∏ j=0 (s − j)
and we know this (4.2).
   Furthermore, this polynomial Q can be found in time
                                                          !
                                   Dr+2                              s r        
                    poly                           · log q  = poly  n ·     · log q
                          (r + 2)! · ∏rj=0 (d − j)                      ε

by solving the system of linear equations.
    We now show that Q has the desired properties.

                       T HEORY OF C OMPUTING, Volume 11 (5), 2015, pp. 149–182                                  162
                                        L IST-D ECODING M ULTIPLICITY C ODES

      The (1, d, d − 1, . . . , d − r) weighted degree of Q is at most

                                                                       2s2
                                                                
                                                          2B
                                         D < AM ≤ A ·        +1 ≤ A·       .
                                                           ε            ε

Because of this, we have that the

                                                        D     D    2D      4s2
                                     degY j (Q) ≤          ≤     ≤    ≤ A·     .
                                                      d − j d −r    d      εd

      Now fix any P(T ) ∈ L. Let A be the set of agreement points:
                                                                                        (<s)
                                          A = i ∈ [n]            P(<s) (αi ) = βi
                                             
                                                                                               .

Consider the polynomial H(T ) = Q(T, P(T ), P(1) (T ), . . . , P(r) (T )). Notice that deg(H) ≤ D. We now
show that for each point of agreement i ∈ A, H(T ) vanishes at αi with multiplicity at least M. Using the
fact that Q(X,Y0 , . . . ,Yr ) lies in Ii , we have that for every i ∈ A, H(T ) can be written in the form

   Q(T, P(T ), . . . , P(s−1) (T )) =                ∑                    Bi,e,e0 ,...,er (T )Mi,e,e0 ,...,er (T, P(T ), . . . , P(r) (T )) .
                                        e,e0 ,...,er |e+∑((s− j)e j )=M


Recall that Mi,e,e0 ,...,er (T, P(T ), . . . , P(r) (T )) equals
                                                     r                    ei
                                                                     ( j)
                                          (T − αi )e ∏ P( j) (T ) − Ri (T ) ,
                                                        j=0


which for i ∈ A, is divisible by (T − α)e+∑((s− j)·e j ) (because of (4.3)). Thus for each i ∈ A, H(T ) vanishes
at αi with multiplicity at least M.
    Thus H(T ) vanishes with multiplicity at least M on |A| points. Since M · |A| ≥ M · A > D (by (4.1)),
we conclude that H(T ) = 0.
    This completes the proof of the theorem.

4.2     Solving the differential equation
We now prove Theorem 4.3. The main idea is to consider the power series expansion of the potential
solution. If a suitable nonsingularity condition holds, this approach will succeed. This is the content of
the following theorem. Later we will see that if the nonsingularity condition does not hold, we will be
able to make progress by solving a related differential equation.
    In our algorithm, we may have to work with power series over the extension field Fq2 of Fq (this
allows us to try out enough “initial conditions”; and Fq turns out to be too small to allow this). We
therefore state the next theorem over a general field F.

Theorem 4.4. Let F be a field of characteristic p.
                                                                                                                                           k+r
                                                                                                                                                
   Let Q(X,Y0 , . . . ,Yr ) ∈ F[X,Y0 , . . . ,Yr ]. Let k ≥ 1 be an integer such that p does not divide                                     r .
Suppose f (T ) ∈ F[T ] and α ∈ F are such that:

                         T HEORY OF C OMPUTING, Volume 11 (5), 2015, pp. 149–182                                                            163
                                                     S WASTIK KOPPARTY

        ∂Q
   1.       (α, f (α), f (1) (α), . . . , f (r) (α)) 6= 0 ,
        ∂Yr
   2. Q(T, f (T ), f (1) (T ), . . . , f (r) (T )) = 0 mod (T − α)k .
Then there exists a unique γ ∈ F such that the function g(T ) = f (T ) + γ · (T − α)k+r satisfies:
                              Q(T, g(T ), g(1) (T ), . . . , g(r) (T )) = 0       mod (T − α)k+1 .
Furthermore, this γ can be found efficiently.
Proof. Since
                               Q(T, f (T ), f (1) (T ), . . . , f (r) (T )) = 0    mod (T − α)k ,
we know that
                       Q(T, f (T ), f (1) (T ), . . . , f (r) (T )) = β · (T − α)k     mod (T − α)k+1
for some β ∈ F.
    Let g(T ) = f (T ) + γ · (T − α)k+r . Then for each i ≤ r, we have that
                                                            
                                 (i)       (i)          k+r
                                g (T ) = f (T ) + γ ·          (T − α)k+r−i .
                                                           i
In particular, we also have g(i) (T ) = f (i) (T ) mod (T − α)k .
    Thus by Taylor expansion, we can write Q(T, g(T ), g(1) (T ), . . . , g(r) (T )) as:
                                             r                                          
                                   (r)          ∂Q                       (r)         k+r
             Q(T, f (T ), . . . , f (T )) + ∑       (T, f (T ), . . . , f (T )) · γ       (T − α)k+r−i
                                            i=0 ∂Yi                                   i
                                             + (T − α)2k · h(T ) ,
for some h(T ) ∈ F[T ] (the last term in the above expression arises from the degree-2 and higher terms in
the Taylor expansion; since each g(i) (T ) − f (i) (T ) is divisible by (T − α)k , we get that all the degree-2
and higher terms in the Taylor expansion are divisible by (T − α)2k ).
    Considering this equation mod (T − α)k+1 (and using the fact that 2k ≥ k + 1), we get:
            Q(T,g(T ), g(1) (T ), . . . , g(r) (T ))
                                                                                    
                             k              ∂Q                        (r)         k+r
                = (T − α) · β +                  (T, f (T ), . . . , f (T )) · γ          mod (T − α)k+1
                                            ∂Yr                                     r
                                                                                     
                             k              ∂Q                        (r)          k+r
                = (T − α) · β +                  (α, f (α), . . . , f (α)) · γ            mod (T − α)k+1 .
                                            ∂Yr                                     r
    Note that                                                                   
                                            ∂Q                      (r)      k+r
                                                (α, f (α), . . . , f (α)) ·
                                            ∂Yr                               r
is nonzero in F (by the hypotheses of the theorem), and thus there is unique γ that makes
Q(T, g(T ), g(1) (T ), . . . , g(r) (T )) equal 0 mod (T − α)k+1 , namely:
                                                                    β
                                        γ = − ∂Q                                 k+r
                                                                                        .
                                                                         (r)
                                                 ∂Yr (α, f (α), . . . , f (α)) · r


                         T HEORY OF C OMPUTING, Volume 11 (5), 2015, pp. 149–182                             164
                                     L IST-D ECODING M ULTIPLICITY C ODES

    This theorem lets us start with an “initial solution” of the differential equation mod (T − α), and
successively improve it to find solutions mod (T − α)2 , (T − α)3 , etc., till we get all solutions of degree
d. We record this below.
Corollary 4.5. Let F be a finite field of characteristic p. Let Q(X,Y0 , . . . ,Yr ) ∈ F[X,Y0 , . . . ,Yr ].
   Let α, β (0) , β (1) , . . . , β (r) ∈ F. Suppose that
                                            ∂Q
                                                (α, β (0) , . . . , β (r) ) 6= 0 .
                                            ∂Yr
    Let L be the set of all polynomials P(T ) ∈ F[T ] of degree at most d satisfying

                                     Q(T, P(T ), P(1) (T ), . . . , P(r) (T )) = 0 ,

and the initial conditions:

                               P(α) = β (0) , P(1) (α) = β (1) , . . . , P(r) (α) = β (r) .

    Then |L| ≤ |F|r·bd/pc+r , and L can be found in time |F|O(r·bd/pc+r) .
Proof. We describe an algorithm to find L. From the description of the algorithm, the bound on |L| will
be clear.
    Start with f0 (T ) = β (0) + β (1) (T − α) + · · · + β (r) (T − α)r .

   1. Let Lr = { f0 (T )}.
   2. For i = r + 1, r + 2, . . . , d, do:
       (a) If ri is relatively prime to p,
                

                • For each f (T ) ∈ Li−1 , let g(T ) = f (T ) + γ · (T − α)i be what is given by Theorem 4.4,
                  and include g(T ) into Li .
        (b) Otherwise,
             • For each f (T ) ∈ Li−1 and for each γ ∈ F, let g(T ) = f (T ) + γ · (T − α)i , and include
               g(T ) into Li .
   3. Let L∗ = P(T ) ∈ Ld | Q(T, P(T ), P(1) (T ), . . . , P(r) (T )) = 0 .
               

   4. Output L∗ .

    The correctness of this algorithm follows easily from Theorem 4.4. The invariant maintained is that

                                  Li ⊇ {P(T ) mod (T − α)i+1 | P(T ) ∈ L} .

     To understand |L|, we observe that |L| = |L∗ | ≤ |Ld |, |Lr | = 1, and for each i ≤ d, we have
|Li | = |Li−1 | if ri is relatively prime to p, and |Li | = |F| · |Li−1 | otherwise.
     This implies that |L| ≤ |F|c , where c is the number of integers i in {r + 1, . . . , d} such that ri is
                                                                                                          

divisible by p. This implies the claimed bounds on the running time of the algorithm and the size of the
output list.

                      T HEORY OF C OMPUTING, Volume 11 (5), 2015, pp. 149–182                                  165
                                                    S WASTIK KOPPARTY

   We can now prove the main theorem on the solvability of polynomial differential equations, Theo-
rem 4.3.

Proof of Theorem 4.3. We give the algorithm below, the bound on the number of solutions will follow
from the analysis of the algorithm.
Algorithm SOLVE(Q):
Here Q(X,Y0 , . . . ,Yr ) ∈ Fq [X,Y0 , . . . ,Yr ] is a nonzero polynomial with degYi (Q) < q for each i.

   1. Initialize L = 0.
                     /

   2. If Q(X,Y0 , . . . ,Yr ) does not depend on any of Y0 , . . . ,Yr , return L and exit.

   3. Let r∗ be the largest j such that Q(X,Y0 , . . . ,Yr ) depends on Y j .
                                            ∗
   4. For each (α, β0 , . . . , βr∗ ) ∈ Frq2+2 , do the following:

           • If
                                                                          ∂Q
                                      Q(α, β0 , . . . , βr∗ ) = 0   and        (α, β0 , . . . , βr∗ ) 6= 0 ,
                                                                          ∂Yr∗
              then using the algorithm of Corollary 4.5 (with F = Fq2 ), find all2 P(T ) ∈ Fq [T ] of degree at
                                                                                                        ∗
              most d with P( j) (α) = β j for 0 ≤ j ≤ r∗ , satisfying Q(T, P(T ), P(1) (T ), . . . , P(r ) (T )) = 0.
           • Include all such P(T ) in the list L.
                                      ∂Q
   5. Let Q# (X,Y0 , . . . ,Yr∗ ) =        (X,Y0 , . . . ,Yr∗ ).
                                      ∂Yr∗
   6. Let L# = SOLVE(Q# ).

   7. Output L ∪ L# .

Correctness. We will now prove correctness of the algorithm. Namely, every solution of degree at
most d of the polynomial-differential equation

                                          Q(T, P(T ), P(1) (T ), . . . , P(r) (T )) = 0

is included in the output of the algorithm.
     Let r∗ be the largest j such that Q(X,Y0 , . . . ,Yr ) depends on Y j . We prove the correctness of the
algorithm by induction on r∗ and degYr∗ (Q).
     The base case is when Q does not depend on any of Y0 , . . . ,Yr . In this case, the algorithm outputs 0,     /
which is correct.
     Now suppose we know correctness of the algorithm for all polynomials Q which depend on only
(some subset of) Y0 , . . . ,Yr∗ −1 , and for all polynomials Q which depend on only (some subset of) Y0 , . . . ,Yr∗
with degYr∗ (Q) < D.
    2 More precisely, we use the algorithm of Corollary 4.5 to find all such P(T ) ∈ F [T ], and then we only keep the ones that
                                                                                      q2
are in Fq [T ].


                        T HEORY OF C OMPUTING, Volume 11 (5), 2015, pp. 149–182                                            166
                                  L IST-D ECODING M ULTIPLICITY C ODES

      We now show that the algorithm is correct for all polynomials Q which depend on (some subset of)
Y0 , . . . ,Yr∗ and with degYr∗ (Q) ≤ D. Suppose P̂(T ) ∈ Fq [T ] is a polynomial of degree at most d such that
                                                                            ∗
                                  Q(T, P̂(T ), P̂(1) (T ), . . . , P̂(r ) (T )) = 0 .

Case 1: Suppose P̂(T ) also satisfies the differential equation
                                    ∂Q                                       ∗
                                         (T, P̂(T ), P̂(1) (T ), . . . , P̂(r ) (T )) = 0 .
                                    ∂Yr∗
      Note that since 0 < degYr∗ (Q) < q, the polynomial
                                                          ∂Q
                                                               (X,Y0 , . . . ,Yr )
                                                          ∂Yr∗
      is nonzero. Then by the induction hypothesis applied to the polynomial
                                                          ∂Q
                                                               (X,Y0 , . . . ,Yr ) ,
                                                          ∂Yr∗
      P̂(T ) will be included in the list L# , and hence in the output of the algorithm.
Case 2: Otherwise, we know that P̂(T ) is such that:
                                              ∂Q                                        ∗
                               H(T ) =              (T, P̂(T ), P̂(1) (T ), . . . , P̂(r ) (T )) 6= 0 .
                                              ∂Yr ∗


      Note that the degree of H(T ) is less than q2 .
      Thus there is some α ∈ Fq2 such that H(α) 6= 0. For that α, we have
                                                                                  ∗
                                      Q(α, P̂(α), P̂(1) (α), . . . , P̂(r ) (α)) 6= 0 .
      Thus in the iteration of Step 4 of the algorithm in which we take:
                                                                                           ∗          ∗
                         β (0) = P̂(α),        β (1) = P̂(1) (α),               ...    β (r ) = P̂(r ) (α) ,
      P̂(T ) will be included in the list L, and hence in the output of the algorithm.

    Thus in either case P̂(T ) appears in the output of the algorithm, as desired.

Running time. The running time of Step 4 of the algorithm can be bounded by
                                                              d                   dr
                                          q2(r+2) · qO( q ·r+r) = qO(r+ q ) .
The total number of recursive calls of the algorithm is bounded by
                                          r
                                       ∑ (degY (Q) + 1) ≤ (r + 1) · q .
                                                      j
                                       j=0
                                                                                               dr
    Thus the total running time of the algorithm can be bounded by qO(r+ q ) .

                     T HEORY OF C OMPUTING, Volume 11 (5), 2015, pp. 149–182                                   167
                                             S WASTIK KOPPARTY

List size. The total number of solutions to the differential equation found by Step 4 is bounded by
                                                            d
                                              q2(r+2)+2r·b q c+2r .

Unravelling the recursion, we get that the total number of solutions found by the algorithm is at most
                                                                   d
                                        (r + 1) · t · q2(r+2)+2r·b q c+2r .


5    List-decoding multivariate multiplicity codes
In this section, we will prove Theorem 3.2 on the list-decoding of multivariate multiplicity codes over Fm q.
    The main technical result is the following, which reduces the problem of list-decoding of multivariate
multiplicity codes over Fm q to several instances of list-decoding univariate multiplicity codes over Fqm .

Theorem 5.1. Suppose there exists an algorithm A which runs in time T and list-decodes the multiplicity
code of order-s evaluations of degree-dqm−1 polynomials in 1 variable over Fqm from an η fraction of
errors with list-size at most L.
    Then there exists an algorithm that list-decodes the multiplicity code of order-s evaluations of degree-d
polynomials in m variables over Fq from an η fraction of errors, which runs in time
                                                   
                                m+s−1              d +m       m+s−1
                            T·                  m
                                        + poly q ,        · L( m ) ,
                                  m                  m
                            m+s−1
and has list-size at most L( m ) .

    Given this, we now complete the proof of Theorem 3.2.

Proof of Theorem 3.2. The main input that we need is a good list-decoding algorithm for univariate
multiplicity codes. Such an algorithm follows from [11]. In particular, this says√that every order-s
univariate multiplicity code over Fq with distance δ can be list-decoded
                                                               √          from a (1 − 1 − δ ) fraction of
errors with list-size O(q) in time poly(qs ), and from a (1 − 1 − δ − ε) fraction of errors with list-size
poly( ε1 ) in time poly(qs ). (The same algorithm, in more a explicit form, is given by the r = 0 case of
Theorem 3.1, but the analysis in Theorem 4.3 has to be done more carefully to get the claimed list-size
bound.)
    Combining this list-decoding algorithm for univariate multiplicity codes with Theorem 5.1, we get
the desired list-decoding algorithm. Explicitly, it gives an algorithm that list-decodes m-variate order-s
multiplicity codes of degree-d polynomials over Fq with the following two sets of parameters:
                           q            
                                  d·qm−1
   1.       • from a 1 − 1 − s·qm          fraction of errors,
                                     m+s−1
          • with list-size q(m+O(1))·( m ) ,
                                                           m+s−1
          • in time poly(qm ) · m+s−1    + poly(qm ) · qm·( m ) .
                                       
                                   m


                     T HEORY OF C OMPUTING, Volume 11 (5), 2015, pp. 149–182                             168
                                       L IST-D ECODING M ULTIPLICITY C ODES
                          q                
                                  d·qm−1
   2.      • from a 1 − 1 − s·qm − ε fraction of errors,
                                 m+s−1
                                         
                             1 O(( m ))
           • with list-size ε             ,
                                                                    m+s−1
                                                                 1 O(( m ))
           • in time poly(qm ) · m+s−1
                                      
                                   m    + poly(qm ) ·            ε          .

Noting that
                                                       d · qm−1
                                                                =δ,
                                                        s · qm
this completes the proof of the theorem.

    The rest of this section is devoted to a proof of Theorem 5.1.

5.1 Fqm and Fm
             q

Let Tr : Fqm → Fq denote the finite field trace:
                                                                               m−1
                                               Tr(t) = t + t q + · · · + t q         .

We also use Tr to denote the associated polynomial:
                                                                         m−1
                                      Tr(T ) = T + T q + · · · + T q           ∈ Fqm [T ] .

    We will be working with bases for Fqm over Fq . Abusing notation, we will say that

                                               a = (α1 , α2 , . . . , αm ) ∈ Fm
                                                                              qm

is a basis for Fqm over Fq if {α1 , α2 , . . . , αm } is a basis for Fqm over Fq .

Definition 5.2 (s-general position). Let a1 , a2 , . . . , aM ∈ Fm       qm be bases for Fqm over Fq . We say that
a1 , . . . , aM are in s-general position if for every R(X1 , . . . , Xm ) ∈ Fqm [X1 , . . . , Xm ] with deg(R) < s, there is
some j ∈ [M] such that R(a j ) 6= 0.

Lemma 5.3. If s ≤ qm − qm−1 , and M ≥ m+s−1
                                                         
                                                     m    , there exists a collection of bases a1 , . . . , aM ∈ Fm    qm in
s-general position. Furthermore, such a collection can be found in time
                                                               m+s−1
                                                                        
                                                  poly (qm )( m ) .

Proof. Note that the property  of being in s-general position is monotone, and so it will suffice to prove
                        m+s−1
the lemma with M =        m     .
    Let B ⊆ Fm qm be the set of all bases of Fqm over Fq . It is a well known fact that B can be described as
the set of non-zeroes of a certain degree-qm−1 polynomial as follows:

                                   B = {(x1 , . . . , xm ) ∈ Fm
                                                              qm | Z(x1 , . . . , xm ) 6= 0} ,


                        T HEORY OF C OMPUTING, Volume 11 (5), 2015, pp. 149–182                                         169
                                                 S WASTIK KOPPARTY

where:                                                                                     
                                                          X1        X2      ...     Xm
                                                      
                                                         X1q       X2q     ...     Xmq     
                                                                                            
                                                            2          2               2
                                                         Xq         X2q     ...     Xmq
                                                                                           
                            Z(X1 , . . . , Xm ) = det 
                                                       1.
                                                                                            .
                                                       .            ..     ..        ..    
                                                       .             .        .       .
                                                                                            
                                                                                            
                                                           m−1        m−1             m−1
                                                        X1q       X2q       . . . Xmq
      The space of all R(X1 , . . . , Xm ) ∈ Fqm [X1 , . . . , Xm ] with deg(R) < s is m+s−1
                                                                                             
                                                                                          m   -dimensional. To each
such polynomial R, we associate a vector vR ∈ FB                   qm , where the coordinate of vR corresponding to
(x1 , . . . xm ) ∈ B equals R(x1 , . . . , xm ).
      Now we will show that the dimension of

                              {vR | R(X1 , . . . , Xm ) ∈ Fqm [X1 , . . . , Xm ], deg(R) ≤ s}

also equals m+s−1
                            
                        m    . Otherwise, there should exist some nonzero polynomial R(X1 , . . . , Xm ) such that
R(x1 , . . . , xm ) = 0 for every (x1 , . . . , xm ) ∈ B. Thus the polynomial R · Z(X1 , . . . , Xm ) would vanish on
every (x1 , . . . , xm ) ∈ Fm   qm , which is impossible since deg(R · Z) < s + q
                                                                                 m−1 ≤ qm .

      Therefore, there is a subset                       
                                                     A = a1 , . . . , a(m+s−1)
                                                                          m


of B of size m+s−1
                              
                          m     such that the set of restrictions of the vectors vR to coordinates in A also has
                   m+s−1
dimension             m     . This implies that for every nonzero R, the R must be nonzero on at least one of
a1 , . . . , a(m+s−1) .
                 m
      This argument can be made an algorithm for finding the ai by considering the matrix whose rows
consist of the vR , and then finding a maximal collection of linearly independent columns of it.

Explicit bases in s-general position. Though not required for the list-decoding application, we remark
that it is possible to construct bases in s-general position in time poly(q, m), provided s < q. Let S ⊆ Fm
                                                                                                          q
be an interpolating set for degree < s polynomials (there are many known simple constructions of such
sets). Let a ∈ Fm                                                           m
                  qm be a basis for Fqm over Fq . Then {v + a | v ∈ S} ⊆ Fqm is a set of bases in s-general
position.

5.2     Curves parametrizing Fm
                              q

We will be interested in certain curves parametrizing Fm     q . Associated to every basis of Fqm over Fq , there
is one such curve. Let a = (α1 , . . . , αm ) ∈ Fm
                                                 qm be a basis for Fqm over Fq . Define the curve γa : Fqm → Fm q
given by
                                  γa (t) = (Tr(α1t), Tr(α2t), . . . , Tr(αmt)) .
We also view γa as tuple of polynomials:

                          γa (T ) = (Tr(α1 T ), Tr(α2 T ), . . . , Tr(αm T )) ∈ (Fqm [T ])m .

      We now list the salient features of the curve γa .

                       T HEORY OF C OMPUTING, Volume 11 (5), 2015, pp. 149–182                                   170
                                    L IST-D ECODING M ULTIPLICITY C ODES

Lemma 5.4 (Properties of γa ). Let a, γa be as above. Then:
    • γa parametrizes Fm                m                                                       m
                       q : γa (Fqm ) = Fq . In particular γa gives a bijection between Fqm and Fq .

    • γa is Fq -linear: For α1 , α2 ∈ Fq , we have
                                      γa (α1 T1 + α2 T2 ) = α1 γa (T1 ) + α2 γa (T2 ) .

    • γa looks linear mod T q : In the ring Fqm [T ], we have γa (T ) ≡ aT mod T q .
    This lemma follows from elementary properties of the Tr function, see, e. g., [19].
    These properties of γa allow us to relate the evaluations of a multivariate polynomial and its derivatives
over Fm
      q to the evaluations of a univariate polynomial and its derivatives over Fqm . We do this in the
following lemma.
Lemma 5.5. Let Q(X1 , . . . , Xm ) ∈ Fq [X1 , . . . , Xm ] and let P(T ) ∈ Fqm [T ] be given by P(T ) = Q ◦ γa (T ).
Then for every t ∈ Fqm and every j < q:

                                        P( j) (t) =        ∑ Q(i) (γa (t))ai .
                                                      i:wt(i)= j

Proof. By definition of derivatives, we have:
                                           P(t +W ) = ∑ P( j) (t)W j ,
                                                              j

                                       Q(γa (t) + X) = ∑ Q(i) (γa (t))Xi .
                                                               i
    By linearity, γa (t +W ) = γa (t) + γa (W ). So
                                      P(t +W ) = Q ◦ γa (t +W )
                                                 = Q(γa (t) + γa (W ))
                                                 = ∑ Q(i) (γa (t))(γa (W ))i .
                                                       i

    Taking this equation mod W q , we get the following equation:

                            ∑ P( j) (t)W j = ∑             Q(i) (γa (t))(aW )i   mod W q .
                            j<q               i:wt(i)<q

For j < q, note that the coefficient of W j in the right hand side of this equation equals

                                                 ∑ Q(i) (γa (t))ai .
                                              i:wt(i)= j

On the other hand, the coefficient of W j in the left hand side of the equation equals P( j) (t). We therefore
conclude that for each j with 0 ≤ j < q, we have
                                        P( j) (t) =        ∑ Q(i) (γa (t))ai .
                                                      i:wt(i)= j



                      T HEORY OF C OMPUTING, Volume 11 (5), 2015, pp. 149–182                                  171
                                                S WASTIK KOPPARTY

5.3   Reducing multivariate decoding to univariate decoding
Using the γa parametrization of Fm
                                 q , we will now see how to reduce decoding of multivariate multiplicity
             m
codes over Fq to several instances of decoding univariate multiplicity codes over Fqm .

Proof of Theorem 5.1. We first describe the list-decoding algorithm, and then we analyze it.
Algorithm for Reducing Multivariate Decoding to Univariate Decoding:

   1. Let M = m+s−1
                   
                m   . Pick bases a1 , a2 , . . . , aM ∈ Fm
                                                         qm in s-general position.

   2. For each i ∈ [M], define `i : Fqm → Fsqm as follows. For each j with 0 ≤ j < s, let

                                            (`i (t)) j =      ∑ r(i) (γa (t)) · aii .
                                                                              i
                                                           i:wt(i)= j



   3. Using algorithm A, compute the set Li of all P(T ) ∈ Fqm [T ] of degree at most dqm−1 such that

                                              ∆(Encs,dqm−1 ,1,qm (P), `i ) < η .


   4. For every
                                                                                    M
                                          (P1 (T ), P2 (T ), . . . , PM (T )) ∈ ∏ Li ,
                                                                                    i=1

       find all Q(X1 , . . . , Xm ) ∈ Fq [X1 , . . . , Xm ] with deg(Q) ≤ d such that for each i ∈ [M],

                                                    Q ◦ γai (T ) = Pi (T ) .

                                                                        d+m
                                                                              
      (This is a system of linear equations over Fqm with                m        variables and (d + 1) · M constraints.)

   5. Output the list of all such Q(X1 , . . . , Xm ).

     We first show that in step 4 of the algorithm, there can be at most one Q satisfying the given system
of equations. This immediately implies that the algorithm runs in the claimed time and has the claimed
list-size bound.
     Suppose there were two distinct polynomials Q1 (X1 , . . . , Xm ) and Q2 (X1 , . . . , Xm ) which satisfied the
system of linear equations in Step 4 of the algorithm. Then their difference Q(X1 , . . . , Xm ) would have
the property that for each i ∈ [M], the univariate polynomial Q ◦ γai (T ) is identically 0. By the following
theorem, we get that Q must be identically 0, contradicting the distinctness of Q1 and Q2 .

Theorem 5.6. Let s < q. Let Q(X1 , . . . , Xm ) ∈ Fq [X1 , . . . , Xm ] be a polynomial of degree d < sq.
   Let a1 , . . . , aM ∈ Fm
                          qm be bases of Fqm over Fq in s-general position.
   Suppose that for each i ∈ [M], the polynomial Q ◦ γai (T ) ∈ Fqm [T ] equals 0. Then Q(X1 , . . . , Xm ) = 0.

                       T HEORY OF C OMPUTING, Volume 11 (5), 2015, pp. 149–182                                         172
                                                L IST-D ECODING M ULTIPLICITY C ODES

     We postpone the proof of this theorem to the end of this section.
     Now we show correctness of the algorithm. Namely, we show that for any Q̂(X1 , . . . , Xm ) ∈
Fq [X1 , . . . , Xm ], such that ∆(Encs,d,m,q (Q̂), r) < η, the algorithm will include Q̂(X1 , . . . , Xm ) in its out-
put.
     We say a ∈ Fm                                      (<s) (a) = r(a).
                        q is a point of agreement if Q̂
     Fix i ∈ [M]. Let P̂i (T ) = Q̂ ◦ γai (T ). We will show that
                                                                          
                                             ∆ Encs,dqm−1 ,1,qm (P̂i ), `i < η .

Recall that γai : Fqm → Fm                               m
                         q is a bijection. Thus for a ∈ Fq there is a unique ta ∈ Fqm such that γai (ta ) = a.
Now for every point of agreement a, we have that for every j with 0 ≤ j < s:

              (P̂i )( j) (ta ) =      ∑ Q(i) (γa (ta ))aiii                      by Lemma 5.5,
                                   i:wt(i)= j

                             =        ∑ Q(i) (a)aii
                                   i:wt(i)= j

                             =        ∑ r(i) (a)aii                              since a is a point of agreement,
                                   i:wt(i)= j

                             =        ∑ r(i) (γa (ta ))aii
                                                      i
                                   i:wt(i)= j

                             = (`i (ta )) j .

Thus every point of agreement a gives rise to an agreement (P̂i )(<s) (ta ) = `i (ta ), and therefore

                                                    ∆(Encs,dqm−1 ,1,qm (P̂i ), `i ) < η .

   Thus when we finish Step 3 of the algorithm, we know that P̂i ∈ Li for each i ∈ [M]. Thus when we
execute Step 4 of the algorithm with Pi = P̂i for each i ∈ [M], Q̂ will be a solution to the system of linear
equations (By the earlier discussion, it will in fact be the unique solution.)
   Thus Q̂ will be in the output of the algorithm, as desired.

    We now prove Theorem 5.6.

Proof of Theorem 5.6. We will show that for each a ∈ Fm     q , mult(Q, a) ≥ s. Then by Lemma 2.3 (recalling
that deg(Q) < sq), we can conclude that Q(X1 , . . . , Xm ) = 0.
    Fix i ∈ [M]. We have Q ◦ γai (T ) = 0. By Lemma 5.5, we conclude that for every t ∈ Fqm ,

                                                              ∑ Q(i) (γa (t))aii = 0 .
                                                                           i
                                                      i:wt(i)= j

    Thus for every a ∈ Fm
                        q and every i ∈ [M]


                                                                 ∑ Q(i) (a)aii = 0 .
                                                              i:wt(i)= j


                        T HEORY OF C OMPUTING, Volume 11 (5), 2015, pp. 149–182                                     173
                                                      S WASTIK KOPPARTY

     For 0 ≤ j < q, let Ra, j (Y) be the polynomial ∑i:wt(i)= j Q(i) (a)Yi .
     We just showed that for each i ∈ [M] and j < q, we have Ra, j (ai ) = 0.
     By the general position hypothesis on ai , this implies that for each j < s, the polynomial Ra, j (Y) is
itself identically 0.
     But the coefficients of Ra, j (Y) are Q(i) (a), for i satisfying wt(i) = j. Thus Q(i) (a) = 0 for each a and
each i with wt(i) < s.
     Therefore mult(Q, a) ≥ s for each a ∈ Fm     q , as desired.



6     Interpolating sets for multiplicity codes
In this section, we describe an explicit interpolating set for multiplicity codes. Concretely, for a given
d, s, m, q, we find a set S ⊆ Fm         q × {i | wt(i) < s} such that for every f : S → Fq , there is exactly one
Q(X1 , . . . , Xm ) ∈ Fq [X1 , . . . , Xm ] of degree at most d such that for each (a, i) ∈ S, Q(i) (a) = f (a, i). By
simple linear-algebra        arguments it can be seen that such sets S exist, and any such set S must have
|S| = d+m
               
           m    .  Our goal    here is to describe an explicit such set; this will enable the local decoding of
multiplicity codes in uniform sublinear time.
      The main fact that we will need is that there are explicit interpolating sets for traditional multivariate
polynomial codes. Specifically, for each d < mq, there is an explicit set Sd ⊆ Fm                  q such that for every
 f : Sd → Fq , there is exactly one A(X1 , . . . , Xm ) ∈ Fq [X1 , . . . , Xm ] of degree at most d and individual degree
< q, such that A(a) = f (a) for each a ∈ Sd .
      Before solving the problem in full generality, let us consider a simple special case where we can see
what is actually going on. Suppose we have m = 2 and s = 2; thus are dealing with bivariate polynomials
of degree ≤ d < 2q and we want to find 3 sets of points S0 , SX , SY ⊆ F2q such that if we learn the values of
a degree-d polynomial Q on S0 , the values of ∂∂ Q                                           ∂Q
                                                           X on SX and the values of ∂Y on SY , we will be able to
uniquely recover Q. Now Q(X,Y ) can be written uniquely as

                             Q(X,Y ) = A(X,Y ) + (X q − X)B(X,Y ) + (Y q −Y )C(X,Y ) ,

where A, B,C all have X and Y degrees ≤ q − 1, and the total degree of A is at most d, and the total
degrees of B,C are at most d − q. Given this representation of Q, the coordinates of the codeword of the
multiplicity code corresponding to Q are easy to compute (using the product rule):

                                                     Q(a) = A(a) ,
                                                   ∂Q       ∂A
                                                      (a) =    (a) − B(a) ,
                                                   ∂X       ∂X
                                                   ∂Q       ∂A
                                                      (a) =    (a) −C(a) .
                                                   ∂Y       ∂Y
This motivates the following choice for S0 , SX and SY . We pick S0 to be an interpolating set for polynomials
of total degree ≤ d and individual degree ≤ q − 1. We pick SX and SY to be interpolating sets for
polynomials of degree ≤ d − q and individual degree3 at most ≤ q − 1. Given evaluations of Q on S0 , we
    3 In our special case this latter condition is vacuous.



                          T HEORY OF C OMPUTING, Volume 11 (5), 2015, pp. 149–182                                   174
                                   L IST-D ECODING M ULTIPLICITY C ODES

get evaluations of A on S0 , and thus we can recover A by choice of S0 . Once we have A, the evaluations
of ∂∂ Q
      X at the points of SX give us the evaluations of B on SX , and thus we can recover B. Similarly we can
recover C. Putting this all together, we just found A, B and C, and hence we found Q. Thus we found an
interpolating set in this special case. The ideas here generalize quite easily to the general case.
     For each vector of nonnegative integers k = (k1 , . . . , km ), introduce the polynomial
                                                            m
                                            Vk (X) = ∏ (X jq − X j )k j .
                                                           j=1

For every i = (i1 , . . . , im ), a simple calculation shows that unless i ≥ k coordinate-wise, we have that for
every a ∈ Fmq:

                                                     (Vk )(i) (a) = 0 .                                       (6.1)

Furthermore,

                                             (Vk )(k) (a) = (−1)wt(k) .                                       (6.2)

    Let Q(X1 , . . . , Xm ) ∈ Fq [X1 , . . . , Xm ] be a polynomial of degree ≤ d. Then Q can be written as

                                       Q(X) =           ∑           Ak (X) ·Vk (X) ,
                                                     k:wt(k)≤ dq


for some polynomials Ak (X) with individual degrees at most q − 1 and total degree at most
                                                                 
                                   min m · (q − 1), d − wt(k) · q .

Conversely, any collection of such Ak (X) gives a Q(X) of degree ≤ d.
    We now show how the polynomials Ak (X), and hence the polynomial Q(X), can be recovered from
certain explicit evaluations of Q and its derivatives.
    We have:

                                 Q(i) (X) = ∑ (Ak ·Vk )(i) (X)
                                                k
                                                                                   0
                                           = ∑ ∑ (Ak )(j) (X) · (Vk )(j ) (X) ,
                                                k j+j0 =i

where the last step follows from the product rule for higher-order Hasse derivatives [18, Pages 144-155].
   Substituting a ∈ Fm  q into this equation, and using Equation (6.1) and Equation (6.2), we get:
                                                                          0
                       Q(i) (a) = ∑       ∑         (Ak )(j) (a)(Vk )(j ) (a)                                 (6.3)
                                   k j+j0 =i,j0 ≥k
                                                                                  0           0
                               = (−1)wt(i) · Ai (a) + ∑              ∑ (Ak )(i−j ) (a)(Vk )(j ) (a) .         (6.4)
                                                             k6=i  j0
                                                            k≤i i≥j0 ≥k

    Thus we can find the evaluation of Ai (X) at the point a of Fm
                                                                 q if we know


                      T HEORY OF C OMPUTING, Volume 11 (5), 2015, pp. 149–182                                  175
                                               S WASTIK KOPPARTY

     • Q(i) (a),

     • the polynomial Ak (X) for each k < i.

    This motivates the following construction of an interpolating set for the multiplicity code of order-s
evaluations of degree-d polynomials in m variables over Fq .

     • For each i with wt(i) ≤ d/q, we choose a set Si ⊆ Fm
                                                          q which is an interpolating set for the space of
       all polynomials with individual degree at most q − 1 and total degree at most d − wt(i)q.

     • Then the interpolating set S is defined by:
                                                          [
                                                   S=               Si × {i} .
                                                        i:wt(i)<s


    We now show that for every f : S → Fq , there is exactly one Q(X1 , . . . , Xm ) ∈ Fq [X1 , . . . , Xm ] of degree
at most d such that f (a, i) = Q(i) (a) for each (a, i) ∈ S.
    We do this by giving a procedure to find Q given Q(i) (a) for (a, i) ∈ S.

    1. Let i vary in the set {i | wt(i) < d/q} in non-decreasing order.

    2. By the time we come to a given i, we would have already found Ai0 (X) for every i0 < i.

    3. Via (6.4), using the known values of Q(i) (a) for a ∈ Si and the known polynomials Ai0 (X) (i0 < i),
       we can find the evaluations of Ai (a) for each a ∈ Si .

    4. By the interpolation property of Si , this suffices to find Ai (X).

    This procedure gives us Ai (X) for each i with wt(i) ≤ d/q, and thus Q(X) is uniquely determined.


7    Local decoding and local list-decoding
In this section, we point out some corollaries to local decoding and local list-decoding of multiplicity
codes.
     Using the algorithm for unique-decoding of bivariate multiplicity codes of distance δ from a δ /2
fraction of errors, we will get a local-decoding algorithm for decoding m-variate multiplicity codes of
distance δ from a (δ /2 − ε) fraction of errors for every constant ε > 0. Given a received word r : Fm
                                                                                                     q → Σ,
let us show how to locally correct it at the coordinate a ∈ Fm
                                                             q . Suppose  Q is the polynomial of degree ≤d
whose corresponding codeword of the multiplicity code is within distance δ /2 − ε from r.
Local Decoder:

     • Let M = m+s−1
                    
                 m   .

     • Pick lines `1 , . . . , `M passing through a uniformly and independently.

     • For each `i , let pi be a uniformly random 2-dimensional plane containing `i .

                      T HEORY OF C OMPUTING, Volume 11 (5), 2015, pp. 149–182                                     176
                                           L IST-D ECODING M ULTIPLICITY C ODES

     • By querying all the points of pi , and decoding from a δ /2 fraction of errors, recover the restriction
       of Q to the plane pi .

     • Using this, we get the restriction of Q to each `i .

     • Use this to recover Q(<s) (a).
We briefly sketch the analysis. For any i ∈ [M], notice that pi is a uniformly random plane passing through
a. Thus with high probability, pi will have a < δ /2 fraction of errors on it. By the union bound, this will
in fact happen for all the i ∈ [M] simultaneously with high probability, and thus all the computations of Q
restricted to pi and `i will be correct, and thus so will the computation of Q(<s) (a) (provided the `i are
in general enough position). Making this analysis formal can be done easily following the analogous
analysis in [22].
     The local list-decoding algorithm involves a little more work. Recall that a local list-decoding
algorithm uses oracle access to a received word r and outputs a collection of oracles, such that for each
codeword in the list of codewords near r, some oracle in the output locally computes that codeword. We
describe the algorithm below:
Local List-decoder:
    1. Set M = m+s−1
                    
                 m   .

    2. Pick lines `1 , . . . , `M uniformly at random in Fm
                                                          q.
                                                                                         √
    3. For each i ∈ [M], list-decode the univariate multiplicity code on r|`i from a (1 − 1 − δ − ε/2)
       fraction of errors to get the list of univariate polynomials Ri .

    4. For each (R1 , . . . , RM ) ∈ ∏i∈[M] Ri , output the oracle Correct(A(`1 ,...,`M ),(R1 ,...,RM ) ).
    Here Correct is the local correcting algorithm of [22]; when given as input an oracle which computes
a received word within distance 0.1 · δ from a codeword of a multiplicity code, it simulates an oracle
which computes that codeword with high probability.
    Next we describe the oracle A. It takes as advice a collection of lines and a collection of univariate
polynomials restricted to those lines, and simulates oracle access to a function that is supposed to be
 0.1 · δ -close to a codeword of the multiplicity code; specifically it is supposed to compute the unique
codeword (if any) in the list of codewords close to r whose restriction to line `i equals the univariate
polynomial Ri .
Oracle A(`1 ,...,`M ),(R1 ,...,RM ) (a):
    1. For each i ∈ [M]:

          (a) Consider the plane pi containing `i and a.
                                           √
          (b) List-decode r|pi from a (1 − 1 − δ − ε/2) fraction of errors to obtain a list Li of bivariate
              polynomials defined on pi .
          (c) Choose the unique element (if any) of the list whose restriction to `i equals Ri ; call that
              bivariate polynomial Pi .

                          T HEORY OF C OMPUTING, Volume 11 (5), 2015, pp. 149–182                            177
                                              S WASTIK KOPPARTY

    2. Output the unique value of Q(<s) (a) that is consistent with the equations Q|pi = Pi .

    The analysis of this algorithm is a combination of the local decoding algorithm for multiplicity
codes [22] and the local list-decoding algorithms  √ for Reed-Muller codes [1, 27, 2]. The usep  of planes in the
decoding is to allow for decoding from a (1− 1 − δ +ε) fraction of errors, and not a (1− 2(1 − δ )−ε)
fraction of errors which is what decoding along lines seems to give (this is the extra ingredient in [2]
which removes the 2 that appears in [27]; for codes of rate approaching 1, the removal of the 2 is crucial
for this algorithm to be nontrivial).
    We sketch the analysis now. Let Q̂(X) ∈ Fq [X] be a polynomial corresponding to a codeword of the
multiplicity code. The first step is to show that with high probability, for each i ∈ [M], Q̂|`i ∈ Ri . This
follows easily from the fact that random lines in Fm      q “sample well.” The second step is to show that with
high probability over the choice of `1 , . . . , `M , the oracle

                                          B := A(`1 ,...,`M ),(Q̂|` ,...,Q̂|` )
                                                                   1       M


computes a function which is (ε Ω(1) )-close (and hence  (0.1δ )-close) to Q̂. To see this, we take a
random a ∈ Fm                                                                                 (<s) (a) is at least
                q , and show that the probability (over the choice of the `i ) that B(a) = Q̂
1 − ε Ω(1) . Now the event “B(a) = Q̂(<s) (a)” will occur if: (1) Q̂|pi appears in Li , (2) no other element
of Li equals Q̂|`i when restricted to `i , and (3) the pi are in general enough position. The probability of
(1) is 1 − ε Ω(1) because pi is a random plane containing a, and hence it “samples well.” The probability
of (2) is also 1 − ε Ω(1) , because we can think of this event as choosing pi first and then `i within it, and
then its probability is bounded by the probability that some two elements of Li , which we know is of size
≤ poly(1/ε), agree on the uniformly random line `i . This is small by a general bound on the number of
lines on which two distinct bivariate polynomials can agree. The probability of (3) is 1 − o(1). Summing
up, we see that the probability of B(a) = Q̂(<s) (a) is at least 1 − ε Ω(1) , and hence we have the desired
property of B. Thus the self-corrected oracle Correct(B) will agree with Q̂ on all points of Fm     q , and so the
codeword of the multiplicity code corresponding to Q̂ will appear in the list of output oracles with high
probability, as desired.


8    Open questions
    1. Can univariate multiplicity codes be list-decoded from list-decoding capacity with constant list-
       size? The results of Guruswami, Guruswami-Wang and Dvir-Lovett imply that a carefully chosen
       (nonlinear) subset of univariate multiplicity codes does have this property.

    2. Can bivariate multiplicity codes of minimum distance δ be list-decoded from a (δ − ε) fraction of
       errors with constant list-size? This would automatically translate into a strong local list-decodability
       for multivariate multiplicity codes. In particular, this would give for every fixed integer m, codes
       of arbitrarily long length n, rate (1 − δ )m and distance δ , which are locally list-decodable from a
       (δ − ε) fraction of errors in time O(n2/m ).

    3. Does list-decoding of multiplicity codes have applications in complexity theory? List-decoding
       traditional polynomial codes (in many variables) has many applications. At first glance, the

                      T HEORY OF C OMPUTING, Volume 11 (5), 2015, pp. 149–182                                 178
                                L IST-D ECODING M ULTIPLICITY C ODES

      advantages of multiplicity codes over polynomial codes seem to be significant only when the
      number of variables is much smaller.


Acknowledgements
I am very grateful to Professor Enrico Bombieri for generously sharing his insights on a previous version
of Theorem 4.3 which worked only for trivariate polynomials, and for suggestions that were instrumental
in the proof of the full Theorem 4.3. Thanks to Boaz Barak, Michael Forbes, Amir Shpilka, Madhu Sudan
and other members of Madhu’s reading group for valuable suggestions during a talk in February, 2011
that further simplified the proof of Theorem 4.3. Thanks also to Shubhangi Saraf, Madhu Sudan and
Sergey Yekhanin for helpful discussions and encouragement. Finally, many thanks to the anonymous
referees for very helpful comments.


References
 [1] S ANJEEV A RORA AND M ADHU S UDAN: Improved low-degree testing and its applications.
     Combinatorica, 23(3):365–426, 2003. Preliminary version in STOC’97. See also at ECCC.
     [doi:10.1007/s00493-003-0025-0] 154, 178

 [2] K RISTIAN B RANDER AND S WASTIK KOPPARTY: List-decoding Reed-Muller over large fields
     upto the Johnson radius. Manuscript, 2009. 154, 178

 [3] Z EEV DVIR , S WASTIK KOPPARTY, S HUBHANGI S ARAF, AND M ADHU S UDAN: Exten-
     sions to the method of multiplicities, with applications to Kakeya sets and mergers. SIAM
     J. Comput., 42(6):2305–2328, 2013. Preliminary version in FOCS’09. See also at ECCC.
     [doi:10.1137/100783704] 156

 [4] Z EEV DVIR AND S HACHAR L OVETT: Subspace evasive sets. In Proc. 44th STOC, pp. 351–358.
     ACM Press, 2012. See also at arXiv. [doi:10.1145/2213977.2214010] 155

 [5] G ERALD B. F OLLAND: Introduction to Partial Differential Equations. Princeton University Press,
     1995. 152

 [6] A LAN G UO: High rate locally correctable codes via lifting. Electron. Colloq. on Comput. Complexity
     (ECCC), 20:53, 2013. See also at arXiv. 155

 [7] A LAN G UO AND S WASTIK KOPPARTY: List-decoding algorithms for lifted codes. CoRR,
     abs/1412.0305, 2014. [arXiv:1412.0305] 155

 [8] A LAN G UO , S WASTIK KOPPARTY, AND M ADHU S UDAN: New affine-invariant codes from
     lifting. In ROBERT D. K LEINBERG, editor, Proc. 4th Conf. on Innovations in Theoretical
     Computer Science (ITCS’13), pp. 529–540. ACM Press, 2013. See also at arXiv and ECCC.
     [doi:10.1145/2422436.2422494] 155

                    T HEORY OF C OMPUTING, Volume 11 (5), 2015, pp. 149–182                          179
                                         S WASTIK KOPPARTY

 [9] V ENKATESAN G URUSWAMI AND S WASTIK KOPPARTY: Explicit subspace designs. Combinatorica,
     pp. 1–25, 2014. Preliminary version in FOCS’13. See also ECCC. [doi:10.1007/s00493-014-3169-1]
     155

[10] V ENKATESAN G URUSWAMI AND ATRI RUDRA: Explicit codes achieving list decoding capacity:
     Error-correction with optimal redundancy. IEEE Trans. Inform. Theory, 54(1):135–150, 2008.
     Preliminary version in STOC’06. [doi:10.1109/TIT.2007.911222] 150, 151, 155

[11] V ENKATESAN G URUSWAMI , A MIT S AHAI , AND M ADHU S UDAN: “Soft-decision” decoding
     of Chinese Remainder Codes. In Proc. 41st FOCS, pp. 159–168. IEEE Comp. Soc. Press, 2000.
     [doi:10.1109/SFCS.2000.892076] 153, 168

[12] V ENKATESAN G URUSWAMI AND M ADHU S UDAN: Improved decoding of Reed-Solomon and
     algebraic-geometric codes. IEEE Trans. Inform. Theory, 45(6):1757–1767, 1999. Preliminary
     version in FOCS’98. See also at ECCC. [doi:10.1109/18.782097] 151, 153

[13] V ENKATESAN G URUSWAMI AND C AROL WANG: Linear-algebraic list decoding for variants of
     Reed-Solomon codes. IEEE Trans. Inform. Theory, 59(6):3257–3268, 2013. See also at ECCC.
     [doi:10.1109/TIT.2013.2246813] 155, 158

[14] V ENKATESAN G URUSWAMI AND C HAOPING X ING: Folded codes from function field towers and
     improved optimal rate list decoding. In Proc. 44th STOC, pp. 339–350. ACM Press, 2012. See also
     at ECCC and arXiv. [doi:10.1145/2213977.2214009] 155

[15] V ENKATESAN G URUSWAMI AND C HAOPING X ING: List decoding Reed-Solomon, Algebraic-
     Geometric, and Gabidulin subcodes up to the Singleton bound. In Proc. 45th STOC, pp. 843–852.
     ACM Press, 2013. See also at ECCC. [doi:10.1145/2488608.2488715] 155

[16] V ENKATESAN G URUSWAMI AND C HAOPING X ING: Optimal rate algebraic list decoding using
     narrow ray class fields. J. Combin. Theory Ser. A, 129:160–183, 2015. Partial preliminary version
     in SODA’14. See also at arXiv and ECCC. [doi:10.1016/j.jcta.2014.09.003] 155

[17] B RETT H EMENWAY, R AFAIL O STROVSKY, AND M ARY W OOTTERS: Local correctability of
     expander codes. In F EDOR V. F OMIN , RUSINS F REIVALDS , M ARTA Z. K WIATKOWSKA , AND
     DAVID P ELEG, editors, Proc. 40th Internat. Colloq. on Automata, Languages and Programming
     (ICALP’13), volume 7965 of Lecture Notes in Computer Science, pp. 540–551. Springer, 2013. See
     also at arXiv. [doi:10.1007/978-3-642-39206-1_46] 155

[18] JAMES W. P. H IRSCHFELD , G ÁBOR KORCHMÁROS , AND F ERNANDO T ORRES: Algebraic Curves
     over a Finite Field (Princeton Series in Applied Mathematics). Princeton University Press, 2008.
     175

[19] S WASTIK KOPPARTY: Algebraic methods in randomness and pseudorandomness. Ph. D. thesis,
     Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2010.
     DSpace@MIT. 171

                   T HEORY OF C OMPUTING, Volume 11 (5), 2015, pp. 149–182                         180
                               L IST-D ECODING M ULTIPLICITY C ODES

[20] S WASTIK KOPPARTY: List-decoding multiplicity codes. Electron. Colloq. on Comput. Complexity
     (ECCC), 12:44, 2012. 149, 155

[21] S WASTIK KOPPARTY: Some remarks on multiplicity codes. In A LEXANDER BARG AND O LEG R.
     M USIN, editors, Discrete Geometry and Algebraic Combinatorics: AMS Spec. Session, volume 625
     of Contemporary Mathematics, pp. 155–176. Amer. Math. Soc., 2014. [doi:10.1090/conm/625,
     arXiv:1505.07547] 155

[22] S WASTIK KOPPARTY, S HUBHANGI S ARAF, AND S ERGEY Y EKHANIN: High-rate codes with
     sublinear-time decoding. J. ACM, 61(5/28):1–20, 2014. Preliminary version in STOC’11.
     [doi:10.1145/2629416] 150, 151, 154, 157, 177, 178

[23] FARZAD PARVARESH AND A LEXANDER VARDY: Correcting errors beyond the Guruswami-Sudan
     radius in polynomial time. In Proc. 46th FOCS, pp. 285–294. IEEE Comp. Soc. Press, 2005.
     [doi:10.1109/SFCS.2005.29] 151

[24] RUUD P ELLIKAAN AND X IN -W EN W U: List decoding of q-ary Reed-Muller codes. IEEE Trans.
     Inform. Theory, 50(4):679–682, 2004. [doi:10.1109/TIT.2004.825043] 153

[25] M ADHU S UDAN: Decoding of Reed Solomon codes beyond the error-correction bound. J. Com-
     plexity, 13(1):180–193, 1997. Preliminary version in FOCS’96. [doi:10.1006/jcom.1997.0439]
     151

[26] M ADHU S UDAN: Ideal error-correcting codes: Unifying algebraic and number-theoretic algorithms.
     In S ERDAR B OZTAS AND I GOR S HPARLINSKI, editors, Proc. 14th Internat. Symp. Applied Algebra,
     Algebraic Algorithms and Error-Correcting Codes (AAECC’01), volume 2227 of Lecture Notes in
     Computer Science, pp. 36–45. Springer, 2001. [doi:10.1007/3-540-45624-4_4] 153

[27] M ADHU S UDAN , L UCA T REVISAN , AND S ALIL P. VADHAN: Pseudorandom generators without
     the XOR lemma. J. Comput. System Sci., 62(2):236–266, 2001. Preliminary version in CCC’99 and
     STOC’99. See also at ECCC. [doi:10.1006/jcss.2000.1730] 154, 178

[28] S ALIL P. VADHAN: Pseudorandomness. Foundations and Trends in Theoretical Computer Science,
     7(1-3):1–336, 2012. [doi:10.1561/0400000010] 155


AUTHOR

      Swastik Kopparty
      Department of Mathematics & Department of Computer Science
      Rutgers University, Piscataway, NJ
      swastik kopparty rutgers edu
      http://www.math.rutgers.edu/~sk1233/




                   T HEORY OF C OMPUTING, Volume 11 (5), 2015, pp. 149–182                       181
                                        S WASTIK KOPPARTY

ABOUT THE AUTHOR

    S WASTIK KOPPARTY received his Ph. D. from M.I.T. in 2010, and was advised by Madhu
       Sudan. He was a postdoc at the Institute for Advanced Study. Earlier, he had been an
       undergrad at IIT Madras and the University of California, Riverside. At UCR he got
       interested in theoretical computer science and coding theory, largely due to the guidance
       of Ilya Dumer and Chinya Ravishankar. Swastik’s fields of research include F2 , F p , Fq ,
       R and C. He is also interested in complexity theory, error-correcting codes, table tennis,
       randomness and pseudorandomness.




                  T HEORY OF C OMPUTING, Volume 11 (5), 2015, pp. 149–182                           182