Authors Swastik Kopparty, Mrinal Kumar, Michael Saks,
License CC-BY-3.0
T HEORY OF C OMPUTING, Volume 12 (7), 2016, pp. 1–27
www.theoryofcomputing.org
Efficient Indexing of Necklaces and
Irreducible Polynomials over Finite Fields
Swastik Kopparty∗ Mrinal Kumar† Michael Saks‡
Received April 2, 2015; Revised March 6, 2016; Published August 13, 2016
Abstract: We study the problem of indexing irreducible polynomials over finite fields,
and give the first efficient algorithm for this problem. Specifically, we show the existence
of poly(n, log q)-size circuits that compute a bijection between {1, . . . , |S|} and the set S of
all irreducible, monic, univariate polynomials of degree n over a finite field Fq . This has
applications to pseudorandomness, and answers an open question of Alon, Goldreich, Håstad,
and Peralta (1992).
Our approach uses a connection between irreducible polynomials and necklaces (equiv-
alence classes of strings under cyclic rotation). Along the way, we give the first efficient
algorithm for indexing necklaces of a given length over a given alphabet, which may be of
independent interest.
ACM Classification: F.2.1, G.2.1, I.1.2
AMS Classification: 11T06, 68R15, 68W40
Key words and phrases: indexing algorithms, necklaces, irreducible polynomials, explicit constructions
1 Introduction
For a finite field Fq and an integer n, let S be the set of all monic irreducible polynomials in 1 variable
over Fq of degree exactly n. There is a well known formula for |S| (which is approximately qn /n). We
A conference version of this paper appeared in the Proceedings of the 41st ICALP, 2014 [16].
∗ Supported in part by a Sloan Fellowship and NSF grant CCF-1253886.
† Supported in part by NSF grant CCF-1253886.
‡ Supported in part by NSF grants CCF-0832787 and CCF-1218711.
© 2016 Swastik Kopparty, Mrinal Kumar, and Michael Saks
c b Licensed under a Creative Commons Attribution License (CC-BY) DOI: 10.4086/toc.2016.v012a007
S WASTIK KOPPARTY, M RINAL K UMAR , AND M ICHAEL S AKS
consider the problem of giving an efficiently computable indexing of irreducible polynomials, i. e., finding
a bijection f : {1, . . . , |S|} → S such that f (i) is computable in time poly(log |S|) = poly(n log q). Our
main result is that indexing of irreducible polynomials can be done efficiently given O(n log q) bits of
advice. This answers a problem posed by Alon, Goldreich, Håstad, and Peralta [3], and is the polynomial
analogue of the the well-known problem of “giving a formula for the n-bit primes.” Note that today
it is not even known (in general) how to produce a single irreducible polynomial of degree n in time
poly(n log q) without the aid of either advice or randomness.
The main technical result we show en route is an efficient indexing algorithm for necklaces. Necklaces
are equivalence classes of strings modulo cyclic rotation. We give an poly(n log |Σ|)-time computable
bijection g : {1, 2, . . . , |N|} → N, where N is the set of necklaces of length n over the alphabet Σ.
1.1 The indexing problem
We define an indexing of a finite set S to be a bijection from the set {1, . . . , |S|} to S. Let us formalize
indexing as a computational problem. Suppose that L is an arbitrary language over alphabet Σ and let Ln
be the set of strings of L of length n. We want to “construct” an indexing function An for each of the sets
Ln . Formally, this means giving an algorithm A which takes as input a size parameter n and an index j
and outputs An ( j), so that the following properties hold for each n.
• An maps the set {1, . . . , |Ln |} bijectively to Ln .
• If j > |Ln |, then An ( j) returns too large.
An indexing algorithm is considered to be efficient if its running time is poly(n).
A closely related problem is reverse-indexing. A reverse-indexing of L is a bijection from Ln to
{1, . . . , |Ln |}, and we say it is efficient if it can be computed in time poly(n).
We can use the above formalism for languages to formulate the indexing and reverse-indexing
problems for any combinatorial structure, such as permutations, graphs, partitions etc., by using standard
efficient encodings of such structures by strings.
1.2 Indexing, enumeration, counting and ranking
Indexing is closely related to the well-studied counting, enumeration and ranking problems for L. The
counting problem is to give an algorithm that, on input n, outputs the size of Ln . The enumeration
problem is to give an algorithm that, on input n, outputs a list containing all elements of Ln . A counting
or enumeration algorithm is said to be efficient if it runs in time poly(n) or |Ln | · poly(n) respectively.
Other important algorithmic problems associated with combinatorial objects include the ranking and
unranking problems. For the ranking problem, one is given an ordering of Ln (such as the lexicographic
order) and the goal is to compute the rank (under this order) of a given element of Ln . For the unranking
problem, one has to compute the inverse of this ranking map. It is easy to see that unranking algorithms
for any ordering are automatically indexing algorithms, and ranking algorithms for any ordering are
automatically reverse-indexing algorithms.1
1 We use the terms indexing and reverse-indexing instead of the terms unranking and ranking to make an important distinction:
T HEORY OF C OMPUTING, Volume 12 (7), 2016, pp. 1–27 2
I NDEXING N ECKLACES AND I RREDUCIBLE P OLYNOMIALS OVER F INITE F IELDS
There is well developed complexity theory for counting problems, starting with the fundamental
work of Valiant [33]. For combinatorial structures, counting problems are (of course) at the heart of
combinatorics, and many basic identities in combinatorics (such as recurrence relations that express the
number of structures of a particular size in terms of the number of such structures of smaller sizes) can
also be viewed as giving efficient counting algorithms for these structures.
The enumeration and ranking problems for combinatorial structures has also received a large amount
of attention. See the books [22, 17, 25, 5] for an overview of some of the work on this topic.
Counting and enumeration can be easily reduced to indexing: Given an indexing algorithm A we can
compute |Ln | by calling An ( j) on increasing powers of 2 until we get the answer “too large” and then do
binary search to determine the largest j for which An ( j) is not too large. Enumeration can be done by just
running the indexing algorithm on the integers 1, 2, . . . until we get the answer “too large.”
Conversely, in many cases, such as for subsets, permutations, set partitions, integer partitions, trees,
spanning trees (and many more), the known counting algorithms can be modified to give efficient indexing
(and hence enumeration) algorithms. This happens, for example, when the counting problem is solved by
a recurrence relation that is proved via a bijective proof.
However, it seems that not all combinatorial counting arguments lead to efficient indexing algorithms.
A prime example of this situation is when we have a finite group acting on a finite set, and the set we
want to index is the set of orbits of the action. The associated counting problem can be solved using the
Burnside counting lemma, and there seems to be no general way to use this to get an efficient indexing
algorithm.
This leads us to one of the indexing problems studied here: Fix an alphabet Σ and consider two
strings x and y in Σn to be equivalent if one is a rotation of the other, i. e., we can find strings x1 , x2 such
that x = x1 x2 and y = x2 x1 (here uv denotes the concatenation of the strings u and v). The equivalence
classes of strings are precisely the orbits under the natural action of the cyclic group Zn on Σn . These
equivalence classes are often called necklaces because if we view the symbols of a string as arranged in
a circle, then equivalent strings give rise to the same arrangement. We are interested in the problem of
efficiently indexing necklaces. We apply the indexing algorithm for necklaces to the problem of indexing
irreducible polynomials over a finite field.
1.3 Main results
Our main result is an efficient algorithm for indexing irreducible polynomials.
Theorem 1.1. Let q be a prime power, and let n ≥ 1 be an integer. Let Iq,n be the set of monic irreducible
polynomials of degree n over Fq .
There is an indexing algorithm for Iq,n , which takes O(n log q) bits of advice and runs in poly(n, log q)
time.
in indexing and reverse-indexing the actual bijection between {1, . . . , |S|} and S is of no importance whatsoever, but in ranking
and unranking the actual bijection is part of the problem. We feel this difference is worth highlighting, and hence we introduced
the new terms indexing and reverse-indexing for this purpose. Note that some important prior work on ranking/unranking
distinguishes between these notions [21].
T HEORY OF C OMPUTING, Volume 12 (7), 2016, pp. 1–27 3
S WASTIK KOPPARTY, M RINAL K UMAR , AND M ICHAEL S AKS
We remark that it is not known today2 how to deterministically produce (without the aid of advice
or randomness) even a single irreducible polynomial of degree n in time poly(n log q) for all choices of
n and q. Our result shows that once we take a little bit of advice, we can produce not just one, but all
irreducible polynomials. For constant q, where it is known how to deterministically construct a single
irreducible polynomial in poly(n) time without advice [29], our indexing algorithm can be made to run
with just O(log n) bits of advice.
Using a known correspondence [12] between necklaces and irreducible polynomials over finite fields,
indexing irreducible polynomials reduces to the problem of indexing necklaces. Our main technical result
(of independent interest) is an efficient algorithm for this latter problem.
Theorem 1.2. There is an algorithm for indexing necklaces of length n over the alphabet {1, . . . , q},
which runs in time poly(n log q).
Our methods also give an efficient reverse-indexing algorithm for necklaces (but unfortunately this
does not lead to an efficient reverse-indexing algorithm for irreducible polynomials; this has to do with
the open problem of efficiently computing the discrete logarithm).
Theorem 1.3. There is an algorithm for reverse-indexing necklaces of length n over the alphabet
{1, . . . , q}, which runs in time poly(n log q).
The indexing algorithm for irreducible polynomials can be used to make a classical ε-biased set
construction from [3] based on linear-feedback shift register sequences constructible with logarithmic
advice (to put it at par with the other constructions in that paper). It can also be used to make the explicit
subspace designs of [13] very explicit (with small advice).
Agrawal and Biswas [2] gave a construction of a family of nearly-coprime polynomials, and used
this to give randomness-efficient black-box polynomial identity tests. The ability to efficiently index
irreducible polynomials enables one to do this even more randomness-efficiently (using a small amount
of advice).
Similarly, the string fingerprinting algorithm by Rabin [24], which is based on choosing a random
irreducible polynomial can be made more randomness-efficient by choosing the random irreducible
polynomial via first choosing a random index and then indexing the corresponding irreducible polynomial
using our indexing algorithm. This application also requires a small amount of advice.
As another application of the indexing algorithm for necklaces, we give a poly(n) time algorithm
for computing any given entry of the k × 2n generator matrix or the (2n − k) × 2n parity check matrix of
BCH codes for all values of the designed distance (this is the standard notion of strong explicitness for
2 The problem of efficiently constructing an irreducible polynomial over F of degree n has been extensively studied. It is
q
known how to solve this problem under the following relaxations:
• with randomness (folklore),
• under the Generalized Riemann Hypothesis (Adleman-Lenstra [1]),
• when Fq has small characteristic (Shoup [28]),
• given a construction of Fqn (Lenstra [18]),
• if we only require an irreducible polynomial of degree approximately n (Adleman-Lenstra [1]).
T HEORY OF C OMPUTING, Volume 12 (7), 2016, pp. 1–27 4
I NDEXING N ECKLACES AND I RREDUCIBLE P OLYNOMIALS OVER F INITE F IELDS
error-correcting codes). Earlier, it was only known how to compute this entry explicitly for very small
values of the designed distance (which is usually the setting where BCH codes are used).
1.4 Related Work
There is an extensive literature on enumeration algorithms for combinatorial objects (see the books [25,
14, 17, 22, 5]). Some of these references discuss necklaces in depth, and some also discuss the rank-
ing/unranking problems for various combinatorial objects.
The lexicographically smallest element of a rotation class is called a Lyndon word, and much is known
about them. Algorithmically, the problem of enumerating/indexing necklaces is essentially equivalent to
the problem of enumerating/indexing Lyndon words. Following a long line of work [10, 11, 26, 8, 6, 27, 7],
we now know O(1)-amortized time enumeration algorithms for Lyndon words/necklaces.
In [20] and [25], it was noted that the problem of efficient ranking/unranking of the lexicographic
order on Lyndon words is an open problem. Our indexing algorithms in fact give a solution to this
problem too: we get an efficient ranking/unranking algorithm for the lexicographic order on Lyndon
words.
Recent work of Andoni, Goldberger, McGregor and Porat [4] studied a problem that may be viewed
as an approximate version of reverse indexing of necklaces. They gave a randomized algorithm for
producing short fingerprints of strings, such that the fingerprints of rotations of a string are determined by
the fingerprint of the string itself. This fingerprinting itself was useful for detecting proximity of strings
under misalignment.
Recent independent work. A preliminary version of this paper appeared as [16]. At about the same
time, similar results were published by Kociumaka, Radoszewski and Rytter [15]. The work in these two
papers was done independently. The papers both have polynomial-time algorithms for indexing necklaces;
the authors in [15] exercised more care in designing the algorithm to obtain a better polynomial running
time. Their approach to alphabets of size greater than 2 is cleaner than ours. On the other hand, we put
the results in a broader context and have some additional applications (indexing irreducible polynomials
and explicit constructions).
1.5 Organization of the paper
The rest of the paper is organized as follows. We give the algorithm to index necklaces in Section 2. In
Section 3, we use our indexing algorithm for necklaces to give an indexing algorithm for irreducible
polynomials over finite fields. In Section 4, we give an application to the explicit construction of
generator and parity check matrices of BCH codes. We conclude with some open problems in Section 5.
In Appendix A, we give an alternate algorithm for indexing binary necklaces of prime length. In
Appendix B, we make some preliminary observations about the complexity theory of indexing in general.
T HEORY OF C OMPUTING, Volume 12 (7), 2016, pp. 1–27 5
S WASTIK KOPPARTY, M RINAL K UMAR , AND M ICHAEL S AKS
2 Indexing necklaces
2.1 Strategy for the algorithm
We first consider a very basic indexing algorithm which will inspire our algorithms. Given a directed
acyclic graph D on vertex set V and distinguished subsets S and T of nodes, there is a straightforward
indexing algorithm for the set of paths that start in S and end in T : Fix an arbitrary ordering on the nodes,
and consider the induced lexicographic ordering on paths (i. e., path P1 P2 . . . is less than path Q1 Q2 . . . if
Pi < Qi where i is the least integer such that Pi 6= Qi ). Our indexing function will map the index j to the
jth path from S to T in lexicographic order. There is a simple dynamic program which computes for each
node v, the number N(v) of paths from v to a vertex in T . Let v1 , . . . , vr be the nodes of S listed in order.
Given the input index j, we find the first source vi such that the number of paths to T starting at nodes
v1 , . . . , vi is at least j; if there is no such source then the index j is larger than the number of paths being
indexed. Otherwise, vi is the first node of the desired path, and we can proceed inductively by replacing
the set S by the set of children of vi and updating j appropriately.
This approach can be adapted to the following situation. Suppose the set S we want to index is a set
of strings of fixed length n over alphabet Σ. A read-once branching program of length n over alphabet Σ
is an acyclic directed graph with vertex layers numbered from 0 to n, where (1) layer 0 has a single start
node, (2) there is a designated subset of accepting nodes at level n, and (3) every non-sink node has one
outgoing arc corresponding to each alphabet symbol, and these arcs connect the node to nodes at the next
level. For nodes v and w and alphabet symbol σ we write v →σ w to mean that there is an arc from v to w
labelled by σ . The size of a branching program B, denoted by |B| is the number of vertices in it.
Such a branching program takes words from Σn and, starting from the start node, follows the path
corresponding to the word to either the accept or reject node. Given a read-once branching program for S,
there is a 1-1 correspondence between strings in S and paths from the start node to an accepting node. We
can use the indexing algorithm for paths given above to index S.
This suggests the following approach to indexing necklaces. For each equivalence class of strings
(necklace) identify a canonical representative string of the class (such as the lexicographically smallest
representative). Then build a branching program B which, given string y, determines whether y is a
canonical representative of its class. By the preceding paragraph, this would be enough to index all of the
canonical representatives, which is equivalent to indexing equivalence classes.
In fact, we are able to implement this approach provided that q = 2 and n is prime (see Appendix A).
However, we have not been able to make it work in general. For this we need another approach, which
still uses branching programs, but in a more involved way.
First some notation. For a given string y, we write the string obtained from y after cyclically rotating
it rightwards by i positions as Roti (y). We define Orbit(y) to be the set containing y and all its distinct
rotations. Orbit(y) will also be referred to as the equivalence class of y. A string y is said to be periodic
with period p if it can be written as y1 m for some y1 ∈ Σ p and m = n/p. A string is said to have
fundamental period p if it is periodic with period p and not periodic with any period smaller than p. We
will denote the fundamental period of a string y by FP(y). Note that for any string y, |Orbit(y)| = FP(y).
If E is an orbit and x is a string, we say that E < x if E contains at least one string y that is
lexicographically less than x. (Notice that under our definition, if x and y are strings then we might have
both that the orbit of x is less than y and the orbit of y is less than x.)
T HEORY OF C OMPUTING, Volume 12 (7), 2016, pp. 1–27 6
I NDEXING N ECKLACES AND I RREDUCIBLE P OLYNOMIALS OVER F INITE F IELDS
Let t be the total number of orbits. Let Cx be the set of orbits that are less than x. Our main goal will
be to design an efficient algorithm which, given string x, returns |Cx |. We now show that if we can do this,
then we can solve both the indexing and reverse indexing problems.
For the indexing problem, we want a 1-1 function ψ that maps j ∈ {1, . . . ,t} to a string so that all of
the image strings are in different orbits. The map ψ will be easily computable given a subroutine for |Cx |.
Define the minimal representative of an orbit to be the lexicographically least string in the orbit. Let
y1 < · · · < yt denote the minimal representatives in lex order. Our map ψ will map j to y j . This clearly
maps each index to a representative of a different orbit.
It suffices to show how to compute ψ( j). Note that |Cx | is equal to the number of yi that precede x,
and is thus a nondecreasing function of x. Therefore, ψ( j) = y j is equal to the lexicographically largest
string with |Cx | < j. Furthermore, since |Cx | is a nondecreasing function of x, we can find ψ( j) by doing
binary search on the set of strings according to the value of |Cx |.
Similarly, we can solve the reverse indexing problem: given a string x we can find the index of the
orbit to which it belongs by first finding the lexicographically minimal representative yi of its orbit and
then computing |Cyi | + 1.
Lemma 2.1. To efficiently index and reverse index necklaces of length n over an alphabet Σ, it suffices
to have an efficient algorithm that takes as input a string x ∈ Σn and outputs |Cx |.
The next section gives our algorithm to determine |Cx | for any input string x.
2.2 Computing |Cx |
Let us define: [ [
Gx,p = E and Gx,≤p = E.
E∈Cx :|E|=p E∈Cx :|E| divides p
In Section 2.2.2 we reduce the problem of computing of |Cx | to the problem of computing |Gx,≤p | for
various p. The main component of the indexing algorithm is a subroutine that computes |Gx,≤p | given a
string x and an integer p. This subroutine works by building a branching program with nO(1) nodes, which
when given a string y accepts if and only if (1) the orbit of y has size dividing p and (2) Orbit(y) < x.
The quantity we want to compute, |Gx,≤p |, is therefore simply the number of inputs y accepted by this
branching program (which, as noted above can be computed in polynomial time via a simple dynamic
program).
2.2.1 Preliminaries
We state some basic facts about periodic strings without proof.
Fact 2.2. Let y be a string of length n and let p be positive integer dividing n. Then, |Orbit(y)| = p if
and only if y has fundamental period p. In particular, y can be written as y1 n/p for an aperiodic string
y1 ∈ Σ p .
Fact 2.3. The fundamental period of a string is a divisor of any period of the string.
In particular, the fundamental period of a string is unique.
Throughout this paper, for a natural number m, we denote by [m], the set {0, 1, 2, . . . m − 1}.
T HEORY OF C OMPUTING, Volume 12 (7), 2016, pp. 1–27 7
S WASTIK KOPPARTY, M RINAL K UMAR , AND M ICHAEL S AKS
2.2.2 Reduction to computing |Gx,≤p |
We begin with some simple transformations that reduce the computation of |Cx | to the computation of
|Gx,≤p | (for various p).
Lemma 2.4. For all x ∈ Σn ,
1 1
|Cx | = ∑ |Orbit(y)| = ∑ FP(y) .
y∈Gx,≤n y∈Gx,≤n
Proof. For y ∈ Gx,≤n , Roti (y) ∈ Gx,≤n for every positive integer i. Note that there are exactly |Orbit(y)|
distinct strings of the form Roti (y). Thus for any orbit E ∈ Cx , we have
1
∑ |Orbit(y)| = 1 .
y∈E
Therefore:
1 1
∑ |Orbit(y)| = ∑ ∑ |Orbit(y)| = ∑ 1 = |Cx | .
y∈Gx,≤n E∈Cx y∈E E∈Cx
The sum on the right hand side can be split on the basis of the period of y. From Lemma 2.4, Fact 2.2
and Fact 2.3, we have the following lemma.
Lemma 2.5. For all x ∈ Σn ,
|Gx,i |
|Cx | = ∑ .
i|n
i
So, to count |Cx | efficiently, it suffices to compute |Gx,i | efficiently for each i | n. Now, from the
definitions, we have the following lemma.
Lemma 2.6. For all x ∈ Σn ,
|Gx,≤p | = ∑ |Gx,i | .
i|p
From the Möbius Inversion Formula (see Chapter 3 in [31] for more details), we have the following
equality.
Lemma 2.7. p
|Gx,p | = ∑ µ |Gx,≤i | .
i|p
i
Lemma 2.7 implies that it suffices to compute |Gx,≤p | efficiently for every divisor p of n. In the next
few sections, we will focus on this sub-problem and design an efficient algorithm for this problem. We
will first describe the algorithm when the alphabet is binary, and then generalize to larger alphabets.
T HEORY OF C OMPUTING, Volume 12 (7), 2016, pp. 1–27 8
I NDEXING N ECKLACES AND I RREDUCIBLE P OLYNOMIALS OVER F INITE F IELDS
2.2.3 Computing |Gx,≤n | efficiently for the binary alphabet
In this section, we will design an efficient algorithm that given a string x ∈ {0, 1}n computes |Gx,≤n |. On
input x the algorithm will construct a branching program with the property that |Gx,≤n | is the number of
accepting paths in the branching program. This number of accepting paths can be computed by a simple
dynamic program as described at the beginning of Section 2.1.
Lemma 2.8. Given as input a branching program B of length n over alphabet Σ, we can compute the size
of the set of accepted strings in time poly(|B|, log n).
Proof. The number accepted strings is the number of paths from the start node to the accept node, and all
such paths have length exactly n. If the start node is labelled i and the accept node is labelled j, then the
number of accepted strings is the i, j entry in the nth power of the adjacency matrix of the graph, and can
thus be computed in time polynomial in the size of the graph and log n (by repeated squaring).
We now describe how to construct, for each fixed string x = x1 x2 . . . xn ∈ {0, 1}n , a branching program
Bx of size polynomial in n such that the strings accepted by Bx are exactly those in Gx,≤n . Lemma 2.8
then implies that we can compute |Gx,≤n | in time polynomial in n.
For strings x, y, when is y <lex x? This happens and only if there exists an i ∈ {0, 1, 2, . . . , n − 1} such
that y j = x j for every j ≤ i and xi+1 > yi+1 . In the case of binary strings of length n, we must have
xi+1 = 1 and yi+1 = 0.
Definition 2.9. The set of witnesses for x, denoted Lx , is defined by:
Lx = {s0 | s ∈ {0, 1}∗ such that s1 is a prefix of x} .
We can summarize the discussion from the paragraph above as follows:
Observation 2.10. For x, y ∈ {0, 1}n , we have y <lex x if and only if some prefix of y lies in Lx .
We will now generalize this observation to strings under rotation. For strings x, y, when is Orbit(y) <
x? Recall that Orbit(y) < x if for some y0 ∈ Orbit(y), we have y0 <lex x. From Observation 2.10, we know
that this happens if and only if some y0 ∈ Orbit(y) has some prefix w in Lx . Rotating back to y, two
situations can arise. Either y contains w as a contiguous substring, or w appears as a “split substring”
wrapped around the end of y. In the latter case, y has a prefix w1 and a suffix w2 such that w2 w1 = w ∈ Lx .
Recall that Gx,≤n is the set of y with Orbit(y) < x. Thus, y ∈ Gx,≤n if and only if it has a contiguous
substring as a witness, or it has a witness that is wrapped around its end. Let us separate these two cases
out.
Definition 2.11. For a string x ∈ {0, 1}n ,
Gcx,≤n = {y ∈ {0, 1}n | y contains a string in Lx as a contiguous substring} ,
Gwx,≤n = {y ∈ {0, 1}n | y has a prefix w1 and suffix w2 such that w2 w1 ∈ Lx } .
From the discussion in the paragraph above, we have the following observation:
T HEORY OF C OMPUTING, Volume 12 (7), 2016, pp. 1–27 9
S WASTIK KOPPARTY, M RINAL K UMAR , AND M ICHAEL S AKS
Observation 2.12.
Gx,≤n = Gcx,≤n ∪ Gwx,≤n .
The branching program Bx will be obtained by combining two branching programs Bcx and Bwx , where
the first accepts the strings in Gcx,≤n and the second accepts the strings in Gwx,≤n . Each layer j of the
branching program Bx is the product of layer j of Bcx and layer j of Bwx and we have arcs (v, v0 ) →σ (w, w0 )
when v →σ v0 and w →σ w0 . The accepting nodes at level n + 1 are nodes (v, v0 ) where v is an accepting
node of Bcx or v0 is an accepting node of Bwx . The resulting branching program clearly accepts the set of
strings accepted by Bcx or Bwx .
Note that the branching programs Bx produced by the algorithm are never actually “run,” but are
given as input to the algorithm of Lemma 2.8 in order to determine |Gx,≤n |.
For a set of strings W , we will use Prefix(W ) to denote the set of all prefixes of all strings in W
(including the empty string Λ). Similarly, Suffix(W ) denotes the set of all suffixes of of all strings in W
(including the empty string Λ). Similarly, we will use Substring(W ) for set of all contiguous substrings
of strings in W .
For a string r, Q(r) is the set of suffixes of r that belong to Prefix(Lx ).
Constructing branching program Bcx . We now present an algorithm which on input x ∈ {0, 1}n , runs
in time polynomial in n and outputs a branching program Bcx that recognizes Lxc .
Definition 2.13. Branching program Bcx .
1. Nodes at level j are triples ( j, s, b) where s ∈ Prefix(Lx ) and b ∈ {0, 1}. (We want string s to be the
longest suffix of z1 z2 . . . z j that belongs to Prefix(Lx ), and b = 1 iff z1 z2 . . . z j contains a substring
that belongs to Lx .)
2. The start node is (0, Λ, 0) where Λ is the empty string.
3. The accepting nodes (n, s, b) are those with b = 1.
4. For j ≤ n, the arc out of nodes ( j − 1, s, b) labeled by alphabet symbol α is ( j, s0 , b0 ) where s0 is
the longest string in Q(sα) and b0 = 1 if s0 contains a suffix in Lx and otherwise b0 = b.
It is clear that the branching program can be constructed (as a directed graph) in time polynomial in n.
It remains to show that it accepts those z that have a substring that belongs to Lx .
Fix a string z ∈ {0, 1}n . Let ( j, s j , b j ) be the jth vertex visited by the branching program on input
z. Note that s j is a suffix of z1 . . . z j . Let h j be the index such that s = zh j . . . z j ; if s is empty, we set
h j = j + 1. For j between 1 and n let i j be the least index such that zi j . . . z j belongs to Prefix(Lx ) (so
i j = j + 1 if there is no such string). Note that i j ≥ i j−1 since if zi . . . z j belongs to Prefix(Lx ) so does
zi . . . z j−1 .
The branching program is designed to make the following true:
Claim 2.14. For j between 1 and n, h j = i j and b j = 1 if and only if a substring of z1 . . . z j belongs to Lx .
The claim for b j = 1 implies that the branching program accepts the desired set of strings.
T HEORY OF C OMPUTING, Volume 12 (7), 2016, pp. 1–27 10
I NDEXING N ECKLACES AND I RREDUCIBLE P OLYNOMIALS OVER F INITE F IELDS
Proof. The claim follows easily by induction, where the basis j = 0 is trivial. Assume j > 0. First we
show that h j = i j . By induction h j−1 = i j−1 and by definition of h j and i j we have i j ≤ h j . To show
h j ≤ i j , note that since i j ≥ i j−1 = h j−1 , the string zi j . . . z j is in Q(s j−1 α) and so is considered in the
choice of s j and thus h j = i j .
For the claim on b j , if z has no substring in Lx , then b j remains 0 by induction. If z has a substring in
Lx , let zi . . . zk be such a substring with k minimum. Then by the claim on tk , hk ≤ i, and so zi . . . zk is a
suffix of sk and so bk = 1, and for all j ≥ k, b j continues to be 1.
Constructing branching program Bwx . In this section, the goal is to design a branching program which
accepts the strings in Gwx,≤n , since such a string may not be in Gcx,≤n and hence, may have been rejected by
Bcx . We now present an algorithm which on input x ∈ {0, 1}n , runs in time polynomial in n and outputs a
branching program Bwx that accepts the set of strings z that have a nonempty suffix u and nonemtpy prefix
v such that uv belongs to Lx .
Definition 2.15. Branching program Bwx .
1. Nodes at level j are triples ( j, s, p) where s ∈ Prefix(Lx ) and p ∈ Substring(Lx ). (String s will be
the longest suffix of z1 z2 . . . z j that belongs to Prefix(Lx ) (as in Bcx ) and p is the longest prefix of
z1 z2 . . . z j that belongs to Substring(Lx ).)
2. The start node is (0, Λ, Λ) where Λ is the empty string.
3. The accepting states are those states (n, s, p) such that p has a nonempty prefix p0 and s has a
nonempty suffix s0 such that s0 p0 ∈ Lx .
4. For j ≤ n, the arc out of state ( j − 1, s, p) labeled by alphabet symbol α is ( j, s0 , p0 ) where s0 is the
longest string in Q(sα) and p0 = pα if |p| = j − 1 and pα ∈ Substring(Lx ), and p0 = p otherwise.
It is clear that the branching program can be constructed (as a directed graph) in time polynomial in n.
It remains to show that it accepts Lxw .
Fix a string z ∈ {0, 1}n . Let ( j, s j , p j ) be the jth node visited by the branching program on input z.
Notice that s j is calculated the same way in Bwx as in Bcx and so s j is the longest suffix of z1 . . . z j that
belongs to Prefix(Lx ).
An easy induction shows that p j is the longest prefix of z1 . . . z j belonging to Substring(Lx ): Let k be
the length of the longest prefix of z belonging to Substring(Lx ). For j ≤ k we have p j = z1 . . . z j and for
j > k, p j = z1 . . . zk .
Finally, we need to show that the branching program accepts z if and only if z has a nonempty suffix s0
and z has a nonempty prefix p0 such that s0 p0 ∈ Lx . If the program accepts, then the acceptance condition
and the fact that sn is a suffix of z and pn is a prefix of z implies that z has the required suffix and prefix.
Conversely, if z has such a prefix p0 and suffix s0 , then they each belong to Prefix(Lx ). Since pn is the
longest prefix of z belonging to Substring(Lx ), p0 is a prefix of pn and since sn is the longest suffix of z
belonging to Prefix(Lx ), s0 is a suffix of sn . So the branching program will accept.
T HEORY OF C OMPUTING, Volume 12 (7), 2016, pp. 1–27 11
S WASTIK KOPPARTY, M RINAL K UMAR , AND M ICHAEL S AKS
Putting things together. From the constructions, it is clear that the size of the branching programs
Bwx and Bcx are polynomial in the size of Lx and hence polynomial in n = |x|. Moreover, by a product
construction, we can efficiently construct the deterministic finite branching program Bx which accepts
the strings accepted by Bwx or Bcx , which is Gx,≤n . This observation, along with Lemma 2.8 implies the
following lemma.
Lemma 2.16. There is an algorithm which takes as input a string x in {0, 1}n and outputs the size of
Gx,≤n in time polynomial in n.
2.2.4 Computing |Gx,≤p | efficiently for the binary alphabet
In this section, we will show that for every p | n, we can compute the quantity |Gx,≤p | efficiently. The
algorithm will be a small variation of our algorithm for computing |Gx,≤n | from the previous section. Let
p be a divisor of n with p < n. Every string y ∈ Gx,≤p is of the form an/p for some a ∈ {0, 1} p , and every
string in Orbit(y) is of the form (Roti (a))n/p , for some i ≤ p. Let us write the string x as x1 x2 . . . xn/p
where for each i, xi is of length exactly p. We will now try to characterize the strings in Gx,≤p . From the
definitions, y = an/p ∈ Gx,≤p if and only if there is a rotation 0 ≤ i < p such that (Roti (a))n/p has a prefix
in Lx . This, in turn, can happen if and only if there is an i < p such that one of the following is true.
• Roti (a) < x1 in lexicographic order, or
• there is j, 0 < j < n/p, such that Roti (a) = x1 = x2 = x3 = · · · = xi and Roti (a) < xi+1 in lexico-
graphic order.
The strings y = an/p for which a has a rotation which is less than x1 in lexicographic order are exactly
the strings of the form cn/p with c ∈ Gx1 ,≤p . Via the algorithm of the previous subsection, there is
a polynomial in n time algorithm which outputs a branching program recognizing Gx1 ,≤p . The only
strings which satisfy the second condition are of the form cn/p , where c is a rotation of x1 and x1 < xi+1
in lexicographic order. There are at most |Orbit(x1 )| such strings, and we can count them directly given x.
This gives us our algorithm for computing |Gx,≤p |:
Computing |Gx,≤p |:
Input:
• Integers n, p such that p | n.
• A string x ∈ {0, 1}n .
Algorithm:
1. Write x as x = x1 x2 . . . xn/p where |xi | = p ∀i ∈ [n/p].
2. Construct a branching program Ax1 such that L(Axi ) ∩ {0, 1} p = Gx1 ,≤p .
3. Let M be the number of strings of length p accepted by Ax1 .
4. If there is an 0 < i < n/p such that x1 = x2 = x3 = · · · = xi , x1 < xi+1 in lexicographic order, and
x1 ∈/ L(Ax1 ), then output M + |Orbit(x1 )|, else output M.
T HEORY OF C OMPUTING, Volume 12 (7), 2016, pp. 1–27 12
I NDEXING N ECKLACES AND I RREDUCIBLE P OLYNOMIALS OVER F INITE F IELDS
From the construction in Section 2.2.3 and Lemma 2.16, it follows that we can construct Ax1 and
count M in time polynomial in n. We thus have the following lemma.
Lemma 2.17. For any divisor p of n and string x ∈ {0, 1}n , we can compute the size of the set Gx,≤p in
time poly(n).
We now have all the ingredients for the proof of the following theorem, which is a special case of
Theorem 1.2 when the alphabet under consideration is {0, 1}.
Theorem 2.18. There is an algorithm for indexing necklaces of length n over the alphabet {0, 1}, which
runs in time poly(n).
Proof. The proof simply follows by plugging together the conclusions of Lemma 2.5, Lemma 2.6,
Lemma 2.7, Lemma 2.8 and Lemma 2.17.
It is not difficult to see that the indexing algorithm can be used to obtain a reverse indexing algorithm
as well and hence, we also obtain a special case of Theorem 1.3 for the binary alphabet.
2.2.5 Indexing necklaces over large alphabets
In this subsection we show how to handle the case of general alphabets Σ (with |Σ| = q). A direct
generalization of the algorithm for the case of the binary alphabet, where the set Lx is appropriately
defined, will run in time polynomial in n and q. Our goal here is to improve the running time to polynomial
in n and log q.
def
The basic idea is to represent the elements in Σ by binary strings of length t = dlog qe. Let Bin : Σ →
{0, 1}t be an injective map whose image is the set Γ of q lexicographically smallest strings in {0, 1}t .
Extend this to a map Bin : Σn → {0, 1}tn in the natural way.
We now use the map Bin to convert our indexing/counting problems over the large alphabet Σ to a
related problem over the small alphabet {0, 1}. For x ∈ Σn , we have Bin(Roti (x)) = Rotti (Bin(x)). For
an orbit E ⊆ Σn and x ∈ {0, 1}tn , we say E < x if some element z ∈ E satisfies Bin(z) <lex x.
Let Cx be the set of orbits E ⊆ Σn which are less than x. For each x ∈ {0, 1}tn and p | n, define:
[ [
Gx,p = E and Gx,≤p = E.
E<x, |E|=p E<x, |E| divides p
The following identity allows us to count Gx,≤n :
|Gx,≤n | = |{y ∈ {0, 1}tn | y ∈ Γn , ∃i < n s. t. Rotit (y) <lex x}| .
It is easy to efficiently produce a branching program A0 such that L(A0 ) ∩ {0, 1}tn = Γn . As we will
describe below, the methods of the previous section can be easily adapted to efficiently produce a
branching program Ax such that
L(Ax ) ∩ {0, 1}tn = {y ∈ {0, 1}tn | ∃i < n s. t. Rotit (y) <lex x} .
The following lemma will be crucial in the design of this branching program.
T HEORY OF C OMPUTING, Volume 12 (7), 2016, pp. 1–27 13
S WASTIK KOPPARTY, M RINAL K UMAR , AND M ICHAEL S AKS
Lemma 2.19. Let y ∈ {0, 1}tn . There exists i < n such that Rotit (y) <lex x if and only if at least one of
the following events occurs:
1. there exists w ∈ Lx such that w appears as a contiguous substring of y starting at a coordinate j with
j ≡ 0 mod t (where the coordinates of x are 0, 1, . . . ,tn − 1),
2. there exist strings w1 , w2 such that w1 w2 ∈ Lx , w2 is a prefix of y, w1 is a suffix of y, and |w1 | ≡ 0
mod t.
Given this lemma, the construction of Ax follows easily via the techniques of the previous subsections.
The main addition is that one needs to remember the value of the current coordinate mod t, which can be
done by blowing up the number of states of the branching program by a factor t.
Intersecting the accepted sets of Ax and A0 gives us our desired branching program which allows us
to count |Gx,≤n |. This easily adapts to also count |Gx,≤p | for each p | n.
We conclude using the ideas of Section 2.2.2. We can now compute |Gx,p | for each x and each p | n.
From Lemma 2.5, Lemma 2.6 and Lemma 2.7, it follows that for every x, we can compute |Cx | efficiently.
We thus get our main indexing theorem for necklaces from Lemma 2.1.
Theorem 2.20. There are poly(n, log |Σ|)-time indexing and reverse-indexing algorithms for necklaces
of length n over Σ.
Furthermore, there are poly(n, log |Σ|)-time indexing and reverse-indexing algorithms for necklaces
of length n over Σ with fundamental period exactly n.
3 Indexing irreducible polynomials
In the previous section, we saw an algorithm for indexing necklaces of length n over an alphabet Σ of size
q, which runs in time polynomial in n and log q. In this section, we will see how to use this algorithm
to efficiently index irreducible polynomials over a finite field. More precisely, we will use an indexing
algorithm for necklaces with fundamental period exactly equal to n (which is also given by the methods
of the previous sections).
Let q be a prime power, and let Fq denote the finite field of q elements. For an integer n > 0, let Iq,n
denote the set of monic, irreducible polynomials of degree n in Fq [T ].
Theorem 3.1. For every q, n as above, there is an algorithm that runs in poly(n, log q) time, takes
O(n log q) bits of advice, and indexes Iq,n .
Proof. To prove this theorem, we start by first describing the connection between the tasks of indexing
necklaces and indexing irreducible polynomials. Let P(T ) ∈ Iq,n . Note that P(T ) has all its roots in the
n−1
field Fqn . Let α ∈ Fqn be one of the roots of P(T ). Then we have that α, α q , . . . , α q are all distinct,
and:
n−1
i
P(T ) = ∏ T − α q .
i=0
T HEORY OF C OMPUTING, Volume 12 (7), 2016, pp. 1–27 14
I NDEXING N ECKLACES AND I RREDUCIBLE P OLYNOMIALS OVER F INITE F IELDS
n−1
Conversely, if we take α ∈ Fqn such that α, α q , . . . , α q are all distinct, then the polynomial
n−1
i
P(T ) = ∏ T − α q
i=0
is in Iq,n .
Define an action of Zn on F∗qn as follows: for k ∈ Zn and α ∈ (Fqn )∗ , define:
k
k[α] = α q .
This action partitions F∗qn into orbits. By the above discussion, Iq,n is in one-to-one correspondence with
the orbits of this action with size exactly n. Thus it suffices to index these orbits.
Let g be a generator of the the multiplicative group (Fqn )∗ . Define a map E : Zqn −1 → F∗qn by:
E(a) = ga .
We have that E is a bijection. Via this bijection, we have an action of Zn on Zqn −1 , where for k ∈ Zn and
a ∈ Zqn −1 ,
k[a] = qk · a .
Now represent elements of Zqn −1 by integers in {0, 1, . . . , qn − 2}. Define Σ = {0, 1, . . . , q − 1}.
For a ∈ Zqn −1 , consider its base-q expansion aσ ∈ Σn . This gives us a bijection between Zqn −1 and
Σn \ {(q − 1, . . . , q − 1)}. Via this bijection, we get an action of Zn on Σn \ {(q − 1, . . . , q − 1)}. This
action is precisely the standard rotation action!
This motivates the following algorithm.
The Indexing Algorithm:
Input: q (a prime power), n ≥ 0, i ∈ [|Iq,n |].
Advice: 1. A description of Fq .
2. An irreducible polynomial F(T ) ∈ Fq [T ] of degree n, whose root is a generator g of (Fqn )∗ (a. k. a.
primitive polynomial).
1. Let Σ = {0, 1, . . . , q − 1}.
2. Use i to index a necklace σ ∈ Σn \ {(q − 1, q − 1, . . . , q − 1)} with fundamental period exactly n
(via Theorem 2.20).
3. View σ as the base q expansion of an integer a ∈ {0, 1, . . . , qn − 2}.
4. Use F(T ) to construct the finite field Fqn and the element g ∈ F∗qn . (This can be done by setting
Fqn = Fq [T ]/F(T ), and taking the class of the element T in that quotient to be the element g.)
5. Set α = ga .
i
6. Set P(T ) = ∏n−1 q
i=0 (T − α ).
7. Output P(T ).
T HEORY OF C OMPUTING, Volume 12 (7), 2016, pp. 1–27 15
S WASTIK KOPPARTY, M RINAL K UMAR , AND M ICHAEL S AKS
For constant q, this algorithm can be made to work with O(log n) bits of advice. Indeed, one can
construct the finite field Fqn in deterministic poly(q, n) time, and a wonderful result of Shparlinski [30]
and Shoup [29] constructs a set of nO(1) elements in Fqn , one of which is guaranteed to be a generator
(see also [23] for improved quantitative dependence on q). The advice is then the index of an element of
this set which is a generator.
4 Explicit generator matrices and parity check matrices for BCH codes
In this section, we will apply the indexing algorithm for necklaces to give a strongly explicit construction
for generator and the parity check matrices for BCH codes. More precisely, we use the fact that our
indexing algorithm is in fact an unranking algorithm for the lexicographic ordering on (lexicographically
least representatives of) necklaces.
BCH codes [19] are classical algebraic error-correcting codes based on polynomials over finite
extension fields. They have played a central role since the early days of coding theory due to their
remarkable properties (they are one of the few known families of codes that has better rate/distance
tradeoff than random codes in some regimes). Furthermore, their study motivated many advances in
algebraic algorithms.
Using our indexing algorithm for necklaces, we can answer a basic question about BCH codes:
we construct strongly explicit generator matrices and parity check matrices for BCH codes. For the
traditionally used setting of parameters (constant designed distance), it is trivial to construct generator
matrices and parity check matrices for BCH codes. But for large values of the designed distance, as far as
we are aware, this problem was unsolved.
4.1 Generator matrices
Let q be a prime power, and let n ≥ 1 and 0 ≤ D < qn − 1. The BCH code associated with these parameters
will be of length qn over the field Fq , where the qn coordinates are identified with the big field Fqn . Let:
V = {hP(α)iα∈Fqn | P(X) ∈ Fqn [X], deg(P) ≤ D, s. t. ∀α ∈ Fqn , P(α) ∈ Fq } .
In words: this is the Fq -linear space of all Fqn -evaluations of Fqn -polynomials of low degree, which have
the property that all their evaluations lie in Fq . In coding theory terminology, this is a subfield subcode of
Reed-Solomon codes.
The condition that P(α) ∈ Fq for each α ∈ Fqn can be expressed as follows:
n
P(X)q ≡ P(X) mod X q − X .
Thus, if P(X) = ∑D i
i=0 ai X , then the above condition is equivalent to:
D D
n
∑ aqi X iq ≡ ∑ ai X i mod X q − X,
i=0 i=0
which simplifies to:
∀i, aiq mod (qn −1) = aqi .
Thus:
T HEORY OF C OMPUTING, Volume 12 (7), 2016, pp. 1–27 16
I NDEXING N ECKLACES AND I RREDUCIBLE P OLYNOMIALS OVER F INITE F IELDS
1. for every i, if ` is the smallest integer such that iq` mod (qn − 1) = i, then
`
ai ∈ V` = {α ∈ Fqn | α q = α} = Fq` ,
2. specifying ai ∈ V` automatically determines aiq mod (qn −1) , aiq2 mod (qn −1) , . . .,
3. ai can take any value in V` .
This motivates the following choice of basis for BCH codes. Define an equivalence relation ∼ on
{0, . . . , qn − 1} as follows: i1 ∼ i2 iff i2 = i1 · qk mod (qn − 1) for some k. Let G denote the set of all
equivalence classes. Let F = {S ∩ {0, 1, . . . , D} | S ∈ G}. Thus F is a partition of {0, 1, . . . , D}. Let
αS,1 , . . . , αS,|S| be a basis for V|S| over Fq (note that when ` | n, we have that
`
V` = {α ∈ Fqn | α q = α} = Fq`
is an Fq -linear subspace of Fqn of dimension `). For S ∈ F, define mS = mini∈S i. For S ∈ F and j ∈ [|S|],
define:
|S|−1
k k n
PS, j (X) = ∑ α qj X mS q mod (q −1) .
k=0
It is easy to see from the above description that (PS, j )S∈F, j∈[n] forms an Fq basis for the BCH code V .
Thus it remains to show that one can index the sets of F.
If we write all the elements of S ∈ F in base q, we soon realize that the S are precisely in one-to-one
correspondence with those rotation orbits of Σn (with Σ = {0, 1, . . . , q − 1}) where all elements of the
orbit are lexicographically ≤ some fixed string in Σn (in this case the fixed string turns out to be the
base q representation of the integer D). By the results of Section 2 (with a small change replacing
“lexicographically ≥” by “lexicographically ≤”), F can be indexed efficiently. Thus we can compute any
given entry of a generator matrix for BCH codes.
4.2 Parity-check matrices
For constructing parity-check matrices of BCH codes, we use an alternate characterization of BCH codes
(which is classical [19]). We begin by giving this characterization.
Let d be a given integer, the “designed distance.” For α ∈ Fqn , let vα denote the vector
(1, α, α 2 , . . . , α d−1 ) ∈ Fdqn .
Note that for any d distinct α, the corresponding vectors vα are linearly independent over Fqn (and hence
over Fq ); this follows from the observation that these vectors form a Vandermonde matrix.
The BCH code with designed distance d is the set of Fq -valued functions:
n o
f : Fqn → Fq s. t. ∑ f (α)vα = 0 .
α∈Fqn
(This is viewed as a code in the usual way: the coordinates of the code are indexed by Fqn ). We remark
that the set of Fqn -valued functions with this property is simply the Reed-Solomon code.
T HEORY OF C OMPUTING, Volume 12 (7), 2016, pp. 1–27 17
S WASTIK KOPPARTY, M RINAL K UMAR , AND M ICHAEL S AKS
Our goal is to give a strongly-explicit construction for a parity-check matrix for the BCH code: i. e., a
full-row-rank matrix with Fq entries whose null-space equals the BCH code. We first give a classical
description of such a matrix; we will then proceed to give an algorithm that computes its entries efficiently.
Define an equivalence relation ∼ on {0, . . . , qn − 1} as follows: i1 ∼ i2 iff i2 = i1 · qk mod (qn − 1)
for some k. Let G denote the set of all equivalence classes. We have the following simple observation: for
α ∈ Fq , E ∈ G, and e ∈ E:
|E|
(α e )q = α e ,
and so:
α e ∈ Fq|E| .
For each equivalence class E ⊆ {1, . . . , qn − 1}, let mE denote the smallest element of E. Let
H = {E | mE < d}
be the set of all equivalence classes whose smallest element is at most d − 1. Now for α ∈ Fqn , define the
vector
uα ∈ ∏ Fq|E|
E∈H
by
uα = (α mE )E∈H .
The remarkable dimension-distance tradeoff of BCH codes is based on the following simple, but extremely
important, fact: if f : Fqn → Fq , then
∑ f (α)vα = 0
α∈Fqn
if and only if
∑ f (α)uα = 0 .
α∈Fqn
The forward direction of this implication is trivial, since the vector uα is obtained from vα by deleting
certain coordinates. The reverse direction is based on the fact that
q
e
∑ f (α)α = ∑ f (α)α eq .
α α
In particular, we have that every d distinct uα are linearly independent over Fq .
For each j ≥ 1, let ψ j : Fq j → Fqj be an Fq -linear isomorphism. We will use the maps ψ j to convert
the uα into vectors with Fq entries. Define r = ∑E∈H |E|. Define ũα ∈ Frq to be the vector:
ũα = ψ|E| (α mE ) E∈H .
By construction, the vectors ũα satisfy the same Fq -dependencies as the vectors uα .
Let M be the r × qn matrix whose columns are ũα for α ∈ Fqn . Then M is a parity check matrix of the
BCH code. The fact that the nullspace of M equals the BCH code follows from the discussion above. The
T HEORY OF C OMPUTING, Volume 12 (7), 2016, pp. 1–27 18
I NDEXING N ECKLACES AND I RREDUCIBLE P OLYNOMIALS OVER F INITE F IELDS
fact that the row rank of M is full follows from the fact that for any (cE )E∈G with cE ∈ Fq|E| , the function
g : Fqn → Fq given by:
g(x) = ∑ TrF |E| /Fq (cE xmE ) ,
q
E∈G
is not identically 0, provided the cE are not all 0. This is proved by expanding the right hand side as a
n
polynomial mod (xq − x) and observing that there is a monomial with a nonzero coefficient. We omit
the (well-known) details.
We now use the components of our indexing algorithm for necklaces to give a strongly explicit
construction for M (after choosing a convenient ordering of the rows). The key observation is that when
the elements of {0, 1, . . . , qn − 1} are written in base q, the equivalence classes under the relation ∼ are
precisely the necklaces of length n over the alphabet {0, 1, . . . , q − 1}. We will now also refer to the E as
necklaces.
For an integer j dividing n, let H j = {E ∈ H | |E| = j}. The necklaces in H j are precisely those
necklaces with fundamental period j which contain some string which is lexicographically at most the
base-q representation of d − 1. By the results of Section 2, |H j | can be efficiently computed, and the
necklaces in H j can be efficiently indexed by a map h j : [|H j |] → H j .
To efficiently compute the entries of M, we first re-index the rows. The rows are indexed by tuples
( j, i, s), where j | n, i ∈ {1, 2, . . . , |H j |} and s ∈ {1, 2, . . . , j}: we may assume this form of indexing of
the rows for our strongly explicit construction since one can efficiently compute a bijection between
s ∈ {1, 2, . . . , r}, and the collection of all such tuples (this uses the fact that |H j | can be computed). The
columns are indexed by α ∈ Fqn . The entry of M corresponding to row ( j, i, s) and column α is then:
m
(ψ j (α h j (i) ))s .
Note that this is efficiently computable. In words, one first uses i to index a necklace E in H j ; next one
computes mE and uses this to compute α mE ; then one applies ψ j to this to get a vector in Fqj ; and finally
one takes the sth coordinate of this vector.
This completes the description of the strongly-explicit construction of a parity check matrix for BCH
codes.
5 Open problems
We conclude with some open problems.
1. Can the orbits of group actions be indexed in general?
One formulation of this problem is as follows: Let G be a finite group acting on a set X, both of
size poly(n). Suppose G and its action on X are given as input explicitly. For a finite alphabet Σ,
consider the action of G on ΣX (by permuting coordinates according to the action on X). Can the
orbits of this action be indexed? Can they be reverse-indexed?
[n]
2. Let G be the symmetric group Sn . Consider its action on {0, 1}( 2 ) , where G acts by permuting
coordinates. The orbits of this action correspond to the isomorphism classes of n-vertex graphs.
Can these orbits be indexed?
T HEORY OF C OMPUTING, Volume 12 (7), 2016, pp. 1–27 19
S WASTIK KOPPARTY, M RINAL K UMAR , AND M ICHAEL S AKS
More ambitiously, can these orbits be reverse-indexed? This would imply that graph isomorphism
is in P.
3. It would be interesting to explore the complexity theory of indexing and reverse-indexing. Which
languages can be indexed efficiently? Can this be characterized in terms of known complexity
classes? We include some simple observations about these problems in Appendix B.
A Alternative indexing algorithm for binary necklaces of prime length
In this section we give another algorithm for indexing necklaces in {0, 1}n in the special case where n is
prime.
For convenience, we will denote the n coordinates of {0, 1}n by 0, 1, . . . , n − 1, and identify them with
elements of Zn . For a binary string x, wt(x) denotes the Hamming weight of the string x.
Definition A.1. Let x ∈ {0, 1}n . We say x is top-heavy if for every j, 0 ≤ j < n:
j
wt(x)
∑ xk − n ≥ 0 .
k=0
In words: every prefix of x has normalized Hamming weight at least as large as the normalized
Hamming weight of x.
The next lemma by Dvoretzky and Motzkin [9] shows that every aperiodic string of prime length has
a unique top-heavy rotation.
Lemma A.2 ([9]). Let n be prime. For each x ∈ {0, 1}n \ {0n , 1n }, there exists a unique i, 0 ≤ i < n such
that Roti (x) is top-heavy.
Proof. Define f : {0, 1}n × N → R by:
j
wt(x)
f (x, j) = ∑ xk mod n − .
k=0 n
Then the top-heaviness of x is equivalent to f (x, j) ≥ 0 for all j ∈ N.
We make two observations.
1. If j = j0 mod n, then f (x, j) = f (x, j0 ). This follows from the fact that:
n−1
wt(x)
∑ xk − = 0.
k=0 n
2. For nonnegative integers j, ` with j < n, we have:
f (Rot j (x), `) = f (x, j + `) − f (x, j) .
T HEORY OF C OMPUTING, Volume 12 (7), 2016, pp. 1–27 20
I NDEXING N ECKLACES AND I RREDUCIBLE P OLYNOMIALS OVER F INITE F IELDS
Putting these two facts together, we get that:
f (Rot j (x), `) = f (x, ( j + `) mod n) − f (x, j) . (A.1)
Now fix x ∈ {0, 1}n \ {0n , 1n }. Define i ∈ {0, 1, . . . , n − 1} to be such that f (x, i) is minimized. By
Equation (A.1), we get that f (Roti (x), `) ≥ 0 for all nonnegative integers `. This proves the existence of
i.
For uniqueness of i, we make two more observations:
1. If f (x, j) > f (x, i), then
f (Rot j (x), n + i − j) = f (x, n + i) − f (x, j) = f (x, i) − f (x, j) < 0 ,
and thus Rot j (x) is not top-heavy.
2. If f (x, j) = f (x, j0 ), then j = j0 mod n. To see this, first note that we may assume j < j0 . Then:
0 = f (x, j0 ) − f (x, j)
j0
wt(x)
= ∑ xk mod n −
k= j+1 n
0
!
j
wt(x)
= ∑ xk mod n − ( j0 − j) · n
.
k= j+1
Thus, since the first term is an integer, we must have that ( j0 − j) · wt(x) must be divisible by n,
and by our hypothesis on x, we have that j0 = j mod n.
Thus i ∈ {0, 1, . . . , n − 1}, for which Roti (x) is top-heavy, is unique.
The above lemma implies that each orbit E contains a unique top-heavy string. We define the
canonical element of E to be that element.
We now show that there is a branching program A such that L(A) ∩ {0, 1}n precisely equals the set of
top-heavy strings. By the discussion in Section 2.1, this immediately gives an indexing algorithm for
orbits of E.
How does a branching program verify top-heaviness? In parallel, for each ` ∈ {1, . . . , n − 1}, the
branching program checks if condition C` holds, where C` is
!
j
k·`
“ (∀0 ≤ j < n) ∑ xk ≥ .”
k=0 n
At the same time, it also computes the weight of x. At the final state, it checks if Cwt(x) is true. The string
x is top-heavy if and only if Cwt(x) is true.
This completes the description of the indexing algorithm.
We also know an extension of this approach that can handle n which have O(1) prime factors. The key
additional ingredient of this extension is a new encoding of strings that enables verification of properties
like top-heaviness by automata.
T HEORY OF C OMPUTING, Volume 12 (7), 2016, pp. 1–27 21
S WASTIK KOPPARTY, M RINAL K UMAR , AND M ICHAEL S AKS
B Complexity of indexing
In this section, we explore some basic questions about the complexity theory of indexing and reverse
indexing. We would like to understand what sets can be indexed/reverse-indexed efficiently.
The outline of this section is as follows. We first deal with indexing and reverse-indexing in a
nonuniform setting. Based on some simple observations about what cannot be indexed/reverse-indexed,
we make some naive, optimistic conjectures characterizing what is efficiently indexable/reverse-indexable,
and then proceed to disprove these conjectures. We then make some natural definitions for indexing and
reverse-indexing in a uniform setting, and conclude with some analogous naive, optimistic conjectures.
B.1 Indexing and reverse-indexing in the nonuniform setting
By simple counting, most sets S ⊆ {0, 1}n cannot be indexed or reverse-indexed by circuits of size
poly(n). We now make two naive and optimistic conjectures.
• For every c > 0, there exists d > 0, such that for all sufficiently large n, if S ⊆ {0, 1}n has a nc -size
circuit recognizing it, then there is a nd -size circuit for indexing S.
• For every c > 0, there exists d > 0, such that for all sufficiently large n, if S ⊆ {0, 1}n has a nc -size
circuit recognizing it, then there is a nd -size circuit for reverse-indexing S.
Note that the simple observations about indexing made in the Introduction are consistent with these
conjectures.
We now show that these conjectures are false (unless the polynomial hierarchy collapses). Assuming
the conjectures, we will give Σ4 algorithms to count the number of satisfying assignments of a given
Boolean formula φ of size nc on n variables. By Toda’s theorem [32], this would imply that the polynomial
hierarchy collapses.
Suppose we are given a Boolean formula φ of size at most nc on n variables (with n sufficiently
large). Let S ⊆ {0, 1}n be the set of satisfying assignments of φ . We know that S can be recognized by a
circuit of size nc (namely φ ). By the conjectures, there are circuits Ci and Cr of size nd for indexing S and
reverse-indexing S. We will now see that a Σ4 algorithm can get its hands on these circuits, and then use
these circuits to count the number of elements in S.
Indexing. Consider the Σ4 algorithm that does the following on input φ . Guess a circuit C : {0, 1}n →
{0, 1}n ∪ {“too large”} of size nd , and an integer K < 2n and then verify the following properties:
• for all i ∈ [K], C(i) 6= too large and φ (C(i)) = 1,
• for all i ∈
/ [K], C(i) = too large,
• for all x ∈ {0, 1}n , if φ (x) = 1, then there exists a unique i ∈ [K] for which C(i) = x.
If C = Ci , and K = |S|, then these properties hold. It is also easy to see that if all these properties hold,
then C is an indexing circuit for S, and K = |S|. Thus the above gives a Σ4 algorithm to compute |S|.
T HEORY OF C OMPUTING, Volume 12 (7), 2016, pp. 1–27 22
I NDEXING N ECKLACES AND I RREDUCIBLE P OLYNOMIALS OVER F INITE F IELDS
Reverse-indexing. Consider the Σ4 algorithm that does the following on input φ . Guess a circuit
C : {0, 1}n → {0, 1}n ∪{“false”} of size nd , and an integer K < 2n and then verify the following properties:
• for all x ∈ {0, 1}n , either (φ (x) = 1 and C(x) ∈ [K]) or (φ (x) = 0 and C(x) = false),
• for all i ∈ [K], there exists a unique x ∈ {0, 1}n such that C(x) = i.
If C = Cr , and K = |S|, then these properties hold. It is also easy to see that if all these properties hold,
then C is a reverse-indexing circuit for S, and K = |S|. Thus the above gives a Σ4 algorithm to compute
|S|.
B.2 Indexing and reverse-indexing in the uniform setting
We now introduce a natural framework for talking about indexing in the uniform setting.
Let L ⊆ Σ∗ × Σ∗ be a pair-language. For x ∈ Σ∗ , define Lx = {y | (x, y) ∈ L}. An algorithm M(x, i) is
said to be an indexing algorithm for L if for every x ∈ Σ∗ , the function M(x, ·) is an indexing of the set
Lx . An algorithm M(x, y) is said to be a reverse indexing algorithm for L if for every x ∈ Σ∗ , the function
M(x, ·) is a reverse indexing of the set Lx . Indexing/reverse-indexing algorithms are said to be efficient if
they run in time poly(|x|).
We now make some preliminary observations about the limitations of efficient indexing/reverse-
indexing.
1. If L can be efficiently indexed, then the counting problem for L can be solved efficiently. (Recall
that the counting problem for L is the problem of determining |Lx | when given x as input. The
counting problem can be solved via binary search using an indexing algorithm.)
2. If L can be efficiently reverse indexed, then L must be in P. Indeed, the reverse indexing algorithm
M(x, y) immediately tells us whether (x, y) ∈ L.
In the absence of any other easy observations, we gleefully made the following optimistic conjectures.
1. Every pair-language L ∈ P for which the counting problem can be solved efficiently can be
efficiently indexed.
2. Every pair-language L ∈ P can be efficiently reverse indexed.
Using ideas similar to those used in the nonuniform case, one can show that the latter of these conjectures
is not true (unless the polynomial hierarchy collapses). However we have been unable to say anything
interesting about the first conjecture, and we leave it as an open problem.
Acknowledgements
We would like to thank Joe Sawada for making us aware of the work of Kociumaka et al. [15], and Igor
Shparlinski for helpful suggestions and references. Thanks to the anonymous referees for many valuable
comments.
T HEORY OF C OMPUTING, Volume 12 (7), 2016, pp. 1–27 23
S WASTIK KOPPARTY, M RINAL K UMAR , AND M ICHAEL S AKS
References
[1] L EONARD M. A DLEMAN AND H ENDRIK W. L ENSTRA , J R .: Finding irreducible polynomials over
finite fields. In Proc. 18th STOC, pp. 350–355. ACM Press, 1986. [doi:10.1145/12130.12166] 4
[2] M ANINDRA AGRAWAL AND S OMENATH B ISWAS: Primality and identity testing via Chi-
nese remaindering. J. ACM, 50(4):429–443, 2003. Preliminary version in FOCS’99.
[doi:10.1145/792538.792540] 4
[3] N OGA A LON , O DED G OLDREICH , J OHAN H ÅSTAD , AND R ENÉ P ERALTA: Simple construction
of almost k-wise independent random variables. Random Structures and Algorithms, 3(3):289–304,
1992. Preliminary version in FOCS’90. [doi:10.1002/rsa.3240030308] 2, 4
[4] A LEXANDR A NDONI , A SSAF G OLDBERGER , A NDREW M C G REGOR , AND E LY P ORAT: Homo-
morphic fingerprints under misalignments: sketching edit and shift distances. In Proc. 45th STOC,
pp. 931–940. ACM Press, 2013. [doi:10.1145/2488608.2488726] 5
[5] J ÖRG A RNDT: Matters Computational. Springer, 2011. [doi:10.1007/978-3-642-14764-7] 3, 5
[6] J EAN B ERSTEL AND M ICHEL P OCCHIOLA: Average cost of Duval’s algorithm for generating
Lyndon words. Theoret. Comput. Sci., 132(1-2):415–425, 1994. [doi:10.1016/0304-3975(94)00013-
1] 5
[7] K EVIN C ATTELL , F RANK RUSKEY, J OE S AWADA , M ICAELA S ERRA , AND C. ROBERT M IERS:
Fast algorithms to generate necklaces, unlabeled necklaces, and irreducible polynomials over GF(2).
J. Algorithms, 37(2):267–282, 2000. [doi:10.1006/jagm.2000.1108] 5
[8] J EAN -P IERRE D UVAL: Génération d’une section des classes de conjugaison et arbre des mots
de Lyndon de longueur bornée. Theoret. Comput. Sci., 60(3):255–283, 1988. [doi:10.1016/0304-
3975(88)90113-2] 5
[9] A RYEH DVORETZKY AND T HEODORE S. M OTZKIN: A problem of arrangements. Duke Math. J.,
14(2):305–313, 1947. [doi:10.1215/S0012-7094-47-01423-3] 20
[10] H AROLD F REDRICKSEN AND I RVING J. K ESSLER: An algorithm for generating necklaces of
beads in two colors. Discrete Math., 61(2-3):181–188, 1986. [doi:10.1016/0012-365X(86)90089-0]
5
[11] H AROLD F REDRICKSEN AND JAMES A. M AIORANA: Necklaces of beads in k colors and k-ary de
Bruijn sequences. Discrete Math., 23(3):207–210, 1978. [doi:10.1016/0012-365X(78)90002-X] 5
[12] S OLOMON W. G OLOMB: Irreducible polynomials, synchronization codes, primitive necklaces and
the cyclotomic algebra. In Combinatorial Math. and Its Appl., pp. 358–370. Univ. N. Carolina
Press, Chapel Hill, NC., 1969. 4
[13] V ENKATESAN G URUSWAMI AND S WASTIK KOPPARTY: Explicit subspace designs. In Proc. 54th
FOCS, pp. 608–617. IEEE Comp. Soc. Press, 2013. Preliminary version in ECCC. 4
T HEORY OF C OMPUTING, Volume 12 (7), 2016, pp. 1–27 24
I NDEXING N ECKLACES AND I RREDUCIBLE P OLYNOMIALS OVER F INITE F IELDS
[14] D ONALD E. K NUTH: Generating all trees, history of combinatorial generation. In The Art of
Computer Programming, volume 4. Addison-Wesley Professional, 2006. ACM DL. 5
[15] T OMASZ KOCIUMAKA , JAKUB R ADOSZEWSKI , AND W OJCIECH RYTTER: Computing k-th
Lyndon word and decoding lexicographically minimal de Bruijn sequence. In Combinat. Pattern
Matching, volume 8486 of LNCS, pp. 202–211. Springer, 2014. [doi:10.1007/978-3-319-07566-
2_21, arXiv:1510.02637] 5, 23
[16] S WASTIK KOPPARTY, M RINAL K UMAR , AND M ICHAEL E. S AKS: Efficient indexing of neck-
laces and irreducible polynomials over finite fields. In Proc. 41st Internat. Colloq. on Au-
tomata, Languages and Programming (ICALP’14), volume 8572, pp. 726–737. Springer, 2014.
[doi:10.1007/978-3-662-43948-7_60, arXiv:1504.00572] 1, 5
[17] D ONALD L. K REHER AND D OUGLAS ROBERT S TINSON: Combinatorial Algorithms: Generation,
Enumeration, and Search. Volume 7 of Discrete Math. and its Appl. CRC Press, 1999. 3, 5
[18] H ENDRIK W. L ENSTRA , J R .: Finding isomorphisms between finite fields. Math. Comput.,
56(193):329–347, 1991. [doi:10.2307/2008545] 4
[19] F. J ESSICA M AC W ILLIAMS AND N EIL J. A. S LOANE: The Theory of Error-Correcting Codes.
North-Holland Publ. Co., 2nd edition, 1978. 16, 17
[20] C ONRADO M ARTÍNEZ AND X AVIER M OLINERO: An efficient generic algorithm for the generation
of unlabelled cycles. In Math. and Comput. Sci. III, Trends in Math., pp. 187–197. Springer, 2004.
[doi:10.1007/978-3-0348-7915-6_19] 5
[21] W ENDY J. M YRVOLD AND F RANK RUSKEY: Ranking and unranking permutations in linear time.
Inf. Process. Lett., 79(6):281–284, 2001. [doi:10.1016/S0020-0190(01)00141-7] 3
[22] A LBERT N IJENHUIS AND H ERBERT S. W ILF: Combinatorial Algorithms for Computers and
Calculators. Volume 1. Academic Press, 1978. 3, 5
[23] G RIGORI Ĭ I OSIFOVICH P EREL’ MUTER AND I GOR E VGEN ’ EVICH S HPARLINSKI: On the dis-
tribution of primitive roots in finite fields. Russ. Math. Surv. (Uspekhi Mat. Nauk), 45(1):223–
224 (original pp. 185–186), 1990. Russian original freely downloadable from Math-Net.Ru.
[doi:10.1070/RM1990v045n01ABEH002330] 16
[24] M ICHAEL O. R ABIN: Fingerprinting by Random Polynomials. Center for Research in Comput.
Techn., Aiken Comput. Lab., Harvard Univ., 1981. 24 pp. See also at Harvard CS Tech Report
version TR 15-81 (12 pp). 4
[25] F RANK RUSKEY: Combinatorial Generation, 2003. Draft available at http://www.1stworks.
com/ref/RuskeyCombGen.pdf. 3, 5
[26] F RANK RUSKEY, C ARLA D. S AVAGE , AND T ERRY M IN Y IH WANG: Generating necklaces. J.
Algorithms, 13(3):414–430, 1992. [doi:10.1016/0196-6774(92)90047-G] 5
T HEORY OF C OMPUTING, Volume 12 (7), 2016, pp. 1–27 25
S WASTIK KOPPARTY, M RINAL K UMAR , AND M ICHAEL S AKS
[27] F RANK RUSKEY AND J OE S AWADA: An efficient algorithm for generating necklaces with
fixed density. SIAM J. Comput., 29(2):671–684, 1999. Preliminary version in SODA’99.
[doi:10.1137/S0097539798344112] 5
[28] V ICTOR S HOUP: New algorithms for finding irreducible polynomials over finite fields. Math.
Comput., 54(189):435–447, 1990. Preliminary version in FOCS’88. [doi:10.1090/S0025-5718-
1990-0993933-0] 4
[29] V ICTOR S HOUP: Searching for primitive roots in finite fields. Math. Comput., 58(197):369–380,
1992. Preliminary version in STOC’90. [doi:10.1090/S0025-5718-1992-1106981-9] 4, 16
[30] I GOR E VGEN ’ EVICH S HPARLINSKI: On primitive elements in finite fields and on elliptic curves.
Math. USSR Sb., 71(1):41–50, 1992. [doi:10.1070/SM1992v071n01ABEH001389] 16
[31] R ICHARD P. S TANLEY: Enumerative Combinatorics: Volume 1. Cambridge Univ. Press, 2011. 8
[32] S EINOSUKE T ODA: PP is as hard as the polynomial-time hierarchy. SIAM J. Comput., 20(5):865–
877, 1991. Preliminary version in FOCS’89. [doi:10.1137/0220053] 22
[33] L ESLIE G. VALIANT: The complexity of computing the permanent. Theoret. Comput. Sci., 8(2):189–
201, 1979. [doi:10.1016/0304-3975(79)90044-6] 3
AUTHORS
Swastik Kopparty
Rutgers University, New Brunswick, NJ
swastik kopparty rutgers edu
http://www.math.rutgers.edu/~sk1233/
Mrinal Kumar
Rutgers University, New Brunswick, NJ
mrinal kumar rutgers edu
https://dragon.rutgers.edu/~mk1029/
Michael Saks
Rutgers University, New Brunswick, NJ
saks math rutgers edu
http://www.math.rutgers.edu/~saks/
T HEORY OF C OMPUTING, Volume 12 (7), 2016, pp. 1–27 26
I NDEXING N ECKLACES AND I RREDUCIBLE P OLYNOMIALS OVER F INITE F IELDS
ABOUT THE AUTHORS
S WASTIK KOPPARTY received his Ph. D. in Computer Science from M.I.T. in 2010, and
was advised by Madhu Sudan. During 2010-2011, he was a postdoc at the Institute for
Advanced Study, and he has been at Rutgers University since then. He is interested in
complexity theory, error-correcting codes, finite fields, randomness and pseudorandom-
ness.
M RINAL K UMAR is a Ph. D. student in the Department of Computer Science at Rutgers
University, where he is advised by Swastik Kopparty and Shubhangi Saraf. His research
interests are in computational complexity, algebraic complexity and error-correcting
codes.
M ICHAEL S AKS received his Ph. D. in Mathematics from M.I.T. in 1980, where he was
advised by Daniel Kleitman. He was a postdoctoral fellow at UCLA, and held positions at
Bell Communications Research and the Computer Science and Engineering Department
at UCSD.
He has worked in a variety of areas in the theory of computing and discrete mathemat-
ics: lower bounds for data structures, circuits, communication complexity, branching
programs and decision trees, streaming algorithms, sublinear algorithms, satisfiability
algorithms, online algorithms, distributed computing, and derandomization, and extremal
problems for graphs, hypergraphs and partially ordered sets.
T HEORY OF C OMPUTING, Volume 12 (7), 2016, pp. 1–27 27