Linear Equations, Arithmetic Progressions and Hypergraph Property Testing

Authors Noga Alon, Asaf Shapira,
Plaintext
                                 T HEORY OF C OMPUTING, Volume 1 (2005), pp. 177–216
                                             http://theoryofcomputing.org




  Linear Equations, Arithmetic Progressions,
      and Hypergraph Property Testing
                                              Noga Alon∗                                Asaf Shapira†
                                     Received: January 12, 2005; published: October 12, 2005.




        Abstract: For a fixed k-uniform hypergraph D (k-graph for short, k ≥ 3), we say that a
        k-graph H satisfies property PD (or property P∗D ) if it contains no copy (or no induced copy)
        of D. Our goal in this paper is to classify the k-graphs D for which there are property-testers
        for testing PD and P∗D whose query complexity is polynomial in 1/ε. For such k-graphs we
        say that property PD (or property P∗D ) is easily testable.
             For P∗D , we prove that aside from a single 3-graph, P∗D is easily testable if and only
        if D is a single k-edge. We further show that for large k, one can use more sophisticated
        techniques in order to obtain better lower bounds for any large enough k-graph. These
        results extend and improve the authors’ previous results about graphs (SODA 2004) and
        results by Kohayakawa, Nagle and Rödl on k-graphs (ICALP 2002).
             For PD , we show that for any k-partite k-graph D, property PD is easily testable. This
        is established by giving an efficient one-sided-error property-tester for PD , which improves
        the one obtained by Kohayakawa et al. We further prove a nearly matching lower bound
  ∗ Supported in part by a USA-Israeli BSF grant, by the Israel Science Foundation and by the Hermann Minkowski Minerva

Center for Geometry at Tel Aviv University.
  † Supported in part by a Charles Clore Foundation Fellowship and by an IBM Ph. D. Fellowship.



ACM Classification: G.2.2, F.2.2
AMS Classification: 05C65, 68R10
Key words and phrases: Property Testing, Hypergraphs, Lower Bounds, Additive Number Theory,
Linear Algebra, Extremal Problems


Authors retain copyright to their work and grant Theory of Computing unlimited rights
to publish the work electronically and in hard copy. Use of the work is permitted as
long as the author(s) and the journal are properly acknowledged. For the detailed
copyright statement, see http://theoryofcomputing.org/copyright.html.

c 2005 Noga Alon and Asaf Shapira                                                                  DOI: 10.4086/toc.2005.v001a009
                                        N. A LON AND A. S HAPIRA

       on the query complexity of such a property-tester. Finally, we give a sufficient condition
       for inferring that PD is not easily testable. Though our results do not supply a complete
       characterization of the k-graphs for which PD is easily testable, they are a natural extension
       of the previous results about graphs (Alon, 2002).
           Our proofs combine results and arguments from additive number theory, linear algebra,
       and extremal hypergraph theory. We also develop new techniques, which we believe are of
       independent interest. The first is a construction of a dense set of integers which does not
       contain a subset that satisfies a certain set of linear equations. The second is an algebraic
       construction of certain extremal hypergraphs. These techniques have already been applied
       in two papers under publication by the authors.




1     Introduction

1.1   Definitions

All the hypergraphs considered here are finite and have no parallel edges. A k-uniform hypergraph (or
k-graph, for short) H = (V, E), is a hypergraph in which each edge contains precisely k distinct vertices
of V . As usual, a 2-graph may be referred to simply as a graph. Let P be a property of k-graphs, that
is, a family of k-graphs closed under isomorphism. A k-graph H with n vertices is ε-far from satisfying
P if one must add or delete at least εnk edges in order to turn H into a k-graph satisfying P. An ε-
tester, or property-tester, for P is a randomized algorithm which, given the quantity n and the ability to
make queries whether a desired set of k vertices spans an edge in H, distinguishes with high probability
(say, 2/3) between the case of H satisfying P and the case of H being ε-far from satisfying P. Such
an ε-tester is said to have one-sided error if when H satisfies P it determines that this is the case (with
probability 1). The ε-tester is said to have two-sided error if it may err in both direction, namely if
it has nonzero probability of accepting k-graphs that are ε-far from satisfying P, as well as nonzero
probability of rejecting k-graphs that satisfy P. The property P is called strongly-testable if, for every
fixed ε > 0, there exists a one-sided ε-tester for P whose total number of queries is bounded only by a
function of ε that is independent of the size of the input k-graph. This means that the running time of the
algorithm is also bounded by a function of ε only, and is independent of the input size. In this paper we
measure query-complexity by the number of vertices sampled, assuming we always examine all edges
spanned by them. For a fixed k-graph D, let P∗D denote the property of being induced D-free. Therefore,
H satisfies P∗D if and only if it contains no induced sub-hypergraph isomorphic to D. We define PD to
be the property of being (not necessarily induced) D-free. Therefore, H satisfies PD if and only if it
contains no copy of D.
     The general notion of property testing was first formulated by Rubinfeld and Sudan [24], who were
motivated mainly by its connection to the study of program checking. The study of the notion of testabil-
ity for combinatorial objects, and mainly for labeled graphs, was introduced by Goldreich, Goldwasser
and Ron [13]. See [11] and [23] for surveys and additional references on the topic.

                        T HEORY OF C OMPUTING, Volume 1 (2005), pp. 177–216                             178
       L INEAR E QUATIONS , A RITHMETIC P ROGRESSIONS , AND H YPERGRAPH P ROPERTY T ESTING

1.2   Previous results

In [3] it is shown that every first order graph property without a quantifier alternation of type “∀∃” has
ε-testers whose query complexity is independent of the size of the input graph. It follows from the main
result of [3] that, for every fixed graph D, the property P∗D is strongly testable. Although the query
complexity is independent of n, it has a huge dependency on 1/ε (the fourth function in the Ackermann
Hierarchy, which is a tower of towers of exponents of height polynomial in 1/ε). In [2] it was shown,
using Szemerédi’s Regularity Lemma, that, for every fixed graph D, the property PD is also strongly
testable. This result was generalized to the case of directed graphs (digraphs) in [5], by first proving a
directed version of the regularity lemma. In the above two cases the query complexity is also huge, a
tower of 2’s of height polynomial in 1/ε.
    As in many cases, moving from graphs to hypergraphs has many unexpected difficulties. While for
graphs the strong testability of PD and P∗D follows quite easily from an appropriate regularity lemma
[3, 27], until very recently there was no strong enough regularity lemma suitable for proving that PD and
P∗D are strongly testable for any k-graph D. The only results for k-graphs were obtained by Frankl and
Rödl [12], who (implicitly) showed that, for any 3-graph D, property PD is strongly testable (see also
[19]) and by Kohayakawa, Nagle and Rödl in [17], where it was shown that, for any 3-graph D, property
P∗D is strongly testable. Recent works of Gowers [15] and independently of Nagle, Rödl, Schacht and
Skokan [22, 20] suggest that a powerful new hypergraph regularity lemma implies that P∗D and PD are
both strongly testable for any k-graph D, for arbitrary value of k. It should be noted, however, that the
upper bounds that these new techniques may guarantee, for testing k-uniform hypergraphs, will probably
belong to the kth level of the Ackermann Hierarchy.
    For some k-graphs, however, there are obviously much more efficient property-testers than the ones
guaranteed by the general results described above. For example, for any k, if D is a single k-edge, then
there is obviously a one-sided-error property-tester for PD = P∗D , whose query complexity is Θ(1/ε).
We simply sample Θ(1/ε) vertices, and check if they span an edge. A natural question is, therefore,
to decide for which k-graphs D there is a one-sided-error property-tester for PD or P∗D whose query
complexity is bounded by a polynomial of 1/ε. We introduce the following definition:


Definition 1.1 (Easily Testable). A property P is easily testable if there is a one-sided-error property-
tester for P whose query complexity is polynomial in 1/ε.


    In [1] it is shown that for an undirected graph D, property PD is easily testable if and only if D is
bipartite. One of the main results of [5] is a precise characterization of all the directed graphs D for
which PD is easily testable. In [4] it is shown that for any graph D other than the paths of length 1, 2, 3
(which have 2,3,4 vertices respectively), the cycle of length 4, and their complements, P∗D is not easily
testable. A similar result was also proved for directed graphs. For k > 2, the only result in the direction
of classifying the k-graphs for which PD and P∗D are easily testable was obtained in [17], where it was
shown that for any k, the complete k-graph on k + 1 vertices is not easily testable. A natural step is
therefore to classify all the k-graphs D for which P∗D and PD are easily testable.

                        T HEORY OF C OMPUTING, Volume 1 (2005), pp. 177–216                            179
                                           N. A LON AND A. S HAPIRA

1.3   The new results
Our first two results concern testing P∗D . In what follows we denote by D3,2 the unique 3-graph on
4-vertices that has 2 edges.

Theorem 1.2. For any k ≥ 3 and any k-graph D other than a single k-edge and D3,2 , there exists a
constant c = c(D) > 0 such that the query-complexity of any one-sided-error ε-tester for P∗D is at least
                                                  c log(1/ε)
                                                  1
                                                               .
                                                  ε
     As noted above, for any k, there is an obvious one-sided-error property-tester for the case of D being
a single k-edge, whose query complexity is Θ(1/ε). We therefore get that Theorem 1.2 gives a complete
characterization of the k-graphs D for which P∗D is easily testable, besides the case of D3,2 .
     Our second result states that for large k we can significantly improve the lower bounds for testing
P∗D , for almost all k-graphs.

Theorem 1.3. For any k there is a constant r(k) such that, for any k-graph D on at least r(k) vertices,
there is a constant c = c(D) > 0 such that any one-sided-error property-tester for testing P∗D has query
complexity at least
                                           c(log 1/ε)blog kc
                                            1
                                                               .
                                            ε
    In fact, the lower bounds in the above theorem apply also to some k-graphs on less than r(k) vertices,
amongst them all the k-graphs that contain F k , which is the complete k-graph on k + 1 vertices. As a
special case, we thus improve the lower bound for the case of F k obtained in [17], which was similar
to the lower bound in Theorem 1.2. Moreover, our technique supplies a slightly inferior lower bound
(namely, with exponent blogdk/2 + 1ec instead of blog kc) for any k-graph D on more than k vertices
(see discussion following the proof of Theorem 1.3 in Section 5.2). Note that the bounds of Theorem 1.3
are super-polynomial in the bounds of Theorem 1.2; thus for large k we obtain substantially better lower
bounds.
    Our next two results concern testing PD . We first give an efficient one-sided-error property-tester
for any k-partite k-graph. Recall that a k-graph is k-partite if its vertex set can be partitioned into k sets
such that each edge has precisely one vertex in each of the partition classes.

Theorem 1.4. (i) Let t1 ≤ . . . ≤ tk , put t ∗ = t1 · . . . · tk , and let D be any k-partite k-graph with partition
classes of sizes t1 , . . . ,tk . Then there is a one-sided-error ε-tester for PD with query complexity
                                                     t ∗ /tk
                                                     1
                                                  O            .
                                                     ε

(ii) For any k-partite k-graph D on d vertices which contains |E| edges, the query complexity of any
one-sided-error ε-tester for PD is
                                               |E|/d
                                                1
                                            Ω           .
                                                ε


                          T HEORY OF C OMPUTING, Volume 1 (2005), pp. 177–216                                   180
        L INEAR E QUATIONS , A RITHMETIC P ROGRESSIONS , AND H YPERGRAPH P ROPERTY T ESTING

    The upper bound in the above theorem improves the one obtained by [17] in which the exponent
was t ∗ . See Section 7 for more details. Observe that when D is the complete k-partite k-graph Kt,...,t ,
the exponent in the upper bound is t k−1 while the one in the lower bound is t k−1 /k, which is quite close.
The proof of this theorem appears in Section 7.
    For the next result we need some definitions. A homomorphism from a k-graph D to a k-graph
K is a mapping ϕ : V (D) → V (K) which maps edges to edges, namely, if (v1 , . . . , vk ) ∈ E(D) then
(ϕ(v1 ), . . . , ϕ(vk )) ∈ E(K).

Definition 1.5. (Core) The core of a k-graph D is the smallest (in terms of edges) sub-hypergraph of
D, denoted K, for which there exists a homomorphism from D to K. A k-graph D is called a core if it is
the core of itself.

    We also need to define a generalization of cycles in graphs;

Definition 1.6. (Hyper-Cycle1 ) A k-graph on d vertices 1, . . . , d is called a hyper-cycle if it contains
d − k + 2 edges e1 , . . . , ed−k+2 and one can arrange its vertices on a cycle such that every edge ei contains
the vertices {i (mod d), . . . , i + k − 1 (mod d)}.

    Observe that for k = 2 the above definition boils down to the definition of a cycle. Also, a single
k-edge is not a hyper-cycle, as it contains 1 < k − k + 2 = 2 edges. The next theorem gives a sufficient
condition for inferring that for a k-graph D, property PD is not easily testable.

Theorem 1.7. If the core of a k-graph D contains a hyper-cycle, then there exists a constant c = c(D) > 0
such that the query-complexity of any one-sided-error ε-tester for PD is at least
                                                        c log(1/ε)
                                                        1
                                                                     .
                                                        ε

    Observe that the core of any k-partite k-graph is a single edge, which does not satisfy the definition
of a hyper-cycle. It is important to note that though Theorem 1.7 establishes that for a large family
of non-k-partite k-graphs D, property PD is not easily testable, it does not cover all the non k-partite
k-graphs, as the core of some of them does not contain a hyper-cycle. However, for k = 2, Theorem 1.7
does cover all the non-bipartite graphs, as it is easy to see that the core of any non-bipartite graph must
contain a cycle, namely, one of the shortest odd cycles of the graph. As we have mentioned above, for
k = 2, this is precisely the definition of a hyper-cycle. Hence, Theorem 1.4 and Theorem 1.7 imply that
for k = 2, property PD is easily testable if and only if D is bipartite, thus extending the result of [1],
where the characterization for graphs was first obtained. We finally mention that using the main ideas of
the proof of Theorem 1.7 one can slightly extend it by showing that it holds even if in the definition of
a hyper-cycle one only requires that the first two vertices of ei would be i (mod d), i + 1 (mod d) (its
other vertices lying in {1, 2, . . . , d}).
    As the proof with this definition is more involved (mainly due to cumbersome notations), and still
does not cover all the cases of non-k-partite k-graphs, we preferred to give the proof of the slightly less
general case, which contains all the important ideas.
   1 In some papers on hypergraphs this object is called a tight cycle.



                            T HEORY OF C OMPUTING, Volume 1 (2005), pp. 177–216                             181
                                        N. A LON AND A. S HAPIRA

    We have thus far considered only one-sided-error property-testers. A natural question is if there
are k-graphs D, for which P∗D (or PD ) is not easily testable, but can still be tested with two-sided error
and query complexity polynomial in 1/ε. We can (partially) rule out this possibility by showing that
the lower bounds of Theorem 1.2, Theorem 1.3 and Theorem 1.7 can all be extended to the case of
two-sided-error ε-testers.

Theorem 1.8. The lower bounds of Theorem 1.2, Theorem 1.3 and Theorem 1.7 hold for two-sided-error
ε-testers as well.

1.4   Techniques
Our main results in this paper, Theorem 1.2, Theorem 1.3 and Theorem 1.7, are based on two new
constructions. All the previous results on testing PD and P∗D ([1, 5, 4, 17]) were based on constructions
of sets of integers which do not contain small subsets that satisfy a certain single equation. All these
constructions were based on Behrend’s construction [7] of a large set of integers containing no 3-term
arithmetic progression. In our case, however, we consider sets of integers that do not contain small
subsets that satisfy a certain set of equations. The key benefit of this consideration is that requiring
the set of integers to satisfy a set of equations, rather than a single one, allows us to construct much
denser sets than the ones used in previous papers. This benefit translates to significantly improved lower
bounds. The proof of this new construction appears in Section 2. Some of the techniques we apply in
the proof of this result are motivated by the work of Laba and Lacey [18], where they reproved a result
of Rankin [21] by constructing large sets of integers without k-term arithmetic progressions. The ideas
used in our number-theoretic construction have been further applied in another recent paper [26].
     Our second technical contribution is an algebraic construction of certain extremal k-graphs. The
goal of this construction is to resolve the main technical difficulty in the proof of the main results. The
main benefit of this construction is that it allows us to infer certain linear equations between the integers
that are used in the definition of these k-graphs. In previous papers about testing subgraphs in graphs,
([1, 5, 4]) inferring these linear equations was trivial. This construction can be viewed as an extension of
a construction of Frankl and Rödl [12] (which is an extension of the well known construction of Ruzsa
and Szemerédi [25]), but ours is far more complicated to analyze. It is also much more applicable than
the construction of [12], which, for example, can only be used to show that the complete k-graph on
k + 1 vertices is not easily testable and with a lower bound as in Theorem 1.2, rather than the one in
Theorem 1.3. Our new algebraic technique is applied in Section 3 and Section 6. The ideas used in the
algebraic construction of extremal k-graphs have been further applied in another recent paper [6].

1.5   Organization
In Section 2 and Section 3 we develop the main machinery needed to prove Theorem 1.2 and Theo-
rem 1.3. In Section 2 we describe a new number-theoretic construction. In Section 3 we describe a new
algebraic construction of extremal k-graphs. In Section 4 we prove two useful lemmas, which use the
constructions of Section 2 and Section 3 in order to obtain the lower bounds of Theorem 1.2 and The-
orem 1.3. The results of Section 2, Section 3 and Section 4 are essentially independent, and thus these
sections can be read independently. To further simplify the reading of these sections, each of them starts

                        T HEORY OF C OMPUTING, Volume 1 (2005), pp. 177–216                              182
        L INEAR E QUATIONS , A RITHMETIC P ROGRESSIONS , AND H YPERGRAPH P ROPERTY T ESTING

with a short subsection in which we state the important definitions and state the main results proved in
that section.
    The proofs of Theorem 1.2 and Theorem 1.3, which follow quite easily by combining the main re-
sults of Section 2, Section 3 and Section 4, are given in Section 5. In Section 6 we apply our algebraic
technique again, this time to construct extremal k-graphs, which are a central tool in the proof of Theo-
rem 1.7. The proof of Theorem 1.7 also appears in Section 6. In Section 7 we prove Theorem 1.4. As
the proof of Theorem 1.8 uses ideas similar to the ones used in [4] (in addition to the ideas of this paper)
we omit it. Section 8 contains some concluding remarks and open problems.
    Throughout this paper we assume, whenever this is needed, that the error parameter ε is sufficiently
small, and that the number of vertices n of the k-graph considered is sufficiently large compared to 1/ε.
In order to simplify the presentation, we omit all floor and ceiling signs whenever these are not crucial,
and make no attempt to optimize the absolute constants. All the logarithms appearing in the paper are in
base 2.


2     Arithmetic Progressions and Linear Equations
2.1    The main results of this section
In this section we give our number-theoretic construction, which will be later used in Section 5. We start
with some definitions.
Definition 2.1. ((k, h)-Gadget) Call a set of k − 2 linear equations E = {e1 , . . . , ek−2 } with integer coef-
ficients in k unknowns x1 , . . . , xk a (k, h)-gadget if it satisfies the following properties:
    1. Each of the unknowns x1 , . . . , xk appears in at least one of the equations.
    2. For 1 ≤ t ≤ k − 2 equation et is of the form
                                                pt xi + qt x j = (pt + qt )x` ,
       where 0 < pt , qt ≤ h and xi , x j , x` are distinct.
    3. Equations e1 . . . , ek−2 are linearly independent.
   We say that z1 , . . . , zk satisfy a (k, h)-gadget E if they satisfy the k − 2 equations of E. Note that any
gadget E has a trivial solution x1 = . . . = xk .
Definition 2.2. ((k, h)-Gadget-Free) A set of integers Z, is called (k, h)-gadget-free if there are no k
distinct integers z1 , . . . , zk ∈ Z that satisfy an arbitrary (k, h)-gadget.
    Our main goal in this section is to prove the following theorem, which will be a key ingredient in
the lower-bounds for P∗D .
Theorem 2.3. For every h and k there is an integer c = c(k, h), such that for every n there is a (k + 1, h)-
gadget-free subset Z ⊂ [n] = {1, 2, . . . , n} of size at least
                                                               n
                                               |Z| ≥            1/blog 2kc
                                                                             .                            (2.1)
                                                       ec(log n)

                           T HEORY OF C OMPUTING, Volume 1 (2005), pp. 177–216                              183
                                         N. A LON AND A. S HAPIRA

    As we have explained before, note that for larger k the above theorem guarantees the existence of
a substantially larger set Z. The special case of the above theorem, where k = 2, was proved and used
in [9] and [4]. As the details of the proof of Theorem 2.3 will reveal, the main idea is to somehow
reduce the construction required to prove Theorem 2.3 to a construction related to a notion very similar
to arithmetic progressions. The main idea of this reduction will be to show that integers satisfying the
linear equations of a gadget nearly form an arithmetic progression. Our notion of “near” arithmetic
progression is the following:

Definition 2.4. ((k, h)-Progression) A set of k integers z1 ≤ z2 ≤ . . . ≤ zk is said to form a (k, h)-
progression if there are integers d, n2 , . . . , nk with ni ≤ h such that, for 2 ≤ i ≤ k, we have

                                              zi = zi−1 + ni d .                                         (2.2)

    In what follows we call the integers ni the coefficients of the progression and d the difference. Note
that a (k, h)-progression is “nearly” an arithmetic progressions in the sense that in an arithmetic pro-
gression one requires n2 = . . . = nk = 1. Also, note that the difference d is analogous to the difference
between consecutive elements in an arithmetic progression. In other words, a k-term arithmetic progres-
sion is a (k, 1)-progression of distinct elements. The following notion will be important for the proof of
Theorem 2.3:

Definition 2.5. (Nontrivial (k, h)-Progression) A (k, h)-progression is said to be nontrivial if its ele-
ments are distinct. Thus, a (k, h)-progression is nontrivial iff the difference d as well as the coefficients
ni are all nonzero.

     The proof of Theorem 2.3 appears in the following two subsections. In the first subsection we
show how to transform the problem from one that deals with linear equations and gadgets to an
analogous problem about (k, h)-progressions. We also show how the solution of the problem about
(k, h)-progressions implies Theorem 2.3. In the second subsection we solve the problem about (k, h)-
progressions.

2.2   Gadgets and (k, h)-Progressions
We start this subsection by “reducing” gadgets to (k, h)-progressions. Formally, we want to show the
following

Lemma 2.6. For every k and h there is an integer c = c(k, h) such that if z1 < . . . < zk satisfy a (k, h)-
gadget then they form a nontrivial (k, c)-progression.

    For the proof of the above proposition we need the following three claims. For the proof of the first
we need the following well-known result that follows from Cramer’s rule and the Hadamard Inequality
(see, e.g., [16]).

Lemma 2.7. Let Ψ be a set of p homogenous linear equations in q variables. If p < q and each of the
coefficients in these equations has absolute value at most r, then Ψ has a nonzero solution {α1 , . . . , αq },
where each αi is an integer with absolute value at most (r2 p) p/2 .

                         T HEORY OF C OMPUTING, Volume 1 (2005), pp. 177–216                               184
        L INEAR E QUATIONS , A RITHMETIC P ROGRESSIONS , AND H YPERGRAPH P ROPERTY T ESTING

Claim 2.8. If z1 < . . . < zk are positive integers, which satisfy a (k, h)-gadget E, then for 2 ≤ i ≤ k − 1
there are positive integers ai , bi ≤ (h2 k)k/2 such that ai zi−1 + bi zi+1 = (ai + bi )zi .

Proof. As there is nothing to prove for k = 3, we assume k ≥ 4. In order to simplify the notation, we
show that there are positive integers a, b ≤ (h2 k)k/2 such that

                                              az1 + bz3 = (a + b)z2 .                                            (2.3)
The other k − 3 cases are identical. We first substitute z1 , . . . , zk into the set E, and obtain k − 2 linear
equations of the form pt zi + qt z j = (pt + qt )zk . Henceforth, when we refer to equation et ∈ E we will
refer to the equation after we have substituted the integers zi into it. Our goal is simply to show that there
are α1 , . . . , αk−2 not all equal to zero, such that in the linear combination C = α1 e1 + . . . + αk−2 ek−2 the
coefficients of the integers z4 , . . . , zk vanish. We first claim that this will give us (2.3). Indeed, note that
as e1 , . . . , ek−2 are by assumption linearly independent, it cannot be the case that all the coefficients of the
integers zi vanish. Also, as for each of the equations in E the sum of the coefficients on the left hand side
is equal to the coefficient on the right hand side, this must also hold for C, hence, it cannot be the case
that precisely one of coefficients of z1 , z2 , z3 does not vanish. Similarly, if precisely two of coefficients
of z1 , z2 , z3 do not vanish, this would imply that they are equal, which contradicts our assumption that
z1 < . . . < zk . Finally as we assume that each of the integers zi appears at least once, we are guaranteed
to get (2.3).
    In order to make sure that in a linear combination with coefficients α1 , . . . , αk−2 the integers z4 , . . . , zk
vanish, we may write k − 3 homogenous linear equations, which require that. This is a set of k − 3
homogenous equations in k − 2 unknowns with coefficients bounded by h. Therefore, by Lemma 2.7 it
has a nonzero solution with integer coefficients of size at most (h2 (k − 2))k/2−1 . This means that the
coefficients of C are bounded by (k − 3)(h2 (k − 2))k/2−1 ≤ (h2 k)k/2 , as needed.

Claim 2.9. Suppose z1 , z2 , z3 , a, b are positive integers, such that z1 < z2 < z3 and a, b ≤ h. If the
following equation holds
                                           az1 + bz3 = (a + b)z2 ,
then z1 , z2 , z3 form a nontrivial (3, h)-progression.

Proof. We show that z1 , z2 , z3 form a (3, h)-progression. It will be a nontrivial (3, h)-progression because
we assume that z1 < z2 < z3 . We first assume that a and b are co-prime, as otherwise we can divide them
by their gcd, and obtain a new equation a0 z1 + b0 z3 = (a0 + b0 )z2 , with a0 < a, b0 < b. Rearranging the
equation we get that a(z2 − z1 ) = b(z3 − z2 ). As a and b are co-prime d = (z3 − z2 )/a = (z2 − z1 )/b is
an integer. Thus, we can write z2 = z1 + bd and z3 = z2 + ad, and take n2 = b ≤ h and n3 = a ≤ h in the
definition of the (3, h)-progression.

Claim 2.10. Suppose z1 < z2 < . . . < zk are positive integers, such that for every 2 ≤ i ≤ k − 1 there are
integers ai , bi ≤ h, such that
                                      ai zi−1 + bi zi+1 = (ai + bi )zi
holds. Then z1 , z2 , . . . , zk form a (nontrivial) (k, hk−2 )-progression.

                          T HEORY OF C OMPUTING, Volume 1 (2005), pp. 177–216                                      185
                                           N. A LON AND A. S HAPIRA

Proof. As before, we show that z1 , . . . , zk form a (k, hk−2 )-progression. It will be a nontrivial (k, hk−2 )-
progression because we assume that z1 < z2 < . . . < zk . We proceed by induction on k. The base case
k = 3 follows from Claim 2.9. Assuming the claim holds for k we prove it for k + 1. By the induction
hypothesis, for 2 ≤ i ≤ k we can write zi = zi−1 + mit for some integer t and mi ≤ hk−2 . By assumption
ak zk−1 + bk zk+1 = (ak + bk )zk . Rearranging this gives
                                                          ak
                                           zk+1 − zk =       (zk − zk−1 ) .                                    (2.4)
                                                          bk
Put g = gcd(bk ,t) (≤ h) and d = t/g, and observe that for 1 ≤ i ≤ k we can write zi = zi−1 + g · mi · d,
and thus take ni = mi · g ≤ hhk−2 = hk−1 . As in Claim 2.9, we may assume that ak and bk are co-prime,
and conclude from (2.4) that bk divides zk − zk−1 = mk t. We may thus write
                                           ak mk t        ak mk g
                             zk+1 = zk +           = zk +         · d = zk + nk+1 d .
                                             bk             bk

As ak g/bk ≤ ak ≤ h and mk ≤ hk−2 , we have nk+1 ≤ hk−1 , and the proof is complete.

Proof of Lemma 2.6. Immediate from Claim 2.8 and Claim 2.10.

    Though we do not need this here, it is worth mentioning that the converse of Lemma 2.6 is also true.
Indeed, if z1 , . . . , zk form a (k, h)-progression, then for every 2 ≤ i ≤ k − 1 we have zi = zi−1 + ni d, and
zi+1 = zi + ni+1 d. This implies that (ni + ni+1 )zi = ni+1 zi−1 + ni zi+1 . Hence, z1 , . . . , zk satisfy the k − 2
linear equations (ni + ni+1 )xi = ni+1 xi−1 + ni xi+1 that are easily checked to satisfy the three requirements
of a (k, h)-gadget.
    The proof of Theorem 2.3 will follow by combining Lemma 2.6 and the following lemma.

Lemma 2.11. For every h and p ≥ 2, there is an integer c = c(p, h) such that for every n there is a
subset Z ⊂ [n] = {1, 2, . . . , n} of size at least
                                                              n
                                                  |Z| ≥        1/p
                                                                                                               (2.5)
                                                          ec log     n


that does not contain any nontrivial (1 + 2 p−1 , h)-progression.

Proof of Theorem 2.3. Let p be the largest integer satisfying 1 + 2 p−1 ≤ 1 + k, namely, p = blog 2kc.
Let c0 = c(k + 1, h) be the constant appearing in Lemma 2.6. Now, by Lemma 2.11, there is a constant
c = c(p, c0 ), such that for every n there is a subset Z ⊆ [n] of size as in (2.5), which contains no nontrivial
(1 + 2 p−1 , c0 )-progression. By our choice of p, this set contains no nontrivial (k + 1, c0 )-progression.
By Lemma 2.6, the set Z does not contain k + 1 distinct integers, which satisfy a (k + 1, h)-gadget. As
p = blog 2kc, the set Z satisfies the requirements of Theorem 2.3.

    It is easy to see that the elements of a (1 + 2 p−1 , h)-progression must be taken from an arithmetic
progression of length at most h2 p−1 , whose difference is the integer d from the definition of the (1 +
2 p−1 , h)-progression in Definition 2.4. Thus, another way to look at Lemma 2.11 is as a construction of a
set Z with the following property: not only doesn’t Z contain arithmetic progressions of length 1 + 2 p−1 ,

                          T HEORY OF C OMPUTING, Volume 1 (2005), pp. 177–216                                    186
        L INEAR E QUATIONS , A RITHMETIC P ROGRESSIONS , AND H YPERGRAPH P ROPERTY T ESTING

but it does not even contain 1 + 2 p−1 numbers out of some other not too large arithmetic progression,
whose other elements need not even belong to Z.
      In order to prove Lemma 2.11, we will first show that it holds for every fixed set of coefficients
n2 , . . . , n1+2 p−1 . Namely, we show that there is a subset of [n] of the same size as in (2.5) that does not
contain any (1 + 2 p−1 , h)-progression z1 , . . . , z1+2 p−1 such that zi = zi−1 + ni d for every 2 ≤ i ≤ 1 + 2 p−1 .
Note that the difference d may be arbitrary. To be precise, we want to show the following:
Lemma 2.12. For every fixed positive n2 , . . . , n1+2 p−1 ≤ h there is an integer c = c(p, h) such that for
every n there is a subset Z ⊂ [n] = {1, 2, . . . , n} of size at least
                                                              n
                                                  |Z| ≥        1/p
                                                                         ,                                      (2.6)
                                                          ec log     n

that does not contain any nontrivial (1 + 2 p−1 , h)-progression with coefficients n2 , . . . , n2 p −1 .
   The proof of this lemma appears in the next subsection. We first show how to derive Lemma 2.11
from the above lemma.

Proof of Lemma 2.11. For every set s, of positive 2 p−1 integers n2 , . . . , n1+2 p−1 ≤ h, let Zs be the
largest subset of [n], which does not contain any nontrivial (1 + 2 p−1 , h)-progression with coefficients
                                                                                                    1/p
n2 , . . . , n1+2 p−1 . By Lemma 2.12 we have that for any s the set Zs has size at least n/ec log n , where c
depends only on p and h. Denote the number of sets s by m, and observe that as the coefficients in each
set s are bounded by h there are less than 2 ph choices for the set s.
      Uniformly at random pick m integers t1 , . . . ,tm from {−n, . . . , n}, and consider the set
                                                        m
                                                        \
                                                  Z=        (Zi + ti )
                                                        i=1

(where, Z + t denotes the translate of Z by t, i.e. Z + t = {z + t : z ∈ Z}). Clearly Z contains no (1 +
2 p−1 , h)-progressions with arbitrary coefficients bounded by h. For every integer z ∈ [n] the probability
                                       1/p
that it belongs to Zi + ti is 1/ec log n , hence the probability that it belongs to all the sets Zi + ti , and
                                   1/p            0  1/p
therefore also to Z, is (1/ec log n )m = 1/ec log n for a possibly larger c0 that still depends only on p
                                                                                    0 1/p
and h. By linearity of expectation we get that the expected size of Z is n/ec log n , and therefore there is
some choice of t1 , . . . ,tm for which the resulting set Z is at least this large.

2.3    Large sets of integers without a given (k, h)-Progression
In this subsection we apply the method of [18] in order to prove Lemma 2.12. The proof will require
some more definitions. We first need to further extend the notion of arithmetic progressions as follows:
Definition 2.13 ((p,t, h)-Progression). A set of p integers z0 , . . . , z p−1 is said to form a (p,t, h)-
progression if there are t + 1 integers d0 , . . . , dt and integers n0 = 0, n1 , . . . , n p−1 ≤ h such that for
0 ≤ i ≤ p−1

                                    zi = d0 + ni · d1 + n2i · d2 + . . . + nti · dt .                           (2.7)

                          T HEORY OF C OMPUTING, Volume 1 (2005), pp. 177–216                                     187
                                                 N. A LON AND A. S HAPIRA

     To avoid confusion, note that by definition d0 = z0 , thus, we did not really need d0 and n0 , which is
fixed to be zero, in the above definition. However, this way of defining the integers of the set will make
subsequent notation more compact. Note that unlike the definition of (k, h)-progressions in Defini-
tion 2.4, here we define each element of the sequence with respect to the smallest number z0 = d0 , rather
than the preceding one as in Definition 2.4. Therefore, a (p, h)-progression as defined in Definition 2.4
is also a (p, 1, h(p − 1))-progression as defined in (2.7).

Definition 2.14 (Nontrivial (p,t, h)-progression). We call a (p,t, h)-progression nontrivial if at least
one of d1 , . . . , dt is nonzero and n0 , . . . , n p−1 are distinct.

Definition 2.15 (Rs (p,t, n)). For a set s of p distinct integers n0 = 0, n1 . . . , n p−1 ≤ h, define Rs (p,t, n) to
be the largest possible size of a subset of [n] which does not contain any nontrivial (p,t, h)-progression
whose coefficients are the integers of s.

    The proof of Lemma 2.12 will follow by combining the following two claims.

Claim 2.16. For every set s, of 2t + 1 distinct integers bounded by h, there is an integer c = c(t, h), such
that
                                                             n
                                       Rs (2t + 1,t, n) ≥ c√log n .                                     (2.8)
                                                          e
Claim 2.17. For every set s, of p distinct integers bounded by h, there is an integer c = c(p, h), such that
if n = gb and p ≥ t + 1, then
                                                   n · Rs (p, 2t, g2 b)
                                     Rs (p,t, n) ≥                      .                              (2.9)
                                                          cb g2 b
Proof of Lemma 2.12. As we have noted above, a (p, h)-progression as defined in Definition 2.4 is also
a (p, 1, h(p − 1))-progression as defined in (2.7). Hence, we can prove Lemma 2.12 by showing that for
every set s, of distinct2 coefficients n0 = 0, n1 , . . . , n2 p−1 ≤ h2 p−1 we have
                                                                             n
                                              Rs (1 + 2 p−1 , 1, n) ≥          1/p
                                                                                     .                                      (2.10)
                                                                         ec log n

    Consider any set s, of distinct integers bounded by h2 p−1 . Given integers n and p, we prove by
induction on ` that for every 2 ≤ ` ≤ p there is a constant c = c(p, h), such that
                                                                                 n
                                           Rs (1 + 2 p−1 , 2 p−` , n) ≥            1/`
                                                                                         .                                  (2.11)
                                                                          ec(log n)
    The case ` = 2 follows from Claim 2.16 with t = 2 p−2 . Assuming the claim holds for ` we prove it
                                                                                  1−1/(`+1)
for ` + 1. Set b = (log n)1/(`+1) , and let g satisfy n = gb , namely g = e(log n)          . A short calculation
shows that in this case

                                              (log g2 b)1/` ≤ c(log n)1/(`+1) ,                                             (2.12)
   2 The reader should recall that for a (p,t, h)-progressions to be nontrivial its coefficients should be distinct. When t = 1 this

guarantees that this nontrivial (p, 1, h)-progression is also a nontrivial (p, h)-progression.


                              T HEORY OF C OMPUTING, Volume 1 (2005), pp. 177–216                                              188
        L INEAR E QUATIONS , A RITHMETIC P ROGRESSIONS , AND H YPERGRAPH P ROPERTY T ESTING

where c depends only on p. We get that

                                           n · R(1 + 2 p−1 , 2 p−` , g2 b)
              R(1 + 2 p−1 , 2 p−`−1 , n) ≥
                                                     cb g2 b
                                              n         g2 b                  n                    n
                                          ≥ b 2 ·          2   1/`
                                                                   ≥              1/(`+1)
                                                                                          ≥          1/(`+1)
                                                                                                             ,
                                           c g b ec(log g b)          cb ec(log n)          ec(log n)
where the first inequality follows from Claim 2.17, the second from the induction hypothesis in (2.11)
with n = g2 b, the third from (2.12), and the last from our choice of b and the fact that c depends only
on p and h. Also, note that by the reasoning we used to derive each of these inequalities, all the above
constants depend only on p and h (we called all of them c in order to simplify the notation). This
completes the proof of (2.11). We now obtain (2.10) by setting ` = p in (2.11).

    We now turn to prove Claim 2.16 and Claim 2.17, which will require (yet again) several additional
definitions. Given a set of integers S we denote by S + r the translate of S by r, that is, S + r = {x + r :
x ∈ S}. Note, that if S does not contain any nontrivial (p,t, h)-progression than so does any translate of
S. For reasons that will soon become clear, we prefer to prove Claim 2.16 and Claim 2.17 with respect
to the set of integers {−n/2, . . . , n/2} rather than [n] = {1, . . . , n}. We also consider representations of
integers from {−n/2, . . . , n/2} in base g, where g will depend on n and will be much smaller than n. If
n = gb we define, for an integer c ≥ 2,
                                                        b−1
                                 Qc = {x ∈ Z : x = ∑ xi · gi , −g/c ≤ xi ≤ g/c} ,
                                                        i=0

namely, all the integers whose “digits” in base g belong to −g/c, . . . , g/c. As Qc ⊆ {−n/2, . . . , n/2}
we may and will construct our sought after sets from integers belonging to Qc for an appropriate large
enough constant c. Note, that somewhat unconventionally, we allow for negative digits. This repre-
sentation, however, is well-defined in the sense that given x ∈ Qc , there are unique integers −g/c ≤
x0 , . . . , xb−1 ≤ g/c such that x = ∑b−1       i
                                       i=0 xi · g . Given an integer x ∈ Qc we will denote by x = (x0 , . . . , xb−1 )
the unique b dimensional vector in Z such that x = ∑b−1
                                            b                             i                         2
                                                                i=0 xi · g . We will also denote kxk = kxk =
                                                                                                                  2

∑i=0 xi . Our argument will critically rely on the observation that if c is sufficiently large then addi-
  b−1 2

tion, and more generally linear combinations with small coefficients, of numbers from Qc is equivalent
to linear combinations of their corresponding vectors. For example, observe that if x, y, z ∈ Q2 , then
x + y = z if and only if x + y = z. The reason for that is simply that there is no carry in the base g
addition of the number. More generally, if c is sufficiently large with respect to integers α1 , . . . , αt , then
for x, x1 , . . . , xt ∈ Qc ,
                                                t                     t
                                          x = ∑ αi · xi ⇐⇒ x = ∑ αi · xi .                                              (2.13)
                                               i=1                   i=1

Also, note that if c is sufficiently large with respect to integers α1 , . . . , αt , then for x1 , . . . , xt ∈ Qc ,
                                                         t
                                                    x = ∑ αi · xi ∈ Qc0 ,                                               (2.14)
                                                        i=1


                            T HEORY OF C OMPUTING, Volume 1 (2005), pp. 177–216                                           189
                                            N. A LON AND A. S HAPIRA

for another (possibly smaller) constant c0 . It should be noted that had we chosen to work with the set [n]
rather than −n/2, . . . , n/2 and represented integers using positive digits, then (2.13) and (2.14) would
not necessarily hold for negative coefficients. The reason is that the difference of two numbers with
small digits may contain very large digits. As we also allow for negative digits, the difference also
contains small digits. Finally, given integers p1 , . . . , pt we denote by V (p1 , . . . , pt ) the Vandermonde
matrix satisfying for 1 ≤ i, j ≤ t, Vi, j = pij .

Proof of Claim 2.16. Consider any set s of 2t + 1 distinct integers n0 = 0, n1 , . . . , n2t ≤ h. For an integer
r define Sr = {x ∈ Qc : kxk2 = r}. We claim that if c is large enough in terms of t and h, then Sr contains
no nontrivial (2t + 1,t, h)-progression with coefficients taken from s. Suppose to the contrary that there
are such 2t + 1 integers z0 , z1 , . . . , z2t . By (2.7) we have that for 0 ≤ i ≤ 2t

                                    zi = d0 + ni · d1 + n2i · d2 + . . . + nti · dt ,                          (2.15)
where d0 = z0 , d1 , . . . , dt are arbitrary integers. Recall, that for this set to be nontrivial at least one
of d1 , . . . , dt must be nonzero (the integers ni ∈ s are already assumed to be distinct). Denote by D the
determinant of the Vandermonde matrix V = V (n0 , . . . , nt ), and for 0 ≤ i ≤ t denote by Di the determinant
of the matrix obtained from V by replacing the ith column with the column vector (z0 , . . . , zt ). Observe,
that we can view the first t + 1 equations in (2.15) as t + 1 equations in unknowns d0 , d1 , . . . , dt . It
follows from Cramer’s rule that for 0 ≤ i ≤ t we have Ddi = Di . From the definition of the determinant
we can view Di as a linear combination of z0 , . . . , zt with integer coefficients bounded by a constant that
depends only on t and n0 , n1 , . . . , nt . As n0 , n1 , . . . , nt ≤ h, these coefficients are bounded by a constant
that depends only on t and h. Hence, by (2.13), if c was chosen large enough in terms of t and h then
for 0 ≤ i ≤ t, we get that Ddi (the b dimensional vector representing Ddi ) is a linear combination of
z0 , . . . , zt . Moreover, by (2.14) we may conclude that Ddi ∈ Qc0 for some c0 < c. As by (2.15), z0 , . . . , z2t
are defined as linear combinations of d0 , . . . , dt , we conclude that if c is large enough (so that c0 is large
enough), we can use (2.13) again to write (2.15) as

                               Dzi = Dd0 + ni · Dd1 + n2i · Dd2 + . . . + nti · Ddt .                          (2.16)
Define the following polynomial of degree 2t
                             b−1
                    P(x) := ∑ (Dd0 )q + (Dd1 )q · x + (Dd2 )q · x2 + . . . + (Ddt )q · xt
                                                                                            2
                                                                                                 ,
                             q=0


where (Ddi )q denotes the qth entry of the vector Ddi . The key observation now is that by (2.16) we have
for 0 ≤ j ≤ 2t that P(n j ) = kDz j k2 = D2 kz j k2 . Hence, as by assumption all the integers zi belong to
Sr , we have that P is a polynomial of degree 2t with 2t + 1 distinct values (namely n0 , n1 , . . . , n2t ) for
which it is equal to D2 r. Therefore, P must be identical to D2 r, which can be easily seen to imply that
(Ddi )q = 0 for all 0 ≤ q ≤ d − 1 and 1 ≤ i ≤ t. Hence, d1 = . . . = dt = 0, contradicting our assumption
that this is a nontrivial (2t + 1,t, h)-progression. We conclude that if c is large enough in terms of h and
t then Sr contains no nontrivial (2t + 1,t, h)-progression.
     The claim now follows by averaging. As the absolute value of each digit in Qc is bounded by g/c,
we have r ≤ b(g/c)2 ≤ bg2 . Similarly, we conclude that Qc is of size (2g/c)b > (g/c)b . As the union of

                          T HEORY OF C OMPUTING, Volume 1 (2005), pp. 177–216                                     190
        L INEAR E QUATIONS , A RITHMETIC P ROGRESSIONS , AND H YPERGRAPH P ROPERTY T ESTING

                                                                                    b     2  2 b
                              √ Qc there must be one r for which |Sr | ≥ (g/c) /bg = n/bg c . Setting
the sets Sr covers the entire set
     √
b = log n, and hence g = e      log n , gives (2.8) for an appropriate constant c = c(t, h).

Proof of Claim 2.17. We again consider an arbitrary set s, of distinct integers n0 = 0, n1 , . . . , n p−1 bounded
by h. As in the previous proof, we continue to write n = gb and represent numbers in base g with possibly
negative digits. We will also construct our sought after set from Qc for a large enough constant c that will
only depend on p,t and h. Let D denote the determinant of the Vandermonde matrix V = V (n0 , . . . , nt ).
Let R ⊆ {1, . . . , D2 b(g/c)2 } be a set of size Rs (p, 2t, D2 b(g/c)2 ) that contains no nontrivial (p, 2t, h)-
progression with coefficients from s, and recall that any translate of R also satisfies this property. Let
L = {−D2 b(g/c)2 , . . . , D2 b(g/c)2 }. For any ` ∈ L define
                                            A` = {x ∈ Qc : kDxk2 ∈ R + `} .
We claim that A` does not contain any nontrivial (p,t, h)-progression, with coefficients from s, provided
c in the definition of Qc is large enough. Suppose this is not the case, and let z0 , . . . , z p−1 be such a
nontrivial (p,t, h)-progression. Namely, suppose there are d0 , d1 , . . . , dt not all equal to zero such that
z j = d0 + ∑ti=1 ntj dt holds for 0 ≤ i ≤ p − 1. As by assumption p ≥ t + 1 we can still write the t + 1 linear
equations as in (2.15). We can then argue as in the proof of Claim 2.16 that provided c is large enough,
we may conclude that for 0 ≤ j ≤ p − 1 one can write

                                Dzi = Dd0 + ni · Dd1 + n2i · Dd2 + . . . + nti · Ddt .                                (2.17)
This implies, as in Claim 2.16, that for every 0 ≤ j ≤ p − 1 we can write

                                                         !2
                                b−1     t
         kDz j k = kDz j k2 = ∑        ∑ (Ddi )q · nij        = d00 + n j · d10 + n2j · d20 + . . . + n2tj · d2t0 ,   (2.18)
                                q=0   i=0

where d00 , . . . , d2t0 are identical to all 0 ≤ j ≤ p − 1. It is easy to see that as d0 , . . . , dt are by assumption
not all zero, then so are d00 , . . . , d2t0 . As d00 , . . . , d2t0 are common to all kDz j k2 , the right hand side of
(2.18) has the structure of a nontrivial (p, 2t, h)-progression with coefficients from s. This means that
kDz0 k2 , . . . , kDzt−1 k2 form a nontrivial (p, 2t, h)-progression with coefficients from s. This contradicts
our choice of R and A` .
      We conclude that for any ` ∈ L, the set A` contains no nontrivial (p,t, h)-progression with coefficients
from s. It is thus enough to show that for some ` ∈ L the set A` is large enough. We do this again
by averaging. As the absolute value of the digits of numbers from Qc is bounded by g/c we have
0 ≤ kDxk2 ≤ D2 b(g/c)2 for any x ∈ Qc . Therefore, for any r ∈ R and x ∈ Qc there is an ` ∈ L such
that kDxk2 = r + `. Hence, for any x ∈ Qc there are |R| integers ` ∈ L such that x ∈ A` . In other words,
  |L|
∑`=1 A` ≥ |R||Q|, and therefore for some choice of ` ∈ L we have |A` | ≥ |R||Qc |/|L|. As |Qc | > (2g/c)b ,
the proof follows as for some ` ∈ L we must have
                        R(p, 2t, h, D2 bg2 ) · (2g/c)b R(p, 2t, h, bg2 ) · gb     R(p, 2t, h, bg2 )
              |Ab | ≥                                 ≥                       ≥ n                   ,                 (2.19)
                                 D2 b(g/c)2                 D2 cb bg2                 cb bg2
where we used the fact that by definition n = gb and D is bounded by a function of t and h only, therefore,
we can use a slightly larger constant c to “absorb” D2 .

                           T HEORY OF C OMPUTING, Volume 1 (2005), pp. 177–216                                          191
                                                   N. A LON AND A. S HAPIRA

3     Linear Equations and Extremal Hypergraphs
3.1    The main results of this section
In this section we describe the first algebraic construction of extremal k-graphs, which will play an
important role in the proofs of Theorem 1.2 and Theorem 1.3 about testing P∗D in Section 5. The second
construction, related to PD and Theorem 1.7, appears in Section 6. The following definition will be key
for what follows:
Definition 3.1 (T k ). Let T k denote the family of k-graphs on k + 1 vertices, which contain at least three
edges.
      Let m be an integer, T a member of T k , and Z an arbitrary subset of [m]. Let also Pd = {p1 , . . . , pk+1 }
be a set of k + 1 distinct integers bounded by d (thus d > k). Consider the following definition of
a k-graph S = S(m, Z, T, Pd ): The vertex set of S consists of k + 1 pairwise disjoint sets of vertices
V1 , . . . ,Vk+1 , where, with a slight abuse of notation, we think of each of these sets as being the set of
integers 1, . . . , d k m. Define

                    E(z0 , z1 , . . . , zk−1 , p) = z0 + p · z1 + p2 · z2 + p3 · z3 + . . . + pk−1 · zk−1 .                       (3.1)

We define the edge set of S by specifying the edge sets of |Z|k copies of T that we put in S. In what
follows we refer to the k + 1 vertices of T as integers in {1, . . . , k + 1}. For every set of (not nec-
essarily distinct) integers z0 , . . . , zk−1 ∈ Z, we add to S a copy of T that is spanned by the vertices
v1 ∈ V1 , . . . , vk+1 ∈ Vk+1 , where for 1 ≤ i ≤ k + 1 we choose vi = E(z0 , . . . , zk−1 , pi ). In order to specify
the edges of this copy, we simply regard the vertices v1 , . . . , vk+1 as if they were the vertices 1, . . . , k + 1
of a regular copy of T and put in the corresponding edges. Namely, for every edge (t1 , . . . ,tk ) ∈ E(T ),
we add to S an edge that contains the vertices

            E(z0 , . . . , zk−1 , pt1 ) ∈ Vt1 , E(z0 , . . . , zk−1 , pt2 ) ∈ Vt2 , . . . , E(z0 , . . . , zk−1 , ptk ) ∈ Vtk .

In what follows we denote by C(z0 , . . . , zk−1 ), the copy of T defined using the integers z0 , . . . , zk−1 . Note,
that each of these |Z|k copies of D has precisely one vertex in each of the sets V1 , . . . ,Vk+1 . Note also,
that for every z0 , . . . , zk−1 and pi , the function E satisfies

                                        1 ≤ E(z0 , . . . , zk−1 , pi ) ≤ kd k−1 m ≤ d k m ,

thus the vertices “fit” into the sets V1 , . . . ,Vk+1 . The reader should also observe that we treat the set of
integers Pd , as an ordered set, as when choosing the vertex from Vi we use the integer pi ∈ Pd . Our first
goal in this section is to prove the following lemma.
Lemma 3.2. (The Key Lemma) Let T be a member of T k , m an arbitrary integer, Z a subset of [m]
and Pd a set of k + 1 distinct integers bounded by d. Define S = S(m, Z, T, Pd ), and suppose E1 , E2 , E3
are three edges that belong to a copy of T in S. If E1 ∈ C(a0 , . . . , ak−1 ), E2 ∈ C(b0 , . . . , bk−1 ) and
E3 ∈ C(c0 , . . . , ck−1 ), and if ai ≤ ci ≤ bi for some i, 0 ≤ i ≤ k − 1, then either ai = bi = ci or there are
                                    2
positive integers β1 , β2 ≤ d 3d such that

                                                  β1 ai + β2 bi = (β1 + β2 )ci .

                             T HEORY OF C OMPUTING, Volume 1 (2005), pp. 177–216                                                   192
         L INEAR E QUATIONS , A RITHMETIC P ROGRESSIONS , AND H YPERGRAPH P ROPERTY T ESTING

    Using the above lemma, we will construct the following extremal k-graph, which will be a key
ingredient in the lower bounds of Theorem 1.2 and Theorem 1.3.

Lemma 3.3. For every fixed k-graph D on d vertices that contains a copy of T ∈ T k with r ≥ 3 edges,
                           2
an integer m and a (r, d 3d )-gadget-free set Z ⊂ [m/d k+2 ], there is a k-graph F on m vertices with the
following properties:

    1. F contains |Z|k induced copies of D, which are singled out from the rest of the copies of D and are
       called the essential copies of D in F.

    2. Each pair of these essential copies share at most k − 1 common vertices.

    3. Every copy of T belongs to one of the essential copies of D.

     It is important to note that we do not claim that F does not contain any copies of D other than the
|Z|k essential copies, nor will we claim so later on in this section. As the statement of the above lemma
is rather technical, the reader can find in Section 3.3 a short intuitive explanation of it.

3.2    Proof of Lemma 3.2
The main idea of the proof is very simple; as T has only k + 1 vertices, the 3 edges spanned by these
vertices must have many common vertices. As the vertices of each set were chosen using the function
E defined in (3.1), we get from each vertex that is common to, say, E1 and E2 , a linear equation that
relates the integers a0 , . . . , ak−1 , which were used to define E1 and the integers b0 , . . . , bk−1 , which were
used to define E2 . We then show that for every i either ai = bi = ci or the linear equations induced by
the intersections of the edges are “reach” enough to enable us to extract a linear equation of the form
β1 ai + β2 bi = (β1 + β2 )ci .
    Let E1 , E2 and E3 be three edges that belong to a copy of a member of T ∈ T k . As T has k +1 vertices
and any k-graph on k + 1 vertices that contains at least 3 edges is a core (recall Definition 1.5), the k + 1
vertices must belong to distinct sets Vi . Call these vertices v1 ∈ V1 , . . . , vk+1 ∈ Vk+1 . Assume, without
loss of generality, that E1 = {v1 , . . . , vk+1 } \ vk+1 , E2 = {v1 , . . . , vk+1 } \ vk and E3 = {v1 , . . . , vk+1 } \ vk−1 .
Recall, that every edge in S belongs to one of the copies of T , defined using some k integers from Z.
Suppose E1 ∈ C(a0 , . . . , ak−1 ), E2 ∈ C(b0 , . . . , bk−1 ), and E3 ∈ C(c0 , . . . , ck−1 ). As v1 ∈ V1 , . . . , vk−1 ∈
Vk−1 , are common to both E1 and E2 we conclude that for every i ∈ [k + 1] \ {k, k + 1}, the following
equation holds:
                                    E(a0 , . . . , ak−1 , pi ) = vi = E(b0 , . . . , bk−1 , pi ) .
As v1 ∈ V1 , . . . , vk−2 ∈ Vk−2 , vk ∈ Vk , are common to both E1 and E3 we conclude that for every i ∈
[k + 1] \ {k − 1, k + 1}, the following equation holds:

                                   E(a0 , . . . , ak−1 , pi ) = vi = E(c0 , . . . , ck−1 , pi ) .

We could have written k − 1 equations for the common vertices of E2 and E3 , however, all but one of
them follow from the previous equations. The only independent equation is due to vk+1 :

                              E(b0 , . . . , bk−1 , pk+1 ) = vk+1 = E(c0 , . . . , ck−1 , pk+1 ) .

                             T HEORY OF C OMPUTING, Volume 1 (2005), pp. 177–216                                            193
                                            N. A LON AND A. S HAPIRA

We get a set of 2k − 1 equations in 3k unknowns, a0 , . . . , ak−1 , b0 , . . . , bk−1 and c0 , . . . , ck−1 . In order
to simplify the rest of this subsection, we substitute the definition of E from (3.1) and write our set of
equations as follows:


                  a0 + p1 a1 + p21 a2 + . . . + p1k−1 ak−1 = b0 + p1 b1 + p21 b2 + . . . + p1k−1 bk−1
                  a0 + p2 a1 + p22 a2 + . . . + p2k−1 ak−1 = b0 + p2 b1 + p22 b2 + . . . + p2k−1 bk−1
                                                          ..
                                                           .
             a0 + pk−1 a1 + p2k−1 a2 + . . . + pk−1
                                                k−1
                                                    ak−1 = b0 + pk−1 b1 + p2k−1 b2 + . . . + pk−1
                                                                                              k−1
                                                                                                  bk−1
                  a0 + p1 a1 + p21 a2 + . . . + p1k−1 ak−1 = c0 + p1 c1 + p21 c2 + . . . + p1k−1 ck−1
                  a0 + p2 a1 + p22 a2 + . . . + p2k−1 ak−1 = c0 + p2 c1 + p22 c2 + . . . + p2k−1 ck−1
                                                          ..
                                                           .
             a0 + pk−2 a1 + p2k−2 a2 + . . . + pk−2
                                                k−1
                                                    ak−1 = c0 + pk−2 c1 + p2k−2 c2 + . . . + pk−2
                                                                                              k−1
                                                                                                  ck−1
                  a0 + pk a1 + p2k a2 + . . . + pkk−1 ak−1 = c0 + pk c1 + p2k c2 + . . . + pkk−1 ck−1
             b0 + pk+1 b1 + p2k+1 b2 + . . . + pk+1
                                                k−1
                                                    bk−1 = c0 + pk+1 c1 + p2k+1 c2 + . . . + pk+1
                                                                                              k−1
                                                                                                  ck−1


     In what follows we denote by Φ the above set of equations. The main idea of the proof will be to
show that either a0 = b0 = c0 or there is a linear combination of Φ with integer coefficients α1 , . . . , α2k−1 ,
which results in the required linear equation relating a0 , b0 and c0 . The other cases relating ai , bi , ci with
i > 0 are completely identical. The main idea is to find a linear combination in which for 1 ≤ i ≤ k − 1
the coefficients of ai , bi and ci vanish. To this end, we introduce a set of equations whose solution will be
our desired integers αi . Observing Φ, we see that each ai appears on the left hand side of the first 2k − 2
equations. Thus, in order for the coefficient of ai to vanish in a linear combination of Φ with coefficients
α1 , . . . , α2k−1 , the following equation must hold

                       Ai :      α1 · pi1 + α2 · pi2 + . . . + α2k−3 · pik−2 + α2k−2 · pik = 0 .

Each bi appears only on the right hand side of the first k − 1 equations and on the left hand side of the
last equation. Therefore, in order for the coefficient of bi to vanish the following equation must hold
                                      i         i                   i              i
                       Bi :      1 · p1 + α2 · p2 + . . . + αk−1 · pk−1 − α2k−1 · pk+1 = 0 .

Finally, each ci appears only on the right hand side of the last k − 1 equations. Hence, in order for the
coefficient of ci to vanish the following equation must hold
                             i           i                    i              i            i
              Ci :      k · p1 + αk+1 · p2 + . . . + α2k−3 · pk−2 + α2k−2 · pk + α2k−1 · pk+1 = 0 .

     Observe, that we can write the analogous linear equations A0 , B0 and C0 that will require the co-
efficients of a0 , b0 and c0 to vanish. Though we apparently don’t need these equations, they will be
useful for the proof. In what follows we denote by ϒ the set of equations A1 , . . . , Ak−1 , B1 , . . . , Bk−1 ,

                          T HEORY OF C OMPUTING, Volume 1 (2005), pp. 177–216                                      194
        L INEAR E QUATIONS , A RITHMETIC P ROGRESSIONS , AND H YPERGRAPH P ROPERTY T ESTING

and C1 , . . . ,Ck−1 . The set ϒ consists of 3k − 3 homogenous linear equations in 2k − 1 unknowns
α1 , . . . , α2k−1 . Observe, however, that for 1 ≤ i ≤ k − 1,

                                                  Ai = Bi +Ci .

Therefore, ϒ is equivalent to a set of 3k − 3 − (k − 1) = 2k − 2 linear homogenous equations in 2k − 1 un-
knowns, which consists of equations Bi ,Ci . Observe also, that each of the coefficients in ϒ has absolute
value at most d k (recall that 1 ≤ p1 , . . . , pk+1 ≤ d). By Lemma 2.7, we are thus guaranteed that there are
                                                                                                                   2
integers α1 , . . . , α2k−1 not all equal to zero, whose absolute values are at most (d 2k (2k − 2))k−1 ≤ d 2d ,
such that in a linear combination of the above equations the coefficients of all the variables but a0 , b0 , c0
vanish. We now claim that in such a combination either the coefficient of b0 or the coefficient of c0 does
not vanish. An important observation is that as the integers p1 , . . . , pk+1 are distinct, the k linear equa-
tions B0 , . . . , Bk−1 that require the coefficients of b0 , . . . , bk−1 to vanish are linearly independent. Hence,
their only solution is α1 = 0, . . . , αk−1 = 0, α2k−1 = 0. Similarly, the k linear equations C0 , . . . ,Ck−1 that
require the coefficients of c0 , . . . , ck1 to vanish are linearly independent. Hence, their only solution is
αk = 0, . . . , α2k−1 = 0. Thus, if the coefficients of b0 and c0 vanish we may conclude that we must have
used a linear combination with α1 = . . . = α2k−1 = 0, which contradicts our choice.
    Note, that as in each of the equations of Φ the sum of the coefficients on the right hand side is equal
to the sum of the coefficients on the left hand side, this property must also hold in a linear combination
of Φ. Hence, there is no linear combination in which the coefficient of precisely one of a0 , b0 , c0 does
not vanish. It also follows that if the coefficients of precisely two of a0 , b0 , c0 do not vanish, then they
must be equal. However, if for example a0 = b0 , then we can “replace” b0 with a0 in the last equation
of Φ, and use the last k equations of Φ to infer that for 1 ≤ i ≤ k − 1 we have ai = ci . We would thus get
that a0 = b0 = c0 , which satisfies the requirement of the lemma. The other two cases are similar. As in
the previous paragraph we have ruled out the possibility that the coefficients of a0 , b0 and c0 vanish, the
remaining possibility is that the coefficients a0 , b0 and c0 do not vanish. In this case, we can use again
the fact that in the resultant equation the sums of the coefficients in each side are equal to infer that we
                                                                                               2
must get the required equation. Finally, as the coefficients αi are bounded by d 2d , the coefficients in the
                                                       2          2
linear combination are bounded by (2k − 1)d 2d < d 3d .

3.3   Intuition for Lemma 3.3
We give some explanation as to why, or more precisely when, Lemma 3.3 is not trivial. Consider for
simplicity the case of k = 2, that is, when T k is simply a triangle, and D is a K4 (a clique of size 4). In
this case, the lemma says that we can construct a graph on m vertices that contains |Z|2 essential copies
of K4 that are pairwise edge disjoint, and such that each triangle in the graph belongs to one of these
copies of K4 . Note, that if |Z| = 1 this statement is trivial as we can simply take a single copy of K4 in
order to construct such a graph. However, if |Z| = m1−o(1) , the lemma claims that we can construct the
following nontrivial graph: It has m vertices and |Z|2 = m2−o(1) essential copies of K4 that are pairwise
edge disjoint, such that each triangle in the graph belongs to one of these copies of K4 . As each K4
contains at most 4 triangles, this graph contains less than m2 triangles. As any triangle appears in at
most m copies of K4 such a graph has at most m3 copies of K4 . Note that any trivial such construction
(e.g. random) will contain roughly m4−o(1) copies of K4 . The fact that we can construct graphs that

                          T HEORY OF C OMPUTING, Volume 1 (2005), pp. 177–216                                    195
                                                 N. A LON AND A. S HAPIRA

contain many induced copies of a graph, where each two copies have at most 1 common vertex (or k − 1
vertices in the case of k-graphs) while containing relatively few copies of it, will be crucial in the proofs
of Theorem 1.2 and Theorem 1.3.

3.4    Proof of Lemma 3.3
We define a k-graph F, similar to the one used in Lemma 3.2. The vertex set of F consists of d pairwise
disjoint sets of vertices V1 , . . . ,Vd , where, with a slight abuse of notation, we think of each of these sets
as being the set of integers 1, . . . , m/d. We define the edge set of F by specifying the edge sets of |Z|k
copies of D that we put in F. In what follows we refer to the d vertices of D as integers in {1, . . . , d}.
    For every set of (not necessarily distinct) integers z0 , . . . , zk−1 ∈ Z, we add to F a copy of D that is
spanned by the vertices v1 ∈ V1 , . . . , vd ∈ Vd , where for 1 ≤ i ≤ d we choose vi = E(z0 , . . . , zk−1 , i). In
order to specify the edges of this copy, we simply regard the vertices v1 , . . . , vd as if they were the vertices
of a regular copy of D and put in the corresponding edges. Namely, for every edge (p1 , . . . , pk ) ∈ E(T ),
we put in F an edge that contains the vertices

           E(z0 , . . . , zk−1 , p1 ) ∈ Vp1 , E(z0 , . . . , zk−1 , p2 ) ∈ Vp2 , . . . , E(z0 , . . . , zk−1 , pk ) ∈ Vpk .

In what follows we denote by C(z0 , . . . , zk−1 ), the copy of D defined using the integers z0 , . . . , zk−1 . This
defines |Z|k copies of D. These |Z|k copies of D will be the essential copies of D in F in the statement
of the lemma (but we will still have to show that they are induced copies of D in F). Observe, that any
essential copy D has precisely one vertex in each of the sets V1 , . . . ,Vd . Note also, that as Z ⊆ [m/d k+2 ],
for every z0 , . . . , zk−1 and 1 ≤ i ≤ d, the function E satisfies 1 ≤ E(z0 , . . . , zk−1 , i) ≤ kd k−1 m/d k+2 ≤ m/d,
thus the vertices “fit” into the sets V1 , . . . ,Vd .
    We now turn to prove the assertions of the lemma. Let v1 , . . . , vk be k vertices that belong to one of
the essential copies of D in F. As the vertices of an essential copy belong to different sets Vi , there are
distinct integers 1 ≤ p1 , . . . , pk ≤ d, such that v1 ∈ Vp1 , . . . , vk ∈ Vpk . From the definitions of F and the
function E in (3.1), there are z0 , . . . , zk−1 , such that the following equations hold:

                         z0 + p1 z1 + p21 z2 + . . . + p1k−1 zk−1 = E(z0 , . . . , zk−1 , p1 ) = v1 ,
                                                                 ..
                                                                  .
                         z0 + pk z1 + p2k z2 + . . . + pkk−1 zk−1 = E(z0 , . . . , zk−1 , pk ) = vk .

If we view the following equations as a set of k linear equations with unknowns z0 , . . . , zk−1 , they corre-
spond to the matrix equation Ax = b, where b = {v1 , . . . , vk }, x = {z0 , . . . , zk−1 }, and Ai, j = pij−1 . As A
is an invertible Vandermonde matrix (here we use the fact that the integers pi are distinct), we conclude
that z0 , . . . , zk−1 are uniquely defined by this set of equations. Hence, they belong to precisely one of
the essential copies of D, namely, C(z0 , . . . , zk−1 ). We have thus shown that each pair of essential copies
share at most k − 1 common vertices. As F is a k-graph, the essential copies of D are in particular edge
disjoint. As by definition, every edge in D belongs to one of the essential copies of D, we conclude that
the essential copies of D in F are in fact induced. We have thus proved items (1) and (2).
    We now turn to prove item (3). Suppose v1 , . . . , vk+1 are k + 1 vertices that span a copy of T , namely,
they span r ≥ 3 edges. As any member of T k contains at least 3 edges, T is a core (recall Definition 1.5).

                            T HEORY OF C OMPUTING, Volume 1 (2005), pp. 177–216                                               196
        L INEAR E QUATIONS , A RITHMETIC P ROGRESSIONS , AND H YPERGRAPH P ROPERTY T ESTING

Hence, there are distinct p1 , . . . , pk+1 such that v1 ∈ Vp1 , . . . , vk+1 ∈ Vpk+1 . Suppose the r edges of T are
e1 ∈ C(z0,1 , . . . , zk−1,1 ), . . . , er ∈ C(z0,r , . . . , zk−1,r ). In order to show that each copy of T belongs to one
of the essential copies of D we may show that for 0 ≤ i ≤ k − 1 we have zi,1 = . . . = zi,r . This will mean
that the r edges belong to C(z0 , . . . , zk−1 ). For ease of notation we show that z1,1 = . . . = z1,r . The other
cases are completely identical.
    An important observation at this point is that the sub-hypergraph of F induced on Vp1 , . . . ,Vpk+1 is
precisely the k-graph S defined in Lemma 3.2 with Pd = {p1 , . . . , pk+1 }. Consider any three distinct
integers j1 , j2 , j3 ∈ {1, . . . , r}, such that z1, j1 ≤ z1, j2 ≤ z1, j3 . By Lemma 3.2, either z1, j1 = z1, j2 = z1, j3 or
                                                     2
there are positive integers β1 , β2 ≤ d 3d such that the following equation holds
                                         β1 z1, j1 + β2 z1, j3 = (β1 + β2 )z1, j2 .
     Assume first that for some choice of j1 , j2 , j3 ∈ {1, . . . , r} we have z1, j1 = z1, j2 = z1, j3 and assume for
simplicity that j1 = 1, j2 = 2, j3 = 3. Consider any other 4 ≤ j ≤ r and assume without loss of generality
                                                                                                                      2
that z1,1 ≤ z1, j ≤ z1,2 . By the above, either z1,1 = z1,2 = z1, j or there are positive integers β1 , β2 ≤ d 3d
such that β1 z1,1 + β2 z1,2 = (β1 + β2 )z1, j . However, as by assumption z1,1 = z1,2 and β1 , β2 > 0 we can
conclude that in this case we also have z1,1 = z1,2 = z1, j . We thus conclude that in this case we have
z1,1 = z1,2 = . . . = z1,r .
     Assume now that none of j1 , j2 , j3 ∈ {1, . . . , r} are such that z1, j1 = z1, j2 = z1, j3 . Suppose we rename
the integers z1,1 , . . . , z1,r such that z1,1 ≤ . . . ≤ z1,r . By Lemma 3.2, we have for every 2 ≤ i ≤ r − 1 that
                                                 2
there are positive integers βi1 , βi2 ≤ d 3d such that
                                         βi1 z1,i−1 + βi2 z1,i+1 = (βi1 + βi2 )z1,i                                    (3.2)
holds (note that by our ordering of z1,1 , . . . , z1,r we satisfy the requirement of Lemma 3.2 that ai ≤ ci ≤
                                                                  2
bi ). But this means that z1,1 , . . . , z1,r satisfy the (r, d 3d )-gadget E = {e2 , . . . , er−1 } where
                                        ei := βi1 xi−1 + βi2 xi+1 = (βi1 + βi2 )xi
                                                     2                                                          2
(it is easy to verify that this is indeed a (r, d 3d )-gadget). However, as by assumption Z is (r, d 3d )-gadget-
free, the integers z1,1 , . . . , z1,r cannot be distinct. Assume, without loss of generality, that z1,1 = z1,2 . As
z1,1 , z1,2 , z1,3 satisfy the linear equation given in (3.2) and as by assumption z1,1 = z1,2 it must be the
case that z1,3 = z1,1 = z1,2 . This contradicts our assumption that there is no triple of equal integers
z1, j1 , z1, j2 , z1, j3 .


4     Extremal Hypergraphs and Lower Bounds for P∗D
4.1    The main results of this section
Our main goal in this section is to prove the following lemma
Lemma 4.1. Let D be a fixed k-graph on d vertices, which contains a copy of T ∈ T k with r edges.
                                                    2
Suppose we can find, for every integer m, a (r, d 3d )-gadget-free subset Z ⊆ [m/d k+2 ] of size m/ f (m).
Then, for every small enough ε > 0, and every large enough integer n, there is a k-graph H on n
vertices that is ε-far from being induced D-free, and yet contains only O(nd /q(ε)) copies of D, where
q(ε) = max{m : (1/ f (m))k ≥ ε}.

                            T HEORY OF C OMPUTING, Volume 1 (2005), pp. 177–216                                         197
                                                     N. A LON AND A. S HAPIRA

    As we explain shortly, our intention is to apply the above lemma with a set Z for which the function
f grows as slowly as possible.
    We also need the following lemma, which follows from the canonical graph property-tester of Gol-
dreich and Trevisan in [14] (see also [3]).
Lemma 4.2. Suppose there is a k-graph on n vertices that is ε-far from satisfying P∗D (or PD ) and yet
contains O(nd /q(ε)) copies of D. Then the query complexity of any one-sided-error property-tester for
P∗D (or PD ) is Ω(q(ε)1/d ), where d is the size of D. In particular, if q(ε) is super-polynomial in 1/ε,
then so is the query complexity of any one-sided-error property-tester for P∗D (or PD ).
     As is evident from the statement of Lemma 4.2, in order to obtain a high lower bound for testing
P∗D , one would want to apply it to a k-graph H that is ε-far from satisfying P∗D and contains O(nd /q(ε))
copies of D with q growing as fast as possible. Inspecting the statement of Lemma 4.1 we see that
it supplies such a k-graph, but in this case the function f should grow as slow as possible (in some
sense q is f −1 ). Note, that one can use the output of Theorem 2.3 as an input to Lemma 4.1. Finally,
requiring f in Lemma 4.1 to grow slowly, means requiring the set Z in Theorem 2.3 to be as large as
possible. Finally, observe that we can use the number-theoretic construction of Theorem 2.3, which
supplies such a set of size n/ f (n) with f being sub-polynomial. This will give a super-polynomial q,
and thus super-polynomial lower bounds, which are our ultimate goals. In Section 5 we indeed apply
the above two lemmas, along with Lemma 3.3 and the number-theoretic construction of Theorem 2.3.
In order to prove Theorem 1.2 and Theorem 1.3. The reader can find further intuition for Lemma 4.1 in
the following subsection. The proofs of Lemma 4.1 and Lemma 4.2 appear in the following subsections.

4.2     Intuition for Lemma 4.1
Going back to the discussion following the statement of Lemma 3.3 we see that using Lemma 3.3 with a
set Z of size n1−o(1) gets us very close to the requirements of Lemma 4.2, with two important differences.
Returning to the example of K4 from Section 3.3, we see that on the one hand the k-graph of Lemma 3.3
contains at most m3 copies of K4 on m vertices, which is far better than the n4 /q(ε) copies on n vertices,
which Lemma 4.2 expects to get3 . On the other hand, however, the input k-graph to Lemma 4.2 must be
ε-far from being induced K4 -free while the k-graph in Lemma 3.3 is only m−o(1) -far from being induced
K4 -free as it contains only m2−o(1) copies of K4 . Thus, Lemma 4.1 can be viewed as a bridge between
the extremal hypergraph construction of Lemma 3.3 and the lower bounds that we can obtain using
Lemma 4.2.

4.3     Proof of Lemma 4.1
We start with a key definition used in the proof of Lemma 4.1:
Definition 4.3 (Blow-up). An s-blow-up of a k-graph T = (V (T ), E(T )) on t vertices is the k-graph
obtained from T by replacing each vertex vi ∈ V (T ) by an independent set Ii of size s, and each edge
(vi1 , . . . , vik ) ∈ E(T ), by a complete k-partite k-graph4 whose vertex classes are Ii1 , . . . , Iik .
   3 The reader should note that as K is complete there is no difference between having it as a subgraph or as an induced
                                      4
subgraph. However, this lets us keep the “intuitive” example easy to explain.
   4 A complete k-partite k-graph has as its vertex set k sets V , . . . ,V , and its edge set is {{v , . . . , v } : v ∈ V , . . . , v ∈ V }.
                                                                1          k                         1           k     1   1           k   k



                                T HEORY OF C OMPUTING, Volume 1 (2005), pp. 177–216                                                      198
        L INEAR E QUATIONS , A RITHMETIC P ROGRESSIONS , AND H YPERGRAPH P ROPERTY T ESTING

    Note, that if we take an s-blow-up of a k-graph T , we get a k-graph on st vertices that contains st
induced copies of T , where each vertex of the copy belongs to a different blow-up of a vertex from
T (simply pick one vertex from each independent set). We call these copies the special copies of the
blow-up. As each set of k vertices in the blow-up is contained in at most st−k special copies of T , it
follows that adding or removing an edge from the k-graph can destroy at most st−k special copies of
T . We conclude that one must add or remove at least st /st−k = sk edges from the blow-up in order to
destroy all its special copies of T .

Proof of Lemma 4.1. Given a small ε > 0, define
                                                     m = q(ε) .                                                  (4.1)
                                 2
Let Z ⊆ [m/d k+2 ] be a (r, d 3d )-gadget-free, and let F be the output of Lemma 3.3, given D, T , m and Z.
Recall that F has m vertices. Let H be an s-blow-up of F, where
                                                         j k
                                                    n          n
                                           s=             =         .                                 (4.2)
                                                 |V (F)|       m
If necessary, add some more isolated vertices to make sure that the number of vertices of H is precisely
n. Claim 4.4 and Claim 4.6 below complete the proof of this lemma.

Claim 4.4. The k-graph H defined in the proof of Lemma 4.1 is ε-far from being induced D-free.
Proof. Consider two essential copies of D in F, D1 and D2 . By item (2) in Lemma 3.3, D1 and D2 share
at most k − 1 vertices vi1 , . . . , vik−1 in F. It follows that their corresponding blow-ups in H will share at
most k − 1 independent sets Ii1 , . . . , Iik−1 , which replace the vertices vi1 , . . . , vik−1 from F. Now, consider
the blow-ups of D1 and D2 in H, denoted D1 and D2 . As D1 and D2 share at most k − 1 common
independent sets, and each of the special copies of D in D1 /D2 has precisely one vertex in each of the
independent sets that replaced the vertices of F, we get that a special copy of D in D1 and a special
copy of D in D2 share at most k − 1 vertices. Thus, adding or removing an edge from H, can either
destroy special copies of D that belong to D1 , or special copies of D that belong to D2 (or not destroy
any induced copies at all). By item (1) in Lemma 4.1 each of the essential copies of D in F is induced,
thus, as we explained above, in order to destroy all the special copies of an s-blow-up of D, one needs
to add or remove at least sk edges from the blow-up. As |Z| = m/ f (m) we have by Lemma 3.3 item (1)
that F contains mk / f k (m) essential copies of D. Therefore, H contains mk / f k (m) blow-ups of copies of
D. We may thus infer that one has to add or delete at least
                                                sk mk       nk
                                                       =         ≥ εnk                                           (4.3)
                                               f k (m)   f k (m)
edges in order to turn H into an induced D-free k-graph, where the inequality follows from our choice
of m in (4.1) and the definition of q(ε). Thus, H is ε-far from being induced D-free.

    In what follows we denote by Iv the independent set of vertices in H that replaced vertex v from F.
As H is a blow-up of F it is clear that {v1 ∈ It1 , . . . , vk ∈ Itk } is an edge in H if and only if {t1 , . . . ,tk }
is an edge in F. We remind the reader that by assumption D contains a copy of T ∈ T k , which contains
r ≥ 3 edges. We need the following simple claim:

                          T HEORY OF C OMPUTING, Volume 1 (2005), pp. 177–216                                      199
                                            N. A LON AND A. S HAPIRA

Claim 4.5. The number of copies of T in H is sk+1 times the number of copies of T in F.

Proof. Assume v1 ∈ It1 , . . . , vk+1 ∈ Itk+1 span a copy of T in H. As T is a core the sets It1 , . . . , Itk+1 are all
distinct. As H is a blow-up of F we get that t1 , . . . ,tk+1 span a copy of T in F. We conclude that a copy
of T in H is obtained only by picking a single vertex from each one of the k + 1 sets It1 , . . . , Itk+1 such that
t1 , . . . ,tk+1 span a copy of T in F. As H is an s-blow-up of F, we conclude that the number of copies of
T in H is precisely sk+1 times the number of copies of T in F.

Claim 4.6. The k-graph H defined in the proof of Lemma 4.1 has O(nd /q(ε)) copies of D.
                                                                                                      n
                                                                                                         
Proof. Note, that as D contains at least one copy of T , and each copy of T belongs to at most d−k−1       ≤
n d−k−1                                                               k+1
        copies of D, it is enough to show that H contains at most n /q(ε) copies (induced or not) of
T . By Claim 4.5, the number of copies of T in H is precisely sk+1 times the number of copies of T in F.
By item (3) in Lemma 3.3 each copy of T belongs to one of the essential copies of D. As each copy of
                          d
D can contain at most k+1     ≤ d k+1 copies of T , and F contains precisely mk / f k (m) essential copies of
D, we get that H contains at most

                      d k+1 · mk · sk+1 d k+1 · mk · nk+1 d k+1 · nk+1
                                       = k               ≤             = O(nk+1 /q(ε))                            (4.4)
                           f k (m)        f (m) · mk+1         m

copies of T , where the first equality is due to our choice of s in (4.2), and in the last equality we used the
definition of m in (4.1).

4.4    Proof of Lemma 4.2
We need the following result of [14], mentioned already in [3].

Lemma 4.7. ([3],[14]) If there exists an ε-tester for a graph property that makes q queries, then there
exists such an ε-tester that makes its queries by uniformly and randomly choosing a set of 2q vertices,
querying all their pairs and then accepting/rejecting
                                                     according to the graph induced by the sample. In
particular, it is a non-adaptive ε-tester making 2q
                                                  2 queries.

    Restating the above, by (at most) squaring the query complexity, we can assume without loss of gen-
erality that a property-tester works by sampling a set of vertices of size q(ε, n) and accepting/rejecting
according to the graph spanned by the set. In [14] the authors measure the query complexity of a property
tester by counting the number of edge queries. As we measure query complexity by the number of ver-
tices sampled, assuming we always query all possible edges within the sample, we infer from Lemma 4.7
that we can simply assume that the property tester first samples a set of vertices, queries about all the
edges, and then proceeds to perform some other computation. Also, the proof of Lemma 4.7 was de-
scribed in [14] for graphs, however, precisely the same argument carries over to k-graphs. We need the
following simple observations:

Claim 4.8. Suppose Q is a k-graph on q vertices containing no induced copy of some k-graph D. Then,
for any n > q there is a k-graph H on n vertices, which contains Q as an induced subgraph, and does
not contain D as an induced subgraph.

                           T HEORY OF C OMPUTING, Volume 1 (2005), pp. 177–216                                      200
       L INEAR E QUATIONS , A RITHMETIC P ROGRESSIONS , AND H YPERGRAPH P ROPERTY T ESTING

Proof. It is clearly enough to show that there is such a k-graph on q + 1 vertices. Let d denote the
number of vertices of D. Suppose first that D has no vertex of degree d−1   k−1 (i.e. a vertex that forms an
edge with all the other subsets  of k − 1 vertices). In this case, if we add to Q a vertex and connect it to
         q
            
all the k−1 sets of k − 1 vertices of Q, we are guaranteed that the new k-graph spans no induced copy
of D. Suppose now that D has no isolated vertex. In this case we add to Q an isolated vertex and thus
guarantee that the new k-graph spans no induced copy of D. The only case left is that D has an isolated
vertex and a vertex of degree d−1
                                k−1 , which is impossible.

    By Theorem 4.7 we can assume that a property-tester for P∗D works by inspecting a random subset
of vertices. The following claim shows that such a one-sided-error property-tester can reject an input
only if it finds an induced copy of D in the sample of vertices.
Claim 4.9. Let D be a some k-graphs, and let A be a one-sided-error tester for P∗D with query complexity
q(ε, n). If for some ε0 > 0 and n, after A samples a set of vertices of size q(ε0 , n), the k-graph induced
by the sample is induced D-free, then A must accept the input.
Proof. Fix any n and ε0 > 0 and suppose that when we execute A on a k-graph H 0 of size n with ε = ε0 ,
and the sample of vertices spans a k-graph Q (of size q(ε0 , n)) that is induced D-free, the algorithm still
rejects the input. By Claim 4.8 there is a k-graph H on n vertices that is induced D-free and contains
an induced copy of Q. Suppose we execute A on H with ε = ε0 . As H and H 0 are of the same size n,
when given H as input the algorithm samples a set of vertices of size q(ε0 , n) = |V (Q)|. As we assume
that when given Q the algorithm rejects, we get that there is a nonzero probability that A will reject H,
contradicting the assumption that it has one-sided error.
Proof of Lemma 4.2. We start with the proof of P∗D . As the algorithm is a one-sided-error algorithm, we
get from Claim 4.9 that it can report that H is not induced D-free only if it finds an induced copy of D
in it. Observe, that if the tester picks a random subset of x vertices, and an input k-graph contains only
O(nd /q(ε)) copies (induced or not) of D, then the expected number of induced copies of D spanned by
x is O(xd /q(ε)), which is o(1) unless x = Ω(q(ε)1/d ). By Markov’s inequality, unless x = Ω(q(ε)1/d ),
the tester finds an induced copy of D with negligible probability.
     The proof for PD is similar. What we need is a version of Claim 4.8 but with respect to non-induced
sub-hypergraphs. But here the proof is even simpler: If we have a k-graph Q on q vertices that has no
copy of a k-graph D, we can construct a k-graph on q + 1 vertices that contains an induced copy of Q
but no copy of D, simply by adding an isolated vertex to Q. Note, that here we assume that D has no
isolated vertices. Clearly when testing PD we may assume that this is the case, because if D0 is obtained
from D by removing an isolated vertex, then any k-graph on at least |V (D)| vertices, satisfies PD0 iff it
satisfies PD . Thus for k-graphs of size at least |V (D)| testing PD0 is equivalent to testing PD , hence it is
enough to prove a lower bound for one of them.


5     Proofs of Theorem 1.2 and Theorem 1.3
5.1   A lower bound for (almost) all k-graphs
In this section we apply the number-theoretic construction of Theorem 2.3, the construction of the ex-
tremal k-graphs of Lemma 3.3 as well as Lemma 4.1 and Lemma 4.2 in order to prove Theorem 1.2.

                         T HEORY OF C OMPUTING, Volume 1 (2005), pp. 177–216                               201
                                         N. A LON AND A. S HAPIRA

We first need the following claim in which we denote by D the complement of D, that is, a k-graph that
contains an edge if and only if D does not. We also call a k-graph D, strongly T k -free, if neither D nor D
contains a copy of T k .
Claim 5.1. There are no strongly T 3 -free 3-graphs on at least 7 vertices. For any k > 3, there are no
strongly T k -free k-graphs on at least k + 1 vertices.
Proof. The case of k > 3 is simple. As in this case k+1
                                                             
                                                        k      ≥ 5, on any set of k + 1 vertices either D or
D spans a copy of T k . For the case of k = 3, observe that D is strongly T 3 -free, if and only if each set
                                                                                                      7
of 4 vertices spans precisely 2 edges. Fixing any set of 7 vertices, this set must span precisely 4 2/4
edges, where we count the number of 4-sets, multiply by 2 as each 4-set by assumption spans 2 edges,
and divide by 4, because we count each edge 4 times. Since this is not an integer it is impossible. Thus,
on any set of 7 vertices either D or D spans a copy of T 3 .

Proof of Theorem 1.2. Let D be a fixed k-graph on d vertices. A simple yet crucial observation is that
for every k-graph D, testing P∗D is equivalent to testing PD∗ . Note, that this relation does not hold for
testing PD . It follows that in order to prove a lower bound for testing P∗D , we may prove a lower bound
for testing PD∗ . By Claim 5.1 all the k-graphs in the statement of Theorem 1.2 (besides some 3-graphs
on 4,5 and 6 vertices. See comment below on how to deal with them) are not strongly T k -free, hence we
may assume that D contains a copy of T ∈ T k with at least 3 edges. By Theorem     √ 2.3k+2 (with k = 2 and
         2                     2
                                                                                                      √
                                                                                             )
h = d 3d ), there is a (3, d 3d )-gadget-free set Z ⊆ m/d k+2 of size (m/d k+2 )/ec log(m/d
                                                                                         √     = m/ec log m
for an appropriate c = c(d). This means that we can use Lemma 4.1 with f (m) = ec log m . It is easy to
check that in this case q(ε) in the statement of Lemma 4.1 satisfies
                                                     c0 log 1/ε
                                                      1
                                             q(ε) ≥                ,                                   (5.1)
                                                      ε
for an appropriate constant c0 = c0 (d). By Lemma 4.1 we get a k-graph that is ε-far from being induced
D-free, and contains only O(nd /q(ε)) copies of D. By Lemma 4.2 the query complexity of any one-
sided-error property-tester for P∗D can be bounded from below by q(ε)1/d , which is (5.1), with c0 replaced
by c0 /d.

     It is worth mentioning that there are strongly T 3 -free 3-graphs on 4,5, and 6 vertices. For 4 vertices
there is a unique such 3-graph, which is D3,2 (which contains 2 edges) mentioned in the statement of
the Theorem 1.3. This is the only k-graph for which we do not know whether P∗D is easily testable. For
5 vertices, it is easy to verify that the only strongly T 3 -free 3-graph has the edges {(1, 2, 3), (2, 3, 4),
(3, 4, 5), (4, 5, 1), (5, 1, 2)}. This 3-graph is better understood if one considers the 5 vertices on a cycle,
and the edges as all triples consisting of three consecutive vertices on the cycle. In what follows we call
it D3,5 . It is easy to check that D3,5 is a hyper-cycle (see Definition 1.6), thus we can prove a version of
Lemma 4.1 (namely, constructing a k-graph, which is ε-far from being D3,5 -free and yet contains only
O(n5 /(1/ε)c log 1/ε ) copies of D3,5 ) that instead of using Lemma 3.3 and Theorem 2.3, uses Lemma 6.1
and Lemma 6.7, which are proved below. The details are very similar. For 6 vertices there are also
some 3-graphs that are strongly T 3 -free, however, every 5 vertices of such a 3-graph must span a copy
of D3,5 thus we can again use the same arguments as for D3,5 to prove that any such 3-graph is not easily
testable.

                         T HEORY OF C OMPUTING, Volume 1 (2005), pp. 177–216                               202
       L INEAR E QUATIONS , A RITHMETIC P ROGRESSIONS , AND H YPERGRAPH P ROPERTY T ESTING

5.2   The improved lower bound
Proof of Theorem 1.3. Observing, as in the proof of Theorem 1.2, that we may either prove a lower
bound for D or D, we recall that by Ramsey’s Theorem, for any integer k there is an integer r(k) such
that for any k-graph D on at least r(k) vertices, either D or D contains a copy of F k . Hence, we may
assume that D contains a copy of F k , which is a member of T k with k + 1 edges. By Theorem 2.3, there
                 3
is a (k + 1, d 3d )-gadget-free set Z ⊆ m/d k+2 of size
                                               0         k+2 ))1/blog 2kc               1/blog 2kc
                          |Z| ≥ (m/d k+2 )/ec (log(m/d                      = m/ec(log m)
                                                                                                     1/blog 2kc
for an appropriate c = c(d, k). This means that we can use Lemma 4.1 with f (m) = ec(log m)                       . It is
easy to check that in this case q(ε) in the statement of Lemma 4.1 satisfies
                                                 c0 (log 1/ε)blog kc
                                                 1
                                         q(ε) ≥                        ,                                          (5.2)
                                                 ε

for an appropriate constant c0 = c0 (d). By Lemma 4.1 we get a k-graph, which is ε-far from being
induced D-free, and contains only O(nd /q(ε)) copies of D. By Lemma 4.2 the query complexity of any
one-sided-error property-tester for P∗D can be bounded from below by q(ε)1/d , which is (5.2), with c0
replaced by c0 /d.

    Note, that though the statement of Theorem 1.3 states the improved lower bounds only for k-graphs
on at least r(k) vertices, it should be clear that the same lower bound also applies to any k-graph on
less than r(k) vertices such that either the k-graph or its complement spans a copy of F k . This, in
particular, applies to F k itself, thus, as mentioned after the statement of Theorem 1.3, we indeed get
an improvement on the lower bound for testing P∗F k from [17]. It is worth mentioning that if one is
willing to replace blog kc with blogdk/2 + 1ec in the statement of Theorem 1.3, then one can obtain this
slightly weaker lower bound for any k-graph on at least k + 1 vertices, instead of k-graphs on at least
r(k) vertices. One just has to note that for any set of k + 1 vertices, either the k-graph or its complement
spans at least dk/2 + 1e edges. One then proceeds as in the proof of Theorem 1.3 by taking a set Z,
                          2                                    2
which is (dk/2 + 1e, d 3d )-gadget-free instead of (k + 1, d 3d )-gadget-free.


6     More on Linear Equations and Extremal Hypergraphs
In this section we prove Theorem 1.7. Analogously to our proof technique for P∗D , the first step in the
proof of Theorem 1.7 is to show that given a hyper-cycle D = (V, E) on d vertices we can construct
a k-graph that contains many copies of D such that from each copy of D we can infer a certain linear
equation. The main idea, as in Lemma 3.2, is to give an algebraic construction of such a graph, but as
we explain below, in this case we have some additional difficulties.
    Let m be an integer, Z ⊆ [m] and D a hyper-cycle of size d, whose vertices are numbered {1, . . . , d}
as in the definition of a hyper-cycle. We define a k-graph F = F(D, Z) as follows: the vertex set of F
consists of d pairwise disjoint sets of vertices V1 , . . . ,Vd , where, with a slight abuse of notation, we think
of each of these sets as being the set of integers 1, . . . , d k+1 m. We define the edge set of F by specifying

                         T HEORY OF C OMPUTING, Volume 1 (2005), pp. 177–216                                        203
                                                  N. A LON AND A. S HAPIRA

the edge sets of |Z|k copies of D that we put in F. For every set of (not necessarily distinct) integers
z0 , . . . , zk−1 ∈ Z, we define a copy of D denoted C = C(z0 , . . . , zk−1 ): As the vertex set of C, we choose d
vertices v1 ∈ V1 , . . . , vd ∈ Vd , where for 1 ≤ t ≤ d we choose vt = E(z0 , . . . , zk−1 ,t), and E is the function
defined in (3.1). Note, that for any choice of z0 , . . . , zk−1 ∈ Z we have E(z0 , . . . , zk−1 ,t) ∈ [d k+1 m],
thus the vertices “fit” into the sets V1 , . . . ,Vd . As for the edges of C, we simply regard the vertices
v1 ∈ V1 , . . . , vd ∈ Vd as if they were the vertices 1, . . . , d in D, namely, if (t1 , . . . ,tk ) ∈ E(D), we put in F
an edge that contains the vertices

              E(z0 , . . . , zk−1 ,t1 ) ∈ Vt1 , E(z0 , . . . , zk−1 ,t2 ) ∈ Vt2 , . . . , E(z0 , . . . , zk−1 ,tk ) ∈ Vtk .

The main technical step in this section is the proof of the following lemma, whose role in the proof of
Theorem 1.7 is analogous to the role of Lemma 3.2 in the proof of Theorem 1.2 and Theorem 1.3.

Lemma 6.1. Let m be an arbitrary integer, Z ⊆ [m] and D a hyper-cycle on d vertices. Construct
F = F(D, Z) as above. Suppose v1 ∈ V1 , . . . , vd ∈ Vd span a copy of D, with vt playing the role of vertex t
in D. Suppose that for 1 ≤ i ≤ d −k +2 edge ei belongs to C(z0,i , . . . , zk−1,i ). Then, for every 1 ≤ j ≤ k −1
there are positive integers a1 , . . . , ad−k+1 ≤ c = c(d) such that

              a1 · z j,1 + a2 · z j,2 + . . . + ad−k+1 · z j,d−k+1 = (a1 + a2 + . . . + ad−k+1 ) · z j,d−k+2 .

   In order to apply Lemma 6.1, we need another notion of linear equations suitable for it, which we
formulate as follows:

Definition 6.2 ((k, h)-linear-free). A set of integers Z ⊆ [m] = {1, 2, . . . , m} is called (k, h)-linear-free
if for every k positive integers a1 , . . . , ak ≤ h, the only solution of the equation

                                        a1 z1 + . . . + ak zk = (a1 + . . . + ak )zk+1 ,                                      (6.1)

which uses k + 1 integers from Z satisfies z1 = z2 = . . . = zk+1 .

     In simple words, if Z is (k, h)-linear-free, then whenever a1 , . . . , ak ≤ h, the only solution to (6.1)
using integers from Z, is one of the |Z| trivial solutions. Similar to our proof technique of Theorem 1.2
and Theorem 1.3, in this case we will also need a dense (k, h)-linear-free sets of integers, with which we
will apply Lemma 6.1.
     The main difficulty in proving Lemma 6.1 is two fold; While we still have to show that we can extract
a linear combination of the integers, as we did in Lemma 3.2, we are faced with the following problem;
suppose we manage to extract a linear equation but it is of the form z1 + z2 − z3 = z4 . In Lemma 3.2
this was not an issue, as in that lemma the required equation only relates 3 integers, thus if we get an
equation of the form, say, 3a − 2b = c, we can simply “shift” 2b to the other side and get the required
equation. This is not possible in our case. The problem is even more serious; as we mentioned above
(and analogously to our proof technique for P∗D ), our ultimate goal will be to apply Lemma 6.1 with a
(k, h)-linear-free set of size m1−o(1) . However, it follows from the pigeon-hole principle that the largest
                                                                    √
size of a subset of [m] without solutions to z1 + z2 − z3 = z4 is O( m). Thus, we must make sure that all
the coefficients in the linear equation we extract are positive. One may also ask, why we cannot prove
our lower bounds for PD by using only linear equations with 3 unknowns, like we use for P∗D . The main

                             T HEORY OF C OMPUTING, Volume 1 (2005), pp. 177–216                                               204
         L INEAR E QUATIONS , A RITHMETIC P ROGRESSIONS , AND H YPERGRAPH P ROPERTY T ESTING

reason for that is that for P∗D we can prove a lower bound either for D or its complement, and one of
them must contain a copy of T k . For PD , however, we cannot use this reasoning and have to deal with D
itself, which does not necessarily contain a copy of T k .
     The proof of Theorem 1.7 will follow by using Lemma 6.1 together with arguments similar to those
used in the proofs of Lemma 3.3, Lemma 4.1 and Lemma 4.2. The proof Lemma 6.1 appears in the
following subsection, and the proof of Theorem 1.7 appears in Section 6.2.

6.1     Proof of Lemma 6.1
For the proof of Lemma 6.1 we need the following simple observation:

Claim 6.3. For a given set of p ≤ r distinct integers t1 , . . . ,t p bounded by r, let A be the matrix (A)i, j =
tip+1− j − (ti − 1) p+1− j (1 ≤ i, j ≤ p). Then, there is a nonzero integer vector v, all of whose entries are
bounded (in absolute value) by r2r , such that for 1 ≤ i ≤ p − 1 (Av)i = 0, while (Av) p > 0.

Proof. As the integers t1 , . . . ,t p are distinct, the Vandermonde matrix (V )i, j = tij−1 is invertible. As A can
be obtained from V by rank preserving operations, A is also invertible. By Claim 2.7 there is a nonzero
integer vector v, all of whose entries are bounded by (r2 p) p/2 ≤ (rp) p ≤ r2r , such that for 1 ≤ i ≤ p − 1
we have (Av)i = 0. As A is invertible and v is non zero, it cannot be the case that (Av) p = 0, and if
(Av) p < 0 we can replace v by −v.

      As a first step towards the proof of Lemma 6.1 we prove the following claim.

Claim 6.4. Let m be an arbitrary integer, Z ⊆ [m] and D a hyper-cycle on d vertices. Construct F =
F(D, Z) as in Lemma 6.1, and denote by i the vector (i, i2 , . . . , ik−1 ) and by zi the vector (z1,i , z2,i , . . . , zk−1,i ).
Then the following equation holds

                z1 · (2 − 1) + . . . + zd−k+1 · (d − k + 2 − d − k + 1) = zd−k+2 · (d − k + 2 − 1) .                     (6.2)

Also, for every 1 ≤ i ≤ d − k + 1 and i + 1 ≤ t ≤ i + k − 2 the following equation holds

                                        zi+1 · (t + 1 − t) − zi · (t + 1 − t) = 0 .                                      (6.3)

Proof. Let v1 ∈ V1 , . . . , vd ∈ Vd be d vertices spanning a copy of D, with vi ∈ Vi playing the role of vertex
i in D. For every 1 ≤ i ≤ d − k + 1 consider the vertices vi ∈ Vi and vi+1 ∈ Vi+1 and recall that by the
definition of a hyper-cycle they belong to ei ∈ C(z0,i , . . . , zk−1,i ). If we regard vi and vi+1 as integers we
get from the definition of F that vi = E(z0,i , . . . , zk−1,i , i) and that vi+1 = E(z0,i , . . . , zk−1,i , i + 1). From
the definition of E in (3.1) this means that (note that z0,i disappears)

            vi+1 − vi = z1,i · ((i + 1) − i) + z2,i · ((i + 1)2 − i2 ) + . . . + zk−1,i · ((i + 1)k−1 − ik−1 ) .         (6.4)

As in the statement of the claim, let the vector i denote (i, i2 , . . . , ik−1 ) and let the vector zi denote
(z1,i , z2,i , . . . , zk−1,i ). Therefore, we can write for every 1 ≤ i ≤ d − k + 1 the vector equation

                                               vi+1 − vi = zi · (i + 1 − i) .                                            (6.5)

                            T HEORY OF C OMPUTING, Volume 1 (2005), pp. 177–216                                           205
                                         N. A LON AND A. S HAPIRA


        z1,1 + 3z2,1 + 7z3,1 + z1,2 + 5z2,2 + 19z3,2 + z1,3 + 7z2,3 + 37z3,3 = 3z1,4 + 15z2,4 + 63z3,4
          z1,1 + 5z2,1 + 19z3,1 − z1,2 − 5z2,2 − 19z3,2                     =0
          z1,1 + 7z2,1 + 37z3,1 − z1,2 − 7z2,2 − 37z3,2                     =0
                               z1,2 + 7z2,2 + 37z3,2 − z1,3 − 7z2,3 − 37z3,3 = 0
                               z1,2 + 9z2,2 + 61z3,2 − z1,3 − 9z2,3 − 61z3,3 = 0
                                                          z1,3 + 9z2,3 + 61z3,3 = z1,4 + 9z2,4 + 61z3,4
                                                       z1,3 + 11z2,3 + 191z3,3 = z1,4 + 11z2,4 + 191z3,4

       Figure 1: The linear equations (6.2), E1,2 , E1,3 , E2,3 , E2,4 , E3,4 , E3,5 when d = 6 and k = 4.


As ed−k+2 contains the vertices vd−k+2 ∈ Vd−k+2 , . . . , vd ∈ Vd we have for every d − k + 2 ≤ i ≤ d − 1

                                       vi+1 − vi = zd−k+2 · (i + 1 − i) .                                     (6.6)

Summing (6.5) for 1 ≤ i ≤ d − k + 1 and (6.6) for d − k + 2 ≤ i ≤ d − 1 we obtain

      vd − v1 = z1 · (2 − 1) + . . . + zd−k+1 · (d − k + 2 − d − k + 1) + zd−k+2 · (d − d − k + 2) .          (6.7)

As ed−k+2 contains the vertices v1 ∈ V1 and vd ∈ Vd , we also have by the same reasoning

                                         vd − v1 = zd−k+2 · (d − 1) .                                         (6.8)

Combining (6.7) and (6.8) we obtain (6.2).
    In order to obtain the other equations, for any 1 ≤ i ≤ d − k + 1 consider edge ei and recall that
it contains the vertices i, . . . , i + k − 1. Note that for every i + 1 ≤ t ≤ i + k − 2 vertices t and t + 1
belong to both ei and ei+1 . By the same reasoning used to obtain (6.4) and (6.5) we can write for every
i+1 ≤ t ≤ i+k−2
                                              vt+1 − vt = zi · (t + 1 − t)                              (6.9)


                                         vt+1 − vt = zi+1 · (t + 1 − t)                                      (6.10)
Combining these equations we get (6.3) for every i + 1 ≤ t ≤ i + k − 2, thus completing the proof.

    For the rest of the proof let us use the following notation: for every i + 1 ≤ t ≤ i + k − 2 denote by Ei,t
the equation of (6.3). Note, that for every 1 ≤ i ≤ d − k + 1 we have k − 2 equations Ei,t . To illustrate the
main ideas of the proof the reader may want to consider the special case where d = 6 and k = 4 depicted
in Figure 1.
    We also need the following claim. For its proof, the reader may find it useful to refer to the example
given in Figure 1.

                         T HEORY OF C OMPUTING, Volume 1 (2005), pp. 177–216                                   206
        L INEAR E QUATIONS , A RITHMETIC P ROGRESSIONS , AND H YPERGRAPH P ROPERTY T ESTING

Claim 6.5. For every 1 ≤ i ≤ d − k + 1 there is a linear combination of (6.2) and equations
Ei,i+1 , . . . , Ei,i+k−2 with integer coefficients, in which the coefficient of zi,1 is positive, while the coef-
ficients of zi,2 , . . . , zi,k−1 vanish.

Proof. Let α1 , . . . , αk−1 denote the unknown coefficients of (6.2) and Ei,i+1 , . . . , Ei,i+k−2 , respectively, in
the linear combination, which we seek. Suppose we write k −1 linear equations e1 , . . . , ek−1 in unknowns
α1 , . . . , αk−1 , where equation ei requires the coefficient of zi,1 to vanish in the linear combination of (6.2),
Ei,i+1 , . . . , Ei,i+k−2 with coefficient α1 , . . . , αk−1 . Observing the coefficients of z1,i , . . . , zk−1,i in (6.2) and
in Ei,i+1 , . . . , Ei,i+k−2 it is easy to see that the entries of the (k − 1) × (k − 1) matrix A, whose ith row
contains equation ei satisfies the properties of Claim 6.3. We can now take the entries of the vector v,
whose existence is guaranteed by Claim 6.3, to be the required integer coefficients α1 , . . . , αk−1 .

Proof of Lemma 6.1. We first observe that (6.2) is an equation in z j,i , where for 1 ≤ j ≤ k − 1 and
1 ≤ i ≤ d − k + 1 we have z j,i on the left hand side of the equation, and for every 1 ≤ j ≤ k − 1 we have
z j,d−k+2 on the right hand side. Furthermore, all the coefficients in this equation are positive. Finally,
for every 1 ≤ j ≤ k − 1 the sum of the coefficients of z j,1 , . . . , z j,d−k+1 is equal to (d − k + 2) j − 1, which
is precisely the coefficient of z j,d−k+2 . It thus follows that (6.2) is the sum of the k − 1 equations that we
need to obtain in order to prove the lemma. In order to simplify the notation we now turn to show how
to obtain the linear equation relating z1,1 , . . . , z1,d−k+2 . The other cases are completely identical.
      To simplify the rest of the proof, when we later refer to fixing i we mean obtaining a linear equation
in which z2,i , . . . , zk−1,i do not appear, while the coefficient of z1,i is positive. Our main idea of extracting
from (6.2) the required linear equation relating z1,1 , . . . , z1,d−k+2 is the following: For 1 ≤ i ≤ d − k + 2,
equation (6.2) contains the variables z1,i , . . . , zk−1,i , while we want an equation in which only z1,i appears.
We thus need to fix i for every 1 ≤ i ≤ d − k + 2. By Claim 6.5 we know that for every 1 ≤ i ≤ d − k + 1
we can find a linear combination of (6.2) and Ei,i+1 , . . . , Ei,i+k−2 , which fixes i. The main problem is
that we need a linear combination which simultaneously fixes any i. Suppose we first use Claim 6.5 in
order to obtain a new equation, denoted E, which fixes i = 1. We would now want to reapply Claim 6.5
in order to fix i = 2. The only difficulty is that we would now want to take a linear combination of
E2,3 , . . . , E2,k with E, and not with (6.2) as taking a linear combination with (6.2) might “bring back”
z2,1 , . . . , zk−1,1 .
      However, it is easy to see that we can also find a linear combination of E2,3 , . . . , E2,k and E, which
fixes i = 2. By Claim 6.5, we know that there is a linear combination of E2,3 , . . . , E2,k and (6.2), which
fixes i = 2. Consider now the coefficients of z1,2 , . . . , zk−1,2 in equations (6.2), E1,2 , . . . , E1,k−1 and
E2,3 , . . . , E2,k . Note, that the coefficients, which appear in equations (6.2), E1,2 , . . . , E2,k−2 also appear
in equations (6.2), E2,3 , . . . , E2,k . To be more precise, the coefficients of z1,2 , . . . , zk−1,2 in equation E1,2
are precisely the coefficients of z1,2 , . . . , zk−1,2 in (6.2) and for every 3 ≤ i ≤ k − 1 the coefficients of
z1,2 , . . . , zk−1,2 in equation E1,i are precisely the coefficients of z1,2 , . . . , zk−1,2 in equation E2,i−1 . Thus, as
E is a linear combination of (6.2) and E1,2 , . . . , E2,k−1 we infer that if there is a linear combination of (6.2)
and E2,3 , . . . , E2,k , which fixes i = 2, then there must be such a linear combination of E and E2,3 , . . . , E2,k .
It is finally important to note that as z1,1 , . . . , z1,k−1 do not appear in equations E2,3 , . . . , E2,k then in the
new linear equation i = 1 remains fixed.
      Note that the above argument can be generalized to any 2 ≤ i ≤ d − k + 1 as equations
Ei,i+1 , . . . , Ei,i+k−2 do not contain the unknowns z1,p , . . . , zk−1,p for any p < i, and the coefficients

                            T HEORY OF C OMPUTING, Volume 1 (2005), pp. 177–216                                           207
                                             N. A LON AND A. S HAPIRA

of zi,1 , . . . , zi,k−1 appearing in (6.2) and Ei−1,i , . . . , Ei−1,i+k−3 also appear in equations (6.2) and
Ei,i+1 , . . . , Ei,i+k−2 . Hence, we can apply an iterative procedure, where in the ith step we add to (6.2)
an appropriate linear combination of equations Ei,i+1 , . . . , Ei,i+k−2 , which fixes i. Moreover, later itera-
tions of this procedure will not change the coefficients of z1,p , . . . , zk−1,p for any p < i. In particular, if
in iteration p we fixed i = p, then we will also have this property at the end of the process. We have thus
established that for 1 ≤ i ≤ d − k + 1 our process obtains in iteration i a linear combination in which for
every p ≤ i the coefficient of z1,p is positive, while the coefficients of z2,p , . . . , zk−1,p have vanished. We
now observe that as both in (6.2) and (6.3) the coefficient of zi,d−k+2 is equal to the sum of the coeffi-
cients of zi,1 , . . . , zi,d−k−1 , it must be the case that after iteration d − k + 1 the coefficient of z1,d−k+2 is
positive while the coefficients of z2,d−k+2 , . . . , zk−1,d−k+2 have vanished, thus i = d − k + 2 is also fixed.
This means that we have obtained the required equation relating z1,1 , z1,2 , . . . , z1,d−k+2 . As for the size of
the integers in this linear equation, note that the coefficients of (6.2) and (6.3) are bounded by d k ≤ d d .
As we apply the above iterative process d − k + 1 < d times, Claim 6.3 guarantees that when we are
done the coefficients are bounded by a function of d only.


Corollary 6.6. For every d, there is c = c(d) such that if we construct the k graph F in Lemma 6.1
with a (d − k + 2, c)-linear-free set Z, then F contains precisely |Z|d copies of D spanned by vertices
v1 ∈ V1 , . . . , vd ∈ Vd , with vt playing the role of vertex t in D.

Proof. The main idea is simply to show that the only such copies of D belong to the same copy of
D defined for some choice of integers z0 , . . . , zk−1 ∈ Z. Consider any copy of D spanned by vertices
v1 ∈ V1 , . . . , vd ∈ Vd , with vt playing the role of vertex t in D. Suppose for every 1 ≤ i ≤ d − k + 2 edge ei
of D belongs to C(z0,i , . . . , zk−1,i ). Lemma 6.1 guarantees that for every 1 ≤ j ≤ k − 1 there are positive
integers a1 , . . . , ad−k+1 ≤ c = c(d) such that the following equation is satisfied

             a1 · z j,1 + a2 · z j,2 + . . . + ad−k+1 · z j,d−k+1 = (a1 + a2 + . . . + ad−k+1 ) · z j,d−k+2 .

Therefore, if we use a set Z, which is (d − k + 2, c)-linear-free it must be the case that for every 1 ≤ j ≤
k − 1, we have z j,1 = z j,2 = . . . = z j,d−k+2 . To complete the proof we just have to show that we also have
z0,1 = z0,2 = . . . = z0,d−k+2 as this will imply that all the edges of D belong to the same copy defined
using z0,1 , z1,1 , . . . , zk−1,1 . To show this we observe that for every 2 ≤ t ≤ d − k + 2, vertex vt ∈ Vt is
common to both et−1 and et . This means that

                             E(z0,t−1 , . . . , zk−1,t−1 ,t) = vt = E(z0,t , . . . , zk−1,t ,t) .

As we already know that for every 1 ≤ j ≤ k − 1 we have z j,i = z j,i+1 , the above equation implies that
z0,t−1 = z0,t holds for every 2 ≤ t ≤ d − k + 2, thus completing the proof.

6.2   Proof of Theorem 1.7
Given Lemma 6.1, the proof of Theorem 1.7 follows by going along the lines of the proofs of Lemma 3.3
and Lemma 4.1, with one key difference, which we shall explain. In order to avoid repeating the
same arguments we will just sketch them, while assuming that the reader is familiar with the proofs

                          T HEORY OF C OMPUTING, Volume 1 (2005), pp. 177–216                                    208
        L INEAR E QUATIONS , A RITHMETIC P ROGRESSIONS , AND H YPERGRAPH P ROPERTY T ESTING

of Lemma 3.3 and Lemma 4.1. As in Lemma 4.1, we will also need a large set of integers that does
not satisfy linear equations similar to the one we extract by using Lemma 6.1. We will thus need the
following:

Lemma 6.7. For every k and h there is c = c(k, h), such that for every n, there is a (k, h)-linear-free
subset Z ⊂ [n] = {1, 2, . . . , n} of size at least
                                                             m
                                                  |Z| ≥     √        .                                          (6.11)
                                                          ec log m
   By using Behrend’s technique [7], this lemma has been proved in [4] and [9] for the case of k = 3
and arbitrary h, and in [1] for the case h = 1 and arbitrary k. As the proof of the above lemma simply
combines the ideas of [1] and [4], we do not include it here.

Proof of Theorem 1.7, sketch. To further simplify the proof, we will use c to indicate (possibly distinct)
constants that depend only on d. Let D be a fixed k-graph on d vertices, whose core L, contains a hyper-
cycle R, of size r (≤ d). Denote by ` (≤ d) the size of L and assume we rename its vertices such that a
copy of R is spanned by the first r vertices of L, with every vertex 1 ≤ i ≤ r playing the role of vertex
i of R. As in the proof of Theorem 1.2, the main idea is to apply Lemma 4.2 by constructing a k-graph
H that is ε-far from satisfying PD , and contains only nd /q(ε) copies of D, with q(ε) ≥ (1/ε)c log 1/ε . To
this end, we will first construct a k-graph F (as in Lemma 3.3), and then take an appropriate blow-up of
it (as in Lemma 4.1).
     Given ε, let m be the largest integer satisfying
                                                    √
                                                  ec log m < 1/ε .                                              (6.12)

It is easy to see that
                                                m > (1/ε)c log 1/ε .                                            (6.13)
Let Z be a (r − k + 2, c)-linear-free subset of [m]. Note, that by Lemma 6.7 we have
                                                             m
                                                  |Z| ≥     √        .
                                                          ec log m
Define a k-graph F as follows: It has ` clusters of vertices V1 , . . . ,V` of size d `+2 m each (thus, F has
`d `+2 m vertices). For each set of k integers z0 , . . . , zk−1 ∈ Z we put in a copy of L in F spanned by the
vertices v1 ∈ V1 , . . . , v` ∈ V` with vi playing the role of i, and vi = E(z0 , . . . , zk−1 , i), with the function E
define in (3.1) (note, that the vertices fit into the sets V1 , . . . ,V` ). As in Lemma 3.3 item (2), one can
easily show that we have thus defined
                                                              mk
                                                    |Z|k ≥ c√log m
                                                            e
copies of L, with each pair sharing at most k − 1 vertices. In particular these copies are edge disjoint. It
will also be important for the rest of the proof to note that the sub-hypergraph of F, which is spanned by
the first r vertices, is precisely the k-graph defined in Lemma 6.1 (with R being the hyper-cycle D in the
statement of the lemma). We thus get by Corollary 6.6 that if we took an (r − k + 2, c)-linear-free set Z,
with a sufficiently small c (in terms of d), then there are |Z|r choices of vertices v1 ∈ V1 , . . . , vr ∈ Vr such

                           T HEORY OF C OMPUTING, Volume 1 (2005), pp. 177–216                                      209
                                            N. A LON AND A. S HAPIRA

that v1 , . . . , vr span a copy of R with vt playing the role of vertex t or R. In what follows we call such
copies of R nice.
      Suppose we construct an n vertex k-graph H, by taking an n/(`d `+2 m) blow-up of F (recall that F
has `d `+2 m vertices).
                      √      By repeating the argument of Claim 4.4, it is not difficult to see that as√F contains
at least mk /ec log m edge disjoint copies of L, we may infer that H contains at least nk /ec log m edge-
disjoint copies of L. By our choice of m in (6.12) we get that H is ε-far from being L-free. It can be
easily shown that as L is the core of D, in this case H is also ε-far from being D-free.
      We are thus left with showing that H contains relatively few copies of D. By repeating the argument
of Claim 4.6, it can be shown that as F spans at most |Z|k nice copies of R, then H spans at most
                                               n r                 n r
                                      |Z|k                    ≤ m k
                                                                               = O(nr /m)
                                                 `d `+2 m            `d `+2 m
nice copies of R (observe that we always have r > k). Assume we prove that every copy of D spanned
by H contains a nice copy of R. It would thus follow that as each copy of R is contained in at most
   n              d−r copies of D, that H spans at most O(nd /m) copies of D. By (6.13) we would get the
 d−r ≤ n
required upper bound on the number of copies of D spanned by H.
      We thus only have to show that every copy of D spanned by H contains a nice copy of R. Given a
copy of D in H, consider the following homomorphism ϕ : V (D) → V (L): suppose v ∈ V (D) is one of
the vertices (in H) of the independent set that replaced vertex i0 ∈ Vi , then we map v to i. Note that this is
indeed a mapping from V (D) to V (L). Also, note that if (i1 , . . . , ik ) 6∈ E(L) then in F there are no edges
connecting vertices of Vi1 , . . . ,Vik . As H is a blow-up we infer that ϕ is indeed a homomorphism. As
L is by definition a sub-hypergraph of D, the mapping ϕ induces a homomorphism ϕ 0 , from L to itself.
By the minimality of L (recall Definition 1.5), we may infer that ϕ 0 is in fact an automorphism, that is
(i1 , . . . , ik ) ∈ E(L) ⇔ (ϕ 0 (i1 ), . . . , ϕ 0 (ik )) ∈ E(L). This means that ϕ 0 maps some copy of R ⊂ D to the
sub-hypergraph of L spanned by vertices 1, . . . , r. Finally, by our definition of ϕ this means that this is a
nice copy of R.


7    Proof of Theorem 1.4
We start this section with the proof of Theorem 1.4 part (i). To this end, we need the following well
known lemma of Erdős and Simonovits.
Lemma 7.1 ([10]). For every t and k, there are constants n0 = n0 (t, k) , c = c(t, k) and γ = γ(t, k) > 0
with the following properties: For every t1 , . . . ,tk ≤ t, every k-graph on n ≥ n0 vertices, which contains
                                                      ∗
δ (n) · nk > nk−γ edges, contains at least cδ (n)t nt copies of Kt1 ,...,tk , where t ∗ = t1 · . . . · tk and t = t1 +
. . . + tk .
     We comment that the proof of this lemma is described in [10] for the case t1 = . . . = tk . However,
simple modifications of the argument give the above lemma. Observe, that a k-graph, which is ε-far
from being D-free, where D = Kt1 ,...,tk , must contain at least εnk  nk−γ edges. From the above lemma
                                                  ∗
we infer that such a k-graph must contain cε t nt copies of K. Hence, as observed in [17], there is a
                                                                        ∗
one-sided-error property-tester for PD that simply samples O((1/ε)t ) sets of t vertices, and accepts iff
it finds no copy of D. By the above claim it finds a copy of D with high probability. As we now show,
we can improve this simple upper bound and show a lower bound, which is nearly tight in many cases.

                          T HEORY OF C OMPUTING, Volume 1 (2005), pp. 177–216                                     210
        L INEAR E QUATIONS , A RITHMETIC P ROGRESSIONS , AND H YPERGRAPH P ROPERTY T ESTING

Proof of Theorem 1.4, part (i). Let c and n0 be the constants of Lemma 7.1. Given an input k-graph H
                                                     ∗
on n > n0 vertices, the algorithm samples 10t¯/(cε t /tk ) vertices and declares H to be D-free iff it finds
no copy of D in the sub-hypergraph spanned by the set of vertices. Clearly, if H is D-free, the algorithm
accepts H with probability 1. So assume H is ε-far from being D-free. We wish to show that with high
probability the set of vertices spans a copy of D. Recall that such a k-graph must contain at least εnk
edges.
    For a vertex v denote by d(v) the degree of v, namely, the number of edges of H to which v belongs.
For a vertex v in H denote by H(v) the following (k − 1)-graph: we take all the edges to which v
belongs and remove v from them. Note that the number of edges of H(v) is precisely d(v), and that H(v)
obviously has at most n vertices. It follows from Lemma 7.1, that for some fixed γ > 0, if d(v) > nk−1−γ ,
then H(v) contains at least
                                                      ∗
                                                 d(v) t /tk t 0
                                               
                                              c k−1         n
                                                 n
copies of the (k − 1)-partite (k − 1)-graph Kt1 ,...,tk−1 , where t 0 = t¯ −tk = t1 + . . . +tk−1 . On the other hand,
if d(v) < nk−1−γ , then it might be the case that H(v) contains no copies of Kt1 ,...,tk−1 at all. In any case,
however, H(v) contains at least
                                                 ∗           t ∗ /tk !
                                           d(v) t /tk
                                         
                                                                1              0
                                     c                      − γ             nt                                    (7.1)
                                           nk−1                n

copies of Kt1 ,...,tk−1 . Hence, all vertices v belong to at least this many copies of the k-partite k-graph
K = Kt1 ,...,tk−1 ,1 , where v plays the role of the single vertex in the last vertex class of K. Suppose we
sample t¯ vertices uniformly at random from H. Let Xv be an indicator random variable for the event that
these vertices form a copy of K along with vertex v, such that v plays the role of the single vertex in the
last vertex class of K. By (7.1),
                                                             ∗         t ∗ /tk !
                                                        d(v) t /tk
                                                      
                                                                          1
                              Pr[Xv = 1] ≥ max 0, c k−1            −c γ              .
                                                        n                n

Define X = ∑v Xv . The expectation of |X| thus satisfies
                                                                         t ∗ /tk
                                                                   d(v)
                 E(|X|) = ∑ Pr[Xv = 1] ≥ c ∑                                         − c ∑ n−γt /tk
                                                                                                   ∗


                              v                            v       nk−1                    v
                                                t ∗ /tk
                                      ∑v d(v)
                                  
                                                                         ∗                     ∗       ∗
                          ≥ cn                             − cn1−γt /tk ≥ cn(kε)t /tk − o(n) ≥ 2cnε t /tk ,
                                        nk
where in the second inequality we have applied Jensen’s inequality to the first summation, and in the
third we have used the fact that H must contain at least εnk edges. Observing that |X| ≤ n, we conclude
that
                               ∗                    ∗                        ∗
                         2cnε t /tk ≤ E(|X|) ≤ cnε t /tk + n · Pr[|X| ≥ cnε t /tk ] .
Therefore,
                                                                     ∗                 ∗
                                                Pr[|X| ≥ cnε t /tk ] ≥ cε t /tk .

                          T HEORY OF C OMPUTING, Volume 1 (2005), pp. 177–216                                      211
                                             N. A LON AND A. S HAPIRA

                                                              ∗
Hence, by Markov’s inequality, after sampling 10/cε t /tk sets of t 0 vertices, with probability at least 9/10
                                                                                        ∗
we find at least one set of t 0 vertices, which forms a copy of K with at least cnε t /tk of the vertices of
H. After finding this set of t 0 vertices, all we need is tk vertices that form a copy of K with this set of
                                                                                                  ∗
vertices, as together they would form a copy of D. By assumption, there are at least cnε t /tk vertices
                                                                                                           ∗
that form a copy of K with the set of t 0 vertices. By Markov’s inequality, after sampling 10tk /(cε t /tk )
vertices, with probability at least 9/10 we find the required set of tk vertices. In total, we sampled
                          ∗
10(t1 + . . . + tk )/(cε t /tk ) vertices, as needed.

Proof of Theorem 1.4, part (ii). Consider the random k-graph H(n, 2kk ε), that is, a k-graph on n vertices,
where each set of k vertices forms an edge randomly                                                              k
                                                      k    n
                                                             and kindependently with probability 2k ε. The
expected number of edges in H is obviously 2k ε k ≥ 2εn , hence, by a standard Chernoff bound, the
                                                                                                                         k
number of edges in H is at least 23 εnk with probability at least 3/4 (in fact, the probability is 1 − 2−Θ(n )
but we do not need this stronger estimate here). As by Lemma 7.1, every k-graph with ε2 nk edges contains
a copy of D, we get that with probability at least 3/4 H is ε-far from being D-free.
    Fix a set of d = t1 + . . . +tk vertices, where ti is the number of vertices of D in its vertex-class number
i. The number of ways to partition this set into k subsets        ofk sizes  ti is at most d!. The probability that
                                                              t∗
any of these partitions spans a copy of D is at most |E|          (2k ε)|E| , where t ∗ = t1 · . . . · tk . Therefore, the
expected number of copies of D in H(n, 2kk ε) is at most
                                        ∗
                                       n        t
                                           d!      2kk ε |E| ≤ nd (t ∗ 2kk ε)|E| .
                                       d       |E|
By Markov’s inequality, the probability that the number of copies of D is 4 times its expectation is at
most 1/4. We conclude that there is a k-graph, which is both ε-far from being D-free, and yet contains
less than
                                          4nd /(1/t ∗ 2kk ε)|E|
copies of D. By Lemma 4.2, the query complexity of a one-sided-error property-tester for PD is
Ω (1/ε)|E|/d .


8    Concluding Remarks and Open Problems
    • The most interesting problem related to this paper is to give a complete characterization of the
      k-graphs D for which PD is easily testable. We believe that the techniques presented in this paper
      should be useful in resolving this problem. It is known that for k = 2, property PD is easily testable
      iff D is bipartite. It seems likely that the “right” characterization is that for larger k, property PD is
      easily testable iff D is k-partite. Using Theorem 1.2, we can rule out the possibility of extending
      the characterization of k = 2 to, “PD is easily testable iff D is 2-colorable.” Indeed, note that for
      k > 2, F k , the complete k-graph on k + 1-vertices, is 2-colorable. On the other hand, as PF k is
      equivalent to P∗F k , we get from Theorem 1.2 that PF k is not easily testable.

    • In light of Theorem 1.2 one may hope to show that the only k-graphs D, for which P∗D is easily
      testable are the single k-edges. This, however, is false. As shown in [4], when k = 2 and D is a path
      of length 2, property P∗D has a one-sided-error tester, whose query complexity is O(log(1/ε)/ε).

                           T HEORY OF C OMPUTING, Volume 1 (2005), pp. 177–216                                       212
     L INEAR E QUATIONS , A RITHMETIC P ROGRESSIONS , AND H YPERGRAPH P ROPERTY T ESTING

    It would thus be interesting to decide if for D = D3,2 (see Theorem 1.2), the property P∗D is easily
    testable.

  • It would also be very interesting to improve the lower bounds obtained in Theorem 1.3. It should
    be noted that using our techniques, one cannot obtain lower bounds that match the current upper
    bounds. For example, the best known upper bound for testing P∗D , for D being a triangle, has
    query complexity that is a tower of exponents of height polynomial in 1/ε. As is evident from the
    statement of Lemma 4.1, in order to prove a matching lower bound using our methods, one would
    have to use an (3, h)-gadget-free subset of the first m integers of size Ω(m/ log∗ m) (and observe
    that such a set contains no 3-term arithmetic progressions).  However, by a result of Bourgain [8],
    every subset of the first m integers of size Ω m/ log m/ log log m contains a 3-term arithmetic
                                                       p

    progression. Thus, the best lower bound one might hope to prove using these techniques is roughly
               2
    2log(1/ε)/ε , which is very far from the current upper bound. Also, any one-sided-error property-
                                                          2
    tester for PK3 = P∗K3 with query complexity 2O((1/ε) ) would imply an improvement of Bourgain’s
    result.



References
[1] * N. A LON: Testing subgraphs in large graphs. Random Structures and Algorithms, 21:359–370,
    2002. Also, Proc. 42nd IEEE FOCS, IEEE (2001), 434–441. [FOCS:10.1109/SFCS.2001.959918,
    RSA:10.1002/rsa.10056]. 1.2, 1.3, 1.4, 6.2

[2] * N. A LON , R. A. D UKE , H. L EFMANN , V. R ÖDL , AND R. Y USTER: The algorithmic aspects of
    the regularity lemma. J. of Algorithms, 16:80–109, 1994. Also, Proc. 33rd IEEE FOCS, Pittsburgh,
    IEEE (1992), 473-481. [FOCS:10.1109/SFCS.1992.267804, JAlg:10.1006/jagm.1994.1005]. 1.2

[3] * N. A LON , E. F ISCHER , M. K RIVELEVICH , AND M. S ZEGEDY: Efficient testing of large graphs.
    Combinatorica, 20:451–476, 2000. Also, Proc. of 40th FOCS, New York, NY, IEEE (1999), 656–
    666. [Combinatorica:mwapje2fdyk7ma2e]. 1.2, 4.1, 4.4, 4.7

[4] * N. A LON AND A. S HAPIRA: A characterization of easily testable induced subgraphs. In Proc.
    of the 15th Annual ACM-SIAM SODA, pp. 935–944. ACM Press, 2004. Combinatorics, Probability
    and Computing, to appear. 1.2, 1.4, 1.5, 2.1, 6.2, 8

[5] * N. A LON AND A. S HAPIRA: Testing subgraphs in directed graphs. JCSS, 69:354–
    382, 2004.     Also, Proc. of the 35th STOC, 2003, 700–709. [STOC:780542.780644,
    JCSS:10.1016/j.jcss.2004.04.008]. 1.2, 1.2, 1.4

[6] * N. A LON AND A. S HAPIRA: On an extremal hypergraph problem of Brown, Erdős and Sós.
    Combinatorica, to appear, 2005. 1.4

[7] * F. A. B EHREND: On sets of integers which contain no three terms in arithmetic progression.
    Proc. of National Academy of Sciences USA, 32:331–332, 1946. 1.4, 6.2

                      T HEORY OF C OMPUTING, Volume 1 (2005), pp. 177–216                           213
                                      N. A LON AND A. S HAPIRA

 [8] * J. B OURGAIN: On triples in arithmetic progression. Geom. and Funct. Anal., 9:968–984, 1999.
     [GAFA:7wqhpp8fnnuk388g]. 8

 [9] * P. E RD ŐS , P. F RANKL , AND V. R ÖDL: The asymptotic number of graphs not containing a fixed
     subgraph and a problem for hypergraphs having no exponent. Graphs and Combin., 2:113–121,
     1986. 2.1, 6.2

[10] * P. E RD ŐS AND M. S IMONOVITS: Supersaturated graphs and hypergraphs. Combinatorica,
     3:181–192, 1983. 7.1, 7

[11] * E. F ISCHER: The art of uninformed decisions: A primer to property testing. The Computational
     Complexity Column of The Bulletin of the European Association for Theoretical Computer Science,
     75:97–126, 2001. 1.1

[12] * P. F RANKL AND V. R ÖDL: Extremal problems on set systems. Random Struct. Algorithms,
     20:131–164, 2002. [RSA:10.1002/rsa.10017]. 1.2, 1.4

[13] * O. G OLDREICH , S. G OLDWASSER , AND D. RON: Property testing and its connection to learning
     and approximation. JACM, 45:653–750, 1998. Also, Proc. of 37th Annual IEEE FOCS, (1996),
     339–348. [FOCS:10.1109/SFCS.1996.548493, JACM:10.1145/285055.285060]. 1.1

[14] * O. G OLDREICH AND L. T REVISAN: Three theorems regarding testing graph properties. Random
     Structures and Algorithms, 23:23–57, 2003. Also, Proc. 42nd IEEE FOCS, IEEE (2001), 460-469.
     [FOCS:10.1109/SFCS.2001.959922, RSA:10.1002/rsa.10078]. 4.1, 4.4, 4.7, 4.4

[15] * W. T. G OWERS: Hypergraph regularity and the multidimensional Szemerédi theorem.
     Manuscript, 2004. 1.2

[16] * I. S. G RADSHTEYN AND I. M. RYZHIK: Tables of Integrals, Series, and Products. Academic
     Press, 2000. 2.2

[17] * Y. KOHAYAKAWA , B. NAGLE , AND V. R ÖDL: Efficient testing of hypergraphs. In Proc. of 29th
     ICALP, pp. 1017–1028, 2002. [ICALP:4wa2tdcqx50avb08]. 1.2, 1.2, 1.3, 1.3, 1.4, 5.2, 7

[18] * I. L ABA AND M. L ACEY: On sets of integers not containing long arithmetic progressions.
     Manuscript, 2004. [arXiv:math/0108155]. 1.4, 2.3

[19] * B. NAGLE AND V. R ÖDL: Regularity properties for triple systems. Random Structures and
     Algorithms, 23:264–332, 2003. [RSA:10.1002/rsa.10094]. 1.2

[20] * B. NAGLE , V. R ÖDL , AND M. S CHACHT: The counting lemma for regular k-uniform hype-
     graphs. Random Structures and Algorithms, to appear. 1.2

[21] * R. A. R ANKIN: Sets of integers containing not more than a given number of terms in arithmetical
     progression. Proc. Roy. Soc. Edinburgh Sect. A, 65:332–344, 1962. 1.4

[22] * V. R ÖDL AND J. S KOKAN: Regularity lemma for k-uniform hypergraphs. Random Structures
     and Algorithms, 25:1–42, 2004. [RSA:10.1002/rsa.20017]. 1.2

                       T HEORY OF C OMPUTING, Volume 1 (2005), pp. 177–216                         214
       L INEAR E QUATIONS , A RITHMETIC P ROGRESSIONS , AND H YPERGRAPH P ROPERTY T ESTING

[23] * D. RON: Property testing. Handbook of Randomized Computing, 2:597–649, 2001. 1.1

[24] * R. RUBINFELD AND M. S UDAN:           Robust characterization of polynomials with
     applications to program testing.    SIAM J. on Computing, 25:252–271, 1996.
     [SICOMP:10.1137/S0097539793255151]. 1.1

[25] * I. Z. RUZSA AND E. S ZEMER ÉDI: Triple systems with no six points carrying three triangles. In
     Combinatorics (Keszthely, 1976), pp. 939–945. Coll. Math. Soc. J. Bolyai, 1976. 1.4

[26] * A. S HAPIRA: Behrend-type constructions for sets of linear equations. Acta Arithmetica, to
     appear. 1.4

[27] * E. S ZEMER ÉDI: Regular partitions of graphs. In Proc. Colloque Inter. CNRS (J. C. Bermond,
     J. C. Fournier, M. Las Vergnas and D. Sotteau, eds.), pp. 399–401, 1978. 1.2


AUTHORS

      Noga Alon [About the author]
      professor
      Schools of Mathematics and Computer Science
      Tel Aviv University, Tel Aviv, Israel
      nogaa tau ac il
      http://www.cs.tau.ac.il/~nogaa


      Asaf Shapira [About the author]
      PhD Candidate
      School of Computer Science
      Tel Aviv University, Tel Aviv, Israel
      asafico tau ac il
      http://www.cs.tau.ac.il/~asafico




                       T HEORY OF C OMPUTING, Volume 1 (2005), pp. 177–216                        215
                                   N. A LON AND A. S HAPIRA

ABOUT THE AUTHORS

   N OGA A LON received his Ph. D. in Mathematics at the Hebrew University of Jerusalem
      under the supervision of Micha Perles. He is a Baumritter Professor of Mathematics
      and Computer Science at Tel Aviv University, and visits frequently the Institute for Ad-
      vanced Study in Princeton. He works in Combinatorics, Graph Theory and their appli-
      cations to Theoretical Computer Science, focusing on combinatorial algorithms, com-
      binatorial geometry, combinatorial number theory, algebraic and probabilistic methods
      in Combinatorics, and has also been working on Circuit Complexity, Streaming algo-
      rithms and topological methods in Combinatorics. He published more than three hun-
      dred and fifty research papers, and one book: The Probabilistic Method, coauthored
      by Joel Spencer. He is a member of the Israel National Academy of Sciences, and re-
      ceived the Erdős prize, the Feher prize, the Polya Prize, the Bruno Memorial Award,
      the Landau Prize and the Gödel Prize. Although he is not Hungarian, he likes Extremal
      problems, has collaborated with 28 Hungarian coauthors, his favorite function is log∗ ,
      which appears in 11 of his papers, and he is, at the moment, one of the Erdős number
      record holders, see Erdős number records. He is married to Nurit and has three daugh-
      ters; Nilli (who has written her first paper at the age of 5), Natalie and Narkis. More
      details can be found at Noga Alon’s Home Page.


   A SAF S HAPIRA is a Ph. D. candidate at the School of Computer Science, Tel Aviv Univer-
      sity. His main areas of research at present are property-testing and extremal problems in
      graph theory. He is a recipient of the Clore Foundation Ph. D. Fellowship and the IBM
      Ph. D. Fellowship. Thanks to his joint papers with his advisor, he holds a (perhaps by
      now optimal) Erdős number 2. He is also an avid traveler, skier and gourmand. More
      details can be found at his Home Page.




                    T HEORY OF C OMPUTING, Volume 1 (2005), pp. 177–216                           216