Sunflowers and Robust Sunflowers from Randomness Extractors

Authors Xin Li, Shachar Lovett, Jiapeng Zhang,
Plaintext
                           T HEORY OF C OMPUTING, Volume 18 (2), 2022, pp. 1–18
                                        www.theoryofcomputing.org

                                   S PECIAL ISSUE : RANDOM 2018




 Sunflowers and Robust Sunflowers from
        Randomness Extractors
                     Xin Li∗              Shachar Lovett†           Jiapeng Zhang
                Received September 19, 2018; Revised May 15, 2021; Published January 31, 2022




       Abstract. The Erdős–Rado Sunflower Theorem (J. London Math. Soc. 1960) is a
       fundamental result in combinatorics, and the corresponding Sunflower Conjecture
       is a central open problem. Motivated by applications in complexity theory, Rossman
       (FOCS 2010) extended the result to robust sunflowers, where similar conjectures
       emerge about the optimal parameters for which it is expected to hold.
           We exhibit a surprising connection between the existence of sunflowers and
       robust sunflowers in sufficiently large families of sets and the problem of constructing
       certain randomness extractors. This allows us to rederive the known results in a
       systematic manner. Our techniques have also motivated significant subsequent
       progress on the Sunflower Conjecture by Ryan Alweiss, Shachar Lovett, Kewen Wu,
       and Jiapeng Zhang (STOC’20 and Annals of Math. 2021).

    A preliminary version of this paper appeared in the Proceedings of the 22nd International Conference on
Randomization and Computation (RANDOM’18) [19].
  ∗ Research supported by NSF award CCF-1617713
  † Research supported by NSF CCF-1614023




ACM Classification: F.1.3, G.2.1
AMS Classification: 05D10, 68Q17, 68Q25
Key words and phrases: CNF-DNF formulas, extractors, combinatorics, sunflowers


© 2022 Xin Li, Shachar Lovett, Jiapeng Zhang
c b Licensed under a Creative Commons Attribution License (CC-BY)                DOI: 10.4086/toc.2022.v018a002
                                     X IN L I , S HACHAR L OVETT, J IAPENG Z HANG

1     Introduction
Let F be a family of sets from some universe 𝑋. A common theme and extensively studied
phenomenon in combinatorics is the following: if the cardinality of F (when F is finite) or
the density of F (when F is infinite) is large enough, then some nice patterns will occur in
F. Well-known examples of this kind include (1) Szemerédi’s theorem [28], which asserts
that all subsets of the natural numbers of positive density contain arbitrarily long arithmetic
progressions; (2) Ramsey’s Theorem [12], which asserts that if one colors the edges of a large
enough complete graph with a finite number of colors, then there must exist a monochromatic
clique of a certain size; and (3) the Erdős–Rado Sunflower Theorem [11], which asserts that a
large enough family of sets of bounded size must contain a large sunflower.1
    The study of these problems has resulted in many important tools (e.g., Szemerédi’s
Regularity Lemma [29] and the probabilistic method), which have found applications not
only in combinatorics, or mathematics more broadly, but also in theoretical computer science
(TCS). Conversely, ideas from TCS have influenced related research in combinatorics quite
often. For example, the first two problems we mentioned above, Szemerédi’s Theorem and
Ramsey’s Theorem, are intimately connected to the area of pseudorandomness in TCS. Indeed,
by constructing a certain sparse pseudorandom subset of natural numbers and proving an
appropriate Szemerédi-type theorem with respect to that subset, a celebrated result of Green
and Tao [15] shows that prime numbers contain arbitrarily long arithmetic progressions. As for
Ramsey’s Theorem, a recent line of work on randomness extractors [3–5, 7, 17, 18] gives strongly
explicit2 constructions of Ramsey graphs that get close to the probabilistic bound [10].
    In this paper we study the Sunflower Theorem and its related variants. We show that
again there is an intimate connection to randomness extractors. In fact, using techniques
from randomness extractors, we build a general proof framework that can unify the Sunflower
Theorem and its variant known as the Robust Sunflower Lemma [27].

Sunflowers. A family is a set of sets. (Repeated sets are not permitted in counting the size of a
family.) A family F is 𝑤-uniform if every element of F has size 𝑤. An 𝑟-sunflower is defined to be
a family of 𝑟 sets such that the intersection of any two sets in the family is the same set (which
can be the empty set). Choose any 𝑤-uniform family F, the main question of interest is, how
large F needs to be in order to ensure that there is an 𝑟-sunflower in F. Erdős and Rado proved
the following theorem.
Theorem 1.1 (Sunflower Theorm, Erdős–Rado [11]). Let F be a 𝑤-uniform family. If |F| > 𝑤!(𝑟 − 1)𝑤
then F contains an 𝑟-sunflower.
    They also conjectured that the bound on |F| can be replaced by 𝑐 𝑟𝑤 where 𝑐 𝑟 is a constant that
only depends on 𝑟 for every 𝑟 > 0. This conjecture is one of the most well-known open problems
in combinatorics, which remains open today despite extensive research.
    1A sunflower is a family of sets whose pairwise intersection is constant, which we will formally define shortly.
    2A class of graphs is strongly explicit if the vertices of each graph 𝐺 = (𝑉 , 𝐸) in the class are encoded by (0, 1)-strings
of length 𝑂(log |𝑉 |) and a polynomial-time algorithm 𝑃(|𝑉 |, 𝑖, 𝑗) decides adjacency of any pair (𝑖, 𝑗) of vertices of 𝐺
in polynomial (i. e., polylog(|𝑉 |)) time.


                            T HEORY OF C OMPUTING, Volume 18 (2), 2022, pp. 1–18                                              2
              S UNFLOWERS AND ROBUST S UNFLOWERS FROM R ANDOMNESS E XTRACTORS

    The Sunflower Theorem has applications in TCS, such as in the proof of strong lower
bounds for monotone circuits [26]. In addition, Alon, Shpilka and Umans [1] related the
Sunflower Conjecture and its variants to possible approaches to fast matrix multiplication.
Recently, following the breakthrough proof of the Cap-set Conjecture [8, 9], a weaker version
of the Sunflower Conjecture was also proved using the same method [24] (concretely, the
Erdős–Szemerédi Sunflower Conjecture for 𝑤 = 3). However, the general conjecture remains
open.

Robust sunflowers. Motivated by the applications of sunflowers to proving monotone circuit
lower bounds, robust sunflowers were introduced by Rossman [27] under the name “quasi-
sunflowers” to prove monotone circuit lower bounds for the 𝑘-clique problem on random graphs.
We denote by P(𝑋) the family of all subsets of a finite set 𝑋.

Definition 1.2 (Robust sunflower [27]). Let 𝑋 be a finite set and S ⊆ P(𝑋) a family of size
|S| ≥ 2. Denote 𝑌 =   𝑇. Following Rossman [27], for 𝑝, 𝛾 ∈ [0, 1], we say that S is a (𝑝, 𝛾)-robust
                    Ñ
                      𝑇∈S
sunflower if for a random set 𝑊 ⊆ 𝑋, with each element of 𝑋 present in 𝑊 independently with
probability 𝑝, it holds that
                                Pr [∃𝑇 ∈ S, (𝑇 \ 𝑌) ⊆ 𝑊] ≥ 1 − 𝛾.

    As Rossman explains, the robust sunflower concept is a relaxation of the sunflower concept
where petals may overlap in a limited way. Indeed, if F is a 𝑤-uniform sunflower then for every
𝑝 ∈ [0, 1], F is a (𝑝, 𝛾)-robust sunflower with 𝛾 = exp(−|F|𝑝 𝑤 ) [27, Remark 13].
    In the same paper, Rossman also proved the following lemma, which says there is always a
robust sunflower in a large family.

Lemma 1.3 (Robust Sunflower Lemma, Rossman [27]). Let F be a 𝑤-uniform family. If

                                       |F| ≥ 𝑤! · (1.71 log(1/𝛾)/𝑝)𝑤 ,

then3 F contains a (𝑝, 𝛾)-robust sunflower.

    Besides the original application, Rossman’s Robust Sunflower Lemma was also used by
Gopalan, Meka and Reingold [14] to study the problem of DNF sparsification. Given a DNF
formula 𝑓 on 𝑛 variables, there are two natural ways to measure the complexity of 𝑓 : the
number of clauses (also called size) 𝑠( 𝑓 ), and the maximum width of a clause 𝑤( 𝑓 ). It is easy to
show that any DNF of small size can be approximated well by another DNF of small width, by
truncating clauses of large width. Gopalan et al. [14] used Rossman’s Robust Sunflower Lemma
to show the reverse direction, that any DNF with small width can also be approximated well by
another DNF with small size. In particular, they showed that any width-𝑤 DNF formula can
be 𝜀-approximated by another DNF formula with size at most (𝑤 log(1/𝜀))𝑂(𝑤) . This kind of
sparsification has applications in constructing pseudorandom generators and approximately
counting the number of satisfying assignments for DNF formulas.
  3Throughout this paper, “log” refers to base-2 logarithms.


                        T HEORY OF C OMPUTING, Volume 18 (2), 2022, pp. 1–18                      3
                              X IN L I , S HACHAR L OVETT, J IAPENG Z HANG

   Similarly to the Sunflower Conjecture, one can also ask whether the bound on |F| in the
Robust Sunflower Lemma can be improved. This question was further studied by Alweiss,
Lovett, Wu and Zhang [2] and Rao [25]. We discuss further details in Section 1.4.

1.1     Our contribution
We provide a general framework to prove both the Sunflower Theorem and the Robust Sunflower
Lemma. In fact, we reduce both of these problems to the construction of a certain type of
randomness extractors. To state our results, we first formally define the notions that are going
to be used in our extractors.
    A random variable 𝑋, taking values in a finite set Σ, defines a probability measure on Σ,
namely, for every subset Δ ⊆ Σ we have the measure 𝜇(Δ) = Pr(𝑋 ∈ Δ). Risking some confusion,
we shall also refer to 𝑋 as a “distribution over Σ,” reducing the meaning of 𝑋 to the probability
measure it defines on Σ. We refer to Σ as the sample space. The support of 𝑋, denoted Supp(𝑋), is
the smallest subset Δ ⊆ Σ such that Pr(𝑋 ∈ Δ) = 1.

Definition 1.4. Let 𝑋 be a distribution over a finite sample space Σ. The min-entropy of 𝑋 is
defined as                                                  
                                                        1
                              H∞ (𝑋) = min log                   .
                                        𝑥∈Σ         Pr[𝑋 = 𝑥]
A distribution over {0, 1} 𝑛 is said to be an (𝑛, 𝑘) source if the distribution has min-entropy at
least 𝑘.

Definition 1.5 (Block min-entropy source). Let 𝑋1 , . . . , 𝑋𝑚 be random variables with sample
space {0, 1} 𝑛 . Consider the random variable 𝑋 = (𝑋1 , . . . , 𝑋𝑚 ) (with sample space {0, 1} 𝑛𝑚 ). The
distribution 𝑋 is an (𝑚, 𝑛, 𝑘) block min-entropy source if for every non-empty subset 𝑆 ⊆ [𝑚], the
joint distribution of (𝑋𝑖 : 𝑖 ∈ 𝑆) has min-entropy at least 𝑘|𝑆|.

   We note that the definition of block min-entropy sources was initiated in [13] as a tool to
prove lifting theorems in communication complexity.

Definition 1.6 (Block min-entropy extractor). A function 𝐸 : {0, 1} 𝑛 × {0, 1} 𝑠 → {0, 1} 𝑑 is a
(𝑘, 𝜀, 𝑑, 𝑠) block min-entropy extractor if for any 𝑚, 𝑛 ∈ N and any (𝑚, 𝑛, 𝑘) block min-entropy
source 𝑋 = (𝑋1 , . . . , 𝑋𝑚 ), it holds that

                                  (𝐸(𝑋1 , 𝑅 1 ), . . . , 𝐸(𝑋𝑚 , 𝑅 𝑚 )) ≈𝜀 𝑈 𝑑𝑚 .

Here, each 𝑅 𝑖 ∈ {0, 1} 𝑠 is an independent uniform random string, 𝑈 𝑑𝑚 is the uniform distribution
over (𝑑𝑚)-bit strings, and ≈𝜀 means 𝜀-close in statistical distance. If in addition it holds that

                           (𝐸(𝑋1 , 𝑅 1 ), 𝑅 1 , . . . , 𝐸(𝑋𝑚 , 𝑅 𝑚 ), 𝑅 𝑚 ) ≈𝜀 𝑈(𝑑+𝑠)𝑚 ,

then we say that the function 𝐸 is a strong (𝑘, 𝜀, 𝑑, 𝑠) block min-entropy extractor.

      We also define a weaker object called a disperser.

                       T HEORY OF C OMPUTING, Volume 18 (2), 2022, pp. 1–18                           4
                S UNFLOWERS AND ROBUST S UNFLOWERS FROM R ANDOMNESS E XTRACTORS

Definition 1.7 (Block min-entropy disperser). A function

                                              𝐸 : {0, 1} 𝑛 × {0, 1} 𝑠 → {0, 1} 𝑑

is a (𝑘, 𝜀, 𝑑, 𝑠) block min-entropy disperser if for any 𝑚, 𝑛 ∈ N and any (𝑚, 𝑛, 𝑘) block min-entropy
source 𝑋 = (𝑋1 , . . . , 𝑋𝑚 ), it holds that

                               | Supp(𝐸(𝑋1 , 𝑅 1 ), . . . , 𝐸(𝑋𝑚 , 𝑅 𝑚 ))| ≥ (1 − 𝜀)2𝑑𝑚 .

Here, each 𝑅 𝑖 ∈ {0, 1} 𝑠 is an independent uniform random string. If in addition there exists at
least one way to fix 𝑅 1 = 𝑟1 , . . . , 𝑅 𝑚 = 𝑟𝑚 such that

                                | Supp(𝐸(𝑋1 , 𝑟1 ), . . . , 𝐸(𝑋𝑚 , 𝑟𝑚 ))| ≥ (1 − 𝜀)2𝑑𝑚 ,

then we say that the function 𝐸 is a strong (𝑘, 𝜀, 𝑑, 𝑠) block min-entropy disperser.

   In this paper, we make connections between the block min-entropy disperser and sunflower
and robust sunflower structures. Formally, we prove the following result.

Theorem 1.8. Suppose that there exists a strong (𝑘, 0, 𝑑, 𝑠) block min-entropy disperser, 𝐸 : {0, 1} 𝑛 ×
{0, 1} 𝑠 → {0, 1} 𝑑 for any (𝑤, 𝑛, 𝑘) block min-entropy source. Then the following holds.
    Let F be a 𝑤-uniform family. Assume that |F| ≥ 2(𝑘+2)𝑤 . Then:

   (i) F contains a 2𝑑 -sunflower.
                                     𝑑
                                          
  (ii) F contains a 𝑝, 𝑤(1 − 𝑝)2 -robust sunflower.

   Observe that the seed length 𝑠 of the extractor does not play a part in the conclusion of
Theorem 1.8. We then show that we can construct strong zero-error block min-entropy extractors.
As a corollary, we are able to construct strong block min-entropy dispersers. Specifically, we
have the following result.

Theorem 1.9. There is a constant 𝑐 > 1 such that for any 𝑚, 𝑛, 𝑘 ∈ N with 𝑘 ≥ 𝑐 log 𝑚, we have:

    • There is an explicit4 strong (𝑘, 𝜀, 𝑑, 𝑠) block min-entropy extractor 𝐸 : {0, 1} 𝑛 × {0, 1} 𝑠 → {0, 1} 𝑑
      for (𝑚, 𝑛, 𝑘) block min-entropy sources, where 𝑠 = 𝑛, 𝑑 = 𝑘/𝑐 and 𝜀 = 2−Ω(𝑘) .

    • There is an explicit strong (𝑘, 0, 𝑑, 𝑠) block min-entropy disperser 𝐸 : {0, 1} 𝑛 × {0, 1} 𝑠 → {0, 1} 𝑑
      for (𝑚, 𝑛, 𝑘) block min-entropy sources, where 𝑠 = 𝑛, 𝑑 = 𝑘/𝑐.

Combined with Theorem 1.8, this gives the Sunflower Theorem and the Robust Sunflower
Lemma. In Theorem 1.9, we give explicit constructions. We also note that the existence of block
min-entropy dispersers is also enough to prove sunflower lemmas.
    4 A class of finite functions is explicit if a polynomial-time algorithm evaluates each function in the class on each
of its inputs. The class of bipartite graphs corresponding to an explicit class of functions is strongly explicit in the
sense of Footnote 2.


                           T HEORY OF C OMPUTING, Volume 18 (2), 2022, pp. 1–18                                        5
                                X IN L I , S HACHAR L OVETT, J IAPENG Z HANG

Corollary 1.10 (Sunflower Theorem, this paper). There is a constant 𝑐 such that for any 𝑤-uniform
family and any 𝑟 > 1, if |F| ≥ (𝑤𝑟)𝑐𝑤 , then F contains an 𝑟-sunflower.

   The reader will note that this result is weaker than Theorem 1.1, the original result by Erdős
and Rado. The point we are making here is that our result is comparable, and it is obtained
using a different approach that may lead to improved bounds. Indeed, a variant of our method
did lead to improved bounds [2, 25].

Corollary 1.11 (Robust Sunflower Lemma, this paper). There is a constant 𝑐 such that for any
                                               𝑐𝑤
                                  𝑤+log(1/𝛾)
                                 
𝑤-uniform family F, if |F| ≥          𝑝             , then F contains a (𝑝, 𝛾)-robust sunflower.


1.2     Overview of the techniques
Our reduction from sunflower/robust sunflower problems to block min-entropy dispersers is as
follows. Suppose the family F ⊆ P(𝑋) for some set 𝑋, where each set in F has size 𝑤. We first
show that without loss of generality we can assume F is a 𝑤-partite family.

Definition 1.12 (𝑤-partite family). Let 𝑋 be a finite set and let F = {𝑈 𝑖 } 𝑖∈𝐼 be a family of subsets
of 𝑋. We say that F is a 𝑤-partite family if

      • F is 𝑤-uniform;

      • there is a partition 𝑋1 , . . . , 𝑋𝑤 of 𝑋 such that for every 𝑈 ∈ F, we have |𝑋 𝑗 ∩ 𝑈 | = 1 for each
        𝑗 ∈ [𝑤].

Consider the uniform distribution over a 𝑤-partite family F. There are two possible cases:

      • Case 1: there is a subset 𝑆 which is a subset of many elements of F, specifically,

                                                                        F
                                               {𝑈 ∈ F : 𝑆 ⊆ 𝑈 } ≥              ,
                                                                       𝑟 |𝑆|
        where 𝑟 is a parameter to be determined.

      • Case 2: every set 𝑆 does not appear in too many sets of F.

     In Case 1, 𝑆 is already like a core in a sunflower or robust sunflower, thus we can apply
induction on the subfamily F𝑆 := {𝑈 \ 𝑆 : 𝑆 ⊆ 𝑈 ∈ F}. In Case 2, the condition basically implies
that the distribution is relatively flat, which equivalently translates into a block min-entropy
source as we defined above. One can naturally imagine that the worst-case situation here is
that the distribution is actually the uniform distribution over 𝑋1 × · · · × 𝑋𝑤 , and we show that
indeed this is the case by using our zero-error block min-entropy disperser (Theorem 1.9). It is
then easy to see that in the worst case, the empty set is a robust sunflower, or one can choose a
sunflower with size 2𝑑 (the support size in the output of the disperser) whose core is the empty
set.

                         T HEORY OF C OMPUTING, Volume 18 (2), 2022, pp. 1–18                             6
               S UNFLOWERS AND ROBUST S UNFLOWERS FROM R ANDOMNESS E XTRACTORS

1.3     The role of extractors in our reduction
One can view the block min-entropy extractor/disperser used in our reduction as a gadget,
which reduces the sunflower/robust sunflower problem in the general case to the much easier
case of a uniform distribution (or full support) on 𝑋1 × · · · × 𝑋𝑤 . This is similar to the role of
extractors in recent work that showed lifting theorems from query complexity to communication
complexity [13], and linear programming lower bounds for constraint satisfaction problems [16].
    In fact, the extractors used in these articles are essentially the same as the extractors used in
the present paper (although here we need to show that the extractor/disperser is strong, while
in [13] and [16] this is not necessary), and the barriers to further improvement are also similar.
Specifically, in all such constructions one needs the min-entropy 𝑘 ≥ 𝑐 log 𝑚 for some constant
𝑐 > 1, where 𝑚 is equal to the size of the sets (i.e., 𝑤) in our applications. It is interesting to ask
whether 𝑘 = Ω(log 𝑚) is necessary. Any improvement of such extractors leads to improvements
of both sunflower theorems and lifting theorems. Subsequent to the conference version of this
paper [19], Meka [23] gave a counterexample showing that such extractors do not exist.

1.4     Subsequent work
In this section we report work done after the publication of the conference version of this
paper [19].

Connection to DNF sparsification: The connection of sunflowers and DNF sparsification was
first discovered by Gopalan, Meka and Reingold [14]. Building on the Robust Sunflower Lemma,
Gopalan et al. [14] proved any width-𝑤 DNF (with arbitrary size) can be 𝜀-approximated by a
DNF of size at most (𝑤 · log(1/𝜖))𝑂(𝑤) . Similarly to the Sunflower Conjecture, Gopalan et al. also
believed the DNF compression bound can be improved to (log(1/𝜖))𝑂(𝑤) . This conjecture was
confirmed by subsequent work by Lovett, Wu and Zhang [21, 22].
     In another piece of subsequent work Lovett, Solomon and Zhang [20] also proved that the
reverse direction is also true. That is, an improved upper bound on DNF sparsification implies
an improved sunflower bound.

Progress on the Sunflower Conjecture. In this paper, we show that any 𝑤-uniform family F
of size |F| ≥ 𝑤 𝑐𝑤 for some constant 𝑐 > 1 contains a 3-sunflower. Furthermore, we show that any
family F with the following spread condition must contain 3 pairwise disjoint sets.

Definition 1.13. Given a family F and 𝑟 > 0, we say that F is 𝑟-spread, if for any set 𝑆,

                                                               F
                                         {𝑈 ∈ F : 𝑆 ⊆ 𝑈 } ≤           .
                                                              𝑟 |𝑆|
      The following corollary follows directly from the proof of Theorem 1.8.

Corollary 1.14. There is a constant 𝑐 > 0, such that for any 𝑤-partite family F, if F is 𝑤 𝑐 -spread then it
contains 𝑤 pairwise disjoint sets.

                        T HEORY OF C OMPUTING, Volume 18 (2), 2022, pp. 1–18                              7
                                 X IN L I , S HACHAR L OVETT, J IAPENG Z HANG

   This 𝑤 𝑐 -spread condition actually comes from the requirement of our disperser that
𝑘 ≥ 𝑐 log 𝑤 . As discussed above, it is interesting to ask whether this is necessary. In an article
subsequent to the conference version of this paper [19], Alweiss, Lovett, Wu and Zhang [2]
improved the 𝑤 𝑐 bound on the spread to (log 𝑤)𝑂(1) and Rao [25] further improved it to 𝑂(log 𝑤).

Lemma 1.15 ( [2, 25]). There is a constant 𝑐 > 0, such that for any 𝑤-partite family F, if F is
(𝑐𝑟 · log(𝑤𝑟))-spread then it contains 𝑟 pairwise disjoint sets.

    This lemma finally leads to an improved sunflower lemma.

Corollary 1.16 ( [2, 25]). There is a constant 𝑐 > 0 such that, for each 𝑤-uniform family F, if
|F| > (𝑐𝑟 · log(𝑤𝑟))𝑤 then F contains an 𝑟-sunflower.

Discussion of robust sunflowers. In this paper, we also study robust sunflower structures.
In particular, we have the following corollary. Below, we use the notation 𝑂 𝑝,𝛾 (·) to hide the
specific dependency on the parameters 𝑝, 𝛾, which is of less interest to us here.

Corollary 1.17. There is a constant 𝑐 > 0 such that, for any 𝑤-partite family F, if F is 𝑟-spread where
𝑟 = (𝑂 𝑝,𝛾 (𝑤))𝑐 , then F is a (𝑝, 𝛾)-robust sunflower with empty kernel.

   The subsequent work of Alweiss, Lovett, Wu and Zhang [2], and its improvement by Rao [25],
gives an improved robust sunflower lemma.

Lemma 1.18 ( [2], [25]). There is a constant 𝑐 > 0 such that, for any 𝑤-partite family F, if F is
(𝑐 log(𝑤/𝛾)/𝑝)-spread then F is a (𝑝, 𝛾)-robust sunflower with empty kernel.

    Furthermore, the log 𝑤 term is tight by the following example. Fix 𝑝 = 𝛾 = 1/2 for convenience.
Let 𝑋1 , . . . , 𝑋𝑤 be 𝑤 disjoint sets each of size 𝑐 log 𝑤 for some large enough 𝑐 > 0. Define the
family F := 𝑋1 × · · · × 𝑋𝑤 . By this we mean, with some abuse of notation, the complete 𝑤-partite
hypergraph on {𝑋1 , . . . , 𝑋𝑤 }. Then F does not contain a (𝑝, 𝛾)-robust sunflower. This example
also sets a barrier to proving the Sunflower Conjecture via robust sunflowers.

Discussion of the block min-entropy extractor and the disperser construction. In this paper,
we construct explicit strong block min-entropy extractors and dispersers. However, in our
constructions, we require the min-entropy 𝑘 to be as large as Ω(log 𝑚). In the conference version
of this paper we asked whether the min-entropy condition could be improved to be 𝑜(log 𝑚).
The following counterexample due to Raghu Meka [23] shows that this is impossible. Therefore
it sets a barrier to improving (robust) sunflowers via extractors.

Theorem 1.19. Let 𝑚, 𝑛 be parameters and set 𝑘 := min(log(2𝑛 − 1), log 𝑚). Then for any function
𝐸 : {0, 1} 𝑛 × {0, 1} 𝑠 → {0, 1}, there is an (𝑚, 𝑛, 𝑘) block min-entropy source 𝑋 = (𝑋1 , . . . , 𝑋𝑚 ) such that,
for every 𝑟1 , . . . 𝑟𝑚 ∈ {0, 1} 𝑠 , the support of

                                          (𝐸(𝑋1 , 𝑟1 ), . . . , 𝐸(𝑋𝑚 , 𝑟𝑚 ))

is not full.

                         T HEORY OF C OMPUTING, Volume 18 (2), 2022, pp. 1–18                                   8
                S UNFLOWERS AND ROBUST S UNFLOWERS FROM R ANDOMNESS E XTRACTORS

Proof. Let 𝐸 : {0, 1} 𝑛 × {0, 1} 𝑠 → {0, 1} be the given function. Fix an arbitrary string 𝑥 ∗ ∈ {0, 1} 𝑛 .
Define the distribution 𝑋 = (𝑋1 , . . . , 𝑋𝑚 ) as follows. First, sample an index 𝑖 ∗ ∈ [𝑚] uniformly and
set 𝑋𝑖 ∗ = 𝑥 ∗ . For every 𝑖 ≠ 𝑖 ∗ , sample 𝑋𝑖 ∈ {0, 1} 𝑛 \ {𝑥 ∗ } uniformly and independently. Observe
that 𝑋 is a (𝑚, 𝑛, 𝑘) block min-entropy source for 𝑘 = min(log(2𝑛 − 1), log 𝑚).
    We will show that for every 𝑟1 , . . . , 𝑟𝑚 ∈ {0, 1} 𝑠 , the random variable (𝐸(𝑋1 , 𝑟1 ), . . . , 𝐸(𝑋𝑚 , 𝑟𝑚 ))
does not have full support on {0, 1} 𝑚 . This is because if we let 𝑧 = (𝐸(𝑥 ∗ , 𝑟1 ), 𝐸(𝑥 ∗ , 𝑟2 ), . . . , 𝐸(𝑥 ∗ , 𝑟𝑚 )),
then the output of (𝐸(𝑋1 , 𝑟1 ), . . . , 𝐸(𝑋𝑚 , 𝑟𝑚 )) always agrees with 𝑧 in at least one position. Thus
the string 𝑧 ⊕ 1 is not in the support of (𝐸(𝑋1 , 𝑟1 ), . . . , 𝐸(𝑋𝑚 , 𝑟𝑚 )).
                                                                                                                       

Acknowledgements
We thank Raghu Meka and Ben Rossman for helpful discussions. We also thank anonymous
reviewers for helpful suggestions on an earlier version of this paper.


2    Preliminaries
We first review some basic definitions in probability.

Definition 2.1. Let 𝐷 be a distribution over a finite sample space Σ. Its entropy is

                                             Õ                                       
                                                                        1
                                   H(𝐷) =           Pr[𝐷 = 𝑥] · log           .
                                                                    Pr[𝐷 = 𝑥]
                                              𝑥∈Σ

Its min-entropy is
                                                                           
                                                            1
                                       H∞ (𝐷) = min log                           .
                                                𝑥∈Σ     Pr[𝐷 = 𝑥]
Its max-entropy is
                                             H0 (𝐷) = log | Supp(𝐷)|.

Definition 2.2 (Statistical distance). Let 𝐷0 and 𝐷1 be distributions over a finite sample space Σ.
The statistical distance between 𝐷0 and 𝐷1 is defined as

                                                    1Õ
                               dist(𝐷0 , 𝐷1 ) =        Pr[𝐷0 = 𝑥] − Pr[𝐷1 = 𝑥] .
                                                    2
                                                     𝑥∈Σ



3    A construction of a block min-entropy extractor
We use the following well-known extractor based on the inner product function [6]. We denote
by F𝑞 the finite field on 𝑞 elements. When 𝑞 = 2ℓ we identify F𝑞 with {0, 1}ℓ and F𝑡𝑞 with {0, 1} 𝑡ℓ .

                           T HEORY OF C OMPUTING, Volume 18 (2), 2022, pp. 1–18                                         9
                                       X IN L I , S HACHAR L OVETT, J IAPENG Z HANG

Theorem 3.1. Let 𝑡, ℓ ≥ 1 and take 𝑞 = 2ℓ , 𝑛 = 𝑡ℓ . Let 𝑋 , 𝑌 be independent sources on F𝑡𝑞  {0, 1} 𝑛
with min-entropies 𝑘 1 , 𝑘 2 , respectively. Let IP be the inner product function over the field F𝑞 . Then:
                   dist ((IP(𝑋 , 𝑌), 𝑋), (𝑈ℓ , 𝑋)) ≤ 𝜀            and     dist ((IP(𝑋 , 𝑌), 𝑌), (𝑈ℓ , 𝑌)) ≤ 𝜀
where 𝜀 = 2−(𝑘1 +𝑘2 −𝑛−ℓ )/2 .
   Now we can construct a block min-entropy extractor as follows. Given parameters 𝑛, 𝑘,
choose a field F𝑞 such that 𝑞 = 2ℓ with ℓ = 𝛼𝑘 for some constant 0 < 𝛼 < 1 to be determined later.
Without loss of generality we assume that 𝑛 = 𝑡ℓ for some integer 𝑡. We view 𝑋 ∈ {0, 1} 𝑛 as a
vector in F𝑡𝑞 and choose a uniform independent seed 𝑅 ∈ {0, 1} 𝑛  F𝑡𝑞 .


                                             A block min-entropy extractor

      1. Given parameters 𝑚, 𝑛, 𝑘 let 𝑞, 𝑡 be as described above.

      2. Sample (𝑋1 , . . . , 𝑋𝑚 ) ∈ (F𝑡𝑞 )𝑚 from the block min-entropy distribution.

      3. Sample (𝑅1 , . . . , 𝑅 𝑚 ) ∈ (F𝑡𝑞 )𝑚 uniformly and independently.

      4. Output 𝑍 := (IP(𝑋1 , 𝑅 1 ), . . . , IP(𝑋𝑚 , 𝑅 𝑚 )).



    We are now ready to prove the following theorem.
Theorem 3.2 (Theorem 1.9 restated). Let 𝑋 = (𝑋1 , . . . , 𝑋𝑚 ) be an (𝑚, 𝑛, 𝑘) block min-entropy source.
Let 𝑍 ∈ {0, 1}ℓ 𝑚 be the output of the above block min-entropy extractor applied to 𝑋. There exists a
constant 𝑐 > 1 such that if 𝑘 ≥ 𝑐 log 𝑚, then the following holds for error 𝜀 = 2−Ω(𝑘) :

    • With probability 1 − 𝜀 over the fixing of the seed (𝑅 1 , . . . , 𝑅 𝑚 ),
                                        Pr[𝑍 = 𝑧] − 2−ℓ 𝑚 ≤ 𝜀 · 2−ℓ 𝑚              ∀𝑧 ∈ {0, 1}ℓ 𝑚 .
       In particular, in such cases H0 (𝑍) = ℓ 𝑚
    • dist ((𝑍, 𝑅 1 , . . . , 𝑅 𝑚 ), (𝑈 , 𝑅 1 , . . . , 𝑅 𝑚 )) ≤ 2𝜀.

Proof. We have a joint distribution (𝑋1 , . . . , 𝑋𝑚 ) that has block min-entropy 𝑘. The output of the
local extractor applied to (𝑋1 , . . . , 𝑋𝑚 ), using 𝑚 independent uniform seeds (𝑅 1 , . . . , 𝑅 𝑚 ), is a
distribution (𝑍1 , . . . , 𝑍 𝑚 ) over {0, 1}ℓ 𝑚 = F𝑚𝑞 where 𝑍 𝑖 = IP(𝑋𝑖 , 𝑅 𝑖 ) for each 𝑖.
    For any fixing of the seed (𝑅 1 = 𝑟1 , . . . , 𝑅 𝑚 = 𝑟𝑚 ), the distribution (𝑍1 , . . . , 𝑍 𝑚 ) is a deterministic
function of (𝑋1 , . . . , 𝑋𝑚 ), and we will view this distribution as a function D : {0, 1}ℓ 𝑚 → [0, 1]
where the image of each input is its associated probability in the distribution. We now write
this function in its Fourier basis:
                                                               Õ
                                                  D(𝑧) =               D̂(𝑆)𝜒𝑆 (𝑧),
                                                             𝑆⊆[ℓ 𝑚]


                             T HEORY OF C OMPUTING, Volume 18 (2), 2022, pp. 1–18                                  10
                  S UNFLOWERS AND ROBUST S UNFLOWERS FROM R ANDOMNESS E XTRACTORS

where 𝑧 = (𝑧 1 , . . . , 𝑧 𝑚 ) ∈ {0, 1}ℓ 𝑚 , 𝜒𝑆 (𝑧) = (−1) 𝑖∈𝑆 𝑧(𝑖) ∈ {+1, −1}, and
                                                                         Í

                                                          Õ
                                    D̂(𝑆) = 2−ℓ 𝑚 ·              D(𝑧)𝜒𝑆 (𝑧) = 2−ℓ 𝑚 · E𝑧∼D [𝜒𝑆 (𝑧)].
                                                           𝑧

Here we use 𝑧(𝑖) to stand for the 𝑖-th bit of the string 𝑧. This is to distinguish between the
notation 𝑧 𝑖 , which refers to the 𝑖-th block of the string 𝑧, that contains ℓ bits.
   Note that D̂(∅) = 2−ℓ 𝑚 since D is a probability distribution. Thus we have that ∀𝑧 ∈ {0, 1}ℓ 𝑚 ,

                                                               Õ                                    Õ
                                D(𝑧) − 2−ℓ 𝑚 =                           D̂(𝑆)𝜒𝑆 (𝑧) ≤                          D̂(𝑆) .                       (3.1)
                                                        𝑆⊆[ℓ 𝑚],𝑆≠∅                            𝑆⊆[ℓ 𝑚],𝑆≠∅

   Note that for any 𝑆 ⊆ [ℓ 𝑚], 𝜒𝑆 (𝑍) corresponds to the parity of a subset of the bits in 𝑍. For
each 𝑍 𝑗 , 𝑗 ∈ [𝑚], this parity may or may not involve any bits in 𝑍 𝑗 . We will be interested in the
number of indices 𝑗 such that 𝜒𝑆 (𝑍) involves at least one bit from 𝑍 𝑗 , and we call this number
Δ(𝑆). Note that Δ(∅) = 0 and 1 ≤ Δ(𝑆) ≤ 𝑚 for any 𝑆 ≠ ∅.
   We now have the following lemma.
Lemma 3.3. If Δ(𝑆) = ℎ, then with probability 1 − 2−ℎ(1−𝛼)𝑘/4 over the fixing of the seed (𝑅1 =
𝑟1 , . . . , 𝑅 𝑚 = 𝑟𝑚 ), we have that | D̂(𝑆)| ≤ 2 · 2−ℓ 𝑚 2−ℎ(1−𝛼)𝑘/4 .

Proof. Without loss of generality assume that the 𝑍 𝑗 from which 𝜒𝑆 (𝑍) involves at least one
bit are (𝑍1 , . . . , 𝑍 ℎ ). Note that for any 𝑍 𝑖 ∈ {0, 1}ℓ = F𝑞 , any parity of the bits of 𝑍 𝑖 corresponds
exactly to the first bit of 𝑎 · 𝑍 𝑖 viewed as a vector in {0, 1}ℓ , for some 𝑎 ∈ F𝑞 and where the
operation · is multiplication in the field F𝑞 . Moreover this correspondence is a bĳection in the
sense that different parities correspond to different elements 𝑎 ∈ F𝑞 . The special case of parity
over the empty set corresponds to the case of 𝑎 = 0. Thus, 𝑖∈𝑆 𝑍(𝑖) corresponds to the first bit
                                                                       Í
of 𝑗∈[ℎ] 𝑎 𝑗 𝑍 𝑗 viewed as a vector in {0, 1}ℓ , for some non-zero {𝑎 𝑗 ∈ F𝑞 : 𝑗 ∈ [ℎ]}. Note that
   Í

      Õ                 Õ                              Õ
              𝑎𝑗 𝑍𝑗 =           𝑎 𝑗 IP(𝑋 𝑗 , 𝑅 𝑗 ) =           IP(𝑎 𝑗 𝑋 𝑗 , 𝑅 𝑗 ) = IP((𝑎 1 𝑋1 , . . . , 𝑎 ℎ 𝑋 ℎ ), (𝑅 1 , . . . , 𝑅 ℎ )) .   (3.2)
      𝑗∈[ℎ]             𝑗∈[ℎ]                          𝑗∈[ℎ]

    Since each 𝑎 𝑗 ≠ 0 the transformation from (𝑥 1 , . . . , 𝑥 ℎ ) to (𝑎 1 𝑥 1 , . . . , 𝑎 ℎ 𝑥 ℎ ) is a bĳection. Thus we
know the distribution (𝑎 1 𝑋1 , . . . , 𝑎 ℎ 𝑋 ℎ ) has min-entropy 𝑘 ℎ, while (𝑅 1 , . . . , 𝑅 ℎ ) has min-entropy
𝑛 ℎ. Thus by Theorem 3.1 applied over the field F2ℓ ℎ we have that

                           © Õ                                                                          −(ℎ(𝑘−ℓ ))         −ℎ(1−𝛼)𝑘
                   dist (            𝑎 𝑗 𝑍 𝑗 , 𝑅 1 , . . . , 𝑅 𝑚 ), (𝑈ℓ ℎ , 𝑅 1 , . . . , 𝑅 𝑚 )® ≤ 2                =2               .
                                                                                               ª            2                  2              (3.3)
                           « 𝑗∈[ℎ]                                                             ¬
    In particular, as 𝜒𝑆 (𝑍) is the first bit of                         𝑗∈[ℎ] 𝑎 𝑗 𝑍 𝑗 , we have
                                                                     Í

                                                                                                            −ℎ(1−𝛼)𝑘
                                  dist (𝜒𝑆 (𝑍), 𝑅 1 , . . . , 𝑅 𝑚 ), (𝑈1 , 𝑅 1 , . . . , 𝑅 𝑚 )) ≤ 2             2      .                      (3.4)
    By Markov’s inequality this means that with probability 1 − 2−ℎ(1−𝛼)𝑘/4 over the fixing of the
seed 𝑅 = (𝑅 1 , . . . , 𝑅 𝑚 ), we have | D̂(𝑆)| = |2−ℓ 𝑚 · E𝑧∼D [𝜒𝑆 (𝑧)]| ≤ 2 · 2−ℓ 𝑚 2−ℎ(1−𝛼)𝑘/4 .                                              

                                T HEORY OF C OMPUTING, Volume 18 (2), 2022, pp. 1–18                                                            11
                                 X IN L I , S HACHAR L OVETT, J IAPENG Z HANG

   Next, note that the number of 𝑆 with Δ(𝑆) = ℎ is 𝑚ℎ (2ℓ − 1) ℎ ≤ 2(ℓ +log 𝑚)ℎ . Recall that ℓ = 𝛼𝑘
                                                                              

and 𝑘 ≥ 𝑐 log 𝑚. We can choose the constants 𝛼, 𝑐 such that 2ℓ +log 𝑚 2−(1−𝛼)𝑘/4 ≤ 2−𝑘/8 . Now we
have as long as 𝑘 ≥ 8,
                                𝑚                                         𝑚
                                Õ  𝑚                         −ℎ(1−𝛼)𝑘
                                                                            Õ        ℎ𝑘        𝑘
                                              (2ℓ − 1) ℎ 2       4      ≤         2− 8 ≤ 2− 8 +1 .                            (3.5)
                                         ℎ
                                ℎ=1                                         ℎ=1

    Set 𝜀 = 2−𝑘/8+2 = 2−Ω(𝑘) .
    By the union bound we have that with probability at least 1 − 𝜀 over the fixing of the seed
                                                                                                     −ℎ(1−𝛼)𝑘
(𝑅 1 = 𝑟1 , . . . , 𝑅 𝑚 = 𝑟𝑚 ), for every 𝑆 ≠ ∅ with Δ(𝑆) = ℎ, | D̂(𝑆)| ≤ 2 · 2−ℓ 𝑚 2                    4      . Thus for any such
seed we have that                                      Õ
                                     D(𝑧) − 2−ℓ 𝑚 ≤           D̂(𝑆) ≤ 𝜀 · 2−ℓ 𝑚 .                                             (3.6)
                                                            𝑆⊆[ℓ 𝑚],𝑆≠∅

    This concludes the proof of the first part of Theorem 1.9. For the second part, notice that
conditioned on the fixing of any seed 𝑅1 , . . . , 𝑅 𝑚 , with probability 1 − 𝜀 the statistical distance is
at most 𝜀, and otherwise it is trivially bounded by 1. So overall the statistical distance between
(𝑍, 𝑅 1 , . . . , 𝑅 𝑚 ) and (𝑈ℓ 𝑚 , 𝑅 1 , . . . , 𝑅 𝑚 ) is at most 2𝜀.                                   


4    Compressing set systems by the block min-entropy extractor
In this section, we focus on the set systems that satisfy the spread condition, and show a
compression operator for such set systems. Our compression is based on the block min-entropy
extractor. We first show that it suffices to consider 𝑤-partite families (see Definition 1.12).
Lemma 4.1. Let F be a 𝑤-uniform family. Then F has a 𝑤-partite subfamily F0 of size |F0 | ≥ |F|/22𝑤 .
Proof. Let 𝑈 ∈ F be a set, and let 𝑋1 , . . . , 𝑋𝑤 be a random partition of 𝑋. Then
                                                                                          𝑤!
                                                       ∀𝑗 ∈ [𝑤], |𝑈 ∩ 𝑋 𝑗 | = 1 =            .
                                                                                   
                                         Pr
                                      𝑋1 ,...,𝑋𝑤                                          𝑤𝑤
Then, by averaging, there is a partition (𝑋1 , . . . , 𝑋𝑤 ) such that
                                                                                               𝑤!
                                     𝑈 ∈ F : ∀𝑗 ∈ [𝑤], |𝑈 ∩ 𝑋 𝑗 | = 1
                                 
                                                                                     ≥ |F| ·
                                                                                               𝑤𝑤
The claim then follows since 𝑤𝑤!𝑤 ≥ 2−2𝑤 .                                                                                       
   Now we can focus on 𝑤-partite families. Given a finite set 𝑋, we denote by 𝑋𝑝 the distribution
over subsets 𝑊 ⊂ 𝑋, where each 𝑥 ∈ 𝑋 appears in 𝑊 independently with probability 𝑝.
Lemma 4.2. Let 𝑢 ≥ 𝑤. Let 𝑐 be the constant from Theorem 1.9. Then for every 𝑤-partite family which
is 𝑢 𝑐 -spread (recall Definition 1.13), it holds that

                                       Pr [∃𝑈 ∈ F, 𝑈 ⊆ 𝑊] ≥ 1 − 𝑤(1 − 𝑝)𝑢 .
                                     𝑊∼𝑋𝑝


                         T HEORY OF C OMPUTING, Volume 18 (2), 2022, pp. 1–18                                                   12
              S UNFLOWERS AND ROBUST S UNFLOWERS FROM R ANDOMNESS E XTRACTORS

    To prove this lemma, we first define a “worst case” instance, and then show that all other
instances behave better than this case. Let 𝑋1∗ , . . . , 𝑋𝑤∗ be 𝑤 disjoint sets each of size 𝑢. Define
the family U∗ as                  n                                    o
                                     U∗ = {𝑥 1 , . . . , 𝑥 𝑤 } : ∀𝑗 ∈ [𝑤], 𝑥 𝑗 ∈ 𝑋 𝑗∗ .
Claim 4.3. Let U∗ as defined above. Then
                                      Pr [∃𝑈 ∈ U∗ , 𝑈 ⊆ 𝑊] ≥ 1 − 𝑤(1 − 𝑝)𝑢 .
                                    𝑊∼𝑋𝑝

Proof. By the definition of U∗ , we have that
                             Pr[∀𝑈 ∈ U∗ , 𝑈 * 𝑊] = Pr[∃𝑗 ∈ [𝑤], 𝑋 𝑗 ∩ 𝑊 = ∅]
                              𝑊                               𝑊
                                                              Õ
                                                          ≤           Pr[𝑋 𝑗 ∩ 𝑊 = ∅]
                                                                      𝑊
                                                              𝑗∈[𝑤]

                                                          =𝑤(1 − 𝑝)𝑢 .                                         
   Let 𝑋 , 𝑌 be finite sets, ℎ : 𝑋 → 𝑌 a map. Given a set 𝑈 ⊂ 𝑋 define ℎ(𝑈) = {ℎ(𝑥) : 𝑥 ∈ 𝑈 } ⊂ 𝑌.
Given a family F ⊆ P(𝑋) define ℎ(F) ⊆ P(𝑌) as
                             ℎ(F) = ℎ(𝑈) : 𝑈 ∈ F and ℎ is injective on 𝑈 .
                                        

Lemma 4.4. Let 𝑋 and 𝑌 be sets, ℎ : 𝑋 → 𝑌 a map, F ⊂ P(𝑋). Then
                          Pr [∃𝑈 ∈ ℎ(F), 𝑈 ⊆ 𝑊𝑌 ] ≤                   Pr [∃𝑈 ∈ F, 𝑈 ⊆ 𝑊𝑋 ].
                        𝑊𝑌 ∼𝑌𝑝                                    𝑊𝑋 ∼𝑋𝑝

Proof. Without loss of generality, we can assume the map ℎ is surjective, because elements
𝑦 ∈ 𝑌 \ ℎ(𝑋) do not affect the events. If |𝑌| = |𝑋 | then ℎ is a bĳection and hence F and ℎ(F) are
the same, up to renaming the elements. So, assume |𝑌| < |𝑋 |. It suffices to prove the lemma for
the case that |𝑌| = |𝑋 | − 1, as the general case follows from applying this case iteratively (namely,
decompose ℎ as a sequence of maps, each reduces the domain size by one).
    So, assume |𝑌| = |𝑋 | − 1. In this case, there is a unique pair 𝑥 1 , 𝑥 2 ∈ 𝑋 such that ℎ(𝑥 1 ) =
ℎ(𝑥2 ) = 𝑦. We may assume without loss of generality (by renaming the elements of 𝑌) that ℎ
is the identity map on 𝑋 0 = 𝑋 \ {𝑥 1 , 𝑥 2 }. This allows us to jointly sample (𝑊𝑋 , 𝑊𝑌 ) as follows.
Sample 𝑊 0 ∼ 𝑋𝑝0 , 𝑊𝑋0 ∼ {𝑥 1 , 𝑥 2 } 𝑝 , 𝑊𝑌0 ∼ {𝑦} 𝑝 and set 𝑊𝑋 = 𝑊 0 ∪𝑊𝑋0 , 𝑊𝑌 = 𝑊 0 ∪𝑊𝑌0 . We will show
that for every fixed 𝑊 0 = 𝑤 0,
              Pr [∃𝑈 ∈ ℎ(F), 𝑈 ⊆ 𝑊𝑌 | 𝑊 0 = 𝑤 0] ≤                    Pr [∃𝑈 ∈ F, 𝑈 ⊆ 𝑊𝑋 | 𝑊 0 = 𝑤 0] .     (4.1)
            𝑊𝑌 ∼𝑌𝑝                                                𝑊𝑋 ∼𝑋𝑝

The lemma then follows by averaging over 𝑊 0.
    To that end, fix 𝑊 0. Let F0 = {𝑈 \ 𝑋 0 : 𝑈 ∈ F, (𝑈 ∩ 𝑋 0) ⊂ 𝑊 0 }. Note that F0 ⊆ P({𝑥 1 , 𝑥 2 }).
Similarly, define F00 = {𝑈 \ 𝑋 0 : 𝑈 ∈ ℎ(F), (𝑈 ∩ 𝑋 0) ⊂ 𝑊 0 }. Note that F00 ⊆ P({𝑦}). Equation (4.1)
is equivalent to
                        Pr        [∃𝑈 ∈ F00 , 𝑈 ⊆ 𝑊𝑌0 ] ≤             Pr           [∃𝑈 ∈ F0 , 𝑈 ⊆ 𝑊𝑋0 ] .   (4.2)
                     𝑊𝑌0 ∼{𝑦} 𝑝                               𝑊𝑋0 ∼{𝑥 1 ,𝑥 2 } 𝑝

We verify Equation (4.2) by a case analysis.

                        T HEORY OF C OMPUTING, Volume 18 (2), 2022, pp. 1–18                                  13
                                X IN L I , S HACHAR L OVETT, J IAPENG Z HANG

  (i) If F00 is empty then the LHS of Equation (4.2) is 0, while the RHS is non-negative.

  (ii) If ∅ ∈ F00 then ∅ ∈ F0. In this case, both the LHS and RHS of Equation (4.2) equal 1.

 (iii) If F00 = {{𝑦}} then either {𝑥 1 } ∈ F0 or {𝑥 2 } ∈ F0 (or possibly both). In either case, the LHS
       of Equation (4.2) equals 𝑝, while the RHS is at least 𝑝.

                                                                                                             
   We now prove Lemma 4.2. Let F be a family that satisfies the assumptions. We will show
there is a function ℎ such that ℎ(F) = U∗ . The extractor from Theorem 1.9, with an appropriate
choice of seed, provides such a function ℎ.

Proof of Lemma 4.2. Let F be a 𝑤-partite family that satisfies the spread condition. We first define
the function ℎ. Since F is a 𝑤-partite family, there exists a partition of 𝑋 to 𝑋1 , . . . , 𝑋𝑤 such that
for each 𝑈 ∈ F and 𝑗 ∈ [𝑤], |𝑋 𝑗 ∩ 𝑈 | = 1.
    Define the sample space as 𝑋1 × · · · × 𝑋𝑤 . With a slight abuse of notation, we identify
F ⊆ P(𝑋1 × · · · × 𝑋𝑤 ), and let 𝐷 be a uniform distribution over F. Since F is 𝑢 𝑐 -spread, the
distribution 𝐷 is a (𝑤, log 𝑋 , 𝑘) block min-entropy source with 𝑘 = 𝑐 log 𝑢 ≥ 𝑐 log 𝑤. Then by
Theorem 1.9, there exists seeds 𝑟1 , . . . , 𝑟𝑤 such that (IP(𝐷1 , 𝑟1 ), . . . , IP(𝐷𝑤 , 𝑟𝑤 )) has full support,
where 𝐷 = (𝐷1 , . . . , 𝐷𝑤 ). Note that the output of IP(·, ·) is in {0, 1} 𝑘/𝑐  [𝑢]. We can now define ℎ
as follows:
                                     ℎ(𝑥) = (IP(𝑥, 𝑟 𝑗 ), 𝑗)   ∀𝑥 ∈ 𝑋 𝑗 .
Note that by definition, ℎ is injective on any 𝑈 ∈ 𝑋1 × · · · × 𝑋𝑤 . We identify elements of U∗ with
{(𝑎 1 , 1), . . . , (𝑎 𝑤 , 𝑤)} with 𝑎 𝑖 ∈ [𝑢]. Thus ℎ(F) = U∗ . The lemma now follows from Lemma 4.4 and
Claim 4.3.                                                                                             
      We will also need the following lemma.

Lemma 4.5. Let 𝑢 ≥ 𝑤. Let 𝑐 be the constant from Theorem 1.9. Then for every 𝑤-partite family F
which is 𝑢 𝑐 -spread (recall Definition 1.13), it holds that F contains 𝑢 pairwise disjoint sets.

Proof. The proof is very similar to the proof of Lemma 4.2. There is a map ℎ for which ℎ(F) = U∗ .
Note that U∗ contains 𝑢 pairwise disjoint sets, 𝑈10 , . . . , 𝑈𝑢0 . By definition, 𝑈 𝑖0 = ℎ(𝑈 𝑖 ). But then
also 𝑈1 , . . . , 𝑈𝑢 must be pairwise disjoint.                                                           

4.1    Sunflowers and robust sunflowers from compression
Now we can prove Theorem 1.8.

Theorem 4.6 (Theorem 1.8 restated). Suppose that there exists a strong (𝑘, 0, 𝑑, 𝑠)-block min-entropy
disperser, 𝐸 : {0, 1} 𝑛 × {0, 1} 𝑠 → {0, 1} 𝑑 for any (𝑤, 𝑛, 𝑘)-block min-entropy source. Then the following
holds.
    Let F be a 𝑤-uniform family. Assume that |F| ≥ 2(𝑘+2)𝑤 . Then:

  (i) F contains a 2𝑑 -sunflower.

                        T HEORY OF C OMPUTING, Volume 18 (2), 2022, pp. 1–18                                 14
               S UNFLOWERS AND ROBUST S UNFLOWERS FROM R ANDOMNESS E XTRACTORS
                                     𝑑
                                          
 (ii) F contains a 𝑝, 𝑤(1 − 𝑝)2 -robust sunflower.

Proof. By Lemma 4.1, there is a 𝑤-partite subfamily F0 ⊆ F of size |F0 | ≥ 2 𝑘𝑤 . There are two
possible cases.
Case 1: There is a subset 𝑆 ⊆ 𝑋 such that

                                              |{𝑈 ∈ F0 : 𝑆 ⊆ 𝑈 }| ≥ |F0 | · 2−𝑘|𝑆| .

Define the family F𝑆0 := {𝑈 \ 𝑆 : 𝑆 ⊆ 𝑈 ∈ F0 }. Notice that

   • F𝑆0 is (𝑤 − |𝑆|)-partite;

   • |F𝑆0 | ≥ |F0 | · 2−𝑘|𝑆| ≥ 2 𝑘(𝑤−|𝑆|) .

By induction both (i) and (ii) hold.
Case 2: For all 𝑆 ⊆ 𝑋,
                                 |{𝑈 ∈ F0 : 𝑆 ⊆ 𝑈 }| ≤ |F0 | · 2−𝑘|𝑆|
Notice that this is the spread condition for Lemma 4.2 and Lemma 4.5. Their conclusions are
precisely (i) and (ii).                                                                   



References
 [1] Noga Alon, Amir Shpilka, and Christopher Umans: On sunflowers and matrix mul-
     tiplication. Comput. Complexity, 22(2):219–243, 2013. Preliminary version in CCC’12.
     [doi:10.1007/s00037-013-0060-1, ECCC:TR11-067] 3

 [2] Ryan Alweiss, Shachar Lovett, Kewen Wu, and Jiapeng Zhang: Improved bounds for
     the sunflower lemma. Ann. Math., 194(3):795–815, 2021. Preliminary version in STOC’20.
     [doi:10.4007/annals.2021.194.3.5, ECCC:TR19-110, arXiv:1908.08483] 4, 6, 8

 [3] Boaz Barak, Guy Kindler, Ronen Shaltiel, Benny Sudakov, and Avi Wigderson: Simulating
     independence: New constructions of condensers, Ramsey graphs, dispersers, and extractors.
     J. ACM, 57(4):20:1–52, 2010. Preliminary version in STOC’05. [doi:10.1145/1734213.1734214,
     ECCC:TR10-037] 2

 [4] Avraham Ben-Aroya, Dean Doron, and Amnon Ta-Shma: An efficient reduction from two-
     source to non-malleable extractors: Achieving near-logarithmic min-entropy. In Proc. 49th
     STOC, pp. 1185–1194. ACM Press, 2017. [doi:10.1145/3055399.3055423, ECCC:TR16-088] 2

 [5] Eshan Chattopadhyay and David Zuckerman: Explicit two-source extractors and resilient
     functions. In Proc. 48th STOC, pp. 670–683. ACM Press, 2016. [doi:10.1145/2897518.2897528,
     ECCC:TR15-119] 2

                          T HEORY OF C OMPUTING, Volume 18 (2), 2022, pp. 1–18              15
                            X IN L I , S HACHAR L OVETT, J IAPENG Z HANG

 [6] Benny Chor and Oded Goldreich: Unbiased bits from sources of weak randomness and
     probabilistic communication complexity. SIAM J. Comput., 17(2):230–261, 1988. Preliminary
     version in FOCS’85. [doi:10.1137/0217015] 9

 [7] Gil Cohen: Towards optimal two-source extractors and Ramsey graphs. In Proc. 49th STOC,
     pp. 1157–1170. ACM Press, 2017. [doi:10.1145/3055399.3055429, ECCC:TR16-114] 2

 [8] Ernie Croot, Vsevolod F. Lev, and Péter Pál Pach: Progression-free sets in Z4𝑛 are
     exponentially small. Ann. Math., 185(1):331–337, 2017. [doi:10.4007/annals.2017.185.1.7] 3

 [9] Jordan S. Ellenberg and Dion Gijswijt: On large subsets of F𝑛𝑞 with no three-term arithmetic
     progression. Ann. Math., 185(1):339–343, 2017. [doi:10.4007/annals.2017.185.1.8] 3

[10] Paul Erdős: Some remarks on the theory of graphs. Bull. AMS, 53(4):292–294, 1947. project
     Euclid. 2

[11] Paul Erdős and Richard Rado: Intersection theorems for systems of sets. J. London Math.
     Soc., 35(1):85–90, 1960. [doi:10.1112/jlms/s1-35.1.85] 2

[12] Paul Erdős and George Szekeres: A combinatorial problem in geometry. Compositio Math.,
     2:463–470, 1935. NUMDAM. 2

[13] Mika Göös, Shachar Lovett, Raghu Meka, Thomas Watson, and David Zuckerman:
     Rectangles are nonnegative juntas. SIAM J. Comput., 45(5):1835–1869, 2016. Preliminary
     version in STOC’15. [doi:10.1137/15M103145X, ECCC:TR14-147] 4, 7

[14] Parikshit Gopalan, Raghu Meka, and Omer Reingold: DNF sparsification and a faster
     deterministic counting algorithm. Comput. Complexity, 22(2):275–310, 2013. Preliminary
     version in CCC’12. [doi:10.1007/s00037-013-0068-6, arXiv:1205.3534] 3, 7

[15] Ben Green and Terence Tao: The primes contain arbitrarily long arithmetic progressions.
     Ann. Math., 167(2):481–547, 2008. [doi:10.4007/annals.2008.167.481] 2

[16] Pravesh K. Kothari, Raghu Meka, and Prasad Raghavendra: Approximating rectangles
     by juntas and weakly-exponential lower bounds for LP relaxations of CSPs. In Proc. 49th
     STOC, pp. 590–603. ACM Press, 2017. [doi:10.1145/3055399.3055438, arXiv:1610.02704] 7

[17] Xin Li: Three source extractors for polylogarithmic min-entropy. In Proc. 56th FOCS, pp. 863–
     882. IEEE Comp. Soc., 2015. [doi:10.1109/FOCS.2015.58, ECCC:TR15-034, arXiv:1503.02286]
     2

[18] Xin Li: Improved non-malleable extractors, non-malleable codes and independent source ex-
     tractors. In Proc. 49th STOC, pp. 1144–1156. ACM Press, 2017. [doi:10.1145/3055399.3055486,
     ECCC:TR16-115, arXiv:1608.00127] 2

                     T HEORY OF C OMPUTING, Volume 18 (2), 2022, pp. 1–18                      16
             S UNFLOWERS AND ROBUST S UNFLOWERS FROM R ANDOMNESS E XTRACTORS

[19] Xin Li, Shachar Lovett, and Jiapeng Zhang: Sunflowers and quasi-sunflowers from
     randomness extractors. In Proc. 22nd Internat. Conf. on Randomization and Computation
     (RANDOM’18), pp. 51:1–13. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, 2018.
     [doi:10.4230/LIPIcs.APPROX-RANDOM.2018.51] 1, 7, 8

[20] Shachar Lovett, Noam Solomon, and Jiapeng Zhang: From DNF compression to
     sunflower theorems via regularity. In Proc. 34th Comput. Complexity Conf. (CCC’19),
     volume 137, pp. 5:1–14. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, 2019.
     [doi:10.4230/LIPIcs.CCC.2019.5, ECCC:TR19-028, arXiv:1903.00580] 7

[21] Shachar Lovett, Kewen Wu, and Jiapeng Zhang: Decision list compression by
     mild random restrictions. In Proc. 52nd STOC, pp. 247–254. ACM Press, 2020.
     [doi:10.1145/3357713.3384241, ECCC:TR19-137, arXiv:1909.10658] 7

[22] Shachar Lovett and Jiapeng Zhang: DNF sparsification beyond sunflowers. In Proc. 51st
     STOC, pp. 454–460. ACM Press, 2019. [doi:10.1145/3313276.3316323, ECCC:TR18-190] 7

[23] Raghu Meka: Personal communication, 2018. 7, 8

[24] Eric Naslund and Will Sawin: Upper bounds for sunflower-free sets. In Forum of
     Mathematics, Sigma, volume 5. Cambridge Univ. Press, 2017. [doi:10.1017/fms.2017.12] 3

[25] Anup Rao: Coding for sunflowers. Discrete Analysis, pp. 2:1–8, 2020. [doi:10.19086/da.11887,
     arXiv:1909.04774] 4, 6, 8

[26] Alexander A. Razborov: Lower bounds for the monotone complexity of some Boolean
     functions. Dokl. Math., 31(4):354–357, 1985. Link at Math-Net.ru. 3

[27] Benjamin Rossman: The monotone complexity of 𝑘-clique on random graphs. SIAM J.
     Comput., 43(1):256–279, 2014. Preliminary version in FOCS’10. [doi:10.1137/110839059] 2, 3

[28] Endre Szemerédi: On sets of integers containing no 𝑘 elements in arithmetic progression.
     Acta Arithm., 27(1):199–245, 1975. DML PL. 2

[29] Endre Szemerédi: Regular partitions of graphs. In Problèmes combinatoires et théorie des
     graphes (Proc. conf. Univ. Orsay 1976), volume 260, pp. 399–401. C.N.R.S., 1978. 2


AUTHORS

     Xin Li
     Associate professor
     Johns Hopkins University
     Baltimore, MD, USA
     lixints cs jhu edu
     https://www.cs.jhu.edu/~lixints/


                     T HEORY OF C OMPUTING, Volume 18 (2), 2022, pp. 1–18                     17
                          X IN L I , S HACHAR L OVETT, J IAPENG Z HANG

    Shachar Lovett
    Associate professor
    University of California San Diego
    La Jolla, CA, USA
    slovett cs ucsd edu
    http://cseweb.ucsd.edu/~slovett


    Jiapeng Zhang
    Assistant professor
    University of Southern California
    Los Angeles, CA, USA
    jiapengz usc edu
    https://sites.google.com/site/jiapeng0708/home


ABOUT THE AUTHORS

    Xin Li is an associate professor in the Computer Science Department at Johns
       Hopkins University. He received his Ph. D. in 2011 from the University of Texas
       at Austin under the supervision of David Zuckerman. After his Ph. D., Xin was
       a Simons Postdoctoral Fellow at the University of Washington. His research
       interests include the use of randomness in computation, complexity theory,
       coding theory, and cryptography. A significant part of his work has been on
       explicit constructions of randomness extractors. Previously he did some work on
       quantum computing and human–computer interaction.


    Shachar Lovett graduated from the Weizmann Institute of Science in 2010; his
       advisors were Omer Reingold and Ran Raz. He was a member of the Institute
       for Advanced Study, School of Mathematics between 2010-2012. Since then,
       he has been a faculty member at the University of California, San Diego. He
       is interested in the role that structure and randomness play in computation
       and mathematics, and in particular in computational complexity, coding theory,
       pseudorandomness, and algebraic constructions.


    Jiapeng Zhang is an assistant professor in the Computer Science Department at the
        University of Southern California. He received his B. S. in 2011 from Shanghai
        Jiao Tong University, and his Ph. D. in 2019 from the University of California, San
        Diego, under the supervision of Shachar Lovett. He spent the year 2019–2020
        as a postdoc at Harvard. His research interests include the analysis of Boolean
        functions, machine learning theory, computational complexity, and cryptography.



                   T HEORY OF C OMPUTING, Volume 18 (2), 2022, pp. 1–18                       18