Inaccessible Entropy II: IE Functions and Universal One-Way Hashing

Authors Iftach Haitner, Thomas Holenstein, Omer Reingold, Salil Vadhan, Hoeteck Wee,
Plaintext
                            T HEORY OF C OMPUTING, Volume 16 (8), 2020, pp. 1–55
                                         www.theoryofcomputing.org




                 Inaccessible Entropy II:
           IE Functions and Universal One-Way
                         Hashing
  Iftach Haitner∗                Thomas Holenstein                       Omer Reingold†            Salil Vadhan‡
                                                      Hoeteck Wee§
               Received January 22, 2015; Revised September 6, 2020; Published October 13, 2020




       Abstract: This paper uses a variant of the notion of inaccessible entropy (Haitner, Reingold,
       Vadhan and Wee, STOC 2009), to give an alternative construction and proof for the funda-
       mental result, first proved by Rompel (STOC 1990), that Universal One-Way Hash Functions
       (UOWHFs) can be based on any one-way functions. We observe that a small tweak of any
       one-way function f is already a weak form of a UOWHF: consider the function F(x, i) that
       returns the i-bit-long prefix of f (x). If F were a UOWHF then given a random x and i it
       would be hard to come up with x0 6= x such that F(x, i) = F(x0 , i). While this may not be the
     A preliminary version of this paper appeared as “Universal One-Way Hash Functions via Inaccessible Entropy” in
EUROCRYPT 2010 [9].
   ∗ Supported by ERC starting grant 638121, ISF grant 1076/11 and US-Israel BSF grant 2010196.
   † Initial research and conference version supported by US-Israel BSF grant 2006060. Preparation of this journal version

supported by NSF grant CCF-1763311.
   ‡ Initial research and conference version supported by NSF grants CNS-0831289 and US-Israel BSF grants 2006060 and

2010196. Preparation of this journal version supported by NSF grant CCF-1763299 and a Simons Investigator Award.
   § Most of the work was done while at Queens College, CUNY. Supported in part by PSC-CUNY Award #6014939 40 and

NSF CAREER Award CNS-0953626.


ACM Classification: F.0
AMS Classification: 03D15
Key words and phrases: cryptography, complexity theory, one-way functions, computational entropy


© 2020 Iftach Haitner, Thomas Holenstein, Omer Reingold, Salil Vadhan and Hoeteck Wee
c b Licensed under a Creative Commons Attribution License (CC-BY)                           DOI: 10.4086/toc.2020.v016a008
    I FTACH H AITNER , T HOMAS H OLENSTEIN , O MER R EINGOLD , S ALIL VADHAN AND H OETECK W EE

      case, we show (rather easily) that it is hard to sample x0 with almost full entropy among all
      the possible such values of x0 . The rest of our construction simply amplifies and exploits
      this basic property. Combined with other recent work, the construction of three fundamental
      cryptographic primitives (Pseudorandom Generators, Statistically Hiding Commitments
      and UOWHFs) out of one-way functions is now to a large extent unified. In particular, all
      three constructions rely on and manipulate computational notions of entropy in similar ways.
      Pseudorandom Generators rely on the well-established notion of pseudoentropy, whereas
      Statistically Hiding Commitments and UOWHFs rely on the newer notion of inaccessible
      entropy.
          In an additional result we reprove the seminal result of Impagliazzo and Levin (FOCS
      1989): a reduction from “uniform distribution” average-case complexity problems to ones
      with arbitrary (polynomial-time samplable) distributions. We do that using techniques similar
      to those we use to construct UOWHFs from one-way functions, where the source of this
      similarity is the use of a notion similar to inaccessible entropy. This draws an interesting
      connection between two seemingly separate lines of research: average-case complexity and
      universal one-way hash-functions.


1    Introduction
Unlike the more common notions of computational entropy, e. g., pseudoentropy [15], that are only
useful as a lower bound on the “computational entropy” of a distribution, accessible entropy is an upper
bound on computational entropy. In particular, it measures the entropy (Shannon, or other types) of the
distribution computed by a resource-bounded machine.
    The inaccessible entropy of the distribution is the gap between its (Shannon) entropy and its accessible
entropy. Inaccessible entropy was introduced by Haitner, Reingold, Vadhan and Wee [12] as a means
to give a simpler construction and proof of statistically hiding commitment from one-way functions
(reproving the result of [10]), and to construct constant-round statistically hiding commitment from
constant-round zero-knowledge proof for NP. In this article introduce simpler variant of their notion to
give an alternative construction and proof for the fundamental result, first proved by Rompel [26], that
Universal One-Way Hash Functions (UOWHFs) can be based on one-way functions. In an additional
result, we reprove the seminal result of Impagliazzo and Levin [18]: a reduction from “uniform distri-
bution” average-case complexity problems to ones with arbitrary (though polynomial samplable one)
distributions. The latter is proved using similar techniques to the ones we use to construct UOWHFs from
one-way functions, where the source of this similarity is the use of a similar notion of inaccessible entropy.
This draws an interesting connection between two seemingly separate lines of research: average-case
complexity and universal one-way hash-functions.
    We start by discussing our construction of universal one-way hash functions, where the result about
average-case complexity is described in Section 1.3. Universal one-way hash functions (UOWHFs), as
introduced by Naor and Yung [23], are a weaker form of collision-resistant hash functions; a function
family F is collision resistant if given a randomly chosen function f ∈ F, it is infeasible to find any pair
of distinct inputs x, x0 such that f (x) = f (x0 ). UOWHFs only require target collision resistance, where
the adversary must specify one of the inputs x before seeing the description of the function f . We give a

                        T HEORY OF C OMPUTING, Volume 16 (8), 2020, pp. 1–55                                2
              I NACCESSIBLE E NTROPY II: IE F UNCTIONS AND U NIVERSAL O NE -WAY H ASHING

formal definition.
Definition 1.1. A family of functions
                                   n                            o
                              Fk = Fz : {0, 1}n(k) → {0, 1}m(k)                             ,
                                                                                 z∈{0,1}k

for n(k), m(k) ∈ poly(k), is a family of universal one-way hash functions (UOWHFs) if it satisfies:
Efficiency: given z ∈ {0, 1}k and x ∈ {0, 1}n(k) , Fz (x) can be evaluated in time poly(k).

Shrinking: m(k) < n(k).

Target Collision Resistance: the probability that a PPT (probabilistic polynomial-time) adversary A
     succeeds in the following game is negligible in k:

          1. Let (x, state) ← A(1k ).
          2. Let z ← {0, 1}k .
          3. Let x0 ← A(state, z).
          4. A succeeds if x, x0 ∈ {0, 1}n(k) , x 6= x0 , and Fz (x) = Fz (x0 ). 1

    It turns out that this weaker security property suffices for many applications. The most immediate
application given in [23] is secure fingerprinting, whereby the pair ( f , f (x)) can be taken as a compact
“fingerprint” of a large file x, such that it is infeasible for an adversary, seeing the fingerprint, to change
the file x to x0 without being detected. More dramatically, [23] also showed that UOWHFs can be used to
construct secure digital signature schemes, whereas all previous constructions (with proofs of security
in the standard model) were based on trapdoor functions (as might have been expected to be necessary
due to the public-key nature of signature schemes). More recently, UOWHFs have been used in the
Cramer–Shoup encryption scheme [8] and in the construction, from one-way functions, of statistically
hiding commitment schemes [10, 12].
    Naor and Yung [23] gave a simple and elegant construction of UOWHFs from any one-way per-
mutation. [28] generalized the construction of [23] to get UOWHFs from regular one-way functions.
Rompel [26] gave a more involved construction to prove that UOWHFs can be constructed from an
arbitrary one-way function, thereby resolving the complexity of UOWHFs (as one-way functions are the
minimal complexity assumption for complexity-based cryptography, and are easily implied by UOWHFs);
this remains the state of the art for arbitrary one-way functions.2 While complications may be expected
for constructions from arbitrary one-way functions (due to their lack of structure), Rompel’s analysis also
feels quite ad hoc. In contrast, the construction of pseudorandom generators from one-way functions in
[15], while also somewhat complex, involves natural abstractions (e. g., pseudoentropy) that allow for
modularity and measure for what is being achieved at each stage of the construction.
   1 The ← notation is explained in Sec. 2.1. Namely, on security parameter 1k , algorithm A first samples an element x in the

function family input domain. Then given (the description of) a function Fz uniformly drawn from the family, algorithm A has
to find a collision with x: an element x0 6= x that Fz maps to the same output value. To avoid discussing stateful algorithms, we
do not allow A to keep a “state” between the game stages. Rather, we enable it to transfer information between the stages using
the auxiliary string state.
    2 More details of the proof given in [26] are worked out, with some corrections, in [27, 20].



                            T HEORY OF C OMPUTING, Volume 16 (8), 2020, pp. 1–55                                               3
      I FTACH H AITNER , T HOMAS H OLENSTEIN , O MER R EINGOLD , S ALIL VADHAN AND H OETECK W EE

    In this paper, we give simpler constructions of UOWHFs from one-way functions, based on (a variant
of) the recently introduced notion of inaccessible entropy [12]. In addition, one of the constructions
obtains slightly better efficiency and security than Rompel’s original construction.

1.1     Inaccessible entropy
For describing our construction, it will be cleaner to work with a variant of UOWHFs where there is
a single shrinking function F : {0, 1}n → {0, 1}m (for each setting of the security parameter k) such
that it is infeasible to find collisions with random inputs. So in our model an adversary A is given a
uniformly random x ← {0, 1}n , outputs an x0 such that F(x0 ) = F(x), and succeeds3 if x0 6= x. Note that
we can assume without loss of generality that x0 = A(x) is always a preimage of F(x) (A has the option
of returning x in case it does not find a different preimage); we refer to an algorithm A with this property
as an F-collision finder.
    Our construction is based on an information-theoretic view of UOWHFs. The fact that F is shrinking
implies that there are many preimages x0 of F(x) available to A. Indeed, if we consider an (inefficient)
adversary A(x) that outputs a uniformly random preimage x0 ← F −1 (F(x)) and let X be a random variable
uniformly distributed on {0, 1}n , then

                                     H(A(X) | X) = H(X | F(X)) ≥ n − m,

where H(· | ·) denotes conditional Shannon entropy. (See Section 2 for more definitional details.) We
refer to the quantity H(X | F(X)) as the real entropy of F −1 .
    On the other hand, target collision resistance means that effectively only one of the preimages is
accessible to A. That is for every probabilistic polynomial-time F-collision finder A, we have Pr[A(X) 6=
X] = neg(n), which is equivalent to requiring that:

                                               H(A(X) | X) = neg(n)

for all probabilistic polynomial-time F-collision finders A. (If A can find a collision X 0 with non-negligible
probability, then it can achieve non-negligible conditional entropy by returning X 0 with probability 1/2 and
returning X with probability 1/2.) We refer to the maximum of H(A(X) | X) over all efficient F-collision
finders as the accessible entropy of F −1 . We emphasize that accessible entropy refers to an upper bound
on a form of computational entropy, in contrast to the Håstad et al. notion of pseudoentropy [15].
    Thus, a natural weakening of the UOWHF property is to simply require a noticeable gap between
the real and the accessible entropies of F −1 . That is, for every probabilistic polynomial-time F-collision
finder A, we have H(A(X) | X) < H(X | F(X)) − ∆, for some ∆ ≥ 1/ poly(n), which we refer to as the
inaccessible entropy of F.

1.2     Our UOWHF constructions
Our constructions of UOWHFs have two parts. First, we show how to obtain a function with noticeable
inaccessible entropy from any one-way function. Second, we show how to build a UOWHF from any
function with inaccessible entropy.
  3 It is easy to convert any such function F into a standard UOWHF family by defining F (x) = F(z + x).
                                                                                        z



                          T HEORY OF C OMPUTING, Volume 16 (8), 2020, pp. 1–55                               4
             I NACCESSIBLE E NTROPY II: IE F UNCTIONS AND U NIVERSAL O NE -WAY H ASHING

OWFs =⇒ inaccessible entropy. Given a one-way function f : {0, 1}n → {0, 1}m , we show that a
random truncation of f has inaccessible entropy. Specifically, we define F(x, i) to be the first i bits of
 f (x).
     To see that this works, suppose for contradiction that F does not have noticeable inaccessible
entropy. That is, we have an efficient adversary A that on input (x, i) can sample from the set S(x, i) =
{x0 : f (x0 )1...i = f (x)1...i } with almost-maximal entropy, which is equivalent to sampling according to a
distribution that is statistically close to the uniform distribution on S(x, i). We can now use A to construct
an inverter Inv for f that works as follows on input y: choose x0 ← {0, 1}n , and then for i = 1, . . . , n
generate a random xi ← A(xi−1 , i − 1) subject to the constraint that f (xi )1,··· ,i = y1,··· ,i . The latter step is
feasible, since we are guaranteed that f (xi )1,...,i−1 = y1,··· ,i−1 by the fact that A is an F-collision finder,
and the expected number of trials needed get agreement with yi is at most O(1) (since yi ∈ {0, 1}, and
y and f (xi ) are statistically close). It is not difficult to show that when run on a random output Y of f ,
Inv produces an almost-uniform preimage of Y . This contradicts the one-wayness of f . Indeed, we only
need f to be a distributional one-way function [19], whereby it is infeasible to generate almost-uniform
preimages under f .


Inaccessible entropy =⇒ UOWHFs. Once we have a non-negligible amount of inaccessible entropy,
we can construct a UOWHF via a series of standard transformations.

   1. Repetition: By evaluating F on many inputs, we can increase the amount of inaccessible entropy
      from 1/ poly(n) to poly(n). Specifically, we take F t (x1 , . . . , xt ) = (F(x1 ), . . . , F(xt )) where t =
      poly(n). This transformation also has the useful effect of converting the real entropy of F −1 to real
      min-entropy: with very high probability x ← X t , F t (x) has large number of pre-images.

   2. Hashing Inputs: By hashing the input to F (namely taking F 0 (x, g) = (F(x), g(x)) for a universal
      hash function g), we can reduce both the real (min-)entropy and the accessible entropy so that
      (F 0 )−1 still has a significant amount of real entropy, but has (weak) target collision resistance (on
      random inputs).

   3. Hashing Outputs: By hashing the output to F (namely taking F 0 (x, g) = g(F(x))), we can reduce
      the output length of F to obtain a shrinking function that still has (weak) target collision resistance.

    There are two technicalities that occur in the above steps. First, hashing the inputs only yields weak
target collision resistance; that is, we only guarantee that an adversary fails in the target collision resistance
game with probability at least 1/ poly(n) rather than 1 − neg(n). This is due to the fact that accessible
Shannon entropy is an average-case measure and thus allows for the possibility that the adversary can
achieve high accessible entropy most of the time. Fortunately, this weak form of target collision resistance
can be amplified to full target collision resistance using another application of repetition and hashing
(similar to [6]).
    Second, the hashing steps require having a fairly accurate estimate of the real entropy. This can
be handled similarly to [15, 26], by trying all (polynomially many) possibilities and concatenating the
resulting UOWHFs, at least one of which will be target collision resistant.

                          T HEORY OF C OMPUTING, Volume 16 (8), 2020, pp. 1–55                                      5
      I FTACH H AITNER , T HOMAS H OLENSTEIN , O MER R EINGOLD , S ALIL VADHAN AND H OETECK W EE

A more efficient construction. We obtain a more efficient construction of UOWHFs by hashing the
output of the one-way function f before truncating. That is, we define F(x, g, i) = (g, g( f (x))1···i ). This
function is in the spirit of the function that Rompel [26] uses as a first step, but our function uses a three-
wise independent hash function instead of an n-wise independent one, and enjoys a simpler structure.4
Our analysis of this function is simpler than Rompel’s and can be viewed as providing a clean abstraction
of what it achieves (namely, inaccessible entropy) that makes the subsequent transformation to a UOWHF
easier.
    We obtain improved UOWHF parameters over our first construction for two reasons. First, we obtain
a larger amount of inaccessible entropy: (log n)/n bits instead of roughly 1/n4 bits. Second, we obtain a
bound on a stronger form of accessible entropy, which enables us to get full target collision resistance
when we hash the inputs, avoiding the second amplification step.
    The resulting overall construction yields better parameters than Rompel’s original construction. A
one-way function of input length n yields a UOWHF with output length O(n           e 7 ), slightly improving
Rompel’s bound of O(n e 8 ). Additionally, we are able to reduce the key length needed: Rompel’s original
construction uses a key of length O(n  e 12 ), whereas our construction only needs a key of length O(n  e 7 ).
If we allow the construction to utilize some nonuniform information (namely an estimate of the real
entropy of F −1 ), then we obtain output length O(n e 5 ), improving Rompel’s bound of O(n e 6 ). For the key
length, the improvement in this case is from O(n      7
                                                  e ) to O(n    5
                                                            e ). Of course, these bounds are still far from
practical, but they illustrate the utility of inaccessible entropy in reasoning about UOWHFs, which may
prove useful in future constructions (whether based on one-way functions or other building blocks).

1.3     Connection to average-case complexity
We use the notion of inaccessible entropy to reprove the following theorem by Impagliazzo and Levin [18],
given in the realm of average-case complexity.
Theorem 1.2 ([18], informal). Assume there exists an NP language that is hard on some (efficiently)
samplable distribution for heuristics: every efficient algorithm fails to decide the language correctly on a
noticeable part of the distribution. Then there exists a language in NP that is hard against heuristics on
the uniform distribution.
    Our proof follows to a large extent in the footsteps of [18], where the main novelty is formulating
the proof in the language of inaccessible entropy, and rephrasing it to make it resembles our proof of
UOWHFs from one-way functions. This draws an interesting connection between two seemingly separate
lines of research: average-case complexity and universal one-way hash-functions.
    As in [18], we prove Theorem 1.2 by proving its search variant: hardness of finding a witness on a
samplable distribution implies hardness of finding a witness on the uniform distribution. Let (R, D) be
an NP-search problem (i. e., R is an NP relation and D is a samplable distribution) that is hard to solve
heuristically, and let D be the algorithm sampling instances according to D. For a family of pair-wise
independent hash functions G, consider the following NP relation:
               R0 = (x0 = (g, i), w0 = (x, w)) : g ∈ G, (D(x), w) ∈ R ∧ g(D(x))1,...,i = 0i 5
                     

   4 [26] started with the function f 0 (z, g , g ) := (g ( f (g (z))), g , g ), where g and g are n-wise independent hash-functions,
                                              1 2         2 0 1          1 2            1     2
and f0 is defined as f0 (x, y, i) = ( f (x), yn−i , 0i ).
   5 To keep the discussion simple, we ignore input length consideration. See Section 6 for the formal definitions.



                             T HEORY OF C OMPUTING, Volume 16 (8), 2020, pp. 1–55                                                  6
            I NACCESSIBLE E NTROPY II: IE F UNCTIONS AND U NIVERSAL O NE -WAY H ASHING

While R0L (i. e., the language of R0 ) might not be hard on the uniform distribution (interpreted as the
uniform distribution over random pairs (g, i)), it is not hard to prove that the distribution is “somewhat
hard.” In particular, it happens noticeably often that i is the “right one for (R, D)”; meaning that for
a fixed element in y which might be in the language, exactly one x with D(x) = y satisfies g(x) = 0i .
Conditioned on this event, letting A(·)x being the x part in the witness output by A, it is not hard to show
that H(A(G, I)x ) is noticeably less than its information theoretic maximum: for any efficient algorithm A,
it holds that
                                      H(A(G, I)x ) < H(D|G(D)1,...,I =0I ),
where (G, I) is the parsing of a random string into a pair (g ∈ G, i), and assuming for simplicity that
A never fails to provide a correct witness. Namely, the (accessible) entropy of A is smaller than the
(real) entropy of D. Using similar means to the ones used to amplify the initial UOWHFs constructions
described in Section 1.2 , the above gap can be amplified to induce hardness over the uniform distribution.

1.4   Perspective
The idea of inaccessible entropy was introduced in [12] for the purpose of constructing statistically hiding
commitment schemes from one-way functions and from zero-knowledge proofs. There, the nature of
statistically hiding commitments necessitated more involved notions of inaccessible entropy than we
present here — inaccessible entropy was defined in [12] for interactive protocols, where one considers
adversaries that try to generate next-messages or next-blocks of high entropy. (See [13] for a simpler
notion of inaccessible entropy that suffices for commitments based on one-way functions.)
     Here, we are able to work with a much simpler form of inaccessible entropy (significantly simpler also
than the notion considered in [13]). The simplicity comes from the non-interactive nature of UOWHFs
(and of solving NP problems) so we only need to measure the entropy of a single string output by the
adversary. Thus, the definitions here can serve as a gentler introduction to the concept of inaccessible
entropy.
     On the other hand, the many-round notions from [12, 13] allow for a useful “entropy equalization”
transformation that avoids the need to try all possible guesses for the entropy. We do not know an
analogous transformation for constructing UOWHFs. We also note that our simple construction of
a function with inaccessible entropy by randomly truncating a one-way function (and its analysis) is
inspired by the construction of an “inaccessible entropy generator” from a one-way function in [12].
     Finally, with our constructions, the proof that one-way functions imply UOWHFs now parallels those
of pseudorandom generators [15, 11] and statistically hiding commitments [10, 12], with UOWHFs and
statistically hiding commitments using dual notions of entropy (high real entropy, low accessible entropy)
to pseudorandom generators (low real entropy, high pseudoentropy). See the surveys [14, 30].

1.5   Related work
UOWHFs. Katz and Koo [20] gave a complete write-up of Rompel’s result [26, 27], with some
corrections. Prior to our paper, Rompel’s result represented the state of the art for UOWHFs from
arbitrary one-way functions. Since the initial publication of this work in 2010 [9], there have been several
improvements in the setting of regular one-way functions. Ames et al. [1] presented an even more efficient

                        T HEORY OF C OMPUTING, Volume 16 (8), 2020, pp. 1–55                              7
      I FTACH H AITNER , T HOMAS H OLENSTEIN , O MER R EINGOLD , S ALIL VADHAN AND H OETECK W EE

construction of UOWHFs from (unknown) regular one-way functions. Barhum and Maurer [2] gave
even a more efficient construction assuming the regularity of the one-way function is known, where Yu
et al. [32] improved the result of [2] presenting an almost optimal construction (with respect to the known
black-box impossibility results) of UOWHFs from known-regular one-way functions.


Average-case complexity. The notion of average-case complexity was first introduced by Levin [21].
We focus on the result by Impagliazzo and Levin [18] who show that if it is possible use a polynomial
time sampler to pick average-case problems which are hard, then there is a different problem which
is hard on average for the uniform distribution. We give a different perspective on that proof, and in
particular highlight the connections to inaccessible entropy. A good overview of average-case complexity
was given by Bogdanov and Trevisan [5].

     Recently, Hubáček et al. [17] made a different, and very elegant, connection between constructing
UOWHFs from OWFs and average-case hardness on the uniform distribution, showing that a solution to
the first challenge implies a solution to the second one. Their approach is surprisingly simple: if OWFs
exist, then UOWHFs also exist, which can be seen as a problem that is hard on the uniform distribution
(given a UOWHF key and an input, find a colliding input). On the other hand, assuming OWFs do not
exist, without loss of generality the sampler of a hard-on-the-average problem can be assumed to output
its random coins (indeed, its coins can be sampled from its original output under the assumption that
OWFs do not exist). So in both cases, a hard-on-the-average problem implies a hard-on-the-average
problem with respect to the uniform distribution.


Organization of the paper
Formal definitions are given in Section 2, where the notion of inaccessible entropy used through the
paper is defined in Section 3. In Section 4 we show how to use any one-way function to get a function
with inaccessible entropy, where in Section 5 we use any function with inaccessible entropy to construct
UOWHF. Finally, our result for average-case complexity is described in Section 6.


2     Preliminaries
Most of the material in this section is taken almost verbatim from [12], and missing proofs can be found
in that paper.


2.1     Notation
All logarithms considered here are in base two. For t ∈ N, we let [t] = {1, . . . ,t}. A function µ : N → [0, 1]
is negligible, denoted µ(n) = neg(n), if µ(n) = n−ω(1) . We let poly denote the set of all polynomials, and
let PPT stand for probabilistic polynomial time. Given a distribution D, we write d ← D to indicate that d
is selected according to D. Similarly, given a finite set S, we write s ← S to indicate that s is selected
according to the uniform distribution on S.

                         T HEORY OF C OMPUTING, Volume 16 (8), 2020, pp. 1–55                                 8
              I NACCESSIBLE E NTROPY II: IE F UNCTIONS AND U NIVERSAL O NE -WAY H ASHING

2.2    Random variables
Let X and Y be random variables taking values in a discrete universe U. We adopt the convention that
when the same random variable appears multiple times in an expression, all occurrences refer to the same
instantiation. For example, Pr[X = X] is 1. For an event E, we write X|E to denote the random variable X
conditioned on E. The support of a random variable X is Supp(X) := {x : Pr[X = x] > 0}. X is flat if it
is uniform on its support. For an event E, we write I(E) for the corresponding indicator random variable,
i. e., I(E) is 1 when E occurs and is 0 otherwise.
      We write kX −Y k to denote the statistical difference (also known as variation distance) between X
and Y , i. e.,
                                 kX −Y k = max |Pr[X ∈ T ] − Pr[Y ∈ T ]| .
                                                    T ⊆U
We say that X and Y are ε-close if kX −Y k ≤ ε and ε-far otherwise.

2.3    Entropy measures
In this article we shall refer to several measures of entropy. The relation and motivation of these measures
is best understood by considering a notion that we will refer to as the sample-entropy: For a random
variable X and x ∈ Supp(X), we define the sample-entropy of x with respect to X to be the quantity
                                              HX (x) := log(1/ Pr[X = x]).
The sample-entropy measures the amount of “randomness” or “surprise” in the specific sample x, assuming
that x has been generated according to X. Using this notion, we can define the Shannon entropy H(X)
and min-entropy H∞ (X) as follows:
                                              H(X) :=          E [HX (x)],
                                                              x←X
                                             H∞ (X) :=              min   HX (x).
                                                              x∈Supp(X)

We will also discuss the max-entropy H0 (X) := log(|Supp(X)|). The term “max-entropy” and its relation
to the sample-entropy will be made apparent below.
    It can be shown that H∞ (X) ≤ H(X) ≤ H0 (X) with equality if and only if X is flat. Thus, saying
H∞ (X) ≥ k is a strong way of saying that X has “high entropy” and H0 (X) ≤ k a strong way of saying
that X as “low entropy”.

Smoothed entropies. Shannon entropy is robust in that it is insensitive to small statistical differences.
Specifically, if X and Y are ε-close then |H(X) − H(Y )| ≤ ε · log |U|. For example, if U = {0, 1}n and
ε = ε(n) is a negligible function of n (i. e., ε = n−ω(1) ), then the difference in Shannon entropies is
vanishingly small (indeed, negligible). In contrast, min-entropy and max-entropy are brittle and can
change dramatically with a small statistical difference. Thus, it is common to work with “smoothed”
versions of these measures, whereby we consider a random variable X to have high entropy if X is ε-close
to some X 0 with H∞ (X) ≥ k and to have low entropy if X is ε-close to some X 0 with H0 (X) ≤ k, for some
parameter k and a negligible ε.6
   6 The term “smoothed entropy” was coined by [25], but the notion of smoothed min-entropy has commonly been used

(without a name) in the literature on randomness extractors [24].


                            T HEORY OF C OMPUTING, Volume 16 (8), 2020, pp. 1–55                                9
    I FTACH H AITNER , T HOMAS H OLENSTEIN , O MER R EINGOLD , S ALIL VADHAN AND H OETECK W EE

   These smoothed versions of min-entropy and max-entropy can be captured quite closely (and more
concretely) by requiring that the sample-entropy be large or small, resp., with high probability:

Lemma 2.1.

   1. Suppose that with probability at least 1 − ε over x ← X, we have HX (x) ≥ k. Then X is ε-close to
      a random variable X 0 such that H∞ (X 0 ) ≥ k.

   2. Suppose that X is ε-close to a random variable X 0 such that H∞ (X 0 ) ≥ k. Then with probability at
      least 1 − 2ε over x ← X, we have HX (x) ≥ k − log(1/ε).

Lemma 2.2.

   1. Suppose that with probability at least 1 − ε over x ← X, we have HX (x) ≤ k. Then X is ε-close to
      a random variable X 0 such that H0 (X 0 ) ≤ k.

   2. Suppose that X is ε-close to a random variable X 0 such that H0 (X 0 ) ≤ k. Then with probability at
      least 1 − 2ε over x ← X, we have HX (x) ≤ k + log(1/ε).

    Think of ε as inverse polynomial or a negligible function in n = log(|U|). The above lemmas
show that up to negligible statistical difference and a slightly super-logarithmic number of entropy bits,
the min-entropy and the max-entropy are captured by a lower and an upper bound on sample-entropy,
respectively.

Conditional entropies. We will also be interested in conditional versions of entropy. For jointly
distributed random variables (X,Y ) and (x, y) ∈ Supp(X,Y ), we define the conditional sample-entropy to
be HX|Y (x | y) = log(1/ Pr[X = x | Y = y]). Then the standard conditional Shannon entropy can be written
as:                                                
                 H(X | Y ) =    E       HX|Y (x | y) = E [H(X|Y =y )] = H(X,Y ) − H(Y ).
                             (x,y)←(X,Y )                 y←Y

There is no standard definition of conditional min-entropy and max-entropy, or even their smoothed
versions. For us, it will be most convenient to generalize the sample-entropy characterizations of
smoothed min-entropy and max-entropy given above. Specifically we will think of X as having “high
min-entropy” and “low max-entropy” given Y if with probability at least 1 − ε over (x, y) ← (X,Y ), we
have HX|Y (x | y) ≥ k and HX|Y (x | y) ≤ k, resp.

Flattening Shannon entropy. The asymptotic equipartition property in information theory states that
for a random variable X t = (X1 , . . . , Xt ), whose marginals Xi are independent, with high probability, the
sample-entropy HX t (X1 , . . . , Xt ) is close to its expectation. In [15] a quantitative bound on this was shown
by reducing it to the Hoeffding bound. (One cannot directly apply the Hoeffding bound, because HX (X)
does not have an upper bound, but one can define a related random variable which does.) We use a
different bound here, which was proven in [16]. The bound has the advantage that it is somewhat easier
to state, even though the proof is longer. We remark that the bound from [15] would be sufficient for our
purposes.

                        T HEORY OF C OMPUTING, Volume 16 (8), 2020, pp. 1–55                                   10
             I NACCESSIBLE E NTROPY II: IE F UNCTIONS AND U NIVERSAL O NE -WAY H ASHING

Lemma 2.3.

    1. Let X be a random variable taking values in a universe U, let t ∈ N, and let ε > 2−t . Then with
       probability at least 1 − ε over x ← X t ,
                                                           p
                                 |HX t (x) − t · H(X)| ≤ O( t · log(1/ε) · log(|U|))

    2. Let X,Y be jointly distributed random variables where X takes values in a universe U, let t ∈ N,
       and let ε > 2−t . Then with probability at least 1 − ε over (x, y) ← (X t ,Y t ) := (X,Y )t ,
                                                                  p
                            HX t |Y t (x | y) − t · H(X | Y ) ≤ O( t · log(1/ε) · log(|U|))

      The statement follows directly from [16, Thm 2].

2.4     Hashing
A family of functions F = { f : {0, 1}n → {0, 1}m } is 2-universal if for every x 6= x0 ∈ {0, 1}n , when
we choose f ← F, we have Pr[ f (x) = f (x0 )] ≤ 1/ |{0, 1}m |. F is t-wise independent if for all distinct
x1 , . . . , xt ∈ {0, 1}n , when we choose f ← F, the random variables f (x1 ), . . . , f (xt ) are independent and
each of them is uniformly distributed over {0, 1}m .
      F is explicit if given the description of a function f ∈ F and x ∈ {0, 1}n , the value f (x) can be
computed in time poly(n, m). F is constructible if it is explicit and there is a probabilistic polynomial-
time algorithm that given x ∈ {0, 1}n , and y ∈ {0, 1}m , outputs a random f ← F such that f (x) = y.
      It is well-known that there are constructible families of t-wise independent functions in which
choosing a function f ← F uses only t · max {n, m} random bits.


3      Inaccessible entropy for inversion problems
In this section we define, following the infomercial description given in the introduction, the real and
accessible entropy of the inverse of a function. The inaccessible entropy of the inverse is define as the
gap between the two.

3.1     Real entropy
For a function F, we define the real entropy of F −1 to be the amount of entropy left in the input after
revealing the output. We measure the above entropy using Shannon entropy (average case), min-entropy
and max-entropy.

Definition 3.1 (real entropy). Let n be a security parameter, and F : {0, 1}n → {0, 1}m a function. We
say that F −1 has real Shannon entropy k if

                                               H(X | F(X)) = k,

                         T HEORY OF C OMPUTING, Volume 16 (8), 2020, pp. 1–55                                   11
      I FTACH H AITNER , T HOMAS H OLENSTEIN , O MER R EINGOLD , S ALIL VADHAN AND H OETECK W EE

where X is uniformly distributed on {0, 1}n . We say that F −1 has real min-entropy at least k if there is a
negligible function ε = ε(n) such that
                                                            
                              Prx←X HX|F(X) (x | F(x)) ≥ k ≥ 1 − ε(n).
We say that F −1 has real max-entropy at most k if there is a negligible function ε = ε(n) such that
                                                            
                              Prx←X HX|F(X) (x | F(x)) ≤ k ≥ 1 − ε(n).
    It is easy to verify that, ignoring negligible terms, the min-entropy of F −1 is at most its Shannon
entropy which in turn is at most its max-entropy, where equality holds only if F is regular. We also note
that more concrete formulas for the entropies above are:
                                HX|F(X) (x | F(x)) = log F −1 (F(x))
                                    H(X | F(X)) = E log F −1 (F(X)) .
                                                                    

    As our goal is to construct UOWHFs that are shrinking, achieving high real entropy is a natural
intermediate step. Indeed, the amount by which F shrinks is a lower bound on the real entropy of F −1 :
Proposition 3.2. If F : {0, 1}n → {0, 1}m , then the real Shannon entropy of F −1 is at least n − m, and
the real min-entropy of F −1 is at least n − m − s for any s = ω(log n).
Proof. For Shannon entropy, we have
                                H(X | F(X)) ≥ H(X) − H(F(X)) ≥ n − m.
   For min-entropy, let S = {y ∈ {0, 1}m : Pr[ f (X) = y] < 2−m−s }. Then Pr[ f (X) ∈ S] ≤ 2m · 2−m−s =
neg(n), and for every x such that f (x) ∈
                                        / S, we have
                                         1                  Pr[ f (X) = f (x)]      2−m−s
  HX|F(X) (x | F(x)) = log                            = log                    ≥ log −n = n − m − s.
                             Pr[X = x | F(X) = f (x)]           Pr[X = x]            2
                                                                                                         

3.2     Accessible entropy
We define accessible entropy of F −1 using the notion of “collision-finding” algorithm, an algorithm that
aims to find a second-pre-image of F(X) with “maximal entropy”. The accessible entropy of F will be
defined as the entropy of the best efficient collision-finding algorithm.
Definition 3.3 (collision finding algorithm). For a function F : {0, 1}n → {0, 1}m , an F-collision-finder
is a randomized algorithm A such that for every x ∈ {0, 1}n and coin tosses r for A, we have A(x; r) ∈
F −1 (F(x)).
    Note that A is required to always produce an input x0 ∈ {0, 1}n such that F(x) = F(x0 ). This is a
reasonable constraint because A has the option of outputting x0 = x if it does not find a true collision. We
consider A’s goal to be maximizing the entropy of its output x0 = A(x), given a random input x.
    It is easy to see that If we let A be computationally unbounded, then the optimum turns out to equal
exactly the real entropy:

                       T HEORY OF C OMPUTING, Volume 16 (8), 2020, pp. 1–55                              12
            I NACCESSIBLE E NTROPY II: IE F UNCTIONS AND U NIVERSAL O NE -WAY H ASHING

Proposition 3.4. Let F : {0, 1}n → {0, 1}m . Then the real Shannon entropy of F −1 equals the maximum
of H(A(X; R) | X) over all (computationally unbounded) F-collision finders A, where the random variable
X is uniformly distributed in {0, 1}n and R is uniformly random coin tosses for A. That is,

                                   H(X | F(X)) = max H(A(X; R) | X),
                                                     A

where the maximum is taken over all F-collision finders A.
Proof. The “optimal” F-collision finder A that maximizes H(A(X) | X) is the algorithm A
                                                                                      e that, on input
                                           −1
x, outputs a uniformly random element of f ( f (x)). Then

                         H(A(X;
                           e    R) | X) = E[log f −1 ( f (X)) ] = H(X | F(X)).

                                                                                                         

    The notion of accessible entropy simply restricts the above to PPT algorithms. We consider both
Shanon and max-entropy variants (since we aim to upper bound the accessible entropy, we care not about
the min-entropy variant).
Definition 3.5 (accessible entropy). Let n be a security parameter and F : {0, 1}n → {0, 1}m a function.
We say that F −1 has accessible Shannon entropy at most k if for every PPT F-collision-finder A, we have

                                           H(A(X; R) | X) ≤ k

for all sufficiently large n, where the random variable X is uniformly distributed on {0, 1}n and R is
uniformly random coin tosses for A.
    We say that F −1 has p-accessible max-entropy at most k if for every PPT F-collision-finder A, there
exists a family of sets {L(x)}x∈Supp(X) each of size at most 2k , such that x ∈ L(x) for all x ∈ Supp(X),
and
                                       Pr [A(X; R) ∈ L(X)] ≥ 1 − p
for all sufficiently large n, where the random variable X is uniformly distributed on {0, 1}n and R is
uniformly random coin tosses for A. In addition, if p = ε(n) for some negligible function ε(·), then we
simply say that F −1 has accessible max-entropy at most k.
     It is easy to verify that, ignoring negligible terms, the accessible Shannon entropy of F −1 is at most
its accessible max-entropy, i. e., if the accessible max-entropy of F −1 is at most k, then its accessible
Shannon entropy is at most k. (We will later, Section 3.2.1, introduce an in-between variant of accessible
entropy; larger than Shanon smaller than max.)
     The reason that having an upper bound on accessible entropy is useful as an intermediate step towards
constructing UOWHFs, is that accessible max-entropy 0 is equivalent to target collision resistance (on
random inputs):
Definition 3.6 (q-collision-resistant). Let F : {0, 1}n → {0, 1}m be a function and q = q(n) ∈ [0, 1]. We
say that F is q-collision-resistant on random inputs if for every PPT F-collision-finder A,

                                          Pr[A(X; R) = X] ≥ q,

                       T HEORY OF C OMPUTING, Volume 16 (8), 2020, pp. 1–55                              13
    I FTACH H AITNER , T HOMAS H OLENSTEIN , O MER R EINGOLD , S ALIL VADHAN AND H OETECK W EE

for all sufficiently large n, where the random variable X is uniformly distributed on {0, 1}n and R is
uniformly random coin tosses for A. In addition, if q = 1 − ε(n) for some negligible function ε(·), we
say that F is collision-resistant on random inputs.

Lemma 3.7. Let n be a security parameter and F : {0, 1}n → {0, 1}m be a function. Then, for any
p = p(n) ∈ (0, 1), the following statements are equivalent:

  (1) F −1 has p-accessible max-entropy 0.

  (2) F is (1 − p)-collision-resistant on random inputs.

In particular, F −1 has accessible max-entropy 0 iff F is collision-resistant on random inputs.

Proof. Note that (1) implies (2) follows readily from the definition. To see that (2) implies (1), simply
take L(x) = {x}.                                                                                       

    While bounding p-accessible max-entropy with negligible p is our ultimate goal, one of our construc-
tions will work by first giving a bound on accessible Shannon entropy, and then deducing a bound on
p-accessible max-entropy for a value of p < 1 using the following lemma:

Lemma 3.8. Let n be a security parameter and F : {0, 1}n → {0, 1}m be a function. If F −1 has accessible
Shannon entropy at most k, then F −1 has p-accessible max-entropy at most k/p + O(2−k/p ) for any
p = p(n) ∈ (0, 1).

Proof. Fix any PPT F-collision-finder A. From the bound on accessible Shannon entropy, we have that
H(A(X; R) | X) ≤ k. Applying Markov’s inequality, we have
                                                                 
                              Pr     HA(X;R)|X (A(x; r) | x) ≤ k/p ≥ 1 − p
                             x←X,r←R

Take L(x) to be the set:
                              L(x) = {x} ∪ x0 : HA(X;R)|X (x0 | x) ≤ k/p
                                          

We may rewrite L(x) as {x} ∪ x0 : Prr [A(x; r) = x0 ] ≥ 2−k/p . It is easy to see that |L(x)| ≤ 2k/p + 1
                               

and thus F −1 has p-accessible max-entropy at most k/p + O(2−k/p ).                                   

    Once we have a bound on p-accessible max-entropy for some p < 1, we need to apply several
transformations to obtain a function with a good bound on neg(n)-accessible max-entropy.

3.2.1   Accessible average max-entropy
Our second construction (which achieves better parameters), starts with a bound on a different average-
case form of accessible entropy, which is stronger than bounding the accessible Shannon entropy. The
benefit of this notion it that it can be converted more efficiently to neg(n)-accessible max-entropy, by
simply taking repetitions.
    To motivate the definition, recall that a bound on accessible Shannon entropy means that the sample
entropy HA(X;R)|X (x0 | x) is small on average over x ← X and x0 ← A(x; R). This sample entropy may

                       T HEORY OF C OMPUTING, Volume 16 (8), 2020, pp. 1–55                           14
            I NACCESSIBLE E NTROPY II: IE F UNCTIONS AND U NIVERSAL O NE -WAY H ASHING

depend on both the input x and the x0 output by the adversary (which in turn may depend on its coin
tosses). A stronger requirement is to say that we have upper bounds k(x) on the sample entropy that
depend only on x. The following definition captures this idea, thinking of k(x) = log |L(x)|. (We work
with sets rather than sample entropy to avoid paying the log(1/ε) loss in Lemma 2.2.)

Definition 3.9 (accessible average max-entropy). Let n be a security parameter and F : {0, 1}n → {0, 1}m
a function. We say that F −1 has accessible average max-entropy at most k if for every PPT F-collision-
finder A, there exists a family of sets {L(x)}x∈Supp(X) and a negligible function ε = ε(n) such that
x ∈ L(x) for all x ∈ Supp(X), E[log |L(X)|] ≤ k and

                                    Pr [A(X; R) ∈ L(X)] ≥ 1 − ε(n),

for all sufficiently large n, where the random variable X is uniformly distributed on {0, 1}n and R is
uniformly random coin tosses for A.

    It is easy to verify that, ignoring negligible terms, the accessible average max-entropy of F −1 is at
least its accessible Shannon entropy and at most its accessible max-entropy.


4     Inaccessible entropy from one-way functions
We present two constructions of inaccessible entropy functions from one-way functions. The one in
Section 4.1 is extremely simple and merely trims the one-way function output. The one in Section 4.2
is somewhat more complicated (in the spirit of the first step of Rompel [26], thought still significantly
simpler) that yields a more efficient overall construction.

4.1    A direct construction
The goal of this section is to prove the following theorem:

Theorem 4.1. Let f : {0, 1}n → {0, 1}n be a one-way function and define F over {0, 1}n × [n] as F(x, i) =
f (x)1,...,i−1 . Then, F −1 has accessible Shannon entropy at most H(Z | F(Z)) − 64n
                                                                                  1
                                                                                    2 , where Z = (X, I) is
uniformly distributed over {0, 1}n × [n].

    We do not know whether the function F −1 has even less accessible Shannon entropy (say, with a gap
of Ω(1/n)). However, it seems that a significantly stronger bound would require much more effort, and
even improving the bound to Ω(1/n) does not seem to yield an overall construction which is as efficient
as the one resulting from Section 4.2. Therefore we aim to present a proof which is as simple as possible.
    We begin with a high-level overview of our approach. Recall from Proposition 3.4 the “optimal”
F-collision-finder Ae that computes F −1 (F(·)). The proof basically proceeds in three steps:

    1. First, we show that it is easy to invert f using A
                                                        e (Lemma 4.2).

    2. Next, we show that if a F-collision-finder A has high accessible Shannon entropy, then it must
       behave very similarly to A
                                e (Lemma 4.3).


                       T HEORY OF C OMPUTING, Volume 16 (8), 2020, pp. 1–55                             15
    I FTACH H AITNER , T HOMAS H OLENSTEIN , O MER R EINGOLD , S ALIL VADHAN AND H OETECK W EE

   3. Finally, we show that if A behaves very similarly to A,
                                                           e then it is also easy to invert f using A
      (Lemma 4.4).

We may then deduce that if f is one-way, any F-collision-finder A must have accessible Shannon entropy
bounded away from H(Z | F(Z)).

Step 1. Suppose we have an optimal collision finder A(x,           e i; r) that outputs a uniform random element
        −1
from F (F(x, i)). In order to invert an element y, we repeat the following process: start with an arbitrary
element x(0) and use A e to find an element x(1) such that f (x(1) ) has the same first bit as y. In the i’th step
find x(i) such that the first i bits of f (x(i) ) equal y1,...,i (until i = n).
    This is done more formally in the following algorithm for an arbitrary oracle CF which we set to A           e
in the first lemma we prove. The algorithm ExtendOne does a single step. Besides the new symbol x0
which we are interested in, ExtendOne also returns the number of calls which it did to the oracle. This is
completely uninteresting to the overall algorithm, but we use it later in the analysis when we bound the
number of oracle queries made by ExtendOne.

 Algorithm ExtendOne

 Oracle: An F-collision finder CF.
 Input: x ∈ {0, 1}n , b ∈ {0, 1}, i ∈ [n].

 j := 0
 repeat
      x0 := CF(x, i)
      j := j + 1
 until f (x0 )i = b
 return (x0 , j)


 Inverter Inv
 Oracle: An F-collision finder CF.
 Input: y ∈ {0, 1}n

 x(0) ← Un
 for i = 1 to n do:
       (x(i) , j) := ExtendOneCF (x(i−1) , yi , i)
 done
 return x(n)

    We first show that with our optimal collision finder A,      e the inverter inverts with only 2n calls in
expectation (even though it can happen that it runs forever). Towards proving that, we define p(b | y1,...,i−1 )
as the probability that the i’th bit of f (x) equals b, conditioned on the event that f (x)1,...,i−1 = y1,...,i−1 (or
0 if f (x)1,...,i−1 = y1,...,i−1 is impossible).

                         T HEORY OF C OMPUTING, Volume 16 (8), 2020, pp. 1–55                                     16
              I NACCESSIBLE E NTROPY II: IE F UNCTIONS AND U NIVERSAL O NE -WAY H ASHING

                                           e in a random execution of InvAe (y = f (x)) with x ← {0, 1}n ,
Lemma 4.2. The expected number of calls to A
is at most 2n.

Proof. Fix some string y1,...,i−1 in the image of F. We want to study the expected number of calls to
e (i−1) , i) in case F(x(i−1) , i) = y1,...,i−1 .
A(x
                                                                                                  1
    If we would know yi , then this expected number of calls would be p(yi |y1,...,i−1                 ) . Since yi = 0 with
probability p(0 | y1,...,i−1 ) we get that the expected number of calls is 1 if either of the probabilities is 0
and
                                                1                                         1
                    p(0 | y1,...,i−1 ) ·                    + p(1 | y1,...,i−1 ) ·                    =2
                                         p(0 | y1,...,i−1 )                        1 | p(y1,...,i−1 )
otherwise. Using linearity of expectation we get the result.                                                             

Step 2. Given an F-collision finder A, we define ε(x, i) to be the statistical distance of the distribution
of A(x, i; r) and the the output distribution of A(x,
                                                 e i; r) (which equals the uniform distribution over
  −1
F (F(x, i))).
    We want to show that if A has high accessible Shannon entropy, then A behaves very similarly to
A.
e The next lemma formalizes this by stating that ε(x, i) is small on average (over the uniform random
choice of x ∈ {0, 1}n and i ∈ [n]). The lemma follows by applying Jensen’s inequality on the well known
relationship between entropy gap and statistical distance.
                                           1                                       1
Lemma 4.3. Assume H(A(Z)) ≥ H(Z | F(Z)) − 64n2 , then Ei←[n],x←{0,1}n [ε(x, i)] ≤ 8n .


Proof.

                               (Z, A(Z))
                                   e     − (Z, A(Z)) = E [ (z, A(z))
                                                               e     − (z, A(z)) ]
                                                       z←Z
                                                            q
                                                     ≤ E [ H(A(z))
                                                                 e    − H(A(z))]
                                                       z←Z
                                                       r
                                                     ≤     E [H(A(z))
                                                                 e    − H(A(z))]
                                                                 z←Z
                                                                1
                                                           ≤      .
                                                               8n
The first inequality uses the fact that if W is a random variable
                                                            p whose support is contained in a set
S and U is the uniform distribution on S, then kU −W k ≤ H(U) − H(W ) (see [7, Lemma 11.6.1]).
The second inequality follows by Jensen’s inequality. The final inequality uses H(A(Z))
                                                                                  e     = H(Z | F(Z))
(Proposition 3.4).                                                                                 

Step 3. We have seen now that InvA inverts f with 2n calls in expectation and that A behaves similarly
                                           e

   e We now want to show that InvA also inverts f efficiently. The main technical difficulty is that even
to A.
though InvA makes 2n calls to A
                              e in expectation and A and Ae are close in statistical distance, we cannot
          e

immediately deduce an upper bound on the number of calls InvA makes to A. Indeed, our analysis below
exploits the fact that Inv and ExtendOne have a fairly specific structure.

                           T HEORY OF C OMPUTING, Volume 16 (8), 2020, pp. 1–55                                          17
    I FTACH H AITNER , T HOMAS H OLENSTEIN , O MER R EINGOLD , S ALIL VADHAN AND H OETECK W EE

    We will assume without loss of generality that

                                                      e i; R)] = 1 − ε(x, i),
                                      Pr[A(x, i; R) = A(x,
                                      R


where A  e is an optimal collision finder as above. This follows from a standard coupling argument since we
do not require A  e to be polynomial time, and also because we can extend the number of random bits A
uses (we assume it just ignores unused ones). To do this, A     e first computes the statistics of A on input
(x, i), and also the result of A(x, i; r). He checks whether A(x, i; r) is one of the elements which occur too
often, and outputs a different, carefully chosen one, with appropriate probability if this is the case.
     We now show that in most executions of ExtendOne it does not matter whether we use A or A         e (that
is, ExtendOne makes the same number of oracle queries to A and A,        e and outputs the same value).

Lemma 4.4. For any (x, i) ∈ {0, 1}n × [n], and any yi ∈ {0, 1}, we have

                                                                                       2ε(x, i)
               Pr[ExtendOneA (x, yi , i; R) = ExtendOneA (x, yi , i; R)] ≥ 1 −                                 (4.1)
                                                               e
                R                                                                   p(yi | y1,...,i−1 )

    Note that the oracle algorithm ExtendOne is deterministic, and in the above expressions, R refers to
the coin tosses used by the oracles that ExtendOne queries, namely A and A.e We stress that the lemma
                          0
says that both the value x and the number j returned are equal with high probability.


Proof. Let J = J(R) be the second coordinate of the output of ExtendOneA (x, yi , i; R) (i. e., the counter)
        e the analogous output of ExtendOneAe (x, yi , i; R). We write
and Je= J(R)

       Pr[ExtendOneA (x, yi , i; R) = ExtendOneA (x, yi , i; R)] =
                                                       e
        R

        ∑ PrR [min(J, J)e = j] · PrR [ExtendOneA (x, yi , i; R) = ExtendOneA (x, yi , i; R)| min(J, J)e = j]
                                                                               e

        j≥1

        = E Pr[ExtendOneA (x, yi , i; R) = ExtendOneA (x, yi , i; R)| min(J, J)
                                                                                  
                                                    e
                                                                             e = j]                            (4.2)
            j←PJ R


where PJ is some distribution over the integers which, as it turns out, we do not need to know.
   Let now R0 be the randomness used by A or A   e in round j. Then,


                    Pr[ExtendOneA (x, yi , i; R) = ExtendOneA (x, yi , i; R)| min(J, J)
                                                                                      e = j]
                                                                   e
                    R
                    = Pr0 A(x, i; R0 ) = A(x,
                                         e i; R0 ) | f (A(x, i; R0 ))i = yi ∨ f (A(x,
                                                                                 e i; R0 ))i = yi ,
                                                                                                
                       R


because each iteration of ExtendOne uses fresh independent randomness.
   Let P be the distribution over {0, 1}n produced by A(x, i; R0 ), and P∗ be the (uniform) distribution

                           T HEORY OF C OMPUTING, Volume 16 (8), 2020, pp. 1–55                                  18
                   I NACCESSIBLE E NTROPY II: IE F UNCTIONS AND U NIVERSAL O NE -WAY H ASHING

            e i; R0 ). For p = p(yi | y1,...,i−1 ) and ε = ε(x, i), it holds that
produced by A(x,

            h                                                                                    i
         Pr0 A(x, i; R0 ) = A(x,
                             e i; R0 ) | f (A(x, i; R0 ))i = yi ∨ f (A(x,       e i; R0 ))i = yi
         R
                     h                                                                                      i
                  Pr A(x, i; R0 ) = A(x,   e i; R0 ) ∧ ( f (A(x, i; R0 ))i = yi ∨ f (A(x, e i; R0 ))i = yi )
              =                           h                                                   i
                                        Pr f (A(x, i; R0 ))i = yi ∨ f (A(x, e i; R0 ))i = yi
                                          h                                                                            i
                  ∑x0 ∈F −1 (y1,...,i ) Pr A(x, i; R0 ) = A(x,   e i; R0 ) = x0 ∧ (A(x, i; R0 ) = x0 ∨ A(x,
                                                                                                       e i; R0 ) = x0 )
              =                                                   h                                   i
                                          ∑x0 ∈F −1 (y1,...,i ) Pr A(x, i; R0 ) = x0 ∨ A(x,
                                                                                       e i; R0 ) = x0

                        ∑x0 ∈F −1 (y1,...,i ) min(P(x0 ), P∗ (x0 ))   p−ε   1 − εp     2ε
                      =                              0     ∗   0
                                                                    =     =     ε ≥ 1−    .
                        ∑x∈F −1 (y1,...,i ) max(P(x ), P (x ))        p+ε   1+ p        p


For the penultimate equality, note that


                                  ε = SD(A(x, i), A(x,
                                                  e i)) =                           ∑            P∗ (x0 ) − min(P(x0 ), P∗ (x0 )).
                                                                          x0 ∈F −1 (y1,...,i )



Hence,


          ∑                   min(P(x0 ), P∗ (x0 )) =          ∑                  min(P(x0 ), P∗ (x0 )) + P∗ (x0 ) − P∗ (x0 )
    x0 ∈F −1 (y   1,...,i )                             x0 ∈F −1 (y   1,...,i )
                                                                                                                                                       

                                                  =            ∑                  P∗ (x0 ) −             ∑               P∗ (x0 ) − min(P(x0 ), P∗ (x0 ))
                                                        x0 ∈F −1 (y1,...,i )                       x0 ∈F −1 (y1,...,i )

                                                  = p − ε.


And similarly, ∑x0 ∈F −1 (y1,...,i ) max(P(x0 ), P∗ (x0 )) = p + ε.
    Collecting the equations and inserting into (4.2) proves the lemma.                                                                                       



Putting everything together. We can now finish the proof of Theorem 4.1. Consider the following
random variables: let X be uniformly drawn from {0, 1}n and let Y = f (X). Run InvA (Y ) and InvA (Y ) in
                                                                                                             e

parallel, using the same randomness in both executions. Let Xe(0) , . . . , Xe(n) be the random variables which
have the values assigned to x(0) , . . . , x(n) in the run of InvA . Finally, let the indicator variables Qi be 1, iff
                                                                 e

the i’th call to ExtendOne in the above parallel run is the first call such that ExtendOneA (Xe(i) ,Yi , i; ·) 6=
ExtendOneA (Xe(i) ,Yi , i; ·).
             e



                                      T HEORY OF C OMPUTING, Volume 16 (8), 2020, pp. 1–55                                                                    19
      I FTACH H AITNER , T HOMAS H OLENSTEIN , O MER R EINGOLD , S ALIL VADHAN AND H OETECK W EE

      We proceed to obtain an upper bound on Pr[Qi = 1]. Observe that for all x ∈ {0, 1}n :

             Pr[Qi = 1 | Xe(i−1) = x]
             = Pr[ExtendOneA (x,Yi , i; R) 6= ExtendOneA (x,Yi , i; R) | Xe(i−1) = x]
                                                                            e


             =               p(yi | f (x)1,...,i−1 ) · Pr[ExtendOneA (x, yi , i; R) 6= ExtendOneA (x, yi , i; R)]
                                                                                                       e
                   ∑
                 yi ∈{0,1}
                                                               2ε(x, i)
             ≤     ∑         p(yi | f (x)1,...,i−1 ) ·
                 yi ∈{0,1}
                                                         p(yi | f (x)1,...,i−1 )
             = 4ε(x, i)

where the inequality above follows by Lemma 4.4. Averaging over x, we have that for all i = 1, . . . , n:

                                                 Pr[Qi = 1] ≤ 4             E       [ε(x, i)].                      (4.3)
                                                                         x←{0,1}n

Here, we use the fact that by induction on i, the random variable Xei , for i ∈ {0, . . . , n}, is uniformly
distributed in {0, 1}n (it is uniform preimage of a uniformly chosen output). Using Equation (4.3), we
have
                                    "        #
                                                n                    n
                                         Pr    ∑ Qi ≥ 1 = ∑ Pr[Qi = 1]
                                               i=1                  i=1
                                                                = n·            E         [Qi ]
                                                                         i←[n],x←{0,1}n

                                                                ≤ 4n ·             E       [ε(x, i)]
                                                                          i←[n],x←{0,1}n
                                                                    1
                                                                ≤
                                                                    2
where the last inequality follows from Lemma 4.3. Hence, with probability 1/2, a run of InvA and
InvA produce the same output and use the same number of queries to the oracles A. Moreover, the
    e

probability that InvA uses more than 8n oracle queries is at most 1/4 (by applying Markov’s inequality
                    e

on Lemma 4.2). Hence, with probability 1/4, InvA inverts f using 8n oracle queries in total, which
contradicts the one-wayness of f . In order to make sure that InvA runs in polynomial time, we just halt it
after 8n calls.

4.2     A more efficient construction
The following theorem shows that a simplified variant of the first step of [26] (which is also the first step
of [20]) yields inaccessible entropy with much stronger guarantees than those obtained in Section 4.1.
The function we construct is F(x, g, i) = (g( f (x))1,...,i , g), where g : {0, 1}n → {0, 1}n is a three-wise
independent function. Since the composition of g and f is still a one-way function, Theorem 4.1 already
implies that F −1 has inaccessible entropy. The benefits of the additional hashing step are that
   1. we get more inaccessible entropy (Θ̃(1/n) bits rather than Θ̃(1/n2 ) bits), and

                              T HEORY OF C OMPUTING, Volume 16 (8), 2020, pp. 1–55                                    20
               I NACCESSIBLE E NTROPY II: IE F UNCTIONS AND U NIVERSAL O NE -WAY H ASHING

    2. we get a bound on accessible average max-entropy rather than accessible Shannon entropy.
These allow for a more efficient and simpler transformation of F into a UOWHF.
Theorem 4.5 (Inaccessible average max-entropy from one-way functions). Let f : {0, 1}n → {0, 1}n be a
one-way function and let G = {g : {0, 1}n → {0, 1}n } be a family of constructible, three-wise independent
hash functions. Define F over Dom(F) := {0, 1}n × G × [n] by
                                                F(x, g, i) = (g( f (x))1,...,i , g, i).
Then, for every constant d > 0, F −1 has accessible average max-entropy at most H(Z | F(Z))−(d log n)/n,
where Z is uniformly distributed over Dom(F).
Proof. Let c be a sufficiently large constant (whose value to be determined later as a function of the
constant d in the theorem statement). The sets {L(x, g, i)}x∈{0,1}n ,i∈[n],g∈G realizing the inaccessible
entropy of F −1 are defined by
                     L(x, g, i) = {(x0 , g, i) : f (x0 ) ∈ L(
                                                           e f (x), i) ∧ g( f (x0 ))1,...,i = g( f (x))1,...,i }                   (4.4)
     where for y ∈ {0, 1}n and i ∈ [n], we let
                         e i) = {y} ∪ y0 ∈ {0, 1}n : H f (X) (y0 ) ≥ (i + c · log n)
                        L(y,
                                         
                                                                                                                                   (4.5)
                                = {y} ∪ y0 ∈ {0, 1}n : | f −1 (y0 )| ≤ 2n−i /nc .
                                         


Namely, L(y,
          e i) consists, in addition to y itself, of “i-light” images with respect to f .7 As a warm-up, it is
helpful to write down L(y,
                       e i) and L(x, g, i) for the case where f is a one-way permutation.8
    The proof of the theorem immediately follows by the following two claims.
Claim 4.6. For every PPT F-collision-finder A and every constant c > 0, it holds that
                                                            / L(Z)] ≤ neg(n),
                                                 Pr[A(Z; R) ∈
where Z is uniformly distributed over Dom(F) and R is uniformly distributed over the random coins of A.
Claim 4.7. For any constant c it holds that
                                                                              
                                                        −1
                                                                      c log n
                                 E [log |L(Z)|] ≤ E log F (F(Z)) − Ω             ,
                                                                          n
where Z is uniformly distributed in Dom(F).
                                                                                                                                      
   7 Recall that the sample entropy is defined as H
                                                        f (X) (y) = log(1/ Pr[ f (X) = y]) = n − log   f −1 (y) , so the “heavy” images,
where f −1 (y) is large, have low sample entropy.
  8 If f is a permutation, then L(y,
                                 e i) is given by:
                                                       (
                                                        {0, 1}n     if i ≤ n − c log n
                                             L(y,
                                             e i) =
                                                        {y}         otherwise.

Then, for all x ∈ {0, 1}n , we have E[|L(x, G, i)|] = 2n−i for all i ≤ n−c log n and |L(x, g, i)| = 1 for all g ∈ G and all i > n−c log n.
This means that the entropy gap between F −1 (F(Z)) and L(X, G, I) is roughly 1n ∑i>n−c log n n − i = Ω(c2 log2 n/n).


                              T HEORY OF C OMPUTING, Volume 16 (8), 2020, pp. 1–55                                                    21
    I FTACH H AITNER , T HOMAS H OLENSTEIN , O MER R EINGOLD , S ALIL VADHAN AND H OETECK W EE

4.2.1   Accessible inputs of F — Proving Claim 4.6

Proof of Claim 4.6. Recall that Z = (X, G, I) is uniformly distributed over Dom(F), and that R is uni-
formly distributed over the random coins of A. Let A1 denote the first component of A’s output. It suffices
to show that

                                                   / f −1 (L(
                                Pr[A1 (X, G, I; R) ∈       e f (X), I))] ≤ neg(n)                                 (4.6)

since the other two output components of A are required to equal (G, I), due to the fact that F(X, G, I)
determines (G, I).
    We construct an inverter Inv such that for all F-collision-finders A and for c as in Equation (4.5) we
have

                                                    1
                     Pr[InvA (Y ) ∈ f −1 (Y )] ≥                            / f −1 (L(
                                                       · Pr[A1 (X, G, I; R) ∈       e f (X), I))]                 (4.7)
                                                    nc

where Y = f (X), and the proof of Claim 4.6 follows readily from the one-wayness of f .


 Inverter InvA
 Oracle: An F-collision finder A.
 Input: y ∈ {0, 1}n

 x ← {0, 1}n
 i ← [n]
 g0 ← Gy,x,i := {g ∈ G : g(y)1...i = g( f (x))1...i }
 return A1 (x, g0 , i; r)


Observe that Inv can be implemented efficiently by sampling g0 as follows: pick first z, z∗ ∈ {0, 1}n such
that z1...i = z∗1...i and use the constructibility of G to pick g with g( f (x)) = z and g(y) = z∗ .
    We analyze the success probability of InvA . Using the short hand notation Prg0 [· · · ] for Prg0 ←Gy,x,i [· · · ]
we observe that
                                               h                                                              i
     Pr[InvA (Y ) ∈ f −1 (Y )] =       E          ∑y∈{0,1}n Pr[ f (X) = y] · Pr
                                                                             0
                                                                                [A1 (x, g0
                                                                                           , i; r) ∈ f −1
                                                                                                          (y)]    (4.8)
                                x←{0,1}n ,i←[n]                               g ,r
                                     h            2−i                0             −1
                                                                                          i
                               ≥ E ∑y∈/ L(
                                        e f (x),i) c  · Pr [A1 (x, g   , i; r) ∈ f    (y)]
                                 x,i              n g0 ,r

where the inequality holds since Pr[ f (X) = y] ≥ 2−i /nc for any y ∈    / L(
                                                                           e f (x), i).
   Next, observe that for any tuple (y, x, i) such that y 6= f (x), it holds that (where we distinguish Prg0 [· · · ]

                         T HEORY OF C OMPUTING, Volume 16 (8), 2020, pp. 1–55                                       22
              I NACCESSIBLE E NTROPY II: IE F UNCTIONS AND U NIVERSAL O NE -WAY H ASHING

as above from Prg [· · · ] = Prg←G [· · · ])

                 A1 (x, g0 , i; r) ∈ f −1 (y) = Pr A1 (x, g, i; r) ∈ f −1 (y) | g( f (x))1···i = g(y)1···i
                                                                                                           
           Pr
            0
                                                                                                                            (4.9)
           g ,r                                  g←G,r

                                                 Prg,r A1 (x, g, i; r) ∈ f −1 (y) ∧ g( f (x))1···i = g(y)1···i
                                                                                                              
                                               =                                               
                                                               Prg,r g( f (x))1···i = g(y)1···i
                                                 Prg,r A1 (x, g, i; r) ∈ f −1 (y)
                                                                                 
                                               =                                 
                                                 Prg,r g( f (x))1···i = g(y)1···i
                                               = 2i · Pr A1 (x, g, i; r) ∈ f −1 (y) .
                                                                                  
                                                           g,r

The second equality follows by Bayes’ rule and the third uses the fact that A is a F-collision finder. The
last equality follows since G is two-wise independent (recall we assumed that G is three-wise independent)
and f (x) 6= y.
     Combining the two preceding observations, and the fact that f (x) ∈ L(
                                                                          e f (x), i), we have that

                                                            h          2−i i                                          i
        Pr[InvA (Y ) ∈ f −1 (Y )] ≥                                                                             −1
                                                                                          
                                                E    ∑y∈/L( e f (x),i) c     · 2  ·  Pr    A 1 (x, g, i; r) ∈ f    (y)
                                     x←{0,1}n ,i←[n]                    n           g←G,r
                                      1    h                                             −1
                                                                                                 i
                                    ≥ c · E ∑y∈/ L(e f (x),i) · Pr   A1 (x, g, i; r) ∈  f    (y)
                                     n x,i                      g,r
                                      1
                                                                   / f −1 (L(
                                                                                          
                                    = c · Pr A1 (x, g, i; r) ∈              e f (x), i)) ,
                                     n x,g,i,r
and the proof of the claim follows.                                                                                            

4.2.2    Upper bounding the size of L — Proving Claim 4.7
Recall that Z = (X, G, I) is uniformly distributed over Dom(F). In the following we relate the size of
L(Z) to that of F −1 (F(Z)).
   We make use of the following property of three-wise independent hash-functions.

Claim 4.8. Let i, x and x∗ , be such that f (x) 6= f (x∗ ) and i ≤ H f (X) ( f (x∗ )) ≤ H f (X) ( f (x)). Then,

                                                                 0               2−n
                                                  Pr              z = (x∗ , g, i) ≥    .
                                                 g←G                                8
                                           z0 ←F −1 (F(x,g,i))


    Note that in the above experiment it is always the case that (g0 , i0 ) = (g, i), where z0 = (x0 , g0 , i0 ).

Proof. Note that with probability 2−i over g ← G, it holds that g( f (x∗ ))1···i = g( f (x))1···i . Henceforth,
we condition on this event that we denote by E, and let w = g( f (x∗ ))1···i = g( f (x))1···i . Observe that for a
fixed g satisfying E, it holds that
                                                     0                                 1
                                                      z = (x∗ , g, i) | E ≥
                                                                         
                                      Pr                                                                                   (4.10)
                               z0 ←F −1 (F(x,g,i))                              |F −1 (F(x, g, i))|


                           T HEORY OF C OMPUTING, Volume 16 (8), 2020, pp. 1–55                                               23
    I FTACH H AITNER , T HOMAS H OLENSTEIN , O MER R EINGOLD , S ALIL VADHAN AND H OETECK W EE

In order to obtain a lower bound on F −1 (F(x, g, i)) , we first consider x0 such that f (x0 ) ∈
                                                                                               / { f (x), f (x∗ )}.
By the three-wise independence of G,

                                            Pr g( f (x0 )) = w | E = 2−i
                                                                 
                                                                                                           (4.11)
                                            g←G


This implies that the expected number of x0 such that g( f (x0 )) = w and f (x0 ) ∈
                                                                                  / { f (x), f (x∗ )} is at most
2 . By Markov’s inequality, we have that with probability at least 1/2 over g ← G (conditioned on E),
 n−i



                    |F −1 (F(x, g, i))| ≤ 2 · 2n−i + | f −1 ( f (x))| + | f −1 ( f (x∗ ))| ≤ 4 · 2n−i ,    (4.12)

where the second inequality uses the fact that i ≤ H f (X) ( f (x∗ )) ≤ H f (X) ( f (x)). Putting everything
together, we have that the probability we obtain x∗ is at least 2−i · 1/2 · (4 · 2n−i )−1 = 2−n /8.       


    We now use Claim 4.8 for proving Claim 4.7.


Proof of Claim 4.7. Let Z 0 = (X 0 , G, I) ← F −1 (F(Z = (X, G, I))) (note that indeed the second and third
coordinates of Z and Z 0 are guaranteed to match). We claim that for proving Claim 4.7 it suffices to show
that
                                                                                 
                                        0     −1                         c log(n)
                                  Pr[X ∈
                                       / f         (L( f (X), I))] ∈ Ω
                                                    e                               .                      (4.13)
                                                                             n

Indeed, let L(z) := F −1 (F(z)) \ L(z), and compute
                                                          "                   #
                                                                     L(Z)  
                   E log F −1 (F(Z)) − E [log |L(Z)|] = E log 1 +
                                   
                                                                                                           (4.14)
                                                                     |L(Z)|
                                                          "                      #
                                                                       L(Z)    
                                                      ≥ E log 1 + −1
                                                                     |F (F(Z))|
                                                            "               #
                                                        1         L(Z)
                                                      ≥ E
                                                        2     |F −1 (F(Z))|
                                                                  1
                                                                 =  Pr[X 0 ∈
                                                                           / f −1 (L(
                                                                                    e f (X), I))]
                                                                  2           
                                                                      c log(n)
                                                                 ∈Ω               .
                                                                          n

The first equality holds since by definition L(Z) ⊆ F −1 (F(z)), the second inequality holds since log(1 +
α) ≥ α2 for α ∈ [0, 1] and the containment by Equation (4.13).

                         T HEORY OF C OMPUTING, Volume 16 (8), 2020, pp. 1–55                                  24
            I NACCESSIBLE E NTROPY II: IE F UNCTIONS AND U NIVERSAL O NE -WAY H ASHING

    We prove Equation (4.13) in two steps. First, observe that for all x:

                                                                             f (x0 ) ∈
                                                                                     / L(
                                                                                                 
                                        Pr                                             e f (x), i)                                    (4.15)
                  (g,i)←G×[n],(x0 ,g,i)←F −1 (F(x,g,i))

                                                                 f (x0 ) 6= f (x) ∧ (i ≤ H f (X) ( f (x0 )) < i + c log n)
                                                                                                                         
                  ≥                   Pr
                      g,i,(x0 ,g,i)←F −1 (F(x,g,i))
                      1                                                                                                    
                                                                                          Pr (x∗ , g, i) ← F −1 (F(x, g, i))
                                                                                            
                  =     ·       ∑ ∗                                      ∑
                      n x∗ : f (x)6 = f (x ) i≤H                         ∗             g
                                                             f (X) ( f (x ))<i+c log n

                      1                                                                               2−n 
                  ≥     ·                   ∑                                          ∑                       (by Claim 4.8)
                      n             x∗ : f (x)6= f (x∗ )             i≤H f (X)    ( f (x∗ ))<i+c log n
                                                                                                        8
                            ∧(H f (X) ( f (x∗ ))≤H f (X) ( f (x)))

                      c log n                                                  2−n
                  ≥           ·                       ∑
                         n                   x∗ : f (x)6= f (x∗ )               8
                                     ∧(H f (X) ( f (x∗ ))≤H f (X) ( f (x)))

                      c log n
                              · Pr∗ f (x) 6= f (x∗ ) ∧ (H f (X) ( f (x∗ )) ≤ H f (X) ( f (x))) .
                                                                                             
                  =
                         8n     x

It follows that

 Pr[X 0 ∈
        / f −1 (L(
                                                                                0
                                                                                     e f (x), i) : (x0 , g, i) ← F −1 (F(x, g, i))
                                                                                   / L(
                                                                                                                                  
                e f (X), I))] =                          Pr                     x ∈
                                          (x,g,i)←{0,1}n ×G×[n]
                                       c log n
                                                                      f (x) 6= f (x∗ ) ∧ (H f (X) ( f (x∗ )) ≤ H f (X) ( f (x)))
                                                                                                                               
                                     ≥         ·       Pr
                                          8n (x,x∗ )←{0,1}n ×{0,1}n
                                       c log n 1                             
                                     ≥         · · 1 − Pr∗ [ f (x) 6= f (x∗ )]
                                          8n 2          x,x
                                       c log n 1 1
                                     ≥         · · .
                                          8n 2 2

The last inequality holds since the one-wayness of f yields that Prx,x∗ [ f (x) = f (x∗ )] is negligible (other-
wise inverting f is trivial). This concludes the the proof of Equation (4.13), and hence of the claim. 


5    UOWHFs from inaccessible entropy
In this section we show how to construct a UOWHF from any efficiently computable function with a
noticeable gap between real Shannon entropy and either accessible average max-entropy or accessible
Shannon entropy. Recall that the more efficient construction from Section 4.2 satisfies the former, and
the more direct construction from Section 4.1 satisfies the latter. Combined with these constructions, we
obtain two new constructions of UOWHFs from any one-way function.
    In both cases, we first transform the entropy gap into a noticeable gap between real Shannon entropy
and accessible max-entropy. We begin with the construction that starts from a gap between real Shannon
entropy and accessible average max-entropy because the transformation involves fewer steps (and is also
more efficient).

                            T HEORY OF C OMPUTING, Volume 16 (8), 2020, pp. 1–55                                                         25
      I FTACH H AITNER , T HOMAS H OLENSTEIN , O MER R EINGOLD , S ALIL VADHAN AND H OETECK W EE

5.1     The more efficient UOWHF
Theorem 5.1. Suppose there exists a polynomial-time computable function F : {0, 1}ne → {0, 1}m such
that F −1 has a noticeable gap ∆ between real Shannon entropy and accessible average max-entropy.
                                                                                     n4 s/∆3 ) and key
Then, there exists a family of universal one-way hash functions with output length O(e
length O(e 4    3
          n s/∆ · log n) for any s = ω(log n), where n is the security parameter.9


      We first show how to combine this with Theorem 4.5 to get a universal one-way hash function.
Theorem 5.2. Suppose there exists a one-way function f : {0, 1}n → {0, 1}n . Then, there exists a family
of universal one-way hash functions with key and output length O(n7 ).
Proof. Fixing s := log2 (n), we use Theorem 4.5 to get a function F : {0, 1}ne → {0, 1}m with ne = O(n) and
gap ∆ = log n/n between real Shannon entropy and accessible average max-entropy. By Theorem 5.1 we
get a family of universal one-way hash functions with output length O(e n4 s/∆3 ) = O(n7 log2 (n)/ log3 (n))
and key length O(e  4    3               7
                   n s/∆ · log(n)) = O(n ).                                                               

Overview. The construction proceeds via a series of transformations as outlined in Section 1.2: gap
amplification (via repetition), entropy reduction (by hashing inputs) and reducing output length (by
hashing outputs). In each of these transformations, we use n0 to denote the input length of the function F
we start with, and n to denote the security parameter.

5.1.1    Gap amplification
Here, we show that a direct product construction increases the gap between real entropy and accessible
entropy. Another useful effect of direct product (for certain settings of parameters) is turning real Shannon
entropy into real min-entropy, and turning accessible average max-entropy into accessible max-entropy.
Lemma 5.3 (Gap amplification). Let n be a security parameter and F : {0, 1}ne → {0, 1}m be a function.
For t ∈ poly(n), let F t be the t-fold direct product of F. Then, F t satisfies the following properties:
                                                                                                           √
   i. If F −1 has real Shannon entropy at least k, then (F t )−1 has real min-entropy at least t · k − ne · st
      for any s = ω(log n) and t > s.

   ii. If F −1 has accessible
                        √       average max-entropy at most k, then (F t )−1 has accessible max-entropy at
       most t · k + ne · st for any s = ω(log n).
Proof. In the following X and X (t) = (X1 , . . . , Xt ) are uniformly distributed over {0, 1}ne and ({0, 1}ne)t ,
respectively.

      i. Follows readily from Lemma 2.3 (with ε = 2−s ).

   ii. Given any PPT F t -collision-finder A0 , we construct a PPT F-collision-finder A that:

              On input x, picks a random i in [t] along with random x1 , . . . , xi−1 , xi+1 , . . . , xt , computes
              A0 (x1 , . . . , xt ) → (x10 , . . . , xt0 ), and outputs xi0 .
   9 Note that ∆ is not required to be efficiently computable.



                            T HEORY OF C OMPUTING, Volume 16 (8), 2020, pp. 1–55                                       26
             I NACCESSIBLE E NTROPY II: IE F UNCTIONS AND U NIVERSAL O NE -WAY H ASHING

        By the bound on the accessible average                 −1
                                            max-entropy of F , we know that there exists a family of
        sets {L(x)} such that E log |L(X)| ≤ k, x ∈ L(x), and Pr[A(X) ∈
                                                                      / L(X)] ≤ neg(n). Consider the
                               

        family of sets L (x ) : x ∈ ({0, 1} ) given by:
                       0 (t)     (t)        n
                                             e t


                                                     (t)        (t)              (t)
                                   L0 (x(t) ) = L(x1 ) × L(x2 ) × · · · × L(xt ).

        By linearity of expectations, we have E log |L0 (X1 , . . . , Xt )| ≤ t · k. Moreover, by the Chernoff-
                                                                          

        Hoeffding bound and using the fact that log |L(X)| assumes values in [0, ne], we have
                                                         √ 
                          Pr log L0 (X (t) ) ≥ t · k + ne st
                            
                                                                                                         (5.1)
                                           (t)               (t)           √ 
                          = Pr log L(X1 ) + · · · + log L(Xt ) ≥ t · k + ne st ≤ e−2s .
                               

                                                                                        √
        We claim that this implies that A0 has accessible max-entropy at most t · k + ne st. Suppose
        otherwise, then there exists a non-negligible function ε such that

                                   Pr[A0 (F t (X (t) )) ∈
                                                        / L0 (X (t) )] ≥ ε − e−2s ≥ ε/2

        Therefore,

                                      / L(X)] = Pr[A0 (F t (X (t) )) ∈
                           Pr[A(F(X)) ∈                              / L0 (X (t) )]/t ≥ ε/2t

        which contradicts our assumption on A.
                                                                                                            

5.1.2    Entropy reduction
Next we describe a construction that given F and any parameter `, reduces the accessible max-entropy of
F −1 by roughly ` bits, while approximately preserving the gap between real min-entropy and accessible
max-entropy.

Lemma 5.4 (Reducing entropy). Let n be a security parameter       and F : {0, 1}ne → {0, 1}m be a function.
Fix a family of pairwise independent hash functions G = g : {0, 1}ne → {0, 1}` . Then, F 0 : {0, 1}ne ×
                                                             

G → {0, 1}m × G × {0, 1}` as given by F 0 (x, g) = (F(x), g, g(x)) satisfies the following properties:

   i. Assuming F −1 has real min-entropy at least k, then (F 0 )−1 has real min-entropy at least k − ` − s
      for any s = ω(log n).

   ii. Assuming F −1 has accessible max-entropy at most k, then (F 0 )−1 has accessible max-entropy at
       most max {k − ` + s, 0} for any s = ω(log n).

Proof. In the following X and G are uniformly distributed over {0, 1}ne and G, respectively.

   i. Fix g ∈ G and let Sg = z ∈ {0, 1}` : Pr[g(X) = z] ≤ 2−`−s . Observe the following.
                            

         (a) Pr[g(X) ∈ Sg ] ≤ 2−s (by a union bound over z ∈ Sg );

                         T HEORY OF C OMPUTING, Volume 16 (8), 2020, pp. 1–55                               27
 I FTACH H AITNER , T HOMAS H OLENSTEIN , O MER R EINGOLD , S ALIL VADHAN AND H OETECK W EE

                  / Sg and any x ∈ {0, 1}n such that HX|F(X) (x | F(x)) ≥ k. Then,
    (b) Fix any z ∈

                   Pr[X = x | F 0 (X, g) = (F(x), g, z)] = Pr[X = x | F(X) = F(x) ∧ g(X) = z]
                                                           Pr[X = x | F(X) = F(x)]
                                                         ≤
                                                                 Pr[g(X) = z]
                                                            2 −k
                                                         ≤ −`−s = 2−(k−`−s) .
                                                           2
         where the second inequality follows from our assumptions on z and x.

   Combining the above two observations and the bound on the real min-entropy of F, it follows that
   for all g ∈ G, with probability 1 − 2−s − neg(n) over x ← X, we have

                                Pr[X = x | F 0 (X, g) = F 0 (x, g)] ≤ 2−(k−`−s) .

   The bound on the real min-entropy of F 0 follows readily.

ii. Given a PPT F 0 -collision-finder A0 , we construct a PPT F-collision-finder A as follows:

         On input x, picks a pair (g, r) uniformly at random and output A0 (x, g; r).

   By
    the bound on   the accessible max-entropy of F −1 , we know that there exists a family of sets
    L(x) ⊆ {0, 1} : x ∈ {0, 1}ne such that |L(x)| ≤ 2k , x ∈ L(x), and
                 n
                 e


                                Pr A(X, G; R) ∈ L(X) ≥ 1 − neg(n),
                                                     
                                                                                              (5.2)

   where R is uniformly distributed over the random coins of A.
   Let L0 (x, g) := {(x0 , g) : x0 ∈ L(x) ∧ g(x0 ) = g(x)}. Equation (5.2) yields that

                                    Pr A0 (X, G; R) ∈ L0 (X, G) ≥ 1 − neg(n)
                                                              
                                                                                                 (5.3)

   We next bound the size of the set L0 (x, g). Fix any x ∈ {0, 1}n . For any x0 6= x, pairwise in-
   dependence of G tells us that Pr[G(x0 ) = G(x)] = 2−` . It follows from linearity of expectation
   that

                              E L0 (x, G) \ {x} ≤ |L(x)| · 2−` ≤ 2k−`
                                               

   Then, by Markov’s inequality, we have

                          Pr L0 (x, G) ≤ 2k−`+s−1 + 1} ≥ 1 − 2−(s−1) ,
                                                     
                                                                                                 (5.4)

   Combining the last two inequalities, we obtain
                                                     n          o
        Pr A0 (X, G; R) ∈ L0 (X, G) ∧ L0 (X, G) ≤ max 2k−`+s , 1 ≥ 1 − neg(n) − 2−(s−1)
          
                                                                                                 (5.5)

   The above yields an upper bound of max {k − ` + s, 0} on the accessible max-entropy of (F 0 )−1 .
                                                                                                   

                    T HEORY OF C OMPUTING, Volume 16 (8), 2020, pp. 1–55                           28
             I NACCESSIBLE E NTROPY II: IE F UNCTIONS AND U NIVERSAL O NE -WAY H ASHING

5.1.3    Reducing output length
The next transformation gives us a way to derive a function that is both length-decreasing and collision-
resistant on random inputs.

Lemma 5.5 (Reducing output length). Let n be a security parameter and F : {0, 1}ne → {0, 1}m be a
function. Fix a family of pairwise independent hash functions G = g : {0, 1}m → {0, 1}ne−log n and
let F 0 : G × {0, 1}ne → {0, 1}ne−log n × G be defined by F 0 (x, g) = (g, g(F(x))). The following holds: if
F −1 has real min-entropy at least ω(log n) and F is collision-resistant on random inputs, then F 0 is
collision-resistant on random inputs.

Proof. The bound on real min-entropy implies that there exists a subset S ⊆ {0, 1}ne of density at most
                            / S it holds that F −1 (F(x)) = nω(1) . Hence,
neg(n), such that for all x ∈

                         |Im(F)| ≤ |F(S)| + F(S̄) ≤ |S| + S̄ /nω(1) ≤ neg(n) · 2n .                          (5.6)

By the two-wise independent of G,
                                                                 |Im(F)|
                  Pr ∃y0 ∈ Im(F) : y0 6= F(X) ∧ G(y0 ) = G(F(X)) ≤ ne−log n ≤ neg(n).
                    
                                                                                                             (5.7)
                                                                   2
Namely, g(F(x)) uniquely determines F(x) with high probability. In particular, a collision for g ◦ F is
also a collision for F. Given any PPT F 0 -collision-finder A0 , we construct a PPT F-collision-finder A as
follows:

        On input x, pick g and r at random and compute x0 = A0 (x, g; r). If F(x0 ) = F(x), output x0 ,
        else output x.

Equation (5.7) implies that Pr[A0 (X, G; R) 6= (A(X; G, R), G)] ≤ neg(n). Therefore, Pr[A0 (X, G; R) =
(X, G)] ≥ 1 − neg(n). Namely, F 0 is also collision-resistant on random inputs.                      

5.1.4    Additional transformations
We present two more standard transformations that are needed to complete the construction.

Lemma 5.6 (From random inputs to targets, folklore). Let n be a security parameter and F : {0, 1}ne →
{0, 1}m be a length-decreasing function. Suppose F is collision-resistant on random inputs. Then,
                                   n                          o
                                     Fy0 : {0, 1}ne → {0, 1}m      ne
                                                                  y∈{0,1}

as defined by Fy0 (x) = F(y + x) is a family of target collision-resistant hash functions.

Proof. Given a PPT adversary A0 that breaks target collision-resistance of Fy0 , we can construct a PPT
adversary A that breaks F as follows:

        On input x, run A0 (1n ) to compute (x0 , state), and then run A0 (state, x ⊕ x0 ) to compute x1 .
        Output x ⊕ x0 ⊕ x1 .

                         T HEORY OF C OMPUTING, Volume 16 (8), 2020, pp. 1–55                                  29
    I FTACH H AITNER , T HOMAS H OLENSTEIN , O MER R EINGOLD , S ALIL VADHAN AND H OETECK W EE

                                          0
Note that (x0 , x1 ) is a collision for Fx⊕x  iff (x, x ⊕ x0 ⊕ x1 ) is a collision for F. It then follows quite readily
                                            0
that A breaks F with the same probability that A0 breaks Fy0 .                                                       

    The following result of [29] (improving on [23, 3]) shows that we can construct target collision-
resistant hash functions for arbitrarily long inputs starting from one for a fixed input length.

Lemma5.7 (Increasing the input length [29]). Let n be a security parameter, t = poly(n) be a parameter
and let Fy : {0, 1}ne+log n → {0, 1}ne be a family of target collision-resistant hash functions. Then, there
exists a family of target collision-resistant hash functions
                                       n                                   o
                                         Fy00 : {0, 1}ne+t log n → {0, 1}ne

where |y0 | = O(|y| logt).

5.1.5    Putting everything together
Using these transformations, we can now prove Theorem 5.1.

Proof of Theorem 5.1. Recall that we have given F : {0, 1}n0 → {0, 1}m0 , s ∈ ω(log n), the gap ∆ and
that n is the security parameter.

STEP 1 (gap amplification): For a parameter t, we define F1 as F1 (x1 , . . . , xt ) = (F(x1 ), . . . , F(xt )), the
        t-fold direct product of F. We choose the parameter t ∈ O(n20 s/∆2 ) such that
                                               √                             √
                               t · kREAL − n0 · st ≥ t · (kREAL − ∆/2) + n0 · st + 3s.

        Lemma 5.3 yields that this repetition increases both the real and accessible entropies of F1 by a
        factor of t (comparing to F). In addition, this repetition converts real Shannon entropy to real
        min-entropy and accessible average max-entropy to accessible max-entropy (up to additive terms
        that are sub-linear in t). More precisely, we have the following properties:

           • F1 : {0, 1}n1 → {0, 1}m1 , where n1 (n) = t · n0 and m1 (n) = t · m0 .
                                                                √                             √
           • F1−1 has real min-entropy at least t · kREAL − n0 · st ≥ t · (kREAL − ∆/2) + n0 · st + 3s.
                                                                               √
           • F1−1 has accessible max-entropy at most t · (kREAL − ∆) + n0 · st.

        In steps 2 to 4, the construction uses non-uniform advice k, which corresponds to an approximation
        to kREAL . In step 5, we will remove this non-uniform advice via “exhaustive search”. Concretely,
        for steps 2 to 4, we are given k satisfying

                                                 k ∈ [kREAL , kREAL + ∆/2]                                       (5.8)

        This means that
                                                                  √
           • F1−1 has real min-entropy at least t · (k − ∆) + n0 · st + 3s.
                                                                        √
           • F1−1 has accessible max-entropy at most t · (k − ∆) + n0 · st.

                          T HEORY OF C OMPUTING, Volume 16 (8), 2020, pp. 1–55                                      30
            I NACCESSIBLE E NTROPY II: IE F UNCTIONS AND U NIVERSAL O NE -WAY H ASHING

      This yields a gap of 3s between real min-entropy and accessible max-entropy.
                                                                                                (k)                (k)
STEP 2 (entropy reduction): We next apply entropy reduction to F1 to obtain F2          . That is, F2 (x, g) =
      (F1 (x), g, g(x)), where g : {0, 1}n1 → {0, 1} ` is selected from a family of pairwise independent hash
                                            √
      functions with ` = t · (k − ∆) + n0 · st + s = O(tn0 ). Lemma 5.4 yields that this additional hashing
      reduces the real min-entropy and accessible max-entropy by ` (up to an additive term of s). More
      exactly, we have the following properties:
              (k)
         • F2 : {0, 1}n2 → {0, 1}m2 where n2 (n, k) = O(tn0 ) and m2 (n, k) = O(tn0 ). Note that in par-
           ticular n2 and m2 also depend on k (unlike n1 and m1 ).
                                         (k)
         • If (5.8) holds, then (F2 )−1 has real min-entropy at least s.
                                         (k)                                                                (k)
         • If (5.8) holds, then (F2 )−1 has accessible max-entropy at most 0. Hence, F2                           is collision-
           resistant on random inputs (by Lemma 3.7).
                                                                                          (k)
STEP 3 (reducing the output length): We next reduce the output length of F2                     by hashing the output to
                                  (k)               (k)
      n2 − log n bits. That is, F3 (x, g) = (g, g(F2 (x))) where g : {0, 1}m2 → {0, 1}n2 −log n is selected
      from a family of pairwise-independent hash functions.
              (k)
         • F3 : {0, 1}n3 → {0, 1}m3 where n3 (n, k) = O(tn0 ) and m3 (n, k) = n3 − log n.
                                   (k)
         • By Lemma 5.5, F3  is collision-resistant on random inputs, assuming that (5.8) holds.
                                                                  n        o
                                                    (k)                (k)
STEP 4 (adding random shifts) We then transform F3 into a family Gy          of target collision-resistant
                                                                                    (k)               (k)
      hash functions via a random shift, following Lemma 5.6. That is, Gy (x) = F3 (y + x). We then
      have that
              (k)                                    (k)
         • Gy (x) : {0, 1}n3 → {0, 1}m3 and Gy uses a key y of length n3 (n, k).
                               n     o
                                 (k)
         • If (5.8) holds, then Gy     is target collision-resistant.

STEP 5 (removing non-uniformity): To remove the non-uniform advice k, we “try all possibilities”
      from 0 to n0 in steps of size ∆/2, similar to the approach used in [26] (see also [20, Section 3.6])
                                                                            n      o                      n       o
                                                                               (k)                            (k)
         i. First, we construct κ = n0 · 2/∆ families of functions Gy , where we instantiate Gy                     for
                    ∆
            all k ∈ 2 , 2 · ∆2 , 3 · ∆2 , . . . , n0 . These κ families of functions satisfy the following properties:
                             (k)                                          (k)
              • Each of Gy is length-decreasing; in particular, Gy has input length n3 (n, k) and output
                                                           (n )
                length n3 (n, k) − log n. Note that Gy 0 has the longest input length, i. e., n3 (n, i∆/2) ≤
                n3 (n, n0 ) for all i because `(n, k) increases as a function of k. We may then assume that
                all κ functions G1y , . . . , Gκy have the same input length n3 (n, n0 ) and the same output
                length n3 (n, n0 ) − log n by padding “extra part” of the input to the output.
                                       n      o
                                          (k)
              • At least one of the Gy           is target collision-resistant; this is because kREAL ∈ [0, n0 ], and
                so (5.8) holds for some k which we picked.

                        T HEORY OF C OMPUTING, Volume 16 (8), 2020, pp. 1–55                                                31
      I FTACH H AITNER , T HOMAS H OLENSTEIN , O MER R EINGOLD , S ALIL VADHAN AND H OETECK W EE
                                                                         n        o     n        o
                                                                              (k)            (k)
          ii. Next, for each k, we construct a family of functions G̃ỹ             from Gy         with input length
              κ · n3 (n, n0 ), key length O(n3 (n, n0 ) · log n) and output length n3 (n,n
                                                                                         n0 ) − o log n, by following
                                                                                              (k)
              the construction given by Lemma 5.7. Again, at least one of the G̃ỹ                  for k as above is
              target collision-resistant.
                                                                                                             (k)
         iii. Finally, we define a family of functions {Gỹ1 ,...,ỹκ } to be the concatenation of all G̃ỹ on the
                                                         (∆/2)                  (n )
             same input. That is, Gỹ1 ,...,ỹκ (x) = G̃ỹ1      (x) ◦ · · · ◦ G̃ỹκ0 (x).
                • Note that G has input length κ · n3 (n, n0 ) and output length κ · (n3 (n, n0 ) − log n), so G
                  is length-decreasing.
                                                         n            o      n        o
                                                              (∆/2)              (n )
                • Moreover, since at least one of G̃ỹ1 (x) , . . . , G̃ỹκ0 is target collision-resistant,
                  {Gỹ1 ,...,ỹκ } must also be target collision-resistant. This is because a collision for Gỹ1 ,...,ỹκ
                                                   (∆/2)        (n )
                  is a collision for each of G̃ỹ1 , . . . , G̃ỹκ0 .

The family {Gỹ1 ,...,ỹκ } is the universal one-way hash function we wanted to construct, and so this finishes
the proof of Theorem 5.1.                                                                                    

5.2     UOWHF via a direct construction
Theorem 5.8. Suppose there exists a polynomial-time computable function F : {0, 1}ne → {0, 1}m such
that F −1 has a noticeable gap ∆ between real Shannon entropy and accessible Shannon entropy. Then,
                                                                               n8 s2 /∆7 ) and key length
there exists a family of universal one-way hash functions with output length O(e
O(en8 s2 /∆7 · log n) for any s = ω(log n).

   As before, we can use Theorem 5.8 together with results from the previous sections to get a universal
one-way hash function.

Theorem 5.9. Suppose there exists a one-way function f : {0, 1}n → {0, 1}n . Then, there exists a family
                                                               e 22 ).
of universal one-way hash functions with key and output length O(n

Proof. We set s := log2 (n), and use Theorem 4.1 to get a F with ne = O(n), and ∆ := O( n12 ). Using F in
                                             e 22 ).
Theorem 5.8 this gives key and output length O(n                                                       

    In order to prove Theorem 5.8, we show how to transform a noticeable gap between real Shannon
entropy and accessible Shannon entropy to one between real Shannon entropy and accessible max-entropy,
and then follow the construction from the previous section. This step is fairly involved as we are unable
to show that parallel repetition directly transforms an upper bound on accessible Shannon entropy into
one for accessible max-entropy. We proceed by first establishing some additional properties achieved by
gap amplification and entropy reduction.

Lemma 5.10 (Gap amplification, continued). Let n be a security parameter and F : {0, 1}ne → {0, 1}m be
a function. For t ∈ poly(n), let F t be the t-fold direct product of F. Then, the following holds:
                                                                                                          √
    i. If F −1 has real Shannon entropy at most k, then (F t )−1 has real max-entropy at most t · k + ne · st
       for any s = ω(log n) and t > s.

                          T HEORY OF C OMPUTING, Volume 16 (8), 2020, pp. 1–55                                       32
              I NACCESSIBLE E NTROPY II: IE F UNCTIONS AND U NIVERSAL O NE -WAY H ASHING

   ii. If F −1 has real min-entropy at least k, then (F t )−1 has real min-entropy at least t · k.

  iii. If F −1 has real max-entropy at most k, then (F t )−1 has real max-entropy at most t · k.

   iv. If F −1 has accessible Shannon entropy at most k, then (F t )−1 has accessible Shannon entropy at
       most t · k.

   v. If F −1 has accessible max-entropy at most k, then (F t )−1 has accessible max-entropy at most t · k.

  vi. If F is q-collision-resistant on random inputs and F −1 has real max-entropy at most k, then (F t )−1
      has accessible max-entropy at most (1 − q/8) · tk + t, provided that t = ω((1/q) · log n).
Proof. Again, X and X (t) = (X1 , . . . , Xt ) are uniformly distributed over {0, 1}ne and ({0, 1}ne)t , respectively.
    i. Follows readily from Lemma 2.3.

   ii. This follows from a union bound and that fact that for all x1 , . . . , xt :
                                                                               t
                                HX (t) (x1 , . . . , xt | F t (x1 , . . . , xt )) = ∑ HX|F(X) (xi | F(xi ))
                                                                             i=1

  iii. Same as previous part.

   iv. Given any PPT F t -collision-finder A0 , we construct the following PPT F-collision-finder A:

              On input x, pick a random i in [t] along with random x1 , . . . , xi−1 , xi+1 , . . . , xt , compute
              A0 (x1 , . . . , xt ) → (x10 , . . . , xt0 ), and output xi0 .

       Define the random variables (X10 , . . . , Xt0 ) = A0 (X1 , . . . , Xt ). Then,

             H(X10 , . . . , Xt0 | X1 , . . . , Xt )
              ≤ H(X10 | X1 ) + · · · + H(Xt0 | Xt )        subadditivity of conditional Shannon entropy
              = t · H(XI0 | XI )                           where I has the uniform distribution over [t]
              = t · H(A(X) | X)                            by definition of A
              ≤ t ·k                                       by the bound on accessible Shannon entropy of F −1

   v. Analogous to Lemma 5.3 part ii, but simpler, since we do not have to use the Chernoff-Hoeffding
      bound.

  vi. Suppose on the contrary that there exists a PPT F t -collision-finder A0 that violates the guarantee on
      accessible max-entropy. For x(t) ∈ ({0, 1}ne)t ,
       let
                          n                                            n                      o       o
                             (t)                                   (t)               (t)  (t)
               B(x(t) ) := x0 ∈ ({0, 1}ne)t : F t (x(t) ) = F t (x0 ) ∧ i ∈ [t] : x0 i = xi     ≥ qt/8 .

       By the bound on real max-entropy, we have that
                            h                             i
                                                  (t)
                          Pr ∃i ∈ [t] : F −1 (F(Xi )) > 2k ≤ t · neg(n) = neg(n).

                           T HEORY OF C OMPUTING, Volume 16 (8), 2020, pp. 1–55                                      33
    I FTACH H AITNER , T HOMAS H OLENSTEIN , O MER R EINGOLD , S ALIL VADHAN AND H OETECK W EE

      Hence,
                                                    
                                       h
                                           (t)     t            i
                                     Pr B(X ) >       2(1−q/8)tk ≤ neg(n)                                            (5.9)
                                                 qt/8

      Since A0 achieves accessible max-entropy greater than (1 − q/8)tk + t, there must exists a non-
      negligible function ε such that Pr[A0 (X (t) ; R0 ) ∈
                                                          / B(X (t) )] ≥ ε − t · neg(n) ≥ ε/2, where R0 is uni-
      formly distributed over the random coins of A . Namely, A0 finds collisions on at least a 1 − q/8
                                                          0

      fraction of the coordinates with non-negligible probability.
      Since F is q-collision resistant, this violates a standard Chernoff-type direct product theorem. We
      provide a proof sketch, following a similar analysis done for standard collision resistance in [6].
      Consider the following PPT F-collision-finder A:

            On input x ∈ {0, 1}ne, pick a random i ∈ [t] along with random x1 , . . . , xi−1 , xi+1 , . . . , xt ,
            compute A0 (x1 , . . . , xt ) → (x10 , . . . , xt0 ), and output xi0 .

      To analyze the success probability of A0 , fix any subset S of {0, 1}ne of density q/2. If t = ω(log n/q),
      then a Chernoff bound yields that
                                                      n                 o
                                                                  (t)
                         Pr[A0 (X (t) ) ∈
                                        / B(X (t) ) ∧ i ∈ [t] : Xi ∈ S ≥ q/4] ≥ ε/4.

      This means that

                            Pr [A0 (X (t) ) → (X10 , . . . , Xt0 ) ∧ Xi ∈ S ∧ Xi0 6= Xi ] ≥ ε/4 · q/8.
                           i←[t]

      We may then deduce (following the same calculations in [6, Prop 2]) that
                                h                                 i
                             Pr Pr[A(x; R) 6= x] ≥ ε/4 · q/8 · 2/q ≥ 1 − q/2.
                                   x←X

      where R is uniformly distributed over the random coins of A. By repeating A a sufficient number
      of times, we may find collisions on random inputs of F with probability 1 − q, contradicting our
      assumption that F is q-collision-resistant on random inputs.
                                                                                                    

Lemma 5.11 (Reducing entropy, continued). Let n be asecurity parameter and F : {0, 1}ne → {0, 1}m be a
function. Fix a family of 2-universal hash functions G = g : {0, 1}ne → {0, 1}` . Then, F 0 : {0, 1}ne ×G →
{0, 1}m × G × {0, 1}` as given by F 0 (x, g) = (F(x), g, g(x)) satisfies the following properties:

   i. If F −1 has real max-entropy at most k, then (F 0 )−1 has real max-entropy at most max {k − ` + s, 0}
      for any s = ω(log n).

   ii. If F −1 has p-accessible max-entropy at most k, then (F 0 )−1 has p + 2−Ω(s) -accessible max-entropy
       at most max {k − ` + s, 0} for any s.

Proof. In the following X and G are uniformly distributed over {0, 1}ne and G, respectively.

                        T HEORY OF C OMPUTING, Volume 16 (8), 2020, pp. 1–55                                           34
              I NACCESSIBLE E NTROPY II: IE F UNCTIONS AND U NIVERSAL O NE -WAY H ASHING

   i. Fix an x such that F −1 (F(x)) ≤ 2k . By 2-universal hashing,

                        E G−1 (G(x)) ∩ (F −1 (F(x)) \ {x}) ≤ (2k − 1) · 2−` ≤ 2k−` .
                                                           

        The bound on the real max-entropy of F −1 and Markov’s inequality yield that

                  Pr G−1 (G(X)) ∩ (F −1 (F(X)) \ {x}) ≥ 2(s−1) · 2k−` ≤ 2−(s−1) + neg(n).
                                                                    

        The bound on the real max-entropy of (F 0 )−1 follows.

  ii. Readily follows from the proof of Lemma 5.4 part ii.
                                                                                                                 

5.2.1    Putting everything together
Proof of Theorem 5.8. Recall that we start out with a function F : {0, 1}n0 → {0, 1}m0 with a gap ∆
between real Shannon entropy and accessible Shannon entropy. Let kREAL denote the real Shannon
entropy of F −1 .

STEP 1 (gap amplification): Let F1 be the t-fold direct product of F for a sufficiently large t to be
        determined later. That is, F1 (x1 , . . . , xt ) = (F(x1 ), . . . , F(xt )).
        Lemma 5.3 yields that this repetition increases both the real and accessible entropies of F1 by a
        factor of t. In addition, the repetition converts real Shannon entropy to real min-entropy and real
        max-entropy (up to an additive o(t) term). More precisely:

           • F1 : {0, 1}n1 → {0, 1}m1 where n1 (n) = t · ne and m1 (n) = t · m.
                                                               √
           • F1−1
               √ has real min-entropy at least t · kREAL − n0 st and real max-entropy at most t · kREAL +
             n0 st.
           • F1−1 has accessible Shannon entropy at most t · kREAL − t∆.

        From the next step on, the construction again uses an additional parameter k. We will be especially
        interested in the case

                                                  k ∈ [kREAL , kREAL + ∆2 /128n0 ].                          (5.10)

        In case this holds,

           • F1−1 has accessible Shannon entropy at most tk − t∆. Lemma 3.8 yields that F1−1 has (1 −
             ∆/4k)-accessible max-entropy at most tk − t∆/2.
                                                                                                       (k)
STEP 2 (entropy reduction): Apply entropy reduction to F1 with ` = tk − t∆/2 + s to obtain F2                . That
              (k)
        is, F2 (x, g) = (F1 (x), g, g(x)), where g : {0, 1}n1 → {0, 1}` is selected from a family of 2-universal
        hash functions.
        By Lemma 5.4 and Lemma 5.11, this reduces the accessible max-entropy to 0, which allows us to
                      (k)
        deduce that F2 is weakly collision-resistant on random inputs. Assuming Equation (5.10) we have

                            T HEORY OF C OMPUTING, Volume 16 (8), 2020, pp. 1–55                                35
   I FTACH H AITNER , T HOMAS H OLENSTEIN , O MER R EINGOLD , S ALIL VADHAN AND H OETECK W EE

              (k)
         • F2 : {0, 1}n2 → {0, 1}m2 where n2 (n, k) = O(tn0 +`(n, k)) = O(tn0 ) and m2 (n, k) = O(tm0 +
           `(n, k)) = O(tn0 ).
              (k)                                                          √
         • (F2 )−1 has real min-entropy at least t · (kREAL − k + ∆/2) − n0 st − 2s, which is at least
                                                                   √
                                      t · (∆/2 − ∆2 /128n0 ) − n0 st − 2s
                                                                   √                 √
            and real max-entropy at most t · (kREAL − k + ∆/2) + n0 st ≤ t · ∆/2 + n0 st.
               (k)                                                                              (k)
         • (F2 )−1 has (1 − ∆/4k + 2−Ω(s) )-accessible max-entropy at most 0. Thus, F2                is q-collision-
           resistant on random inputs (by Lemma 3.7), for q = ∆/4k − 2−Ω(s) .
                                  (k)                                 (k)
STEP 3 (gap amplification): F3             is t 0 -fold direct product of F2 , where t 0 = s/q = O(ks/∆). That is,
        (k)                        (k)               (k)
      F3 (x1 , . . . , xt 0 ) = (F2 (x1 ), . . . , F2 (xt 0 )).
                                                                                                      (k)
      By Lemma 5.10, this allows us to amplify the weak collision-resistance property of F2 to obtain a
                                                                    (k)
      gap between real min-entropy and accessible max-entropy in F3 , again assuming Equation (5.10).
               (k)
         • (F3 )−1 has real min-entropy at least
                                                                           √
                                          t 0 · t · (∆/2 − ∆2 /128n0 ) − n0 st − 2s .
                                                                                   


              (k)                                                                 √
         • (F3 )−1 has accessible max-entropy at most t 0 · (1 − q/8) · (t∆/2 + n0 st) + 1 , which is at
                                                                                          

           most:                                                 √
                                     t 0 · t · (∆/2 − ∆q/16) + n0 st) + 1 .
                                                                           

                                                                                          (k)
            Now, k ≤ n0 , so q = ∆/4k − 2−Ω(s) ≥ ∆/4n0 − 2−Ω(s) . This means (F3 )−1 has accessible
            max-entropy at most:
                                                                          √
                                 t 0 · t · (∆/2 − ∆2 /64n0 + 2−Ω(s) ) + n0 st) + 1 .
                                                                                  

                                                                       √
      Note that the gap is at least t 0 · t · ∆2 /128n0 − 2−Ω(s) − (2n0 st + 2s + 1) , which is at least 3s as
                                                                                    

      long as:                                                  √
                                t · ∆2 /128n0 ≥ 2−Ω(s) + 2n0 st + 2s + 1 + 3s/t 0
                                                                                                              (k)
      Since 3s/t 0 = 3q ≤ 3∆, we can set t = O(n0 /∆ + n0 s/∆2 + n40 s/∆4 ) = O(n40 s/∆4 ) so that (F3 )−1
      has a gap of 3s between real min-entropy and accessible max-entropy, and moreover, we know
      where this gap is (given k).

STEP 4: We follow steps 2, 3, 4, and 5 in the previous construction, with the following modifications in
      the parameters:

         • We apply entropy reduction first, with
                                                                        √
                                        ` = t 0 · t · (∆/2 − ∆q/16) + n0 st) + 1 + s.
                                                                                


                        T HEORY OF C OMPUTING, Volume 16 (8), 2020, pp. 1–55                                        36
               I NACCESSIBLE E NTROPY II: IE F UNCTIONS AND U NIVERSAL O NE -WAY H ASHING

            • To remove the non-uniform advice k, we “try all possibilities” from 0 to n0 in steps of size
              ∆2 /128n0 .
        We then obtain a non-uniform construction of UOWHFs with output and key length O(n0 · t · t 0 ) =
        O(n60 s2 /∆5 ), since t = O(n40 s/∆4 ) and t 0 = O(n0 s/∆). We also obtain a uniform construction with
        output length O(n0 /(∆2 /n0 ) · n0 · t · t 0 · log n) = O(n80 s2 /∆7 ) and key length O(n80 s2 /∆7 · log n).

This finishes the proof of Theorem 5.8.                                                                                     


6      Connection to average-case complexity
In this section we use the notion of inaccessible entropy to reprove a result by Impagliazzo and Levin [18],
given in the realm of average-case complexity. Our proof follows to a large extent the footstep of [18],
where the main novelty is formulating the proof in the language of inaccessible entropy, and rephrasing it
to make it resembles our proof of UOWHFs from one-way functions.
    Section 6.1 introduces the basic notion and definitions used through the section, and in particular
what “success on the average” means. It also formally describes the result of Impagliazzo and Levin [18],
a result that we reprove in Section 6.2.

6.1     Preliminaries and the Impagliazzo and Levin result
We start by introducing some basic notions from average-case complexity.10

6.1.1     Algorithms that err
Let L be some language, and suppose A(y; r) is a randomized algorithm with input y ∈ {0, 1}s , randomness
r, and output domain {0, 1, ⊥}.11 It is useful to think that A(y, ·) is trying to decide L, where 0 and 1 are
guesses whether y ∈ L, and ⊥ signals that A refuses to guess.
Definition 6.1. A randomized algorithm A is α-correct on input y ∈ {0, 1}s with respect to a language
L ⊆ {0, 1}s , if Pr[A(y; R) = L(y)] ≥ α, where we identify languages with their characteristic functions,
and R is uniformly distributed over the possible random coins of A.
    Recall that when defining a worst case complexity class (e. g., BPP), one requires Pr[A(y; R) =
L(y)] ≥ 2/3 for any y ∈ {0, 1}s (the choice of the constant 2/3 is somewhat arbitrary). In other words,
we require that an algorithm is 2/3−correct for every input.
    In contrast, in average-case complexity an algorithm is allowed to be wrong on some inputs. Specif-
ically, the success probability of a decider A is measured not with respect to a single input, but with
respect to a distribution over the elements of {0, 1}s .
    10 We limit the following discussion only to notions that we actually use, so this section should not be regarded as an

comprehensive introduction to the field of average-case complexity (see Bogdanov and Trevisan [5] for such an introduction).
While the definitions given here are equivalent to the ones given in [5], some of them are formulated somewhat differently (the
interested reader is welcome to check their equivalence).
   11 We make the less common choice of using y as the input variable (and not x), since in our applications the element y is

sampled as the output of a one-way function.


                            T HEORY OF C OMPUTING, Volume 16 (8), 2020, pp. 1–55                                           37
    I FTACH H AITNER , T HOMAS H OLENSTEIN , O MER R EINGOLD , S ALIL VADHAN AND H OETECK W EE

    To make this formal, we first a problem as such a pair of language and distribution family.

Definition 6.2 (Problem). A problem is a pair (L, D) of language and distribution family, where L ⊆
{0, 1}s and D = {Di }i∈N is a family of distributions over {0, 1}s .

    The problem class HeurBPP (see, e. g., [5, Definition 15]) contains those problems for which there
exists an efficient algorithm that is correct on all but a “small” fraction of the inputs.

Definition 6.3 (HeurBPP). A problem (L, D) is in HeurBPP, if there exists a four-input algorithm A
such that the following holds for every (n, δ ) ∈ N × (0, 1]: A(·, 1n , δ ; ·) runs in time p(n, 1/δ ) for some
p ∈ poly, and

                                Pr [A(Y, 1n , δ ; ·) is 32 -correct on Y ] ≥ 1 − δ .
                               Y ←Dn

    In one second we will restrict ourselves to the case that the sampler runs in polynomial time. Then,
this means that A is allowed to run in time polynomial in the “sampling complexity” of the instance, and
inverse polynomial in the probability with which it is allowed to err.

6.1.2   Samplable distributions
We next study the families of distributions D to consider. While it is most natural to focus on efficiently
samplable distributions, the definition of HeurBPP does not pose such limits on the distributions con-
sidered; a pair (L, D) can be decided efficiently on average even if sampling the distribution D is a
computationally hard problem. For the reduction, however, we restrict ourselves to polynomial-time
samplable distributions. This limitation is crucial, since there exist (not efficiently samplable) distributions
D with the property that (L, D) ∈ HeurBPP if and only if L ∈ BPP (see [22] or [5, Section 2.5]).

Definition 6.4 (Polynomial-time samplable distributions, (NP, PSamp)). A distribution family D =
{Dn }n∈N is polynomial-time samplable, denoted D ∈ PSamp, if there exists a polynomial-time com-
putable function D and a polynomial p(n), such that D(1n ,Up(n) ) is distributed according to Dn for every
n ∈ N, where Um is the uniform distribution over m-bit strings.
    The product set (NP, PSamp) denotes the set of all pairs (L, D) with L ∈ NP and D ∈ PSamp.

   Note that the input Up(n) to D above is the only source of randomness used to sample the elements of
D. In the following we make use the distribution family U = {Un }n∈N (clearly, U ∈ PSamp).

6.1.3   Impagliazzo and Levin result
In the above terminology the result of Impagliazzo and Levin [18] can be stated as follows (cf., [5, Thm.
29]):

Theorem 6.5 ([18]). If (NP, PSamp) 6⊆ HeurBPP, then ∃L ∈ NP with (L, U) ∈
                                                                        / HeurBPP.

   In other words, suppose there exists an average-case hard problem whose distribution is polynomial-
time samplable, then there exists an average-case hard problem whose distribution is uniform.

                        T HEORY OF C OMPUTING, Volume 16 (8), 2020, pp. 1–55                                 38
            I NACCESSIBLE E NTROPY II: IE F UNCTIONS AND U NIVERSAL O NE -WAY H ASHING


                (L, D) ∈
                       / HeurBPP                                      (L0 , U) ∈
                                                                               / HeurBPP

                           Trivial                                Theorem 6.8
                                               Lemma 6.9
             (R, D) ∈
                    / SearchHeurBPP                                (V, U) ∈
                                                                          / SearchHeurBPP

                              Figure 1: Schematic proof of Theorem 6.5.


6.1.4   Search problems
The proof of Theorem 6.5 uses the notion of “NP-search problems”:
Definition 6.6 (Search problems). A search problem is a pair (R, D), where R ⊆ {0, 1}s × {0, 1}s is a
binary relation, and D = {Di }i∈N is a family of distributions over {0, 1}s . In case R is an NP-relation,
then (R, D) is an NP-search problem.
    For a relation R, let RL be the corresponding language, i. e., RL = {y : ∃w : (y, w) ∈ R}.
    The notion of heuristics is naturally generalized to NP-search problems. The only change is that
in case the algorithm claims y ∈ RL , it additionally has to provide a witness to prove that. A search
algorithm A always outputs a pair, and we let A1 be the first component of this pair, and A2 be the second
component.
Definition 6.7 (SearchHeurBPP). An NP-search problem (R, D) is in SearchHeurBPP, if there exists
an algorithm A outputting pairs in {0, 1, ⊥} × {0, 1}s such that the following holds: (1) A1 is a heuristic
for (RL , D) and (2) A1 (y, 1n , δ ; r) = 1 =⇒ (y, A2 (y, 1n , δ ; r)) ∈ R.
    Algorithm A is called a (randomized) heuristic search algorithm for (R, D).

6.1.5   Search problems vs. decision problems
Suppose (L, D) ∈ (NP, PSamp) is a “difficult decision problem”. Then any NP-relation associated with
L gives a “difficult NP-search problem”, because finding a witness also solves the decision problem.
    The converse direction is less obvious (recall that even in worst-case complexity, one invokes self-
reducibility). Nevertheless, [4] prove the following (see also [5, Thm. 4.5]):
Theorem 6.8 ([4]). Suppose there is an NP-relation R with (R, U) ∈
                                                                 / SearchHeurBPP, then ∃L ∈ NP
with (L, U) ∈
            / HeurBPP.
     Using Theorem 6.8 the proof of Theorem 6.5 proceeds as follows (see Figure 1): suppose there is a pair
(L, D) ∈ (NP, PSamp) \ HeurBPP and let R be an NP-relation for L. Then (R, D) ∈        / SearchHeurBPP
(if (R, D) would have a search heuristic algorithm, the first component of this algorithm, that outputs its
left hand side output, would place (L, D) ∈ HeurBPP.) The following lemma states that in this case there
is a pair (V, U) ∈
                 / SearchHeurBPP, and therefore Theorem 6.8 yields the conclusion.
Lemma 6.9 (Impagliazzo and Levin [18] main lemma, reproved here). Assume that there exists an NP-
search problem (R, D) with D ∈ PSamp such that (R, D) ∈
                                                      / SearchHeurBPP, then there is an NP-relation
V such that (V, U) ∈
                   / SearchHeurBPP.

                       T HEORY OF C OMPUTING, Volume 16 (8), 2020, pp. 1–55                             39
      I FTACH H AITNER , T HOMAS H OLENSTEIN , O MER R EINGOLD , S ALIL VADHAN AND H OETECK W EE

     Intuitively Lemma 6.9 states the following: suppose some sampling algorithm D(1n , ·) generates hard
search problems, then there exist an NP-search problem that is hard over the uniform distribution.
     Consider the following application of Lemma 6.9; suppose that one-way functions exist and let
 f : {0, 1}n → {0, 1}n be a length-preserving one-way function. Let D(1n , r) be the algorithm that
applies f : {0, 1}n → {0, 1}n on the input randomness x = r and outputs f (x), and set D = {Dn }n∈N
to the corresponding distribution family. Furthermore, consider the NP-search problem given by the
relation R = {( f (x), x) : x ∈ {0, 1}s }. It is easy to verify that the NP-search problem (R, D) is not in
SearchHeurBPP.
     Lemma 6.9 implies that hard problems exist for some uniform distribution, under the assumption that
one-way functions exist. However, we knew this before: if one-way functions exist, then UOWHFs also
exist. Let Fk be such a family as in Definition 1.1, and consider the relation V = {((z, x), x0 ) : Fz (x) =
Fz (x0 ), x 6= x0 }, which asks us to find a non-trivial collision in Fz with a given x. By the security property
of UOWHF, if we pick (z, x) uniformly at random, then this is a hard problem, and it is possible to show
that (V, U) ∈   / SearchHeurBPP. Thus, Rompel’s result gives the conclusion of Lemma 6.9 in case we
have the stronger assumption that one-way functions exist.
     Given the above, it seems natural to ask whether the strategy used for constructing UOWHF from
one-way functions, can be used for proving the general case stated in Lemma 6.9. In the following
section we answer the above question in the affirmative. Specifically, we present an alternative proof
for Lemma 6.9 following a similar approach to that taken in the first part of this paper for constructing
UOWHF from one-way functions.

6.1.6    The Valiant-Vazirani lemma
We make use of the Valiant-Vazirani Lemma, originated in [31].

Lemma 6.10 ([31]). Let S ⊆ {0, 1}s be a set such that 2k ≤ |S| ≤ 2k+1 , and let
                                       n                        o
                                  G = g : {0, 1}s → {0, 1}k+2

be a family of pairwise independent hash-functions. Then for any x ∈ S, it holds that

                                       Pr [g−1 (0k+2 ) = {x}] ≥ 1/2k+3 .
                                      g←G

   A more usual formulation of the Valiant-Vazirani lemma states that with probability at least 18 there is
exactly one element x ∈ S with g(x) = 0k+2 . This follows immediately from the above form.

Proof. The probability that g(x) = 0k+2 for a fixed x ∈ S is 1/2k+2 . Conditioned on this event, due to
the pairwise independence of G, the probability that any other element of S is mapped to 0k+2 is at most
|S| /2k+2 ≤ 1/2.                                                                                      

6.2     Proving Lemma 6.9 via inaccessible entropy
Recall the basic idea underlying the two constructions of UOWHF from one-way functions presented in
the first part of this paper. In a first step, we use the one-way function f to construct a function F that

                        T HEORY OF C OMPUTING, Volume 16 (8), 2020, pp. 1–55                                  40
              I NACCESSIBLE E NTROPY II: IE F UNCTIONS AND U NIVERSAL O NE -WAY H ASHING

induces a gap between its real and accessible entropy (i. e., F has “inaccessible entropy”). Roughly, the
distribution induced by the output of any efficient “collision-finder” algorithm getting a random x and
returning a random x0 ∈ F −1 (F(x)), has a smaller entropy than that induced by random preimage of F(x).
Afterwards, we use F to build the UOWHF.
    We want to redo this first step in the current setting. Now, however, it is not anymore important to
talk about collisions.12 Thus, we can instead define F such that F −1 (y) has some inaccessible entropy for
a uniform random y. This is in fact compatible with the construction given in Section 4.2: it is possible to
show that the image of F is close to uniform in case i ≈ H f (X) ( f (x)) (recall that i is the number of bits
hashed out from f (x) in the definition of F).
    Let now (R, D) be an NP search problem with D ∈ PSamp which is not in SearchHeurBPP. We
would like to use a similar approach as above to define a relation with limited accessible max-entropy. One
might suggest that the following search problem has inaccessible entropy: given a four tuple (n, i, g, z),
where g is a pairwise independent hash-function, and z has i bits, find as solution an input x such that
g(D(1n , x))1,...,i = z. However, it turns out that one does not in fact need the randomness inherent in the
choice of z (note that a typical pairwise independent hash-function XORs the output with a random string
anyhow). Instead, it makes no difference to fix z = 0i , and so we adopt this to simplify the notation, so
that the suggested search problem becomes to find x with g(D(1n , x))1,...,i = 0i for a given triple (n, i, g).

Problems with the above intuition and postprocessing the witness. A moment of thought reveals
that there can be cases where this suggested search problem is easy. For example if the sampler D(1n , x)
simply outputs y = x itself, which is possible if finding w with (y, w) ∈ R is difficult for a uniform
random y. The solution is easy: ask the solving algorithm to output also a matching witness w with
(D(1n , x), w) ∈ R (ignore invalid outputs).
     Thus, the suggested search problem becomes: “given (n, i, g), find (x, w) such that g(D(1n , x))1...i = 0i
and (D(1n , x), w) ∈ R”. The hope is then that this search problem has limited accessible entropy in the
coordinate corresponding to x (we do not want to talk about the entropy in w because it arise from the
number of witnesses which R has, and at this point we have no control over this number).
     There is a last little problem to take care of: it is not obvious how to encode n into the search problem,
as (n, i, g) does not look like a uniform bitstring of a certain length, even if i and g look random. However,
it is possible to ensure that the length of (i, g) uniquely define n, and we assume that this is done in such a
way that n can be easily computed from the length of (i, g).

6.2.1   A Relation with bounded accessible average max-entropy
Using the above discussion, we now finally have enough intuition to define the relation Q. For D ∈ PSamp,
we let Canon(D) be an arbitrary polynomial-time sampler for D.
Construction 6.11. Let R be an NP relation, let D ∈ PSamp, let D = Canon(D) and let d ∈ N be such
that D’s running time on input (1n , ·) is bounded by nd . Let G be an explicit and constructible family of
pairwise independent hash functions, where the family Gm maps all strings of length at most m to strings
of length m.
  12 One advantage of using an “inversion problem” instead of a “collision problem” is that it becomes possible to use two-wise

independent hash-functions (instead of three-wise independent).


                           T HEORY OF C OMPUTING, Volume 16 (8), 2020, pp. 1–55                                             41
    I FTACH H AITNER , T HOMAS H OLENSTEIN , O MER R EINGOLD , S ALIL VADHAN AND H OETECK W EE

    For n ∈ N, define
            n                           d
                                                                                                           o
    Q(n) := (i, g), (x, w) : x ∈ {0, 1}n , i ∈ [nd ], g ∈ Gnd , g(D(1n , x))1...i = 0i , (D(1n , x), w) ∈ R ,
                          


and let Q :=     n∈N Q
                         (n) .
               S

    Note that the elements of Gm in Construction 6.11 have domain m              i
                                                                       i=0 {0, 1} . This somewhat unusual
                                                                                  S

requirement is needed since the sampler might output strings of arbitrary lengths (up to nd ).
    From now on, we will only consider the case where we have some fixed sampler D in mind. In
this case, whenever n is given, we will assume that (i, g) are elements satisfying the conditions in
Construction 6.11. Furthermore, we assume without loss of generality that (the encoding of) a uniform
random bitstring, of the right length, induces the uniform distribution on Gnd × [nd ].

6.2.2   Accessible average max-entropy
We next define what it means for an NP-search problem to have limited accessible max-entropy, with
respect to a part of its witness. This notion is modeled by introducing a function f that outputs the
“interesting part” of the witness.
Definition 6.12. Let Q be an NP relation, and f : {0, 1}s → {0, 1}s a function. For y ∈ {0, 1}s let

                                           SQ, f (y) := { f (w) : (y, w) ∈ Q} .

The real average max-entropy of (Q, D) with respect to f , is the function

                                          Q,D, f (m) = E [log( SQ, f (Y ) )]
                                         HReal
                                                          Y ←Dm

letting log(0) := −1.13
    In case the relation R and f are clear from the context, we sometimes write S(y) instead of SQ, f (y).
    We next define a useful notion of limited accessible max-entropy in this setting. Here, one should think
of algorithm A as an algorithm which, on input y produces a witness w with (y, w) ∈ Q. It furthermore
“aims” to produce witnesses w for which f (w) has as much entropy as possible.
Definition 6.13. Let f : {0, 1}s → {0, 1}s , p, k : N × (0, 1] → R, Q an NP-relation, and D = {Dn }n∈N a
family of distributions. The pair (Q, D) has i.o. (infinitely often) ρ-accessible average max-entropy at
most k with respect to f , if for every four-input algorithm A(y, 1m , ε; r) running in time ` = `(m, 1/ε)
for some ` ∈ poly, there exists infinitely many m’s in N, a function ε(m) = ε ∈ (0, 1] and an ensemble of
set families                       n                                o
                                     {Lm (y) ⊆ {0, 1}s }y∈Supp(Dm )     ,
                                                                               m∈N
such that

                                        Γ(Y, A(Y, 1m , ε; R)) ∈ Lm (Y ) ∪ {⊥} ≥ 1 − ρ(m, ε)
                                                                            
                                 Pr                                                                                      (6.1)
                     Y ←Dm ,R←{0,1}`

  13 The convention log(0) = −1 helps us to simplify notation in a few places. (We need that log(0) < log(1) since an algorithm

which produces no valid witness should produce less entropy than an algorithm which produces some valid witness).


                            T HEORY OF C OMPUTING, Volume 16 (8), 2020, pp. 1–55                                            42
            I NACCESSIBLE E NTROPY II: IE F UNCTIONS AND U NIVERSAL O NE -WAY H ASHING

where Γ = ΓQ, f (y, w) equals f (w) in case (y, w) ∈ Q and equals ⊥ otherwise, and
                                                           
                                       E      log(|Lm (Y )|) ≤ k(m, ε)                                (6.2)
                                     Y ←Dm


    The following lemma, proven in Section 6.2.3, states that the relation Q defined in Construction 6.11
has limited accessible max-entropy with respect to the function (x, w) → x.

Lemma 6.14. Let (R, D) be an NP relation with D ∈ PSamp and (R, D) ∈            / SearchHeurBPP. Define
Q from (R, D) as in Construction 6.11, and let f : {0, 1}s × {0, 1}s → {0, 1}s be given by f (x, w) = x.
Then, for any c ∈ N, (Q, U) has i.o. ( mε )c -accessible average max-entropy at most

                                                             √
                                      k(m, ε) = HReal        c
                                                 Q,U, f (m) − ε · m
                                                                   c



with respect to f .

    Lemma 6.14 is proven below, and in Section 6.2.4 we use Lemma 6.14 for proving Lemma 6.9.
The latter is done by additionally fixing the value of h(x)1... j , where h is an additional random hash
function, and j is a random integer (in a certain range). The ratio is that an algorithm producing (x, w)
with h(x)1... j = 0 j , can be used to access max-entropy roughly 2 j .


6.2.3   Proving Lemma 6.14

The proof of Lemma 6.14 follows similar lines to the proof of Theorem 4.5.


Proof (of Lemma 6.14). Let A be an algorithm that “aims to produce max-entropy for Q”. Without loss
of generality, we assume that A either outputs a valid witness (x, w) for a given input (i, g) or ⊥. We show
how to find infinitely many m ∈ N, ε = ε(m) ∈ (0, 1], and ensemble of set families
                                     n                         o
                                      Lm = {Lm (i, g)}(i,g)∈QL
                                                                  m∈N


with the properties as required in the lemma (we write Lm (i, g) instead of Lm (y) because the elements
of QL are pairs (i, g)). Towards achieving the above, consider the following candidate algorithm B for a
search heuristics for (R, D).
    Let β ∈ N be a constant to be determined by the analysis, let d ∈ N be such that nd is an upper bound
on the runtime of the sampler D (recall that we have fixed D above) on input (1n , ·), and let ` = `(m, ε)
be an upper bound on the running time of A on parameters m and ε. Let m(n) be the description length of
a pair in [nd ] × Gnd and let
                                                             β
                                         ε(n, δ ) = δ (n)/nβ .

                       T HEORY OF C OMPUTING, Volume 16 (8), 2020, pp. 1–55                              43
    I FTACH H AITNER , T HOMAS H OLENSTEIN , O MER R EINGOLD , S ALIL VADHAN AND H OETECK W EE


 Search Heuristics BA
 Oracle: A            / Entropy generator for (Q, U).
 Input: y ∈ {0, 1} , 1n and δ ∈ (0, 1].
                  s


 ε = ε(n, δ ); m = m(n); ` = `(m, ε)
 repeat n · (n/δ )β times:
      i ← [n  d
             ]0
      g ← g ∈ Gnd : g0 (y)1...i = 0i
      r ← {0, 1}`
      (x, w) = A((i, g), 1m , ε; r)
      if (y, w) ∈ R and D(1n , x) = y
             return (1, w)
 return 0
    It is clear that, for any β ∈ N, the running time of B is poly(n, 1/δ ). Recall that, by assumption,
(R, D) has no search heuristics. Algorithm B satisfies item (2) of Definition 6.7, and since (R, D) has no
search heuristics, it must fail to satisfy item (1) (i. e., B1 cannot be a randomized heuristic). Note now that
algorithm B1 is always correct if y ∈  / L (always outputs 0). Thus, B must fail to be correct on some y ∈ L.
In fact, there exist infinitely many n’s in N and a function δ = δ (n) ∈ (0, 1]. such that B1 (y, 1n , δ ; ·) is
not 23 -correct for more than a fraction δ of the inputs y ∈ RL produced by D(1n , ·).
    The following discussion is with respect to any fixed pair (n, δ = δ (n)) from the above infinite set.
    We present a family of sets
                                              {Lm (i, g)}(i,g)∈G d ×[nd ]
                                                             n

for which Equations (6.1) and (6.2) holds with respect to algorithm A and f , for the parameters m = m(n),
ε = ε(n, δ ) = ε(m), and ρ and k as stated in the lemma. Since this holds for any such pair (n, δ ) and
since m(n) ∈ Ω(n) (and thus, there are infinitely many different m’s) the proof of the lemma would follow.
    Consider the following set
                                                                                               
           
                                                                                         ε β 
                                                                                                
                                                            −1 n
      Y = y ∈ L:         Pr                     m                                   I
                                  [A1 ((I, G), 1 , ε; R) ∈ D (1 , y) : G(y)1...I = 0 ] <          ,  (6.3)
           
                       I←[nd ]                                                            n   
                                `
                     G←G d ,R←{0,1}
                        n

letting                                         n                            o
                                                            d
                                 D−1 (1n , y) := x ∈ {0, 1}n : D(1n , x) = y
and let ` = `(m). Note that Y contains all the y’s in L for which B1 is not 23 -correct, and the above
discussion implies
                                                Pr [Y ∈ Y] > δ                                            (6.4)
                                              Y ←Dn

Towards    defining the sets {Lm (i, g)}, we partition the preimages of the elements in Y into buckets; for
i ∈ 0, . . . , nd − 1 let
   
                            n            d                                              o
                     L(i) := x ∈ {0, 1}n : D(1n , x) ∈ Y ∩ y : HDn (y) (y) ∈ [i, i + 1)
                                                            
                                                                                           ,         (6.5)


                        T HEORY OF C OMPUTING, Volume 16 (8), 2020, pp. 1–55                                 44
                 I NACCESSIBLE E NTROPY II: IE F UNCTIONS AND U NIVERSAL O NE -WAY H ASHING

where HDn (y) (y) = − log(1/Dn (y)) is the sample entropy of y with respect to the distribution Dn . In
words: L(i) are those x for which y = D(1n , x) ∈ L is an element for which B(y) is unlikely to produce a
                                              d
witness, and for which y has roughly 2(n )−i preimages.
    For (i, g) ∈ Gnd , the set Lm (i, g) is defined as
                                                 Lm (i, g) := S(i, g) \ L(i),                                         (6.6)
where S(i, g) is taken from Definition 6.12. In the remaining of the proof we show that, for the right
choice of β , Equations (6.1) and (6.2) holds
                                           √   with respect to the family {Lm (i, g)} for the functions
                                Real
ρ(m, ε) = ( m ) and k(m, ε) = HQ,U, f (m) − ε · mc . The proof easily follow by the next two claims.
            ε c                             c


Claim 6.15. We have
                                                                                              ε β
                                   Pr         [A1 ((I, G), 1m , ε; R) ∈ L(I)] ≤ 2nd ·
                              I←[nd ],G←G d
                                         n
                                                                                                n
                                R←{0,1}`


Claim 6.16. For i ∈ [nd ] and g ∈ Gnd , let S̃(i, g) = S(i, g) in case this set in non-empty and S̃(i, g) = {⊥}
otherwise, then
                                                                     δ
                                             Pr     [X ∈ L(I)] ≥
                                           d
                                       I←[n ],G←G d
                                                  n
                                                                    10nd
                                                 X←S̃(I,G)

    Before proving the above claims, we first use them to conclude the proof of the lemma.
    Claim 6.15 implies that for large enough β
                                                                                    ε β       ε c
                  [Γ (I, G), A((I, G), 1m , ε; R) ∈ (Lm (I, G) ∪ {⊥})] ≥ 1 − 2nd ·
                                                 
         Pr                                                                               ≥ 1−                        (6.7)
    I←[nd ],G←G d
               n
                                                                                     n           m
      R←{0,1}`

yielding that Equation (6.1) holds for {Lm (i, g)} and ρ.
    Applying Markov’s inequality on Claim 6.16, yields that
                                                                                     δ
                                               Pr     [D(1n , X) ∈ L(i)] ≥                                            (6.8)
                                          X←S(i,g)                                  20nd

                d fraction of the pairs (i, g) ∈ [n ] × Gnd . Where for any such pair it holds that
              δ
for at least 20n                                   d

                                                                           δ
                                   log(|Lm (i, g)|) ≤ log((1 −                 ) · |S(i, g)|)                         (6.9)
                                                                         20nd
                                                                                    δ
                                                             ≤ log(|S(i, g)|) −        .
                                                                                 20nd
It follows that for large enough β
             HReal
              Q,U, f (m) − E[log(|Lm (I, G)|)] =                   E           [log(|S(I, G)|) − log(|L(m, I, G)|)]
                                                             (I,G)←[nd ]×Gnd
                                                                  √
                                                          δ2     ( β ε · mβ )2
                                                      ≥        =
                                                        400n2d
                                                        √          400n2d
                                                        c    c
                                                      ≥ ε ·m
Hence, Equation (6.2) holds for {Lm (i, g)} and k, and the proof of the lemma follows.                                  

                          T HEORY OF C OMPUTING, Volume 16 (8), 2020, pp. 1–55                                          45
    I FTACH H AITNER , T HOMAS H OLENSTEIN , O MER R EINGOLD , S ALIL VADHAN AND H OETECK W EE

Proof of Claim 6.15. Compute
         ε β                    ε β
   nd ·        ≥ ∑ Dn (y) · nd ·                                                                                                  (6.10)
          n      y∈Y               n
                   nd
               ≥∑              ∑         Dn (y) ·             Pr            [A1 ((i, G), 1m , ε; R) ∈ D−1 (1n , y) | G(y)1...i = 0i ],
                   i=1 y∈D(1n ,L(i))                G←Gnd ,R←{0,1}`


letting D(1n , L(i)) := D(1n , x) : x ∈ L(i)) . In addition, for any (i, r) it holds that
                       


            ∑       Dn (y) · Pr [A1 ((i, G), 1m , ε; r) ∈ D−1 (1n , y) | G(y)1...i = 0i ]
                                   G←Gnd
        y∈D(1n ,L(i))

                        =         ∑         Dn (y) · 2i · Pr [A1 ((i, G), 1m , ε; r) ∈ D−1 (1n , y) ∧ G(y)1...i = 0i ]
                                                            G←Gnd
                            y∈D(1n ,L(i))

                        =         ∑         Dn (y) · 2i · Pr [A1 ((i, G), 1m , ε; r)) ∈ D−1 (1n , y)]
                                                            G←Gnd
                            y∈D(1n ,L(i))
                            1
                        ≥     ·       ∑            Pr [A1 ((i, G), 1m , ε; r)) ∈ D−1 (1n , y)]
                            2                     G←Gnd
                                  y∈D(1n ,L(i))
                            1
                        =     · Pr [A1 ((i, G), 1m , ε; r)) ∈ L(i)].
                            2 G←Gnd
Collecting the equations yields the claim.                                                                                               

    For the proof of Claim 6.16, we need first a pairwise independence analogue of Claim 4.8. The proof
is exactly the same, except a bit simpler as we fix the output w instead of fixing another preimage. We
provide it for completeness.
                                                                              d
Claim 6.17. Let i ∈ [nd ], w ∈ {0, 1}i and x∗ ∈ {0, 1}n be such that H f (X) ( f (x∗ )) ≥ i. Then,
                                                                                           d
                                                                                  ∗ 2−n
                                                            Pr           [x = x ] ≥     ,
                                                         g←G d
                                                                n                    10
                                                       x←(g◦ f )−1 (w)


letting (g ◦ f )−1 (w) equals the set {x : g( f (x))1...i = w} in case this set is not empty, and {⊥} otherwise.
Proof. Let G be uniformly distributed over Gnd , and let E be the event that G( f (x∗ )) = w. Note that
Pr[E] = 2−i and that
                                                                                             1
                                                  Pr          [x = x∗ | E] =                                                      (6.11)
                                           x←(G◦ f )−1 (w)                            |(G ◦ f )−1 (w)|
                                                                                                                         d
The pairwise independence of Gnd yields that Pr[G( f (x)) = w | E] = 2−i for any x ∈ {0, 1}≤n \ f −1 ( f (x∗ )).
                                                         d
Hence, E (G ◦ f )−1 (w) \ f −1 ( f (x∗ )) | E ≤ 2 · 2−i+n and by Markov’s inequality
                                            

                           h                                           d
                                                                            i 1
                        Pr (G ◦ f )−1 (w) \ f −1 ( f (x∗ )) ≤ 4 · 2−i+n | E ≥                          (6.12)
                                                                                2

                              T HEORY OF C OMPUTING, Volume 16 (8), 2020, pp. 1–55                                                       46
            I NACCESSIBLE E NTROPY II: IE F UNCTIONS AND U NIVERSAL O NE -WAY H ASHING

Combining the above inequality and the assumption H f (X) ( f (x∗ )) ≥ i, we get
                           h                                              d
                                                                              i 1
                         Pr (G ◦ f )−1 (w) + | f −1 ( f (x∗ )| ≤ 5 · 2−i+n | E ≥ ,                   (6.13)
                                                                                2
and conclude that

                                Pr          [x = x∗ ] = Pr[E] ·          Pr           [x = x∗ | E]
                          x←(G◦ f )−1 (w)                         x←(G◦ f )−1 (w)
                                                              1     1
                                                      ≥ 2−i · ·     −i+nd
                                                              2 5·2
                                                            d
                                                        2−n
                                                      =       .
                                                         10
                                                                                                         

Proof of Claim 6.16. For i ∈ 0 . . . , nd and x ∈ L(i), Claim 6.17 yields that
                            

                                                                                  d
                                                                    2−n
                                                Pr        [X = x] ≥                                  (6.14)
                                          G←Gnd ,X←S̃(i,G)           10

By Equation (6.4) it holds that PrY ←Dn [Y ∈ Y] > δ , and therefore
                                                      d
                                                "             #
                                                     n
                                                           L(i) = Pr [Y ∈ Y] > δ .
                                                     [
                                     Pr         X∈
                               X←{0,1}n
                                            d                       Y ←Dn
                                                     i=1

We conclude that

                                                                2−nd
                                            [X ∈ L(I)] ≥ E L(I) ·
                                                          
                                     Pr
                              I←[nd ],G←G d
                                         n
                                                                  10
                                X←S̃(I,G)
                                                                     d        d
                                                               δ · 2n 2−n     δ
                                                             ≥     d
                                                                      ·    =      .
                                                                 n      10   10nd
                                                                                                         

6.2.4   A difficult problem for the uniform distribution
In this section we show how to transform a uniform search problem with a gap between its real and
accessible entropy, into a uniform search problem for which no heuristic search algorithm exists (i. e., the
problem is not in SearchHeurBPP). Combining it with Lemma 6.14 concludes the proof of Lemma 6.9.
    The transformation is achieved by adding additional restriction on the witness of the given search
problem. Specifically, requiring its “hash value” with respect to a randomly chosen pairwise independent
hash function to be the all zero string.
    We use the following construction:

                       T HEORY OF C OMPUTING, Volume 16 (8), 2020, pp. 1–55                              47
    I FTACH H AITNER , T HOMAS H OLENSTEIN , O MER R EINGOLD , S ALIL VADHAN AND H OETECK W EE

Construction 6.18. Let Q be an NP-relation, let f : {0, 1}s → {0, 1}s be a function, let G = {Gm } be
family of pairwise independent hash functions, where the functions of Gk map strings of length at most k
to string of length m (as in Construction 6.11), and let d ∈ N be such that (y, w) ∈ Q =⇒ | f (w)| ≤ |y|d .
For n ∈ N let
                 n                                                                                          o
         V(n) := (y, j, g), w) : y ∈ {0, 1}n , j ∈ [nd + 2], g ∈ Gnd +2 , (y, w) ∈ Q, g( f (w))1... j = 0 j

and let V :=    n∈N V .
                     (n)
               S

    As in Construction 6.18, we assume that the tuples (y, j, g)’s above can be encoded such that a
uniformly random string, of the right length, decodes to a uniformly random tuple in {0, 1}n × [nd + 2] ×
Gnd +2 .
                                                                                                     2
Lemma 6.19. Let Q, f , d and V be as in Construction 6.18. Suppose that (Q, U) has i.o. 50m
                                                                                         ε
                                                                                           d -accessible

average max-entropy at most HQ,U, f (m) − 5εm with respect to f , then (V, U) ∈
                             Real            d                                / SearchHeurBPP.
Proof. We assume towards a contradiction that (V, U) ∈ SearchHeurBPP, and show that (Q, U) has too
high accessible average max-entropy.
    Let A be a randomized search heuristics for (V, U). The following algorithm B contradicts the
                                    ε2
assumption that (Q, U) has i.o. 50m                                                    Real       d
                                       d -accessible average max-entropy at most HQ,U, f (m) − 5εm with
respect to f .
    Let ` = `(n, δ ) be an upper bound on the running time of A on parameters n and δ . Let n(m) be the
                                                                                    ε2
description length of a triplet in {0, 1}m × [md + 2] × Gmd +2 and let δ (m, ε) = 100m  d.



 Entropy generator BA for (Q, U)

 Oracle: A           / Search heuristics for (V, U).
 Input: y ∈ {0, 1} , 1m and ε ∈ (0, 1]
                  m


 δ = δ(m, ε); n = n(m); ` = `(n, δ )
  j ← 2, . . . , md + 2
 g ← Gmd +2
 r ← {0, 1}`
 (b, w) = A(y, j, g, 1n , δ ; r)
 if b = 1 return w, else return ⊥

     It is clear that the running time of B is poly(m, 1/ε). We show that B achieves high max-entropy for
all (except maybe finitely many) values of m and any ε. Specifically, that for all (except maybe finitely
many) (m, ε), there exists no family {Lm (y)}y∈{0,1}m as described in Definition 6.13.
     Fix m and ε, and let the random variables Y , J, G and R be uniformly chosen from {0, 1}m × [md +
2] × Gmd +2 × {0, 1}` . For y ∈ {0, 1}m , let η(y) be the probability that A(y, J, G, 1n , δ ; ·) is not 23 -correct.
                            ε2
Since E[η(Y )] ≤ δ = 100m      d , it holds that

                                                 h        ε i
                                              Pr η(Y ) ≤      ≥ 1−ε                                           (6.15)
                                                         8md

                         T HEORY OF C OMPUTING, Volume 16 (8), 2020, pp. 1–55                                    48
            I NACCESSIBLE E NTROPY II: IE F UNCTIONS AND U NIVERSAL O NE -WAY H ASHING

Fix y ∈ {0, 1}m and let S(y) = SQ, f (y) = { f (w) : (y, w) ∈ Q}. Intuitively, since f (w) is the “interesting
part” of a Q-witness w for y, S(y) is the set of interesting parts of witnesses of y.
    For x ∈ S(y) let E(y, x) ⊆ [md + 2] × Gmd +2 be the set of pairs ( j, g) for which x is the only element in
S(y) with g(x)1... j = 0 j . Going back to our intuition again, if x is the “interesting part” of a Q-witness
for y, then E(y, x) is the set of pairs ( j, g), such that on input (y, j, g), algorithm A must output w with
f (w) = x to be successful.
    By Lemma 6.10, for every fixed pair (y, x) with x ∈ S(y):
                                                                       1
                                         Pr[(J, G) ∈ E(y, x)] ≥                                          (6.16)
                                                                    8|S|md
The following argument would become simpler if Equation (6.16) would hold with equality, or if at least
Pr[(J, G) ∈ E(y, x)] was the same for every x with x ∈ S(y). This may of course not be the case, but since
all pairs ( j, g) have the same probability we can simply discard elements from the sets E(y, x) until they
all have the same size. We call the resulting sets E0 (y, x). Hence we can assume that E0 (y, x) ⊆ E(y, x) and
                                                                                            1
                          Pr[(J, G) ∈ E0 (y, x1 )] = Pr[(J, G) ∈ E0 (y, x2 )] ≥
                                                                                         8|S|md

for all x1 , x2 ∈ S(y).
     We next let E0 (y) = x∈S E0 (y, x). Intuitively, E0 (y) are those pairs ( j, g) such that A on input
                               S

(y, j, g) is forced to answer with some unique x. Let xy (g, j) to be the unique element of S(y) with
g(x)1... j = 0 j in case it exists, and ⊥ otherwise, and let Xy = xy (G, J). Because of the above trickery with
E0 , conditioned on (J, G) ∈ E0 (y), the random variable Xy is uniformly distributed over S(y). Furthermore,
since Pr[(J, G) ∈ E0 (y)] ≥ 8m1 d , it holds that

                      Pr[A(y, J, G, 1n , δ ; ·) is not 23 -correct | (J, G) ∈ E0 (y)]                    (6.17)
                       ≤ 8md · Pr[A(y, J, G, 1n , δ ; ·) is not 23 -correct ∧ (J, G) ∈ E0 (y)]
                       ≤ 8md · η(y).

    By definition,
                                                                                     2
                                  Pr[ f (A2 (y, j, g, 1n , δ ; R)) = xy ( j, g)] ≥                       (6.18)
                                                                                     3
for every ( j, g) ∈ E0 (y) such that A(y, j, g, 1n , δ ; ·) is 23 -correct.
    For each ( j, g) ∈ E0 (y), we now construct a set R( j, g) of random strings as follows: first pick all
random strings r for which f (A2 (y, j, g, 1n , δ ; r)) = xy ( j, g) is satisfied. Afterwards, if necessary, add
other random strings or discard some of the picked random strings such that |R( j, g)| = 23 2` .
    Note that

                            Pr[ f (A2 (y, J, G, 1n , δ ; R)) = Xy | (J, G) ∈ E0 (y) ∧ R ∈ R0 (J, G)]     (6.19)
                            ≥ Pr[A(y, J, G, 1 , δ ; ·) is 23 -correct. | (J, G) ∈ E0 (y)]
                                                n

                                     d
                            ≥ 1 − 8m η(y).


                        T HEORY OF C OMPUTING, Volume 16 (8), 2020, pp. 1–55                                 49
    I FTACH H AITNER , T HOMAS H OLENSTEIN , O MER R EINGOLD , S ALIL VADHAN AND H OETECK W EE

Hence, Equation (6.15) yields that

                 Pr[ f (A2 (y, J, G, 1n , δ ; R)) = Xy | (J, G) ∈ E0 (y) ∧ R ∈ R0 (J, G)] ≥ 1 − ε      (6.20)

for a (1 − ε) fraction of the y’s in {0, 1}m .
                                                                                              ε2
    It remains to show that no family of sets {L(y)}y∈{0,1}m can be used to show that A has 50m  d-
                                                       d
accessible max-entropy at most E[log(|S(Y )|)] − 5εm . Fix a family {L(y)} with E[log(|L(Y )|)] ≤
E[log(|S(Y )|)] − 5εmd . The following claim concludes the proof of the lemma.
                                                                              2
                                                    / L(Y ) ∪ {⊥}] ≥ 50m
Claim 6.20. It holds that Pr [Γ(Y, A(Y, 1n , ε; R)) ∈                 ε
                                                                         d.


Proof. Note that log(|S(y)|) − log(|L(y)|)            d        for at least 2ε fraction of the y’s in {0, 1}m
                                            > 2εm holds
(otherwise, E log(|S(Y )|) − log(|L(Y )|) ≤ (1 − 2ε)2εm + 2ε · (md + 1) < 5εnd ). Since Equation (6.20)
                                                             d

holds for a (1 − ε) fraction of the y’s in {0, 1}m , there exists a set Good ⊆ {0, 1}m of density ε such that
for every y ∈ Good, both

                                                     log(|L(y)|) < log(|S(y)|) − 2εmd , and            (6.21)
                                                                  0               0
                 Pr[ f (A2 (y, J, G, 1 , ε; R)) = Xy | (J, G) ∈ E (y) ∧ R ∈ R (J, G)] ≥ 1 − ε
                                     n
                                                                                                       (6.22)

hold. It follows that for every y ∈ Good

                                          |L(y)| < |S(y)| (1 − 1.5ε).                                  (6.23)

Equation (6.23) trivially holds in case L(y) = 0, where in case |L(y)| > 0, it holds that
   i. |L(y)| < |S(y)| · e−2ε·log|S(y)| (since |S(y)| > |L(y)| ≥ 1) and

   ii. e−2ε·log|S(y)| ≤ |S(y)| · (1 − 1.5ε) (since e−εκ ≤ e−ε < 1 − 0.75ε for κ ≥ 1 and ε < 21 ).
Equation (6.23) yields that

   Pr Xy ∈ S(y) \ L(y)|(J, G) ∈ E0 (y) ∧ R ∈ R0 (J, G) = Pr Xy ∈ S(y) \ L(y)|(J, G) ∈ E0 (y)
                                                                                         
                                                                                                       (6.24)
                                                              ≥ 1.5ε

for every y ∈ Good, and therefore Equation (6.22) yields that
                h                                                                       i ε
                                                                     0           0
              Pr f (A2 (y, J, G, 1 , δ ; R)) ∈ S(y) \ L(y) (J, G) ∈ E (y) ∧ R ∈ R (J, G) ≥
                                  n
                                                                                                       (6.25)
                                                                                           2
for every y ∈ Good.
    Since Pr[Y ∈ Good] ≥ ε and since Pr [(J, G) ∈ E0 (y) ∧ R ∈ R0 (J, G)] ≥ 23 (1 − ε) 8m1 d for every y ∈
Good, it follows that

                                                                                       1 ε   ε2
                                               / L(y) ∪ {⊥}] ≥ ε · 32 · (1 − ε) ·
              Pr[ f (A2 (Y, J, G, 1n , δ ; R)) ∈                                         · ≥     ,
                                                                                      8md 2 50md
proving the claim and thus the lemma.                                                                      
                                                                                                           

                        T HEORY OF C OMPUTING, Volume 16 (8), 2020, pp. 1–55                               50
            I NACCESSIBLE E NTROPY II: IE F UNCTIONS AND U NIVERSAL O NE -WAY H ASHING

6.2.5   Putting it together

We now use Lemmas 6.14 and 6.19 to prove Lemma 6.9 (and as we have seen, this implies Theorem 6.5).


             √ 6.9.2d Lemma 6.14 yields that (Q, U) has i.o. (ε/m) -accessible average max-entropy
Proof of Lemma                                                      2d

at most v − ε · m with respect to f , for v = EI,G [log( SQ, f (I, G) )]. It follows that (Q, U) has i.o.
             2d

  ε
50md
     -accessible average max-entropy at most v − 5εmd with respect to f , and the proof follows by
Lemma 6.19.                                                                                           


Acknowledgments
We are thankful to Ran Raz and Chiu-Yuen Koo for useful conversations.


References
 [1] S COTT A MES , ROSARIO G ENNARO , AND M UTHURAMAKRISHNAN V ENKITASUBRAMANIAM:
     The generalized randomized iterate and its application to new efficient constructions of UOWHFs
     from regular one-way functions. In Proc. 18th Internat. Conf. Theory Appl. of Cryptology and Inform.
     Security (ASIACRYPT’12), pp. 154–171. Springer, 2012. [doi:10.1007/978-3-642-34961-4_11] 7

 [2] K FIR BARHUM AND U ELI M AURER: UOWHFs from OWFs: Trading regularity for efficiency. In
     Progress in Cryptology (LATINCRYPT’12), pp. 234–253. Springer, 2012. [doi:10.1007/978-3-642-
     33481-8_13] 8

 [3] M IHIR B ELLARE AND P HILLIP ROGAWAY: Collision-resistant hashing: Towards making UOWHFs
     practical. In Proc. 17th Ann. Internat. Cryptology Conf. (CRYPTO’97), pp. 470–484. Springer, 1997.
     [doi:10.1007/BFb0052256] 30

 [4] S HAI B EN -DAVID , B ENNY C HOR , O DED G OLDREICH , AND M ICHAEL L UBY: On the theory
     of average case complexity. J. Comput. System Sci., 44(2):193–219, 1992. Preliminary version in
     SCT’89. [doi:10.1016/0022-0000(92)90019-F] 39

 [5] A NDREJ B OGDANOV AND L UCA T REVISAN: Average-case complexity. Found. and Trends in
     Theoret. Comp. Sci., 2(1):1–106, 2006. [doi:10.1561/0400000004, arXiv:cs/0606037] 8, 37, 38, 39

 [6] R AN C ANETTI , RONALD L. R IVEST, M ADHU S UDAN , L UCA T REVISAN , S ALIL P. VADHAN ,
     AND H OETECK W EE: Amplifying collision resistance: A complexity-theoretic treatment. In Proc.
     27th Ann. Internat. Cryptology Conf. (CRYPTO’07), pp. 264–283. Springer, 2007. [doi:10.1007/978-
     3-540-74143-5_15] 5, 34

 [7] T HOMAS M. C OVER AND J OY A. T HOMAS: Elements of Information Theory. Wiley-Interscience,
     2006. [doi:10.1002/047174882X] 17

                      T HEORY OF C OMPUTING, Volume 16 (8), 2020, pp. 1–55                            51
   I FTACH H AITNER , T HOMAS H OLENSTEIN , O MER R EINGOLD , S ALIL VADHAN AND H OETECK W EE

 [8] RONALD C RAMER AND V ICTOR S HOUP: Design and analysis of practical public-key encryption
     schemes secure against adaptive chosen ciphertext attack. SIAM J. Comput., 33(1):167–226, 2003.
     Preliminary version in CRYPTO’98. [doi:10.1137/S0097539702403773] 3

 [9] I FTACH H AITNER , T HOMAS H OLENSTEIN , O MER R EINGOLD , S ALIL VADHAN , AND H OETECK
     W EE: Universal one-way hash functions via inaccessible entropy. In Proc. 39th Ann. Internat. Conf.
     on Theory Appl. of Cryptographic Techniques (EUROCRYPT’10), pp. 616–637. Springer, 2010.
     [doi:10.1007/978-3-642-13190-5_31] 1, 7

[10] I FTACH H AITNER , M INH N GUYEN , S HIEN J IN O NG , O MER R EINGOLD , AND S ALIL VADHAN:
     Statistically hiding commitments and statistical zero-knowledge arguments from any one-way
     function. SIAM J. Comput., 39(3):1153–1218, 2009. [doi:10.1137/080725404] 2, 3, 7

[11] I FTACH H AITNER , O MER R EINGOLD , AND S ALIL VADHAN: Efficiency improvements in con-
     structing pseudorandom generators from one-way functions. SIAM J. Comput., 42(3):1405–1430,
     2013. Preliminary version in STOC’10. [doi:10.1137/100814421] 7

[12] I FTACH H AITNER , O MER R EINGOLD , S ALIL VADHAN , AND H OETECK W EE: Inaccessible
     entropy. In Proc. 41st STOC, pp. 611–620. ACM Press, 2009. [doi:10.1145/1536414.1536497] 2,
     3, 4, 7, 8

[13] I FTACH H AITNER , O MER R EINGOLD , S ALIL VADHAN , AND H OETECK W EE: Inaccessible
     entropy I: Inaccessible entropy generators and statistically hiding commitments from one-way
     functions, 2017. Preliminary version in STOC’09. [arXiv:2010.05586] 7

[14] I FTACH H AITNER AND S ALIL P. VADHAN: The many entropies in one-way functions. In Y EHUDA
     L INDELL, editor, Tutorials on the Foundations of Cryptography — Dedicated to Oded Goldreich,
     pp. 159–217. Springer, 2017. [doi:10.1007/978-3-319-57048-8_4, ECCC:TR17-084] 7

[15] J OHAN H ÅSTAD , RUSSELL I MPAGLIAZZO , L EONID A. L EVIN , AND M ICHAEL L UBY: A pseu-
     dorandom generator from any one-way function. SIAM J. Comput., 28(4):1364–1396, 1999.
     Preliminary versions in STOC’89 and STOC’90. [doi:10.1137/S0097539793244708] 2, 3, 4, 5, 7,
     10

[16] T HOMAS H OLENSTEIN AND R ENATO R ENNER: On the randomness of independent experi-
     ments. IEEE Trans. Inform. Theory, 57(4):1865–1871, 2011. [doi:10.1109/TIT.2011.2110230,
     arXiv:cs/0608007] 10, 11

[17] PAVEL H UBÁ ČEK , M ONI NAOR , AND E YLON YOGEV: The journey from NP to TFNP hardness. In
     Proc. 8th Innovations in Theoret. Comp. Sci. (ITCS’17), pp. 60:1–60:21. Schloss Dagstuhl–Leibniz-
     Zentrum fuer Informatik, 2017. [doi:10.4230/LIPIcs.ITCS.2017.60] 8

[18] RUSSELL I MPAGLIAZZO AND L EONID L EVIN: No better ways to generate hard NP instances
     than picking uniformly at random. In Proc. 31st FOCS, pp. 812–821. IEEE Comp. Soc., 1990.
     [doi:10.1109/FSCS.1990.89604] 2, 6, 8, 37, 38, 39

                      T HEORY OF C OMPUTING, Volume 16 (8), 2020, pp. 1–55                           52
           I NACCESSIBLE E NTROPY II: IE F UNCTIONS AND U NIVERSAL O NE -WAY H ASHING

[19] RUSSELL I MPAGLIAZZO AND M ICHAEL L UBY: One-way functions are essential for com-
     plexity based cryptography. In Proc. 30th FOCS, pp. 230–235. IEEE Comp. Soc., 1989.
     [doi:10.1109/SFCS.1989.63483] 5

[20] J ONATHAN K ATZ AND C HIU -Y UEN KOO: On constructing universal one-way hash functions from
     arbitrary one-way functions. Cryptology ePrint Archive, Report 2005/328, 2005. 3, 7, 20, 31

[21] L EONID A. L EVIN: Average case complete problems. SIAM J. Comput., 15(1):285–286, 1986.
     Preliminary version in STOC’84. [doi:10.1137/0215020] 8

[22] M ING L I AND PAUL M. B. V ITÁNYI: Average case complexity under the universal distribution
     equals worst-case complexity. Inform. Process. Lett., 42(3):145–149, 1992. [doi:10.1016/0020-
     0190(92)90138-L] 38

[23] M ONI NAOR AND M OTI Y UNG: Universal one-way hash functions and their cryptographic ap-
     plications. In Proc. 21st STOC, pp. 33–43. ACM Press, 1989. [doi:10.1145/73007.73011] 2, 3,
     30

[24] N OAM N ISAN AND DAVID Z UCKERMAN: Randomness is linear in space. J. Comput. System Sci.,
     52(1):43–52, 1996. [doi:10.1006/jcss.1996.0004] 9

[25] R ENATO R ENNER AND S TEFAN W OLF: Smooth Rényi entropy and applications. In Internat. Symp.
     Inform. Theory (ISIT’04), p. 233. IEEE Comp. Soc., 2004. [doi:10.1109/ISIT.2004.1365269] 9

[26] J OHN T. ROMPEL: One-way functions are necessary and sufficient for secure signatures. In Proc.
     22nd STOC, pp. 387–394. ACM Press, 1990. [doi:10.1145/100216.100269] 2, 3, 5, 6, 7, 15, 20, 31

[27] J OHN T. ROMPEL: Techniques for Computing With Low-Independence Randomness. Ph. D. thesis,
     Massachusetts Institute of Technology, 1990. ACM DL. 3, 7

[28] A LFREDO D E S ANTIS AND M OTI Y UNG: On the design of provably secure cryptographic hash
     functions. In Proc. 19th Ann. Internat. Conf. on Theory Appl. of Cryptographic Techniques
     (EUROCRYPT’90), pp. 412–431. Springer, 1990. [doi:10.1007/3-540-46877-3_37] 3

[29] V ICTOR S HOUP: A composition theorem for universal one-way hash functions. In Proc. 29th Ann.
     Internat. Conf. on Theory Appl. of Cryptographic Techniques (EUROCRYPT’00), pp. 445–452.
     Springer, 2000. [doi:10.1007/3-540-45539-6_32] 30

[30] S ALIL VADHAN: Computational entropy. In O DED G OLDREICH, editor, Providing Sound Founda-
     tions for Cryptography: On the Work of Shafi Goldwasser and Silvio Micali, pp. 693–726. ACM
     Press, 2019. [doi:10.1145/3335741.3335767] 7

[31] L ESLIE G. VALIANT AND V IJAY V. VAZIRANI: NP is as easy as detecting unique solutions.
     Theoret. Comput. Sci., 47(1):85–93, 1986. Preliminary version in STOC’85. [doi:10.1016/0304-
     3975(86)90135-0] 40

                     T HEORY OF C OMPUTING, Volume 16 (8), 2020, pp. 1–55                        53
   I FTACH H AITNER , T HOMAS H OLENSTEIN , O MER R EINGOLD , S ALIL VADHAN AND H OETECK W EE

[32] Y U Y U , DAWU G U , X IANGXUE L I , AND J IAN W ENG: (Almost) optimal constructions of
     UOWHFs from 1-to-1, regular one-way functions and beyond. In Proc. 35th Ann. Internat. Cryp-
     tology Conf. (CRYPTO’15), pp. 209–229. Springer, 2015. [doi:10.1007/978-3-662-48000-7_11]
     8


AUTHORS

     Iftach Haitner
     Tel Aviv University
     Tel Aviv Israel
     iftachh tau ac il
     www.cs.tau.ac.il/~iftachh


     Thomas Holenstein
     Google
     Zurich, Switzerland
     thomas.holenstein gmail com


     Omer Reingold
     Stanford University
     Stanford, California
     reingold stanford edu
     engineering.stanford.edu/people/omer-reingold


     Salil Vadhan
     Harvard University
     Cambridge, Massachusetts
     salil_vadhan harvard edu
     people.seas.harvard.edu/~salil


     Hoeteck Wee
     ENS
     Paris, France
     wee di ens fr
     www.di.ens.fr/~wee




                     T HEORY OF C OMPUTING, Volume 16 (8), 2020, pp. 1–55                       54
         I NACCESSIBLE E NTROPY II: IE F UNCTIONS AND U NIVERSAL O NE -WAY H ASHING

ABOUT THE AUTHORS

    I FTACH H AITNER is an associate professor at the School of Computer Science at Tel Aviv
        University and the director of the Check Point Institute for Information Security. He
        received his Ph. D. under the supervision of Omer Reingold at the Weizmann Institute
        in 2008; the title of his dissertation was “New Implications and Improved Efficiency
        of Constructions Based on One-way Functions.” He enjoys doing playback theater and
        swimming.


    T HOMAS H OLENSTEIN obtained his Ph. D. from ETH Zürich in 2006, advised by Ueli
       Maurer. He then spent two years as a postdoc at Microsoft Research, and one year as a
       postdoc at Princeton University. After this, he was a professor at ETH Zürich for 6 years,
       and then joined Google, where he currently works on the cryptographic library Tink.


    O MER R EINGOLD is the Rajeev Motwani Professor of Computer Science at Stanford
       University. Omer received his Ph. D. in 1999 from the Weizmann Institute under the
       direction of Moni Naor. Omer’s past positions include Samsung Research America, the
       Weizmann Institute of Science, Microsoft Research, the Institute for Advanced Study in
       Princeton, NJ and AT&T Labs. His research is in the Foundations of Computer Science
       and most notably in Computational Complexity and the Foundations of Cryptography
       with emphasis on randomness, derandomization and explicit combinatorial constructions.
       He has a keen interest in the societal impact of computation, as reflected in his work on
       privacy and algorithmic fairness.
       Omer is an ACM Fellow and among his distinctions are the 2005 Grace Murray Hopper
       Award and the 2009 Gödel Prize.


    S ALIL VADHAN is the Vicky Joseph Professor of Computer Science and Applied Mathe-
       matics at the Harvard John A. Paulson School of Engineering & Applied Sciences. He
       received his Ph. D. under the supervision of Shafi Goldwasser at MIT in 1999; the title of
       his dissertation was “A Study of Statistical Zero-Knowledge Proofs.” His other research
       interests include the theory of pseudorandomness and the theory and practice of data
       privacy. He enjoys spending leisure time with his wife and two daughters, as well as
       learning to surf in the cold waters of New England.


    H OETECK W EE joined ENS as a CNRS researcher in 2013, after teaching in the US for
       several years. He obtained his Ph. D. from UC Berkeley in 2007 under the supervision
       of Luca Trevisan. He is the recipient of a NSF CAREER Award, a Humboldt Research
       Fellowship, a Google Faculty Research Award, an ERC Starting Grant, and the Best
       paper award at Eurocrypt 2016; he is also a contributor to the TLS 1.3 standard.


                    T HEORY OF C OMPUTING, Volume 16 (8), 2020, pp. 1–55                            55