Authors Ivan Hu, Dieter van Melkebeek, Andrew Morgan,
License CC-BY-3.0
T HEORY OF C OMPUTING, Volume 20 (7), 2024, pp. 1β62
www.theoryofcomputing.org
Lower Bound Techniques in the
Comparison-Query Model and
Inversion Minimization on Trees
Ivan Hu Dieter van Melkebeek Andrew Morgan
Received March 18, 2023; Revised July 1, 2024; Published December 7, 2024
Abstract. Given a rooted tree and a ranking of its leaves, what is the minimum
number of inversions of the leaves that can be attained by ordering the tree? This
variation of the well-known problem of counting inversions in arrays originated in
mathematical psychology. It has the evaluation of the MannβWhitney statistic for
detecting differences between distributions as a special case.
We study the complexity of the problem in the comparison-query model, the
standard model for problems like sorting, selection, and heap construction. The
complexity depends heavily on the shape of the tree: for trees of unit depth, the
problem is trivial; for many other shapes, we establish lower bounds close to the
strongest known in the model, namely the lower bound of log2 (π!) for sorting π
items. For trees with π leaves we show, in increasing order of closeness to the sorting
lower bound:
A conference version of this paper appeared in the Proceedings of the 34th ACM-SIAM Symposium on Discrete
Algorithms (SODAβ23) [14].
ACM Classification: Theory of computation β Computational complexity and cryptography
AMS Classification: 68Q17, 68Q25
Key words and phrases: query complexity, comparison-query model, lower bounds, permutahe-
dron, sensitivity, connectivity, inversions, trees, MannβWhitney, Gaussian binomial coefficients,
Gaussian polynomials
Β© 2024 Ivan Hu, Dieter van Melkebeek, and Andrew Morgan
c b Licensed under a Creative Commons Attribution License (CC-BY) DOI: 10.4086/toc.2024.v020a007
I VAN H U , D IETER VAN M ELKEBEEK , AND A NDREW M ORGAN
(a) log2 ((πΌ(1 β πΌ)π)!) β π(log π) queries are needed whenever the tree has a
subtree that contains a fraction πΌ of the leaves. This implies a lower bound of
π
log2 (( (π+1) 2 π)!) β π(log π) for trees of degree π.
(b) log2 (π!) β π(log π) queries are needed in case the tree is binary.
(c) log2 (π!) β π(π log π) queries are needed for certain classes of trees of degree π,
including perfect trees with even π.
The lower bounds are obtained by developing two new techniques for any problem
Ξ in the comparison-query model and applying them to inversion minimization on
trees. Both techniques can be described in terms of the Cayley graph of the symmetric
group with adjacent-rank transpositions as the generating set, or equivalently,
in terms of the edge graph of the permutahedron, the polytope spanned by all
permutations of the vector (1, 2, . . . , π). Consider the subgraph consisting of the
edges between vertices with the same value under Ξ . We show that the size of any
decision tree for Ξ must be at least
(i) the number of connected components of this subgraph, and
(ii) the factorial (or Ξ function) of the average degree of the complementary
subgraph, divided by π.
Lower bounds on query complexity then follow by taking the base-2 logarithm.
Technique (i) represents a discrete analog of a classical technique in algebraic
complexity and allows us to establish (c) and a tight lower bound for counting cross
inversions, as well as unify several of the known lower bounds in the comparison-
query model. Technique (ii) represents an analog of sensitivity arguments in Boolean
complexity and allows us to establish (a) and (b).
Along the way to proving (b), we derive a tight upper bound on the maximum
probability of the distribution of cross inversions, which is the distribution of the
MannβWhitney statistic in the case of the null hypothesis. Up to normalization, the
probabilities appear in the literature as the coefficients of polynomials formed by the
Gaussian binomial coefficients, also known as Gaussian polynomials.
1 Overview
The result of a hierarchical cluster analysis on a set π of items can be thought of as an unordered
rooted tree π with leaf set π. To visualize the tree, or to spell out the classification in text,
one needs to decide for every internal node of π in which order to visit its children. Figure 1a
represents an example of a classification of eight body parts from the psychology literature
[6]. It is obtained by repeatedly clustering nearest neighbors where the distance between two
items is given by the number of people in a survey who put the items into different classes [22].
The ordering of the resulting binary tree in Figure 1a is the output produced by a particular
implementation of the clustering algorithm.
T HEORY OF C OMPUTING, Volume 20 (7), 2024, pp. 1β62 2
L OWER B OUND T ECHNIQUES IN THE C OMPARISON -Q UERY M ODEL
ear ear
toe chest waist cheek mouth cheek mouth chest waist toe
knee thigh thigh knee
(a) Initial tree ordering (b) Optimal tree ordering
Figure 1: Classification of body parts
Another ordering is given in Figure 1b; black marks the nodes whose children have been
swapped from the ordering in Figure 1a. Figure 1b has the advantage over Figure 1a that the
leaves now appear in an interesting global order, namely head-to-toe: ear, cheek, mouth, chest,
waist, thigh, knee, toe. Indeed, Figure 1b makes apparent that the anatomical order correlates
perfectly with the clustering. In general, given a tree π and a ranking π of its leaves, one might
ask βhow correlatedβ is π with π? Degerman [6] suggests evaluating the orderings of π in terms
of the number of inversions of the left-to-right ranking π of the leaves with respect to the given
ranking π, and use the minimum number over all orderings as a measure of (non)correlation.
For further reference, we collect the formal definitions of the underlying notions.
Definition 1.1 (tree ordering, ranking, inversion, InvΒ· (Β·)). An ordering of a rooted tree π consists
of an ordering, for every non-leaf π£ of π, of the children of π£. A ranking π of a set π of π items is
a bΔ³ection from π to [π]. Given two rankings π and π, an inversion of π with respect to π is a
pair of items π₯ 1 , π₯2 β π such that π(π₯ 1 ) < π(π₯ 2 ) but π(π₯ 1 ) > π(π₯ 2 ). The number of inversions
is denoted by Invπ (π). An inversion in an array π΄ of distinct values is an inversion of π with
respect to π where π denotes the ranking by array index and π the ranking by value; in this
setting we write Inv(π΄) for Invπ (π).
The minimum number of inversions over all tree orderings can be used to compare the quality
of different trees π for a given ranking π, or of different rankings π for a given tree π. This mimics
the use of the number of inversions in applications like collaborative filtering in recommender
systems, rank aggregation for meta searching the web, and Kendallβs test for dependencies
between two random variables. In particular, the MannβWhitney test for differences between
random variables can be viewed as a special case of our optimization problem. The test is
widely used because of its nonparametric nature, meaning that no assumptions need to be made
about the distribution of the two variables; the distribution of the statistic in the case of the null
hypothesis (both variables have the same distribution) is always the same. The test achieves
this property by only considering the relative order of the samples. It takes a sequence π΄ of π
samples from a random variable π, a sequence π΅ of π samples from another random variable π,
and computes the statistic π min(XInv(π΄, π΅), XInv(π΅, π΄)) that is the minimum of the number
XInv(π΄, π΅) of cross inversions from π΄ to π΅, and vice versa, defined as follows.
T HEORY OF C OMPUTING, Volume 20 (7), 2024, pp. 1β62 3
I VAN H U , D IETER VAN M ELKEBEEK , AND A NDREW M ORGAN
Definition 1.2 (cross inversions, XInvΒ· (Β·, Β·)). Let π be a ranking of π, and π΄, π΅ β π. A cross
inversion from π΄ to π΅ with respect to π is a pair (π₯ 1 , π₯2 ) β π΄ Γ π΅ that is out of order with respect
to π, i. e., such that π(π₯ 1 ) > π(π₯ 2 ). The number of cross inversions is denoted by XInvπ (π΄, π΅).
For two arrays π΄ and π΅ of distinct values, a cross inversion from π΄ to π΅ is a cross inversion from
the set of entries in π΄ to the set of entries in π΅ where π denotes the ranking by value; in this
setting we write XInv(π΄, π΅) for XInvπ (π΄, π΅).
The statistic π coincides with the optimum value of our optimization problem with the tree
π in Figure 2 as input. The leftmost π leaves correspond to the samples π΄, the rightmost π leaves
to the samples π΅, and the ranking π to the value order of the combined π + π samples.
Β·Β·Β· Β·Β·Β·
π leaves π leaves
Figure 2: MannβWhitney instance
We mainly study the value version of our optimization problem, which we denote by MinInv.
Definition 1.3 (inversion minimization on trees, MinInv(Β·, Β·), Ξ Β· ). Inversion minimization on
trees is the computational problem with the following specification:
Input: A rooted tree π with leaf set π of size π, and a ranking π of π.
Output: MinInv(π, π), the minimum of Invπ (π) over all possible orderings of π, where π denotes
the left-to-right ranking of π induced by the ordering of π.
For any fixed tree π with leaf set π, we use the short-hand Ξ π to denote the computational
problem that takes as input a ranking π of π and outputs MinInv(π, π).
Degerman [6] observes that the ordering at each internal node can be optimized independently
in a greedy fashion. In the setting of binary trees, for each node π£, we can count the cross
inversions from the leaves in the left subtree of π£ to the leaves in the right subtree of π£. Between
the two possible orderings of the children of a node π£, we choose the one that yields the smaller
number of cross inversions. Based on his observation, Degerman presents a polynomial-time
algorithm for the case of binary trees π. A more refined implementation and analysis yields
a running time of π(πavg (π) Β· π), where πavg (π) denotes the average depth of a leaf in π. For
balanced binary trees the running time becomes π(π log π). All of this can be viewed as variants
of the well-known π(π log π) divide-and-conquer algorithm for counting inversions in arrays of
length π.
T HEORY OF C OMPUTING, Volume 20 (7), 2024, pp. 1β62 4
L OWER B OUND T ECHNIQUES IN THE C OMPARISON -Q UERY M ODEL
For trees of degree deg(π) > 2, the local greedy optimization at each internal node becomes
more complicated, as there are many ways to order the children of each internal node. Exhaustive
search results in a running time of π((deg(π)! + deg(π) Β· πavg (π)) Β· π), which can be improved to
π((deg(π)2 2deg(π) + deg(π) Β· πavg (π)) Β· π) using dynamic programming. The problem is closely
related to the classical problem of minimum arc feedback set, and becomes NP-hard without
any constraints on the degree. We refer to Section 10 for more details.
Query complexity. Rather than running time in the Turing machine model, our focus lies
on query complexity in the comparison-query model. There we can only access the ranking
π : π β [π] via queries of the form: Is π(π₯ 1 ) < π(π₯ 2 )? For any fixed tree π, we want to determine
the minimum number of queries needed to solve the problem.
The comparison-query model represents the standard model for analyzing problems like
sorting, selection, and heap construction. Sorting represents the hardest problem in the
comparison-query model as it is tantamount to knowing the entire ranking π. Its query
complexity has a well-known information-theoretic lower bound of log2 (π!) = π log2 (π/e) +
2 log2 (π) + π(1). Standard algorithms such as mergesort and heapsort yield an upper bound of
1
log2 (π!) + π(π), which has been improved to log2 (π!) + π(π) recently [26]. We refer to Section 2
for an overview of results and techniques for lower bounds in the model.
Information theory only yields
a very weak lower bound on the query complexity of inversion
minimization on trees: log2 π2 = 2 log2 (π) β π(1). The complexity of the problem critically
depends on the shape of the tree π and can be significantly lower than the one for sorting. For
starters, the problem becomes trivial for trees of depth one as their leaves can be arranged freely
in any order. More precisely, the trees π for which the answer is identically zero, irrespective of
the ranking π, are exactly those such that all root-to-leaf paths have only the root in common.
Arguably, the simplest nontrivial instances of inversion minimization are for trees π of the
MannβWhitney type in Figure 2 with π = 1 and π = π β 1. Depending on the rank π of the
isolated leaf, an optimal ordering of π is either the left or the right part in Figure 3, where the
label of each leaf is its rank under π.
... ...
π 1 2 π 1 2 π π
Figure 3: Rank instance
As the ordering on the left has π β 1 inversions and the one on the right π β π, the answer is
min(π β 1, π β π). Thus, this instance of inversion minimization on trees is essentially equivalent
to rank finding, which has query complexity exactly π β 1.
T HEORY OF C OMPUTING, Volume 20 (7), 2024, pp. 1β62 5
I VAN H U , D IETER VAN M ELKEBEEK , AND A NDREW M ORGAN
1.1 Main results
We prove that for many trees π, inversion minimization on π is nearly as hard as sorting. First,
we exhibit a common structure that guarantees high complexity, namely a subtree that contains
a fairly balanced fraction of the leaves. We make use of the following notation.
Definition 1.4 (leaf set, L(Β·), and subtree). For a tree π, the leaf set of π, denoted L(π), is the set
of leaves of π. For a node π£ to π, ππ£ denotes the subtree of π rooted at π£.
The quantitative statement references the gamma function Ξ, which is a proxy for any convex
real function that interpolates the factorial function on the positive integers. More precisely, we
have that Ξ(π + 1) = π! for every integer π β₯ 1.
Theorem 1.5 (lower bound for general trees). Let π be a tree with π leaves, and π£ a node with
|L(ππ£ )| = β . The query complexity of inversion minimization on π is at least log2 (Ξ( β (πββ π
)
+ 1)). In
π
particular, the complexity is at least log2 (Ξ( (π+1)2 Β· π + 1)) where π denotes the degree of π.
For trees of constant degree, Theorem 1.5 yields a lower bound that is as strong as the one
for sorting up to a constant multiplicative factor. For the important case of binary trees (like
the classification trees from the motivating example), we obtain a lower bound that is only a
logarithmic additive term shy of the lower bound for sorting.
Theorem 1.6 (lower bound for binary trees). For binary trees π with π leaves, the query complexity
of inversion minimization on π is at least log2 (π!) β π(log π).
The logarithmic loss can be reduced to a constant for certain restricted classes of trees. The
full statement is somewhat technical. First, it assumes that the tree has no nodes of degree 1. This
is without loss of generality, as we can short-cut all degree-1 nodes in the tree without affecting
the minimum number of inversions. For example, trivial trees for inversion minimization have
depth 1 without loss of generality. Second, the strength of the lower bound depends on the
maximum size of a leaf child set, defined as follows.
Definition 1.7 (leaf child set, LC(Β·)). The leaf child set LC(π£) of a vertex π£ in a tree π is the set
LC(π£) of all the children of π£ that are leaves in π.
Most importantly, the result requires certain fragile parity conditions to hold. That said,
there are interesting classes satisfying all requirements, and the bounds are very tight.
Theorem 1.8 (lower bound for restricted classes). Let π be a tree without nodes of degree 1 such that
the leaf child sets have size at most π, at most one of them is odd, and if there exists an odd one, say LC(π£ β ),
then all ancestors of π£ β have empty leaf child sets. The query complexity of inversion minimization on π
is at least log2 (π!) β π(π log π). In particular, the lower bound applies to:
β’ perfect trees of even degree π, and
β’ full binary (π = 2) trees with at most one leaf without a sibling leaf.
T HEORY OF C OMPUTING, Volume 20 (7), 2024, pp. 1β62 6
L OWER B OUND T ECHNIQUES IN THE C OMPARISON -Q UERY M ODEL
Recall that a tree of degree π is full if every node has degree 0 or π. It is perfect if it is full
and all leaves have the same depth.
For the MannβWhitney statistic, Theorem 1.5 provides an Ξ©(π log π) lower bound for
balanced instances, i. e., when π and π are Ξ(π). For unbalanced instances there is a more
efficient way to count cross inversions and thus evaluate the statistic: Sort the smaller of the two
sides, and then do a binary search for each item of the larger side to find its position within the
sorted smaller side so as to determine the number of cross inversions that it contributes. For
π β€ π the approach makes π log2 (π) + π(π log π) comparisons. We establish a lower bound that
shows the approach is optimal up to a constant multiplicative factor.
Theorem 1.9 (lower bound for counting cross inversions). Counting cross inversions from a set π΄
of size π to a set π΅ of size π β₯ π with respect to a ranking π of π π΄ t π΅ requires Ξ©((π + π) log(π))
queries in the comparison-query model, as does inversion minimization on the tree of Figure 2.
1.2 Techniques
We obtain our results by developing two new query lower bound techniques for problems Ξ
in the comparison-query model, and then instantiating them to the problem Ξ π of inversion
minimization on a fixed tree π. Although some of our techniques extend to relations, we restrict
attention to computational problems Ξ that are functions, just like the problem Ξ π that we focus
on.
Definition 1.10 (computational problem and algorithm in the comparison-query model). A
computational problem in the comparison-query model is a total function on the rankings of a
set π. An algorithm in the comparison-query model can access an input ranking π : π β [π]
using comparison queries: For given π₯1 , π₯2 β π, test if π(π₯ 1 ) < π(π₯ 2 ).
Both of our techniques follow the common pattern of lower bounding the number of distinct
execution traces that any algorithm for Ξ needs to have.
Definition 1.11 (execution trace, complexity measures D(Β·) and Q(Β·)). Consider an algorithm π΄
for a problem Ξ in the comparison-query model. An execution trace of π΄ is the sequence of
comparisons that π΄ makes on some input π, as well as the outcomes of the comparisons. The
complexity D(Ξ ) is the minimum over all possible algorithms for Ξ of the number of distinct
traces the algorithm has over the set of all inputs π. The complexity Q(Ξ ) is the minimum, over
all possible algorithms for Ξ of the maximum number of comparisons that the algorithm makes
over the set of all inputs π.
The complexity measure Q is what we refer to as query complexity. Since the maximum
number of queries that an algorithm π΄ makes is at least the base-2 logarithm of the number
of execution traces, we have that Q(Ξ ) β₯ log2 (D(Ξ )). Note that, in order to avoid confusion
with the tree π specifying an instance of inversion minimization, we refrain from the common
terminology of decision trees in the context of the complexity measure D. In those terms, we
lower bound the number of leaves of any decision tree for Ξ , and use the fact that the depth of
this binary decision tree is at least the base-2 logarithm of the number of leaves.
T HEORY OF C OMPUTING, Volume 20 (7), 2024, pp. 1β62 7
I VAN H U , D IETER VAN M ELKEBEEK , AND A NDREW M ORGAN
Both techniques proceed by considering the effect on the output of perturbations to the input
ranking π that are hard for queries to observe. More specifically, we consider the following
perturbations:
Definition 1.12 (adjacent-rank transposition, affected items). An adjacent-rank transposition is
a permutation π of [π] of the form π = (π, π + 1), where π β [π β 1] and π denotes the number
of items. Given π and a ranking π : π β [π], the affected items are the two elements π₯ β π for
which π(π(π₯)) β π(π₯), i. e., the items with ranks π and π + 1 under π.
As with any permutation of the set of ranks, the effect of π on a ranking π is the ranking ππ.
Adjacent-rank transpositions are the least noticeable perturbations one can apply to a ranking
in the following sense: If two rankings differ by an adjacent-rank transposition, then the only
query that distinguishes them is the query that compares the affected items.
Sensitivity. Our first technique turns this observation around to obtain a lower bound on
query complexity. We adopt the terminology of sensitivity from Boolean query complexity.
Definition 1.13 (sensitivity, average sensitivity, π (Β·)). Let Ξ be a computational problem in
the comparison-query model on a set π of items. For a fixed ranking π and adjacent-rank
transposition π, we say that Ξ is sensitive to π at π if Ξ (π) β Ξ (ππ). The sensitivity of Ξ at π
is the number of adjacent-rank transpositions π such that Ξ is sensitive to π at π. The average
sensitivity of Ξ , denoted π (Ξ ), is the average sensitivity of Ξ at π when π is drawn uniformly at
random from all rankings of π.
On input a ranking π, any algorithm for Ξ needs to make a number of queries that is at
least the sensitivity of Ξ at π. Indeed, consider an adjacent-rank transposition π to which Ξ is
sensitive at π. If the algorithm does not make the query that compares the affected items, then it
must output the same answer on input ππ as on input π. Since the value of Ξ differs on both
inputs, this means the algorithm makes a mistake on at least one of the two. It follows that the
average number of queries that any algorithm for Ξ makes is at least the average sensitivity
π (Ξ ). A fortiori, Q(Ξ ) β₯ π (Ξ ).
As sensitivity cannot exceed π β 1, the best lower bound on query complexity that we can
establish based on the above basic observation alone, is π β 1. The following improvement
yields a the lower bound D(Ξ ) β₯ π!/π = (π β 1)!, and therefore Q(Ξ ) β₯ log2 (π!) β log2 (π) for
problems Ξ of maximum average sensitivity π (Ξ ) = π β 1. The argument hinges on an efficient
encoding of rankings that share the same execution trace. See Section 3 for more details.
Lemma 1.14 (Sensitivity Lemma). For any problem Ξ in the comparison-query model with π items,
D(Ξ ) β₯ Ξ(π (Ξ ) + 2)/π.
The lower bound for general trees π in Theorem 1.5 and the strengthening for binary trees
in Theorem 1.6 follow from corresponding lower bounds on the average sensitivity π (Ξ π ).
Theorem 1.5 only requires a short analysis to establish the sensitivity lower bound needed for
the application of the Sensitivity Lemma; this illustrates the power of the lemma and of the
lower bound technique. Theorem 1.6 requires a more involved sensitivity analysis, but then
T HEORY OF C OMPUTING, Volume 20 (7), 2024, pp. 1β62 8
L OWER B OUND T ECHNIQUES IN THE C OMPARISON -Q UERY M ODEL
yields a very tight lower bound. Owing to the average-case nature of the underlying measure,
the technique also exhibits some degree of robustness. For the particular problem of inversion
minimization on trees, we show that small changes to the tree π do not affect the average
sensitivity π (Ξ π ) by much. See Section 4 and Section 5.
For sorting, counting inversions, and inversion parity, the average sensitivity reaches its
maximum value of π β 1, and Lemma 1.14 recovers the standard lower bounds up to a small
loss. In contrast, for selection, the average sensitivity equals 1 for ranks 1 and π, and 2 for other
ranks, so the bound from Lemma 1.14 is no good. This reflects that, just like in the Boolean
setting, (average) sensitivity is sometimes too rough of a measure and not always capable of
proving strong lower bounds. Our second technique looks at a more delicate structural aspect,
which enables it to sometimes yield stronger lower bounds.
Permutahedron graph. Before introducing our second technique, we cast our first technique
in graph theoretic terms. In fact, both our techniques can be expressed naturally in subgraphs
of the graph with the rankings as vertices and adjacent-rank transpositions as edges. The
latter graph can be viewed as the Cayley graph of the symmetric group with adjacent-rank
transpositions as the generating set. It is also the edge graph of the permutahedron, the convex
polytope spanned by all permutations of the vertex (1, 2, . . . , π) in β π . The permutahedron
resides inside the hyperplane where the sum of the coordinates equals π2 , has positive volume
inside that hyperplane, and can thus be represented naturally in dimension π β 1; see Figure 4
for a rendering of the instance with π = 4 [9].
We think of coloring the vertices of the
permutahedron with their values under Ξ and (3,1,2,4)
(4,1,2,3)
(4,2,1,3)
make use of the subgraph with the same vertex (3,2,1,4)
set but only containing the monochromatic (4,1,3,2)
(2,1,3,4) (4,3,1,2)
edges, i. e., the edges whose end points have
(2,3,1,4)
the same value under Ξ . We also consider the
(3,1,4,2) (4,2,3,1)
the complementary subgraph containing all (2,1,4,3) (4,3,2,1)
bichromatic edges. (1,2,3,4) (3,4,1,2)
(1,3,2,4) (2,4,1,3)
(3,2,4,1)
Definition 1.15 (permutahedron graph, πΊ(Β·),
(1,2,4,3) (3,4,2,1)
πΊ(Β·)). Let Ξ be a computational problem in
the comparison-query model on a set π of (1,4,2,3)
(2,3,4,1)
items. The permutahedron graph of Ξ , denoted (1,3,4,2) (2,4,3,1)
πΊ(Ξ ), has the rankings of π as vertices, and (1,4,3,2)
an edge between two rankings π1 and π2 if Figure 4: Permutahedron for π = 4 items
Ξ (π1 ) = Ξ (π2 ) and there exists an adjacent-
rank transposition such that π2 = ππ1 . The
complementary permutahedron graph of Ξ , denoted πΊ(Ξ ), is defined similarly by replacing the
condition Ξ (π1 ) = Ξ (π2 ) by its complement, Ξ (π1 ) β Ξ (π2 ).
Our first technique looks at degrees in the complementary permutahedron graph πΊ(Ξ ), and
more specifically at the average degree degavg (πΊ(Ξ )) πΌ(degπΊ(Ξ ) (π)), where the expectation is
T HEORY OF C OMPUTING, Volume 20 (7), 2024, pp. 1β62 9
I VAN H U , D IETER VAN M ELKEBEEK , AND A NDREW M ORGAN
with respect to a uniform choice of the ranking π. Our second technique looks at the connected
components of the permutahedron graph πΊ(Ξ ).
Connectivity. Our second technique is reminiscent of a result in algebraic complexity theory,
where the number of execution traces of an algorithm for a problem Ξ in the algebraic
comparison-query model is lower bounded in terms of the number of connected components
that Ξ induces in its input space β π [1]. In the comparison-query setting, we obtain the following
lower bound.
Lemma 1.16 (Connectivity Lemma). For any problem Ξ in the comparison-query model, D(Ξ ) is at
least the number of connected components of πΊ(Ξ ).
The Connectivity Lemma allows for a simple and unified exposition of many of the known
lower bounds. For counting inversions and inversion parity the argument goes as follows. Every
adjacent-rank transposition changes the number of inversions by exactly one (up or down), and
therefore changes the output of Ξ , so all π! vertices in πΊ(Ξ ) are isolated. This means that any
algorithm for Ξ actually needs to sort and has to make at least log2 (π!) queries. See Section 7 for
a proof of the Connectivity Lemma and more applications to classical problems, including the
Ξ©(π) lower bound for median finding.
The Connectivity Lemma also enables us to establish strong lower bounds for inversion
minimization on special types of trees π, namely those of Theorem 1.8 and the MannβWhitney
instances in Theorem 1.9, closely related to counting inversions. Both theorems involve an
analysis of the size of the connected component of a random ranking π in πΊ(Ξ π ), and Theorem 1.8
uses the delicate parity conditions of its statement to keep πΊ(Ξ π ) as sparse as possible. See
Section 8 for more details, including a more general property that guarantees the required
sparseness (the partition property from Definition 8.1) and the resulting lower bound for any
problem Ξ that satisfies the property (Lemma 8.4).
The MannβWhitney setting illustrates well the relative power of our techniques. In the
MannβWhitney instances of inversion minimization, the leaves are naturally split between
a subtree containing π of them and a subtree containing π of them. The argument behind
ππ
Theorem 1.5 yields a lower bound of π+π on the sensitivity π (Ξ π ). The true sensitivity is just
π(1) below the one for counting cross inversions, which is π+π
2ππ
. The resulting lower bounds on
the query complexity in case π β€ π are Ξ(π log π), which roughly account for sorting the smaller
side but not for the π log2 (π) comparisons used in the subsequent binary searches for counting
cross inversions. Our approach based on the Connectivity Lemma yields a lower bound that
includes both terms. On the other hand, it is easier to estimate and obtain the lower bound via
the Sensitivity Lemma than to argue the query lower bound via the Connectivity Lemma or
from scratch.
1.3 Other results
Other modes of computation. We stated our lower bounds for the standard, deterministic
mode of computation. Both of our techniques provide lower bounds for the number of distinct
T HEORY OF C OMPUTING, Volume 20 (7), 2024, pp. 1β62 10
L OWER B OUND T ECHNIQUES IN THE C OMPARISON -Q UERY M ODEL
execution traces that are needed to cover all input rankings, irrespective of whether these
execution traces derive from a single algorithm. Such execution traces can be viewed as
certificates or witnesses for the value of Ξ on a given input π, or as valid execution traces of a
nondeterministic algorithm for Ξ . We define the minimum number of traces needed to cover all
input rankings for a problem Ξ as the nondeterministic complexity of Ξ and denote it by N(Ξ ),
along the lines of the Boolean setting [15]. All of our lower bounds on D(Ξ ) actually hold for
N(Ξ ). See Remark 3.4 and Remark 7.2 for further discussion.
Since randomized algorithms with zero error are also nondeterministic algorithms, all of our
lower bounds apply verbatim to the former mode of computation, as well. As for randomized
algorithms with bounded error, we argue in Section 6 that our lower bounds on the query complexity
of inversion minimization on trees that follow from the Sensitivity Lemma carry over modulo a
small loss in strength. We do so by showing in general that high average sensitivity implies
high query complexity against such algorithms (see Lemma 6.5).
The fact that our techniques yield lower bounds on N(Ξ ) and not just D(Ξ ) also explains
why our approaches sometimes fail. For example, for the problem Ξ of finding the minimum
of π items, a total of π certificates suffice and are needed, namely one for each possible item
being the minimum. This means that our techniques cannot give a lower bound on the query
complexity of Ξ that is better than log2 (π). In contrast, as reviewed in Section 2, D(Ξ ) = 2πβ1
and the number of queries needed is π β 1.
Cross-inversion distribution. As a technical result in the sensitivity analysis for inversion
minimization on binary trees (Theorem 1.6), we need a strong upper bound on the probability
that the number of cross inversions XInvπ (π΄, π΅) takes on any particular value when the ranking π
of the set π = π΄ t π΅ is chosen uniformly at random. This is the distribution of the MannβWhitney
statistic under the null hypothesis. Mann and Whitney [20] argued that it converges to a normal
distribution with mean π = ππ/2 and variance π2 = ππ(π + π + 1)/12βas π |π΄| and π |π΅| grow
large. Since the normal distribution has a maximum density of 1/( 2ππ), theirp result suggests
that the maximum of the underlying probability distribution is π(1/π) = π(1/ ππ(π + π + 1)).
β
TakΓ‘cs [28] managed to formally establish such a bound for all pairs (π, π) with |πβπ| = π( π + π),
Stanley and Zanello [27] for all pairs (π, π) with min(π, π) bounded, and Melczer, Panova, and
Pemantle [21] for all pairs (π, π) with |π β π| β€ πΌ Β· (π + π) for some constant πΌ < 1. However, these
results do not cover all regimes and leave open a single bound of the same form that applies to
all pairs (π, π), which is what we need for Theorem 1.6. We establish such a bound in Section 9.
The counts of the rankings π with a particular value for XInvπ (π΄, π΅) appear as the coefficients of
the Gaussian polynomials. Our bound can be stated equivalently as a bound on those coefficients.
1.4 Organization
We have organized the material so as to provide a shortest route to a full proof of Theorem 1.5.
Here are the sections needed for the different main results:
β’ Theorem 1.5 (lower bound for general trees): 3, 4.
T HEORY OF C OMPUTING, Volume 20 (7), 2024, pp. 1β62 11
I VAN H U , D IETER VAN M ELKEBEEK , AND A NDREW M ORGAN
β’ Theorem 1.6 (lower bound for binary trees): 3, 5, 9.
β’ Theorem 1.8 (lower bound for restricted classes): 7, 8 up to 8.3 inclusive.
β’ Theorem 1.9 (lower bound for counting cross inversions): 7, 8 but not 8.2 nor 8.3.
In Section 2, we provide some background on known lower bounds in the comparison-query
model, several of which are unified by the Sensitivity Lemma and Connectivity Lemma. In
Section 6, we present our lower bounds against randomized algorithms with bounded error.
The tight bound on maximum probability of the cross-inversion distribution is covered in
Section 9. For completeness, we end in Section 10 with proofs of the results we stated on the
Turing complexity of inversion minimization on trees.
2 The comparison-query model
In this section we provide an overview of known results and techniques for lower bounds in the
comparison-query model. This section can be skipped without a significant loss in continuity.
Tight bounds have been established for problems like sorting, selection, and heap construc-
tion.
β’ We already discussed the central problem of sorting in Section 1.
β’ In selection we are told a rank π, and must identify the item with rank π. The query
complexity is known to be Ξ(π) [2, 7, 8]. There is also multiple selection, in which one is
given multiple ranks π1 , . . . , π π , and must identify each of the corresponding items. The
query complexity of multiple selection is likewise known up to a Ξ(π) gap between the
upper and lower bounds [16].
β’ In heap construction we must arrange the items as nodes in a complete binary tree such that
every node has a rank no larger than its children. The query complexity is known to be
Ξ(π) [5, 11].
All the problems above can be cast as instantiations of a general framework known as partial
order production [25]. Here, in addition to query access to the ranking π of the items, we are given
π slots and regular access to a partial order <slot on the slots. The objective is to put each item
into a slot, one item per slot, so that whenever two slots, π 1 and π 2 , are related by π 1 <slot π 2 , we
also have π(π 1 ) < π(π 2 ). Sorting coincides with the case where <slot is a total order. In selection
of rank π, there is a designated slot π β , and there are exactly π β 1 slots π with π <slot π β and
exactly π β π slots π with π β <slot π ; there are no other relations in <slot (see the example at the
end of Section 7 for more details). Multiple selection is similar. For heap construction, <slot
matches the complete binary tree arrangement.
Partial order production for a given <slot naturally decomposes into the same problem for
each of the connected components of the undirected graph underlying <slot . In the case of a
single connected component, an elementary adversary argument shows that Q(Ξ ) β₯ π β 1: Any
combination of less than π β 1 queries to π leaves some pair of slots in <slot undetermined with
T HEORY OF C OMPUTING, Volume 20 (7), 2024, pp. 1β62 12
L OWER B OUND T ECHNIQUES IN THE C OMPARISON -Q UERY M ODEL
respect to π. Another lower bound is the information-theoretic limit. For each way of putting
items into slots, the number of input rankings π for which that way is a correct answer is bounded
by π(<slot ), the number of ways to extend <slot to a total order. Therefore, there must be at least
π!/π(<slot ) distinct execution traces. Since each execution trace is determined by the outcomes of
its queries, and each query has only two outcomes, we conclude that π(<slot ) log2 (π!/π(<slot ))
queries are necessary to solve partial order production. Complementing these lower bounds
there exists an upper bound of (1 + π(1)) Β· π(<slot ) + π Β· (π β 1) queries for some universal constant
π [4]. For any instance Ξ with partial order <slot it follows that Q(Ξ ) = Ξ(π(<slot ) + π β πΎ(<slot )),
where πΎ(<slot ) denotes the number of connected components of the undirected graph underlying
<slot .
Not every problem of interest in the comparison model is an instance of partial order
production. Here are a few examples.
β’ In rank finding there is a designated item π₯ β , and we have to compute its rank. The rank can
be computed by comparing π₯ β with each of the π β 1 other items. A similar elementary
adversary argument as above shows that the query complexity is at least π β 1.
β’ In counting inversions the items are arranged in some known order π and the objective is
to count the number of inversions of π with respect to π. As we reviewed in Section 1,
counting inversions has exactly the same query complexity as sorting.
β’ The problem of inversion parity is the same as counting inversions except that one need only
count the number of inversions modulo 2. This problem, as well as counting inversions
modulo π for any integer π > 1, also has exactly the same complexity as sorting.
For each of the three problems above, information theory does not provide a satisfactory
lower bound. For example, in the inversion parity problem there are only two possible outputs,
which yields a lower bound of log2 (2) = 1. It so happens that for each of the preceding three
examples, the query complexity is known quite precisely; however, the known arguments are
rather problem-specific.
Inversion minimization on trees is another example that does not fit the framework of partial
order generation, and for which information theory only yields a weak lower bound: log2 π2 =
2 log2 (π) β Ξ(1). In contrast to the above examples, a strong lower bound does not seem to
follow from a simple ad-hoc argument nor from a literal equivalence to sorting.
3 Sensitivity Lemma
In this section we develop Lemma 1.14. We actually prove a somewhat stronger version.
Lemma 3.1 (Strong Sensitivity Lemma). Consider an algorithm π΄ in the comparison-based model
with π items, color each vertex of the permutahedron with its execution trace under π΄, and let π» denote
the subgraph with the same vertex set but only containing the bichromatic edges. The number of distinct
execution traces of π΄ is at least π(degavg (π») + 1)/π, where π : [1, β) β β is any convex function with
π(π₯) = π₯! for π₯ β [π].
T HEORY OF C OMPUTING, Volume 20 (7), 2024, pp. 1β62 13
I VAN H U , D IETER VAN M ELKEBEEK , AND A NDREW M ORGAN
The Sensitivity Lemma follows from Lemma 3.1 because the coloring with execution traces of
an algorithm π΄ for Ξ is a refinement of the coloring with Ξ , so every edge of the permutahedron
that is bichromatic under Ξ is also bichromatic under π΄, and
π (Ξ ) πΌ(degπΊ(Ξ ) (π)) β€ πΌ(degπ» (π)) degavg (π»).
Provided π is nondecreasing, it follows that D(Ξ ) β₯ π(degavg (π») + 1)/π β₯ π(π (Ξ ) + 1)/π.
In the Sensitivity Lemma we set π(π₯) = Ξ(π₯ + 1). An optimal (but less elegant) choice for π is
the piece-wise linear function that interpolates the prescribed values at the integral points in
[π], namely
π(π₯) (π₯ β bπ₯c) Β· (dπ₯e!) + (1 β (π₯ β bπ₯c)) Β· (bπ₯c!).
For the proof of Lemma 3.1 we take intuition from a similar result in the Boolean setting [23,
Exercise 8.43], where the hypercube plays the role of the permutahedron in our setting.
Fact 3.2. Let π΄ be a query algorithm on binary strings of length π. Color each vertex of the π-dimensional
hypercube by its execution trace under π΄, and let π» denote the subgraph with the same vertex set but
deg (π»)
only containing the bichromatic edges. Then the number of distinct execution traces is at least 2 avg .
One way to argue Fact 3.2 is to think of assigning a weight π€(π₯) to each π₯ β {0, 1} π so as to
maximize the total weight on all inputs, subject to the constraint that the total weight on each
individual execution trace is at most 1. Then the number of distinct execution traces must be at
least the sum of all the weights. If the weight only depends on the degree, i. e., if we can write
π€(π₯) = π (degπ» (π₯)) for some function π : [0, β) β β, then we can lower bound the number π of
distinct execution traces as follows:
Γ Γ
πβ₯ π€(π₯) = π (degπ» (π₯)) β₯ 2π Β· π (πΌ(deg(π₯))) = 2π Β· π (degavg (π»)), (3.1)
π₯ π₯
where the last inequality holds provided π is convex.
In the Boolean setting, the set π
of inputs π₯ β {0, 1} π with a particular execution trace forms
a subcube of dimension π β β , where β denotes the length of the execution trace, i. e., the number
of queries. Each π₯ β π
has degree β in π»; this is because a change in a single queried position
results in a different execution trace, and a change in an unqueried position does not. Therefore,
a natural choice for the weight of π₯ β π
is π€(π₯) = π (β ) where π (π₯) = 1/2πββ . It satisfies the
constraint that the total weight on π
is (at most) one, and π is convex. We conclude by (3.1) that
the number of distinct execution traces is at least 2π Β· π (degavg (π»)) = 2
degavg (π»)
, as desired.
Proof of Lemma 3.1. Let π denote the number of distinct execution traces of π΄, and let π
1 , . . . , π
π
denote the corresponding sets of rankings. Following a similar strategy, we want to find a convex
function π : [0, β) β β such that the weight function π€(π) = π (degπ» (π)) does not assign weight
more than 1 to any one of the sets π
π . The following claim, to be proven later, is the crux of this.
Claim 3.3. Let π
denote the set of all rankings π that follow a particular execution trace on π΄, and let
π!
π β {0, . . . , π β 1}. The number of rankings π β π
with degπ» (π) = π is at most (π+1)! .
T HEORY OF C OMPUTING, Volume 20 (7), 2024, pp. 1β62 14
L OWER B OUND T ECHNIQUES IN THE C OMPARISON -Q UERY M ODEL
Based on Claim 3.3, a natural choice for π is any convex function that satisfies π (π₯) = π1 (π₯+1)!
π!
for π₯ β {0, . . . , π β 1}. The factor of π1 comes from the fact that there are π terms to sum together
after the weights have been normalized. For every π β [π] we then have
πβ1 πβ1 πβ1
Γ Γ Γ 1 (π + 1)! π! Γ1
π€(π) = {π β π
π : degπ» (π) = π} Β· π (π) β€ Β· = = 1.
π π! (π + 1)! π
πβπ
π π=0 π=0 π=0
Similar to (3.1) we conclude
π Γ
Γ Γ Γ
πβ₯ π€(π) = π€(π) = π (degπ» (π)) β₯ π! Β· π (πΌ(degπ» (π))) = π! Β· π (degavg (π»)). (3.2)
π=1 πβπ
π π π
π(π₯+1)
Setting π (π₯) = π1 π! turns the requirements for π into those for π in the statement of the
lemma, and yields that π β₯ π! Β· π (degavg (π»)) = π(degavg (π») + 1)/π.
We now turn to proving Claim 3.3. The comparisons and outcomes that constitute a
particular execution trace of π΄ can be thought of as directed edges between the items in π. We
refer to the resulting digraph on the vertex set π as the comparison graph πΆ. Since the outcomes
of the comparisons are consistent with some underlying ranking, the digraph πΆ is acyclic. The
rankings in π
are in one-to-one and onto correspondence with the linear orderings of the DAG
πΆ. For a given ranking π β π
, the degree degπ» (π) equals the number of π β {2, . . . , π} such that
swapping ranks π β 1 and π in π results in a ranking π0 = ππ that is not in π
, where π denotes the
adjacent-rank transposition (π β 1, π). The ranking π0 not being in π
means that it is inconsistent
with the combined comparisons and outcomes of the underlying execution trace, which happens
exactly when there is a path in πΆ from the item πβ1 (π β 1) of rank π β 1 in π to the item πβ1 (π)
with rank π in π. Thus, the degree degπ» (π) equals the number of π β {2, . . . , π} such that there
is a path from πβ1 (π β 1) to πβ1 (π) in πΆ. See Figure 5 for an illustration, where a squiggly edge
π’ { π£ denotes that there exists a path from π’ to π£ in πΆ. We only draw squiggly edges from one
position to the next, so degπ» (π) equals the number of squiggly edges in Figure 5.
rank 1 2 3 4 5 ... πβ1 π ... π
πβ1
Figure 5: Ranking encoding
Our strategy is to give a compressed encoding of the rankings in π
such that there is more
compression as the number of squiggly edges increases. Our encoding is based on the well-
known algorithm to compute a linear order of a DAG. Algorithm 1 provides pseudocode for the
algorithm, which we refer to as BuildRanking.
In our formulation, BuildRanking is nondeterministic: There is a choice to make in step 3
for each π = 1, . . . , π. The possible executions of BuildRanking are in one-to-one and onto
correspondence with the linear orders of πΆ, and thus with the rankings in π
.
T HEORY OF C OMPUTING, Volume 20 (7), 2024, pp. 1β62 15
I VAN H U , D IETER VAN M ELKEBEEK , AND A NDREW M ORGAN
Algorithm 1 BuildRanking(πΆ)
Input: DAG πΆ on vertex set π
Output: ranking of π that is a linear order of πΆ
1: π β β
2: for π = 1 to π do
3: π₯ β arbitrary element of π {π£ β π | there is no π’ β π \ π with π’ { π£ in πΊ}
4: πβ1 (π) β π₯
5: π β π βͺ {π₯}
Our encoding is a compressed description of how to make the decisions in BuildRanking
such that the output is π. Note that if πβ1 (π β 1) { πβ1 (π), then the item π₯ with rank π cannot
enter the set π before iteration π. This is because before πβ1 (π β 1) is removed from π at the end
of iteration π β 1, the edge πβ1 (π β 1) { πβ1 (π) prevents π₯ from being in π. Thus, whenever
πβ1 (π β 1) { πβ1 (π), the item π₯ = πβ1 (π) is lucky in the sense that it gets picked in step 3 as soon
as it enters the set π. In fact, the lucky items with respect to a ranking π β π
are exactly those for
which πβ1 (π β 1) { πβ1 (π) for some π β {2, . . . , π}, as well as the item πβ1 (1) with rank 1. In
Figure 5 the lucky items are marked black. Their number equals degπ» (π) + 1.
In order to generate a ranking π using BuildRanking, it suffices to know:
(a) the lucky items (as a set, not their relative ordering), and
(b) the ordering of the non-lucky items (given which items they are).
This information suffices to make the correct choices in step 3 of Algorithm 1:
β’ If the set π contains a lucky item, there will be a unique lucky item in π; pick it as the
element π₯.
β’ Otherwise, pick for π₯ the first item in the ordering of the non-lucky items that is not yet in
π. Such an element will exist, and all the items that come after it in the ordering are not
yet in π either.
π
If π has degree π = degπ» (π), then there are π + 1 lucky items, so there are at most π+1
choices for
π π!
(a), and at most (π β π β 1)! choices for (b), resulting in a total of at most π+1 Β· (π β π β 1)! = (π+1)!
choices. This proves Claim 3.3.
Remark 3.4. Suppose we allow an algorithm π΄ to have multiple valid execution traces on a
given input π, and let π
π denote the set of rankings on which the π-th execution trace is valid.
The proof of Claim 3.3 carries over as it considers individually sets π
π , and only depends on the
DAG that the comparisons in π
π induce. The rest of the proof of Lemma 3.1 carries through
modulo the first equality in (3.2), which no longer holds as the sets π
π may overlap. However,
the equality can be replaced by the inequality β₯, which does hold and is sufficient for the
argument. This means that we can replace D(Ξ ) in the statement of the Sensitivity Lemma by
its nondeterministic variant N(Ξ ).
T HEORY OF C OMPUTING, Volume 20 (7), 2024, pp. 1β62 16
L OWER B OUND T ECHNIQUES IN THE C OMPARISON -Q UERY M ODEL
4 Sensitivity approach for general trees
In this section we analyze the average sensitivity of the problem Ξ π of inversion minimization on
a tree π with a general shape. In Section 4.1 we show that the existence of a subtree containing
a fair fraction of the leaves implies high sensitivity. The lower bound on query complexity
for Ξ π in Theorem 1.5 then follows from the Sensitivity Lemma. In Section 4.3 we prove that
the average sensitivity measure is Lipschitz continuous. For the analysis, we make use of
the decomposition of the objective of inversion minimization on trees mentioned earlier. We
describe the decomposition in more detail in Section 4.2; it will be helpful in later parts of this
paper, as well.
4.1 Subtree-induced sensitivity
We first introduce a sensitivity bound for inversion
minimization based on the size of a subtree.
Lemma 4.1 (subtree-induced sensitivity). Consider a
tree π with π |L(π)| leaves, and some node π£ in π with π£
β |L(ππ£ )| leaves. We have
β (π β β ) L(ππ£ )
π (Ξ π ) β₯ β 1.
π
L(π)
Note that π£ is not necessarily a direct child of the
root, as shown in Figure 6. Figure 6: Subtree rooted at π£
We now prove Lemma 4.1. Let π be a ranking of
the leaves of π, and let πmin be a tree ordering that
minimizes the number of inversions with respect to π.
Claim 4.2. π is sensitive to the transposition π = (π, π + 1) if πmin (πβ1 (π)) > πmin (πβ1 (π + 1)).
Proof. If πmin (πβ1 (π)) > πmin (πβ1 (π + 1)), then Invππ (πmin ) = Invπ (πmin ) β 1. By definition,
Invπ (πmin ) = MinInv(π, π), so this means that MinInv(π, ππ) < MinInv(π, π), or that π is
sensitive to π.
In the case of general trees, a tree ordering π that minimizes the number of inversions with
respect to π is difficult to find (see the discussion on NP-hardness in Section 10). Our strategy
is to find a lower bound on the number of π for which π(πβ1 (π)) > π(πβ1 (π + 1)) that applies
regardless of π.
Claim 4.3. For any ordering π, the number of π such that π(πβ1 (π)) > π(πβ1 (π + 1)) is at least one less
than the number of π such that πβ1 (π ) β L(ππ£ ) and πβ1 (π + 1) β L(ππ£ ).
Proof. For all except at most one value of π (the maximum π for which πβ1 (π ) β L(ππ£ )), there exists
a minimal π 0 > π such that πβ1 (π 0) β L(ππ£ ). We claim that at least one value of π = π , . . . , π 0 β 1
satisfies π(πβ1 (π)) > π(πβ1 (π + 1)). If not, then π would rank πβ1 (π ), πβ1 (π + 1), . . . , πβ1 (π 0) in
T HEORY OF C OMPUTING, Volume 20 (7), 2024, pp. 1β62 17
I VAN H U , D IETER VAN M ELKEBEEK , AND A NDREW M ORGAN
increasing order. Because π is a tree ordering, the leaves of L(ππ£ ) must be mapped into a
contiguous range by π, as shown in Figure 7. However, we have πβ1 (π ), πβ1 (π 0) β L(ππ£ ) but
πβ1 (π + 1) β L(ππ£ ), which violates this property since π ranks a leaf outside L(ππ£ ) between two
leaves inside L(ππ£ ).
Because each value of π is found between consecutive pairs of values in L(ππ£ ), the values of π
are distinct.
π
9
8
7
π£ 6
5 L(ππ£ )
2 7 3 9 4
3
2
8
1
1 5 4 6 1 2 3
π1
4 5 6
π2
7 8 9 π
π 1 π 10 π 2 π 20
(a) Leaves in π-order, labeled with π-ranks (b) Corresponding plot of π, π for each leaf
Figure 7: π maps leaves of L(ππ£ ) in a contiguous range.
Claim 4.4. Over a uniformly random π, the expected number of π such that πβ1 (π ) β L(ππ£ ) and
πβ1 (π + 1) β L(ππ£ ) is β (πββ
π .
)
Proof. For π = 1, . . . , π β 1, the probability that πβ1 (π ) β L(ππ£ ) is πβ , and the probability that
πββ
πβ1 (π + 1) β L(ππ£ ) given that πβ1 (π ) β L(ππ£ ) is πβ1 . Using linearity of expectation on the indicator
random variables for π (π ) β L(ππ£ ) and π (π + 1) β L(ππ£ ), the expected number of π satisfying
β1 β1
this property is
β (π β β ) β (π β β )
(π β 1) = .
π(π β 1) π
Combining Claim 4.2, Claim 4.3, and Claim 4.4, we can conclude with Claim 4.1.
Bounded degree. We apply our analysis to the case of trees of degree π. Observe that for fixed
π, Lemma 4.1 is strongest when β = π/2. Not every tree π has a subtree with exactly π/2 leaves,
but Lemma 4.1 still gives a useful bound for subtrees that do not contain too few or too many
leaves. In the case of trees of bounded degree, there always exists a subtree ππ£ that contains a
fairly balanced fraction of the leaves. The following quantification is folklore, but we include a
proof for completeness.
T HEORY OF C OMPUTING, Volume 20 (7), 2024, pp. 1β62 18
L OWER B OUND T ECHNIQUES IN THE C OMPARISON -Q UERY M ODEL
Fact 4.5. If π is a tree of degree π with π leaves, there exists a node π£ in π such that β |L(ππ£ )| = πΌ Β· π,
π
where π+1
1
β€ πΌ β€ π+1 .
Proof. Let π be the root of π and construct a sequence π£ 1 = π, π£2 , π£3 , . . . such that π£ π is a child of
π£ πβ1 that maximizes β π |L(ππ£ π )|, with ties broken arbitrarily. Notice that {β π } is a decreasing
sequence, and since π has degree π, β π β€ πβ π+1 for all π. We claim that some π£ π in this sequence
π
satisfies the conditions of the claim. If not, then for some π, β π > π+1 Β· π and β π+1 < π+11
Β· π, which
contradicts the fact that β π β€ πβ π+1 .
By choosing a subtree satisfying Fact 4.5, we can apply Lemma 4.1 and conclude that
π
π (Ξ π ) β₯ (π+1) 2 Β· π β 1. The Sensitivity Lemma then gives the βin particularβ part of Theorem 1.5.
4.2 Decomposition of the objective function
For use in this section as well as later parts of the paper, we now explain how the objective of
inversion minimization on trees decomposes. We introduce the notion of root inversion along
the way, and observe the effect of adjacent-rank transpositions on the decomposition.
The objective MinInv(π, π) can be written as the sum of contributions from each of the
individual nodes. A node π£ contributes those inversions that reside in the subtree ππ£ and go
through the root π£ of ππ£ . We refer to them as the root inversions in ππ£ .
Definition 4.6 (root inversions, RInv(Β·, Β·, Β·), MinRInv(Β·, Β·)). Given a tree π, a ranking π of the
leaves of π, and an ordering π of π, a root inversion of π with respect to π is an inversion (β 1 , β2 )
of π with respect to π for which the lowest common ancestor LCA(β 1 , β2 ) is the root of π. The
number of root inversions of π with respect to π in π is denoted by RInv(π, π, π). The minimum
number of root inversions in π with respect to π is denoted
MinRInv(π, π) min RInv(π, π, π), (4.1)
π
where π ranges over all possible orderings of π.
The only aspect of the ordering π of ππ£ that affects RInv(ππ£ , π, π) is the relative order of the
children of π£. For a node π£ with π children π’1 , . . . , π’ π , by abusing notation and using π to also
denote the ranking of the children induced by the ordering of the tree, we have
Γ
RInv(π, π, π) XInvπ (πΏ π(π) , πΏ π(π) ), (4.2)
1β€π<πβ€π
where πΏ π is a short-hand for the leaf set πΏ(ππ’π ). The contributions of the nodes can be optimized
independently: Γ
MinInv(π, π) = MinRInv(ππ£ , π), (4.3)
π£
where π£ ranges over all nodes of π with degree degπ (π£) > 1.
When we apply an adjacent-rank transposition π to a ranking π, at most one of terms in the
decomposition (4.3) can change, and the change is at most one unit. We capture this observation
for future reference as it will be helpful in several sensitivity analyses.
T HEORY OF C OMPUTING, Volume 20 (7), 2024, pp. 1β62 19
I VAN H U , D IETER VAN M ELKEBEEK , AND A NDREW M ORGAN
Proposition 4.7. Let π be a ranking of the leaf set π of a tree π, π an adjacent-rank transposition, and β 1
and β2 be the affected leaves. Then
MinRInv(ππ£ , π) = MinRInv(ππ£ , ππ)
for all nodes π£ in π except possibly π£ = LCA(β 1 , β2 ). Moreover, the difference is at most 1 in absolute
value.
Proof. Since the ranks of β 1 and β2 under π are adjacent, for any leaf β other than β 1 and β 2 , the
relative order of β under π is the same with respect to β 1 as it is with respect to β 2 . This means
that the adjacent-rank transposition π does not affect whether a pair of leaves constitutes an
inversion unless that pair equals {β 1 , β2 }. As a result, the only term on the right-hand side of
(4.3) that can be affected by the transposition π is the one corresponding to the node π£, and it
can change by at most one unit.
4.3 Lipschitz continuity
Average-case notions typically do not change much under
small changes to the input. This is indeed the case for the
average sensitivity when βsmallβ is interpreted as affecting
few of the subtrees. The following lemma quantifies π£
the property and can be viewed as a form of Lipschitz
continuity. π£β
Lemma 4.8. Given a tree π, if a subtree ππ£ β with β leaves is
replaced with a tree ππ£0β with the same number of leaves, resulting
in the tree π 0, then
β1 β2 π
β (β β 1) π
|π (Ξ π ) β π (Ξ π 0 )| β€ .
π
Figure 8: Effects of changing ππ£ β
Proof. We think of the leaf sets of π and π 0 as being the
same set π = L(π) = L(π 0), and fix a ranking π of π.
Consider an ordering of π and the ranking π of π that it induces. Outside of ππ£0β we can order π 0
in the same way as π. Irrespective of how we order π 0 inside ππ£0β , the induced ranking π0 of π
agrees with π on all leaves in π except possibly those in π L(ππ£ β ) = L(ππ£0β ). Moreover, under
both π and π0, the set π gets mapped to the same contiguous interval. It follows that for all pairs
(β 1 , β2 ) of distinct leaves of which at least one lies outside of π, (β1 , β2 ) constitutes an inversion of
π with respect to π if and only if (β 1 , β2 ) constitutes an inversion of π0 with respect to π. For any
node π£ outside of ππ£ β , root inversions in ππ£ cannot involve leaves that are both in π L(ππ£ β ). See
Figure 8 for an illustration. Thus, for such nodes π£, RInv(ππ£ , π, π) = RInv(ππ£0 , π, π0). By taking
the minimum over all orderings, we conclude:
Claim 4.9. MinRInv(ππ£ , π) = MinRInv(ππ£0 , π) holds for every node π£ outside of ππ£ β (or equivalently,
outside of ππ£0β ).
T HEORY OF C OMPUTING, Volume 20 (7), 2024, pp. 1β62 20
L OWER B OUND T ECHNIQUES IN THE C OMPARISON -Q UERY M ODEL
Consider a ranking π and an adjacent-rank transposition π = (π, π + 1). We claim that,
unless (β 1 , β2 ) (πβ1 (π), πβ1 (π + 1)) β π Γ π, Ξ π is sensitive to π at π if and only if Ξ π 0 is
sensitive to π at π. This is because by Proposition 4.7 the only term in the decomposition (4.3)
of MinInv(π, π) that can be affected by π is the contribution MinRInv(ππ£ , π) for π£ = LCA(β 1 , β2 ).
If at least one of β 1 or β 2 is not inside ππ£ β , then π£ is not inside ππ£ β either, so by Claim 4.9,
MinRInv(ππ£ , π) = MinRInv(ππ£0 , π). By the same token, MinRInv(ππ£ , ππ) = MinRInv(ππ£0 , ππ). It
follows that MinInv(π, π) β MinInv(π, ππ) if and only if MinInv(π 0 , π) β MinInv(π 0 , ππ).
We bound the expected number of values of π for which (πβ1 (π), πβ1 (π + 1)) β π Γ π with
π L(ππ£ ) when π is chosen uniformly at random. For π β [π β 1], the probability that πβ1 (π) β π
is πβ , and the probability that πβ1 (π + 1) β π given that πβ1 (π) β π is πβ1β β1
. Using linearity of
expectation on the indicators, the expected number of said π is
β (β β 1) β (β β 1)
(π β 1) = .
π(π β 1) π
Lemma 4.8 helps to extend query lower bounds based on average sensitivity to larger classes.
Suppose we have established a good lower bound on the sensitivity π (Ξ π ) for a class πΆ of trees.
Consider a class πΆ 0 obtained by taking a tree π in class πΆ and replacing some of the subtrees ππ£
by other subtrees ππ£0 on the same number of leaves. For this new class πΆ 0 the same lower bound
on the sensitivity of inversion minimization applies modulo the Lipschitz loss. For example,
Theorem 1.6 holds by virtue of a lower bound of the form π (Ξ π ) β₯ (π β 1) β π log(π) for every
binary tree π with π leaves, where π is a universal constant. If we allow some of the subtrees of
π to be replaced by, say freely arrangeable ones on the same leaves, applying Lemma 4.8 for
each of the modified subtrees in sequence shows that the resulting new tree π 0 has
πΌπ(πΌπ β 1)
π (Ξ π 0 ) β₯ π (Ξ π ) β β₯ (π β 1) β π log(π) β πΌ 2 (π β 1) = (1 β πΌ2 )(π β 1) β π log(π),
π
where πΌ denotes the fraction of leaves that belong to one of the replaced subtrees.
In fact, the notion of average sensitivity is robust with respect to the following, more refined
type of surgery. From any tree π, let π
be a connected subset of π that includes no leaves. Let
π 0 be the subtree rooted at the LCA of π
(π 0 contains all of π
), and let π10 , . . . , ππ0 be the disjoint
maximal subtrees of π 0 that are strictly below π
. Let π
0 be any tree that has π leaves. Replace π 0
by π
0, and then replace the leaves of π
0 by π10 , . . . , ππ0.
The effect is that the region π
has been βreshaped" to look like π
0, but the rest of π is
unaffected. The cost of such a surgery is at most (π β 1) times the probability that a uniformly
random pair of distinct leaves has their LCA in π
. The bound follows from thinking of sensitivity
as (π β 1) times the probability that a uniformly random edge in the full permutahedron is
bichromatic. Provided the LCA of the affected leaves is outside π
, then we get sensitivity before
the surgery if and only if we get it after the surgery. Surgeries can be iterated, and the costs
accumulate additively. In combination with our strong lower bound on the average sensitivity
of binary trees (Lemma 5.7), this allows for a robust sense in which βmostly-binary" trees have
high average sensitivity.
T HEORY OF C OMPUTING, Volume 20 (7), 2024, pp. 1β62 21
I VAN H U , D IETER VAN M ELKEBEEK , AND A NDREW M ORGAN
5 Refined sensitivity approach for binary trees
In this section we show how to refine the sensitivity approach for lower bounds on the query
complexity of the problem Ξ π of inversion minimization on trees in the important special case
of binary trees π. In Section 5.1 we first develop a criterion for when a particular ranking π is
sensitive to a particular adjacent-rank transposition π. We then analyze the root sensitivity of
binary trees in Section 5.2 and finally establish a strong lower bound on the average sensitivity
in Section 5.3. An application of the Sensitivity Lemma then yields Theorem 1.6.
5.1 Sensitivity criterion
Recall the decomposition of the objective function MinInv(π, π) into contributions attributed to
each node π£ of degree degπ (π£) > 1, as given by (4.3) in Section 4.2. In the case of binary trees,
the contribution of node π£ can be calculated simply as
MinRInv(ππ£ , π) = min(XInvπ (πΏ1 , πΏ2 ), XInvπ (πΏ2 , πΏ1 )), (5.1)
where π’1 and π’2 denote the two children of π£, and πΏ1 L(ππ’1 ) and πΏ2 L(ππ’2 ) their leaf sets.
This simplicity makes a precise analysis of sensitivity feasible, as we will see next.
For a given ranking π of π and a given adjacent-rank transposition π, we would like to figure
out the effect of π on the objective MinInv(π, Β·), in particular when MinInv(π, ππ) = MinInv(π, π).
Let β lo and β hi denote the two leaves that are affected by the transposition π on the ranking
π, where the subscript βloβ indicates the lower of the two leaves with respect to π, and βhiβ
the higher of the two. Let π£ be the lowest common ancestor LCA(β lo , βhi ). We use the same
subscripts βloβ and βhiβ for the two children of π£: π’lo denotes the child whose subtree contains
β lo , and π’hi its sibling. Similarly, we denote by πΏlo the leaf set of ππ’lo , and by πΏhi the leaf set of
ππ’hi . See Figure 9 for the subsequent analysis.
π£
π’lo π’hi
πΏlo = {βlo } t π΄ πΏhi {β hi } t π΅
π΄ = π΄< t π΄> π΅ = π΅< t π΅>
π(π΄< ) < π(β lo ) < π(π΄> ) β lo β hi π(π΅< ) < π(βhi ) < π(π΅> )
πΏlo πΏhi
Figure 9: Sensitivity analysis for binary trees
By Proposition 4.7, the situation before and after the application of π is as follows, where we
abbreviate π₯ XInvπ (πΏlo , πΏhi ) and π¦ XInvπ (πΏhi , πΏlo ).
T HEORY OF C OMPUTING, Volume 20 (7), 2024, pp. 1β62 22
L OWER B OUND T ECHNIQUES IN THE C OMPARISON -Q UERY M ODEL
ranking RInv(ππ£ , Β·, π(β lo ) < π(β hi )) RInv(ππ£ , Β·, π(β hi ) < π(β lo )) MinRInv(ππ£ , Β·)
π π₯ π¦ min(π₯, π¦)
ππ π¦β1 π₯+1 min(π¦ β 1, π₯ + 1)
The objective function remains the same iff min(π₯, π¦) = min(π¦ β 1, π₯ + 1), which happens iff
π₯ β π¦ = β1, or equivalently iff
DInvπ (πΏlo , πΏhi ) = DInvπ ({β lo }, {βhi }), (5.2)
where we introduce the following short-hand:
Definition 5.1 (cross inversion difference, DInvΒ· (Β·, Β·)). For a ranking π of a set π, and two subsets
π΄, π΅ β π,
DInvπ (π΄, π΅) XInvπ (π΄, π΅) β XInvπ (π΅, π΄).
We can split πΏlo as πΏlo = {βlo } t π΄ = π΄< t {βlo } t π΄> , where π΄< contains all leaves in πΏlo that
π ranks before β lo , and π΄> contains all the leaves in πΏlo that π ranks after β lo . We similarly split
πΏhi , as indicated in Figure 9. We have that
DInvπ (πΏlo , πΏhi ) = DInvπ ({β lo }, {βhi }) + DInvπ ({β lo }, π΅) + DInvπ (π΄, {βhi }) + DInvπ (π΄, π΅).
Since the ranks of β lo and β hi under π are adjacent, we have that DInvπ ({β lo }, π΅) = |π΅< | β |π΅> |
and DInvπ (π΄, {β hi }) = |π΄> | β |π΄< |. Plugging everything into (5.2) we conclude:
Proposition 5.2. Let π be a binary tree, π a ranking of the leaves of π, π an adjacent-rank transposition,
β lo and βhi the two leaves affected by π under π such that π ranks β lo before β hi . Referring to the notation
in Figure 9, we have that
MinInv(π, π) = MinInv(π, ππ) β DInvπ (πΏlo , πΏhi ) = DInvπ ({β lo }, {β hi }) (5.3)
β DInvπ (π΄, π΅) = |π΄< | β |π΄> | + |π΅> | β |π΅< |. (5.4)
5.2 Root sensitivity
Given a ranking π and an adjacent-rank transposition π, we know by Proposition 4.7 that at
most one of the terms in the decomposition (4.3) of MinInv(π, π) is affected by the transposition,
namely MinRInv(ππ£ , π) where π£ is the lowest common ancestor of the affected leaves β lo and β hi .
It follows that we can write the average sensitivity of Ξ π MinInv(π, Β·) as the following convex
combination:
π (Ξ π ) = (π β 1) Β· β[MinInv(π, π) β MinInv(π, ππ)]
Γ
= (π β 1) β[π£ = LCA(βlo , βhi )]β[MinRInv(ππ£ , π) β MinRInv(ππ£ , ππ) | π£ = LCA(βlo , βhi )],
π£
(5.5)
T HEORY OF C OMPUTING, Volume 20 (7), 2024, pp. 1β62 23
I VAN H U , D IETER VAN M ELKEBEEK , AND A NDREW M ORGAN
where the probability is over a uniformly random choice of the ranking π and the adjacent-rank
transposition π, and β lo and βhi denote the affected leaves. The conditional probability on the
right-hand side of (5.5) only depends on the subtree ππ£ . The ranking π of all leaves induces a
ranking π0 of the leaves of ππ£ that is uniform under the conditioning. Similarly, the adjacent-rank
transposition π for π induces an adjacent-rank transposition π0 for π0; the distribution of π0 under
the conditioning is independent of π0 and uniform among all adjacent-rank transpositions such
that the affected leaves live in subtrees of different children of π£. Thus, the probability on the
right-hand side of (5.5) coincides with the following notion for the subtree ππ£ .
Definition 5.3 (root sensitivity). Let π be a tree. The root sensitivity of π is the probability that
MinRInv(ππ£ , π) β MinRInv(ππ£ , ππ)
when π is a uniform random ranking of L(π), and π a uniform random adjacent transposition
with the condition that the affected leaves are in subtrees of different children of the root of π.
Note that the only nodes π£ that need to be considered in the sum on the right-hand side of
(5.5) are those that can appear as the lowest common ancestor of two leaves, and such that ππ£ is
not freely arrangeable. In the case of binary trees, this means that we only need to consider
nodes π£ of degree 2 such that ππ£ contains more than 2 leaves. In this section we prove a strong
lower bound on the root sensitivity of such trees ππ£ .
Consider the binary tree π with root π£ in Figure 9. The distribution underlying Definition 5.3
can be generated as follows: Pick a leaf on each side of the root π£ uniformly at random, and let π
be a ranking of the leaves of π that is uniformly random on the condition that the selected leaves
receive adjacent ranks; π then is the adjacent-rank transposition that swaps the two selected
leaves. The root sensitivity of π is the complement of the probability that the right-hand side
of (5.4) holds under this distribution. Let us analyze the left-hand side of (5.4) further. As
π΄ = π΄< t π΄> and π΅ = π΅< t π΅> , we have that
DInvπ (π΄, π΅) = DInvπ (π΄< , π΅< ) + DInvπ (π΄< , π΅> ) + DInvπ (π΄> , π΅< ) + DInvπ (π΄> , π΅> ).
By the defining properties of the sets involved (see Figure 9), we know that DInvπ (π΄< , π΅> ) =
βπ < π > and DInvπ (π΄> , π΅< ) = π > π < , where π < |π΄< |, π > |π΄> |, π < |π΅< |, and π > |π΅> |. Thus,
we can rewrite criterion (5.4) as:
DInvπ (π΄< , π΅< ) + DInvπ (π΄> , π΅> ) = π < π > β π > π < + π < β π > + π > β π < . (5.6)
A critical observation that helps us to bound the probability of (5.6) is that, conditioned on all
four values π Β· and π Β· , the right-hand side of (5.6) is fixed, but the left-hand side still contains a
lot of randomness. In fact, under the conditioning stated, the ranking that π induces on π΄< t π΅<
is still distributed uniformly at random, the same holds for the ranking that π induces on
π΄> t π΅> , and both distributions are independent. This means that, under the same conditioning,
the left-hand side of (5.6) has the same distribution as the sum XInvπ < ,π < + XInvπ > ,π > of two
independent random variables of the following type:
T HEORY OF C OMPUTING, Volume 20 (7), 2024, pp. 1β62 24
L OWER B OUND T ECHNIQUES IN THE C OMPARISON -Q UERY M ODEL
Definition 5.4 (cross inversion distribution, XInvΒ·,Β· ). For nonnegative integers π and π, XInvπ,π
denotes the random variable XInv(π΄, π΅) that counts the number of cross inversions from π΄ to π΅,
where π΄ is an array of length π, π΅ an array of length π, and the concatenation π΄π΅ is a uniformly
random permutation of [π + π].
In Section 9 we establish the following upper bound on the probability that the number of
cross inversions takes on any specific value.
Lemma 5.5. There exists a constant πΆ such that for all integers π, π β₯ 1 and 0 β€ π β€ ππ,
πΆ
β[XInvπ,π = π] β€ p . (5.7)
ππ(π + π)
Using Lemma 5.5 we can establish an upper bound of the same form as the right-hand side
of (5.7) for the probability that (5.6) holds: For some constant πΆ 0
πΆ0
β[XInvπ < ,π < + XInvπ > ,π > = π < π > β π > π < + π < β π > + π > β π < ] β€ p , (5.8)
ππ(π + π)
where π π < + π > β₯ 1 and π π < + π > β₯ 1. We consider several cases based on the relative
sizes of π < vs π > , and π < vs π > .
(i) In case both π < β₯ ππ and π < β₯ ππ, the bound (5.8) follows from (5.7) as long as πΆ 0 β₯ πΆ/π3/2 .
(ii) By switching the order in (i), the same holds true in case both π > β₯ ππ and π > β₯ ππ.
(iii) In case π > β€ ππ and π < β€ ππ, the left-hand side of (5.6) is at most 2πππ, whereas the
right-hand side is at least
(1 β π)2 ππ β π2 ππ + (1 β π)π β ππ + (1 β π)π β ππ = (1 β 2π)(ππ + π + π).
As long as 2π β€ 1 β 2π, or equivalently, π β€ 1/4, this case cannot occur.
(iv) By switching the roles of π΄ and π΅ in (iii), the same holds true in case π > β€ ππ and π < β€ ππ.
As long as π β€ 1/2, it holds that either π < β₯ ππ or π > β₯ ππ, and either π < β₯ ππ or π > β₯ ππ.
Distributing the βandβ over the βorβ, we obtain the four cases we considered, which are therefore
exhaustive. We conclude that (5.8) holds for πΆ 0 = 43/2 πΆ whenever π |π΄| β₯ 1 and π |π΅| β₯ 1.
In the case where π = 0 and π β₯ 1, the right-hand side of (5.4) vanishes, as do |π΄< | and
|π΄> |, so (5.4) holds if and only if |π΅< | = |π΅> |, or equivalently, the leaf βhi is ranked exactly in the
middle of the leaf set πΏhi . As the ranking π is chosen uniformly at random, this happens with
probability 1/(π + 1) where π |π΅| = |πΏhi | β 1. The case where π β₯ 1 and π = 0 is symmetric.
The remaining case, π = π = 0, is one we do not need to consider as the tree π then only has two
leaves. In all other cases we obtain a strong upper bound on the probability that (5.4) holds, and
by complementation a strong lower bound on the root sensitivity. We capture the lower bound
in the following single expression that holds for all cases under consideration.
T HEORY OF C OMPUTING, Volume 20 (7), 2024, pp. 1β62 25
I VAN H U , D IETER VAN M ELKEBEEK , AND A NDREW M ORGAN
Lemma 5.6. There exists a constant π such that for every binary tree π with at leaves 3 leaves and a root
of degree 2, the root sensitivity of π is at least
π
1β p , (5.9)
π1 π2 (π1 + π2 )
where π1 and π2 denote the number of leaves in the subtrees rooted by the two children of the root.
5.3 Average sensitivity
We are now ready to establish that, except for trivial cases, the average sensitivity of a binary
tree is close to maximal. The trivial cases are those where the tree has at most two leaves, in
which case the sensitivity is zero.
Lemma 5.7. The average sensitivity of Ξ π for binary trees π with π β₯ 2 leaves is π β π(1).
Proof. We use the expression (5.5) for the average sensitivity of Ξ π , where π£ ranges over all
nodes of degree 2 such that ππ£ contains as least two leaves. Consider a node π£ of degree 2 such
that ππ£ contains ππ£,1 leaves one one side and ππ£,2 leaves on the other side, where ππ£,1 + ππ£,2 β₯ 3.
If we choose the ranking π and the adjacent-rank transposition π uniformly at random, each of
the π2 pairs of leaves are equally likely to be the affected pair. As there are ππ£,1 Β· ππ£,2 choices
2ππ£,1 ππ£,2
that result in π£ as their lowest common ancestor, we have that β[π£ = LCA(βlo , βhi )] = π(πβ1) .
Combining this with the root sensitivity lower bound given by (5.9), we have that
!
Γ 2ππ£,1 ππ£,2 π
π (Ξ π ) β₯ (π β 1) Β· 1β p
π£
π(π β 1) ππ£,1 ππ£,2 (ππ£,1 + ππ£,2 )
ππ£,1 ππ£,2
r
2π Γ
= (π β 1) β .
π π£
ππ£,1 + ππ£,2
The following claim then completes the proof.
Claim 5.8. There is a constant π 0 such that for all binary trees π with π leaves
Γ r ππ£,1 ππ£,2
β€ π 0 π, (5.10)
π£
ππ£,1 + ππ£,2
where the sum ranges over all nodes π£ of degree 2 such that ππ£ contains at least 3 leaves.
Proof of Claim 5.8. We use structural induction to prove a somewhat stronger claim, namely that
Γ r ππ£,1 ππ£,2 β
β€ π 0 π β π0 π (5.11)
π£
ππ£,1 + ππ£,2
for some constants π 0 and π0 to be determined. As the base case we consider binary trees π with
at most two leaves. In this case, the left-hand side of (5.11) is zero and the right-hand side is
non-negative provided π 0 β₯ π0, so (5.11) holds.
T HEORY OF C OMPUTING, Volume 20 (7), 2024, pp. 1β62 26
L OWER B OUND T ECHNIQUES IN THE C OMPARISON -Q UERY M ODEL
For the inductive step, the case where the root of π has degree 1 immediately follows
from the inductive hypothesis for the subtree ππ’ rooted by the child π’ of the root of π. The
remaining case is where the root of π has degree 2. Let π’1 and π’2 be the two children for the root,
π1 = |L(ππ’1 )|, and π2 = |L(ππ’2 )|. The sum on the left-hand side of (5.11) has three contributions:
q
π1 π2
π1 +π2 from the root, and the contributions from ππ’1 and ππ’2 , to which we can individually
apply the inductive hypothesis. This gives us an upper bound of
π1 π2 π1 π2
r r
β β β β
+ (π 0 π1 β π0 π1 ) + (π 0 π2 β π0 π2 ) = + π 0(π1 + π2 ) β π0( π1 + π2 ),
π1 + π2 π1 + π2
which we want to upper bound by
β β
π 0 π β π0 π = π 0(π1 + π2 ) β π0 π1 + π2 .
Writing π1 = πΌπ for some πΌ β [0, 1] and rearranging terms, the upper bound holds if and only if
p β β
πΌ(1 β πΌ) β€ π0( πΌ + 1 β πΌ β 1).
β
We claim that the upper bound holds for π0 = ( 2 + 1)/2. Let
β β p
πΉ(πΌ) π0( πΌ + 1 β πΌ β 1) β πΌ(1 β πΌ).
It suffices to show that πΉ(πΌ) β₯ 0. Since πΉ is continuous on [0, 1], it attains a minimum on [0, 1].
On (0, 1), πΉ is differentiable. It can be verified that πΉ0 has a unique zero in (0, 1/2), which needs
to be a maximum as πΉ is increasing at πΌ = 0. By the symmetry πΉ(πΌ) = πΉ(1 β πΌ), it follows
that the minimum of πΉ on [0, 1] is attained at the midpoint πΌ = 1/2 or at one of the endpoint
πΌ = 0βor πΌ = 1. At all three points πΉ(πΌ) = 0. We conclude that (5.11) holds for any constants
π0 β₯ ( 2 + 1)/2 and π 0 β₯ π0.
6 Sensitivity approach for bounded error
In this section, we apply the sensitivity approach to obtain lower bounds on the query complexity
of problems in the comparison-query model against randomized algorithms with bounded error.
We derive a general result that query lower bounds against deterministic algorithms that are
based on the Sensitivity Lemma also hold against bounded-error randomized algorithms with a
small loss in strength. The approach works particularly well when we have linear lower bounds
on the average sensitivity, in which case there is only a constant-factor loss in the strength of
the query lower bound. Among others, this applies to the Ξ©(π log π) query lower bound for
inversion minimization on trees of bounded degree.
Our approach is based on Yaoβs minimax principle [29], which lower bounds worst-case
complexity against randomized algorithms with bounded error by average-case complexity
against deterministic algorithms with bounded distributional error. We view a deterministic
algorithm with small distributional error for a problem Ξ as an exact deterministic algorithm
T HEORY OF C OMPUTING, Volume 20 (7), 2024, pp. 1β62 27
I VAN H U , D IETER VAN M ELKEBEEK , AND A NDREW M ORGAN
for a slightly modified problem Ξ 0. The idea is to then apply the sensitivity approach to Ξ 0, and
capitalize on the closeness of the average sensitivities of Ξ and Ξ 0 to obtain a lower bound in
terms of the sensitivity of Ξ . By using the Sensitivity Lemma as a black box, the approach yields
a lower bound on the query complexity of bounded-error algorithms that is worst-case with
respect to the input and with respect to the randomness, i. e., the lower bound holds for some
input and some computation path on that input. By delving into the proof of the Sensitivity
Lemma, we are able to obtain a lower bound that is worst-case with respect to the input but
average-case with respect to the randomness, i. e., the lower bound holds for the expected
number of queries on some input.1
We first define the notions of randomized complexity and distributional complexity.
Definition 6.1 (randomized query complexity, RQΒ· (Β·), and distributional query complexity,
DistQΒ· (Β·, Β·)). Let Ξ be a problem in the comparison-query model and π β [0, 1].
A randomized algorithm π
for Ξ is said to have error π if on every input π, the algorithm
outputs Ξ (π) with probability at least 1 β π. The query complexity of π
is the maximum, over
all inputs π, of the expected number of queries that π
makes on input π. The π-error randomized
query complexity of Ξ , denoted RQπ (Ξ ), is the minimum query complexity of π
over all π-error
randomized algorithms π
for Ξ .
Let π be a probability distribution on the inputs π. A deterministic algorithm π΄ for Ξ has
error π with respect to π if the probability that π΄(π) = Ξ (π) is at least 1 β π where the input π is
chosen according to π. The query complexity of π΄ with respect to π is the expected number of
queries that π΄ makes on input π when π is chosen according to π. The π-error distributional query
complexity of Ξ with respect to π, denoted DistQπ (Ξ , π), is the minimum query complexity of
π΄ with respect to π over all deterministic algorithms π΄ for Ξ that have error π with respect to
π.
The relationship between randomized complexity and distributional complexity is described
by Yaoβs principle.
Lemma 6.2 (Yaoβs minimax principle [29]). Let Ξ be a problem in the comparison-query model,
π β [0, 1/2], and π a distribution on the inputs π.
1
RQπ (Ξ ) β₯ DistQ2π (Ξ , π). (6.1)
2
We now prove lower bounds on the distributional query complexity, and thus on randomized
query complexity, of comparison-query problems Ξ based on average sensitivity bounds. For
these bounds, we always set π to be the uniform distribution, the distribution underlying the
notion of average sensitivity.
We start by studying average-case query complexity, i. e., zero-error distributional query
complexity, and its relationship to the average sensitivity. We follow a strategy similar to the
1In fact, the approach yields a lower bound that is average-case with respect to the input (chosen uniformly at
random) as well as the randomness. This follows because the proof of Yaoβs minimax principle allows us to replace
the left-hand side of (6.1) by the average of the expected number of queries with respect to the distribution π, which
we pick to be uniform in our application of the principle.
T HEORY OF C OMPUTING, Volume 20 (7), 2024, pp. 1β62 28
L OWER B OUND T ECHNIQUES IN THE C OMPARISON -Q UERY M ODEL
one in the proof of the Sensitivity Lemma. Whereas a bound on deterministic complexity Q
follows purely from the number of execution traces D, here, the execution traces are weighted
by their depth and their probability of occurring.
Recall that π in the statement of the Strong Sensitivity Lemma denotes any convex function
π : [1, β) β β with π(π₯) = π₯! for π₯ β [π]; for deriving the Sensitivity Lemma from the Strong
Sensitivity Lemma we also need π to be nondecreasing. One such function is π(π₯) = Ξ(π₯ + 1). To
prove a lower bound on the zero-error distributional complexity, we need the function π to be
not only convex, but log-convex, i. e., log2 π(π₯) needs to be convex. The function π(π₯) = Ξ(π₯ + 1)
satisfies this constraint as well.
Proposition 6.3. Let Ξ be a problem in the comparison-query model with π items, π the uniform
distribution on the inputs π, and π : [1, β) β β a nondecreasing log-convex function with π(π₯) = π₯!
for π₯ β [π].
DistQ0 (Ξ , π) β₯ log2 (π(π (Ξ ) + 1)/π)
Proof. Let π be the number of distinct execution traces of a deterministic algorithm π΄ for Ξ , and
let π
1 , . . . , π
π denote the corresponding sets of rankings. Interpreting π΄ as a binary decision
tree, let π(π
π ) be the depth of the execution trace corresponding to π
π . By Kraftβs inequality,
π
Γ
2βπ(π
π ) β€ 1.
π=1
π(π₯+1)
Let π (π₯) = π1 π! and define the weight function π€(π) = π (degπ» (π)), where π» refers to the
notation of the Strong Sensitivity Lemma: π» denotes the subgraph of the full permutahedron
that only consists of the bichromatic edges when the vertices are colored with their execution
trace under π΄. By Claim 3.3, the sum of the weights of all rankings π in π
π is at most 1. Therefore,
Γ
2βπ(π) π€(π) β€ 1.
π
Dividing both sides by π! and taking the logarithm of both sides, we get that
h i
βπ(π)
log2 πΌ 2 π€(π) β€ log2 (1/π!), (6.2)
where the expectation is with respect to a uniform distribution over the inputs π. By Jensenβs
inequality, since log is concave, we get
h i h i
log2 πΌ 2βπ(π) π€(π) β₯ πΌ log2 2βπ(π) π€(π) = πΌ[βπ(π)] + πΌ[log2 π€(π)],
which, in combination with (6.2), implies
πΌ[log2 π€(π)] β€ πΌ[π(π)] + log2 (1/π!).
T HEORY OF C OMPUTING, Volume 20 (7), 2024, pp. 1β62 29
I VAN H U , D IETER VAN M ELKEBEEK , AND A NDREW M ORGAN
Note that since π is log-convex, so is π . By applying Jensenβs inequality again,
πΌ[log2 π€(π)] = πΌ[log2 π (degπ» (π))] β₯ log2 π (πΌ[degπ» (π)]),
implying
1 π(πΌ[degπ» (π)] + 1)
log2 Β· β€ πΌ[π(π)] + log2 (1/π!),
π π!
or equivalently,
πΌ[π(π)] β₯ log2 (π(πΌ[degπ» (π)] + 1)/π).
The result follows since π΄ is an arbitrary deterministic algorithm for Ξ , πΌ[π(π)] equals the query
complexity of π΄ with respect to the uniform distribution π, πΌ[degπ» (π)] β₯ πΌ[degπΊ(Ξ ) (π)] = π (Ξ ),
and π is nondecreasing.
Proposition 6.3 allows us to prove a lower bound on the π-error distributional query
complexity of Ξ with respect to the uniform distribution. In order to do so, we view a
deterministic algorithm with distributional error π for Ξ as an exact deterministic algorithm for
a modified problem Ξ 0, apply Proposition 6.3, and lower bound the sensitivity of Ξ 0 in terms of
the sensitivity of Ξ .
Proposition 6.4. Let Ξ be a problem in the comparison-query model with π items, π the uniform
distribution on the inputs π, π β [0, 1], and π : [1, β) β β a nondecreasing log-convex function with
π(π₯) = π₯! for π₯ β [π].
π(π (Ξ ) + 1 β 2(π β 1)π)
DistQπ (Ξ , π) β₯ log2 (6.3)
π
Proof. Consider any algorithm π΄ with error π for Ξ , or in other words, β[π΄(π) β Ξ (π)] β€ π. Let
Ξ π΄ be the problem of determining the output of π΄. We prove that
π (Ξ π΄ ) β₯ π (Ξ ) β 2(π β 1)π,
which implies the desired result by Proposition 6.3, since π΄ is a deterministic algorithm for Ξ π΄
and π is nondecreasing.
Let πΊ denote the full permutahedron graph for π items. We use the fact that π (Ξ ) =
(π β 1) Β· βπβπΊ [π β πΊ(Ξ )], and similarly, π (Ξ π΄ ) = (π β 1) Β· βπβπΊ [π β πΊ(Ξ π΄ )], where all the
underlying distributions are uniform. Suppose the endpoints of π are π1 and π2 . Note that if
π β πΊ is picked uniformly at random, then the marginal distributions of both π1 and π2 are also
uniform. If π΄(π1 ) = Ξ (π1 ), π΄(π2 ) = Ξ (π2 ), and π β πΊ(Ξ π΄ ), then π β πΊ(Ξ ), as well. By a union
bound, the probability that π΄(π1 ) β Ξ (π1 ) or π΄(π2 ) β Ξ (π2 ) is at most 2π.
βπβπΊ [π β πΊ(Ξ π΄ )] β₯ βπβπΊ [π β πΊ(Ξ )] β βπβπΊ [π΄(π1 ) β Ξ (π1 ) or π΄(π2 ) β Ξ (π2 )]
β₯ βπβπΊ [π β πΊ(Ξ )] β 2π.
Multiplying both sides by π β 1 gives π (Ξ π΄ ) β₯ π (Ξ ) β 2(π β 1)π.
T HEORY OF C OMPUTING, Volume 20 (7), 2024, pp. 1β62 30
L OWER B OUND T ECHNIQUES IN THE C OMPARISON -Q UERY M ODEL
Since π (Ξ ) β€ π β 1, Proposition 6.4 only yields nontrivial lower bounds for small π. In order
to establish lower bounds for the standard π = 1/3, we first reduce the error using standard
techniques. Doing so such that the argument of π on the right-hand side of (6.3) remains
Ξ©(π (Ξ )), and picking π(π₯) = Ξ(π₯ + 1), we conclude:
Lemma 6.5 (Bounded-Error Sensitivity Lemma). For any problem Ξ in the comparison-query model
with π items,
π log π
RQ1/3 (Ξ ) = Ξ© ,
log(2π/π )
where π π (Ξ ).
Proof. By taking the majority vote of multiple independent runs and a standard analysis, e. g.,
based on Chernoff bounds, we have that RQπ (Ξ ) = π(log(1/π)) RQ1/3 (Ξ ) for any π β€ 1/3.
Combining this with Lemma 6.2 and Proposition 6.4, we have:
RQπ (Ξ ) DistQ2π (Ξ ) log(π(π + 1 β 4(π β 1)π)/π)
RQ1/3 (Ξ ) = Ξ© =Ξ© =Ξ© .
log(1/π) log(1/π) log(1/π)
Setting π such that 4ππ = π /2 yields
log(π(π /2 + 1)/π)
RQ1/3 (Ξ ) = Ξ© .
log(8π/π )
β π₯ π₯
Picking π(π₯) = Ξ(π₯ + 1) and using the fact that Ξ(π₯) β₯
2ππ₯ e , we obtain
(π /2) log(π /(2e)) β log π π log π
RQ1/3 (Ξ ) = Ξ© =Ξ© ,
log(8π/π ) log(2π/π )
β
where the simplification can be verified by considering the cases of large π (say π β₯ π) and
small π separately.
We can apply Lemma 6.5 to the sensitivity lower bounds of Lemma 4.1 and produce
randomized lower bounds for inversion minimization on bounded-degree trees. Using Fact 4.5
we obtain:
Theorem 6.6 (lower bound against bounded-error for inversion minimization on trees). Let π
be a tree with deg(π) β€ π. The query complexity of Ξ π for bounded-error randomized algorithms is
π log(π/π)
Ξ©( π log(π) ).
7 Connectivity Lemma
In this section we establish Lemma 1.16 and use it to present some of the known lower bounds
in a unified framework. We actually prove the following somewhat stronger result.
T HEORY OF C OMPUTING, Volume 20 (7), 2024, pp. 1β62 31
I VAN H U , D IETER VAN M ELKEBEEK , AND A NDREW M ORGAN
Lemma 7.1 (Strong Connectivity Lemma). Consider an algorithm π΄ in the comparison-based model,
color each vertex of the permutahedron with its execution trace under π΄, and let π» denote the subgraph
with the same vertex set but only containing the monochromatic edges. The number of distinct execution
traces of π΄ equals the number of connected components of π».
The Connectivity Lemma follows from Lemma 7.1 because the coloring with execution
traces of an algorithm π΄ for Ξ is a refinement of the coloring with Ξ . Note that the counterpart
of Lemma 7.1 in the Boolean setting is trivial. This is because an execution trace in the Boolean
setting is specified by values for a subset of the input bits, so the set of inputs that follow a
particular execution trace form a subcube of the hypercube, the Boolean counterpart of the
permutahedron. Subcubes are trivially connected inside the hypercube. In the comparison-
query model, the sets of inputs that follow a particular execution trace can be more complicated,
and their connectedness is no longer trivial but still holds.
Proof of Lemma 7.1. Two rankings π1 and π2 that have distinct execution traces under π΄ cannot
be connected because any path between them needs to contain at least one bichromatic edge.
For the remainder of the proof, we consider two rankings π1 and π2 that have the same execution
trace under π΄, and construct a path from π1 to π2 in π».
If π1 = π2 , we do not need to make any move and use an empty path.
Otherwise, there exists a rank π < π such that π1 and π2 agree on ranks less than π and
disagree on rank π. We have the following situation, where the item π¦π with rank π under π2 ,
has rank π > π under π1 .
rank 1 Β· Β· Β· πβ1 π Β·Β·Β· π β1 π Β·Β·Β· π
πβ1
1
π₯1 Β· Β· Β· π₯ πβ1 π₯ π Β·Β·Β· π₯ π β1 π₯ π = π¦π Β·Β·Β·
= = = β
πβ1
2
π¦1 Β· Β· Β· π¦πβ1 π¦π Β·Β·Β· Β·Β·Β·
Considering ranking π1 , we have that π1 (π₯ π β1 ) = π β 1 < π = π1 (π₯ π ). Considering ranking
π2 , since π₯ π β1 differs from π¦ π = π₯ π for every π β [π β 1] and also differs from π¦π , we have that
π2 (π₯ π β1 ) > π = π2 (π¦π ) = π2 (π₯ π ). Thus, the relative ranks of π₯ π β1 and π₯ π under π1 and π2 differ. As
π1 and π2 have the same execution trace, this means that the algorithm does not compare π₯ π β1
and π₯ π on either input, and on π1 in particular. Let π01 be the ranking obtained from ranking π1
by applying the adjacent-rank transposition π = (π β 1, π ). Since the algorithm does not compare
the affected items, the execution trace for π01 and π1 are the same, so the edge from π01 to π1 is
monochromatic and in π». We use this edge as the first on the path from π1 to π2 in π». What
remains is to find a path from π01 to π2 in π». The situation is the same as the one depicted above
but with π increased by one in case π = π + 1, and with the same π and π decreased by one,
otherwise. The proof then follows by induction on the ordered pair (π, π β π ).
Remark 7.2. Suppose we allow an algorithm π΄ to have multiple valid execution traces on a
given input π, and let π
denote the set of rankings on which a particular execution trace is valid.
The construction in the proof of Lemma 7.1 yields a path in the permutahedron between any
two rankings in π
such that the path entirely stays within π
. This means that we can replace
D(Ξ ) in the statement of the Connectivity Lemma by its nondeterministic variant N(Ξ ).
T HEORY OF C OMPUTING, Volume 20 (7), 2024, pp. 1β62 32
L OWER B OUND T ECHNIQUES IN THE C OMPARISON -Q UERY M ODEL
The Connectivity Lemma captures all the prior lower bounds stated in Section 2 except the
elementary adversary argument (which is also based on connectivity considerations, but in
an undirected graph other than π», namely (π , πΈ) where πΈ denotes the queries the algorithm
makes on a given input ranking π). It captures the information-theoretic lower bound because
input rankings with different outputs cannot belong to the same connected component of π».
We already explained in Section 1 how the Connectivity Lemma shows that counting inversions
and inversion parity amount to sorting, and require at least log(π!) queries. We now illustrate
its use for a classical problem that is easier than sorting, namely, median finding.
Let Ξ denote the selection problem with rank π = dπ/2e. For any ranking, the adjacent-rank
transpositions π that change the item with rank π are the two that involve rank π: π = (π β 1, π)
and π = (π, π + 1). Those transpositions are the ones that correspond to missing edges in the
permutahedron graph πΊ(Ξ ). As a result, for any two rankings, there exists a path between them
in πΊ(Ξ ) if and only if they have the same median as well as the same set of items with rank less
than π (and also the same set of items withrank greater than π). As there are π possibilities
for the median and, for each median, πβ1 πβ1 possibilities for the set of items that have rank
πβ1
less than π, πΊ(Ξ ) has π Β· πβ1 connected components. It follows that any algorithm for Ξ has
β
at least π Β· πβ1 π
πβ1 = Ξ©( π Β· 2 ) distinct execution paths, and therefore needs to make at least
π + 12 log(π) β π(1) queries.
As a side note, this example clarifies a subtlety in the equivalence between ordinary
selection and the instantiation of partial order production that is considered equivalent to
selection. Whereas selection of rank π ordinarily requires outputting only the item of rank π, the
instantiation of partial order production additionally requires partitioning the remaining items
according to whether their ranks are less than or greater than π. The above analysis implies
that it is impossible for the algorithm to know the item of rank π without also knowing how
to partition the remaining items into those of rank less than and greater than π. It follows
that, in the comparison-based model, ordinary selection and the instantiation of partial order
production are equivalent.
8 Connectivity approach
This section covers the connectivity approach for obtaining query lower bounds in the
comparison-query model. Our main focus is the problem Ξ π of inversion minimization
on a fixed tree π, for which we derive very strong query lower bounds in the case of the special
types of trees in Theorem 1.8. Some parts of the analysis carry through for a broader class of
problems Ξ , namely those that satisfy a certain partition property. We first develop the property
and apply the Connectivity Lemma to a general problem Ξ with the property. We then present
sufficient conditions for the problem Ξ π to have the property and perform a detailed analysis,
leading to Theorem 1.8. Finally, we apply the same ideas to the problem of counting cross
inversions, for which we obtain the query lower bound of Theorem 1.9, as well as to the closely
related problem of inversion minimization on the MannβWhitney trees of Figure 2.
T HEORY OF C OMPUTING, Volume 20 (7), 2024, pp. 1β62 33
I VAN H U , D IETER VAN M ELKEBEEK , AND A NDREW M ORGAN
8.1 Partition property
In order to obtain good lower bounds on D(Ξ ) using the Connectivity Lemma, it is sufficient to
find good upper bounds on the size of the connected components of a typical vertex in πΊ(Ξ ).
For the problem Ξ π , we can assume without loss of generality that π has no internal nodes
of degree 1, i. e., no nodes with exactly one child. With that assumption, Ξ π is insensitive to
any adjacent-rank transposition π at a ranking π for which the affected leaves are siblings in π.
Thus, the corresponding edges from the permutahedron are always present in πΊ(Ξ π ). From the
perspective of ensuring small connected components in πΊ(Ξ π ), the ideal situation would be if
there were no other edges in πΊ(Ξ π ). That is to say, Ξ π is sensitive at π to every adjacent-rank
transposition π except when the affected leaves are siblings. We will investigate conditions on π
that guarantee this situation in the next two subsections. In this subsection, we analyze the size
of the connected components of πΊ(Ξ π ) when π is of the desired type, and use it obtain a query
lower bounds via the Connectivity Lemma. Our analysis applies more generally to any problem
Ξ with the following property.
Definition 8.1 (partition property). A computational problem Ξ in the comparison-query model
on a set π of π items has the partition property if the set π can be partitioned into sets ππ such
that for any ranking π of π and adjacent-rank transposition π = (π, π + 1) with π β [π β 1],
Ξ (π) = Ξ (ππ) if and only if πβ1 (π) and πβ1 (π + 1) belong to the same partition class ππ . If every
partition class ππ has size at most π, we say that Ξ has the partition property with class size at
most π.
In other words, a problem Ξ has the partition property if the underlying universe can be
partitioned in such a way that adjacent-rank transpositions that do not change the answer are
exactly those whose affected items fall within the same partition class. In the case of the problem
Ξ π , the partition classes ππ correspond to the leaf child sets LC(π£) from Definition 1.7, where π£
ranges over the leaf parents.
Let us investigate the size of the connected components of πΊ(Ξ ) when Ξ satisfies the partition
property. Consider a walk in πΊ(Ξ ). As the only steps we can take correspond to adjacent-rank
transpositions π that swap elements in the same partition class, the sets ππ remain invariant,
irrespective of the ranking π we start from. Depending on π, there may be more structure inside
each partition class ππ ; the set ππ may be broken up into smaller subsets that are each invariant.
For our analysis, we list the elements of each partition class in order of increasing rank under
π, and include an edge between elements that have successive ranks. We introduce the term
βsuccessor graph" to capture this structure, viewed as a graph with the ranks as vertices.
Definition 8.2 (successor graph, π(Β·, Β·)). Let Ξ be a computational problem in the comparison-
query model on a set π of π items, and π a ranking of π. The successor graph of Ξ on π, denoted
π(Ξ , π), has vertex set [π] and contains all edges of the form (π, π + 1) with π β [π β 1] such that
Ξ (π) = Ξ (ππ), where π denotes the adjacent-rank transposition (π, π + 1).
We have the following connection.
T HEORY OF C OMPUTING, Volume 20 (7), 2024, pp. 1β62 34
L OWER B OUND T ECHNIQUES IN THE C OMPARISON -Q UERY M ODEL
Proposition 8.3. Let Ξ be a computational problem in the comparison-query model on the set π, and let
π be a ranking of π. If Ξ has the partition property, then the connected component of π in πΊ(Ξ ) has size
π (π π !), where the π π βs denote the sizes of the connected components of π(Ξ , π).
Γ
Proof. The connected components of the successor graph π(Ξ , π) correspond to subsets of
the classes ππ that each remain invariant under walks in πΊ(Ξ ). Within each of the subsets,
independently for each subset, every possible ordering can be realized by such walks. This is
because for any adjacent-rank transposition π, the successor graphs π(Ξ , π) and π(Ξ , ππ) are
the same, and every ordering can be realized by a sequence of swaps of adjacent elements. It
follows that the number of rankings that can be reached from π in πΊ(Ξ ) equals the product over
all connected components of π(Ξ , π) of the number of possible orderings of the elements in the
connected component.
Figure 10 depicts an example for a problem of type Ξ π and a partition consisting of 4 classes,
namely the leaf child sets LC1 , LC2 , LC3 and LC4 . The tree π and ranking π are represented in
Figure 10a. Figure 10b represents the part of the successor graph π(Ξ π , π) involving the leaf
child set LC3 and illustrates the subpartitioning into invariant subsets.
LC4 π 2 3 7 8 9 11 14
1 13
LC1 LC2 π(π)
4 12 6 5 10
LC3
11 2 3 9 14 7 8
(a) Leaf child sets and leaf ranks under π (b) Inside leaf child set LC3
Figure 10: Connected component analysis
If each of the partition classes ππ has size at most π, then each of the connected components
of π(Ξ , π) has size π π β€ π, irrespective of π. The maximum value that π (π π !) can take under
Γ
the constraints π π π = π and π π β€ π is no more than (π!)π/π . By the Connectivity Lemma, we
Γ
conclude that D(Ξ ) β₯ π!/(π!)π/π , and that the query complexity is at least log2 (π!) β π(π log(π)).
We can do better by observing that, for a random ranking π, the number of adjacent-rank
transpositions π that do not jump from one partition class to another is not much larger than the
average size of the partition classes.
Lemma 8.4 (lower bound for problems with the partition property). Let Ξ be a computational
problem in the comparison-query model on a set of size π. If Ξ satisfies the partition property with class
size at most π, then then D(Ξ ) β₯ π!/(2(π!)2 ).
Proof. For any rank π β [π β 1], the probability that πβ1 (π) and πβ1 (π + 1) belong to the same
πβ1
partition class equals π |πππ | |ππβ1
π |β1
provided each partition class ππ has size
Γ
, which is at most πβ1
T HEORY OF C OMPUTING, Volume 20 (7), 2024, pp. 1β62 35
I VAN H U , D IETER VAN M ELKEBEEK , AND A NDREW M ORGAN
at most π. It follows that the expected number of adjacent-rank transpositions π that do not
change partition class, is at most π β 1, so for a fraction at least half of the rankings π the number
is at most 2(π β 1).
The number of adjacent-rank transpositions π that do not change partition class for a given
ranking π equals the number of edges in the successor graph π(Ξ , π). In terms of the sizes π π ,
the number equals π (π π β 1). We are considering rankings π for which the sum is at most
Γ
Γ Γ
2(π β 1). The maximum of π (π π !) under the constraints that π (π π β 1) β€ 2(π β 1) and that each
individual π π β€ π, is reached when two of the π π βs equal π and the rest are 1. Thus, if each of the
partition classes ππ are of size at most π, for a fraction at least half of the rankings π, the size of
the connected component of π in πΊ(Ξ ) is at most (π!)2 . It follows that the number of connected
components of πΊ(Ξ ) is at least π!/(2(π!)2 ). The Connectivity Lemma then yields the claimed
lower bound on D(Ξ ).
Lemma 8.4 yields a lower bound of log(π!) β π(π log(π)) on the query complexity of Ξ
whenever Ξ satisfies the partition property with class size at most π.
Next we turn to sufficient conditions on the tree π that guarantee the partition property for
Ξ π . For didactic reasons we first develop the conditions for binary trees, and then generalize
them to arbitrary trees.
8.2 Binary trees
In the case of binary trees π, the sensitivity analysis of Section 5.1 leads to a simple sufficient
condition for the partition property to hold for Ξ π . Recall that we are assuming without loss of
generality that π has no internal nodes of degree 1, which in the case of binary trees is equivalent
to saying that the tree is full: Every internal node has the maximum degree of 2.
Consider criterion 5.3 in Proposition 5.2. The right-hand side is always β1. As for the
left-hand side, we know the following.
Fact 8.5. For all disjoint sets π΄, π΅ β π and any ranking π of π, DInvπ (π΄, π΅) = |π΄| Β· |π΅| mod 2.
Proof. As every pair in π΄ Γ π΅ constitutes a cross-inversion for either π΄ to π΅, or π΅ to π΄, we have
XInvπ (π΄, π΅) + XInvπ (π΅, π΄) = |π΄| Β· |π΅|. Thus,
DInvπ (π΄, π΅) XInvπ (π΄, π΅) β XInvπ (π΅, π΄)
= (XInvπ (π΄, π΅) + XInvπ (π΅, π΄)) β 2 XInvπ (π΅, π΄)
= |π΄| Β· |π΅| β 2 XInvπ (π΅, π΄). (8.1)
As XInvπ (π΅, π΄) in an integer, the claim follows.
Fact 8.5 implies that whenever at least one of the leaf sets πΏlo or πΏhi is of even cardinality,
then (5.3) fails to hold, and Ξ π is sensitive to the underlying π at π. Thus, we can guarantee that
Ξ π satisfies the partition property provided that for any two siblings π’1 and π’2 in π that are
not both leaves, at least one of |L(ππ’1 )| or |L(ππ’2 )| is even. We refer to the latter condition as the
product condition. In trees without nodes of degree 1, the product condition can be expressed
T HEORY OF C OMPUTING, Volume 20 (7), 2024, pp. 1β62 36
L OWER B OUND T ECHNIQUES IN THE C OMPARISON -Q UERY M ODEL
alternately in terms of the leaf child sets. We state and prove the result for arbitrary trees as it
will help us in the next subsection to generalize the analysis.
Proposition 8.6. Let π be a tree without nodes of degree 1. The following two conditions are equivalent:
(a) For any two siblings π’1 and π’2 that are not both leaves, at least one of |L(ππ’1 )| or |L(ππ’2 )| is even.
(b) At most one leaf child set is odd, and if there exists a node π£ β with an odd leaf child set LC(π£ β ), then
all ancestors of π£ β have an empty leaf child set.
In the case of binary trees, (b) can be simplified to: At most one leaf has a non-leaf sibling.
Proof. We establish the two directions of implication separately.
β: We argue the contrapositive. Suppose that at least two of the leaf child sets are odd. Start
with the root of π as the node π£, and iterate the following: If π£ has a child π’ such that ππ’ contains
at least two nodes with an odd leaf child set, replace π£ by such a child π’. When the process
ends, one of the following situations applies:
β’ There are two distinct children π’1 and π’2 of π£ that are not leaves and each contain a single
node with an odd leaf child set. In this case both ππ’1 and ππ’2 contain an odd number of
leaves, violating (a).
β’ There exists a unique child π’1 of π£ that is not a leaf and contains a single node with an odd
leaf child set, and π£ itself has an odd number of leaf children. In this case, setting π’2 to
any one leaf child of π£ (which exists as their number is odd), leads to a violation of (a).
Next, suppose that there exists a unique node π£ β that has an odd leaf child set, and that an
ancestor π£ of π£ β has a leaf child π’1 . Setting π’2 to the child of π£ that contains π£ β in its subtree,
yields a violation of (a) as ππ’2 contains an odd number of leaves.
β: If neither π’1 nor π’2 are leaves, the first condition of (b) guarantees that at most one of ππ’1 or
ππ’2 contains an odd number of leaves. If π’1 is a leaf and π’2 is not, then the second condition of
(b) implies that ππ’2 cannot contain a node with an odd leaf child set, and therefore has an even
number of leaves.
In the case of binary trees, the first condition of (b) implies the second one, which can
therefore be dropped from the equivalence statement. Moreover, for binary trees the first
condition of (b) can be expressed as: At most one leaf has a non-leaf sibling.
By Proposition 8.6, in a binary tree the condition that at most one leaf has no sibling is
equivalent to the product condition, which implies the partition property of Ξ π , so the lower
bound of Lemma 8.4 applies. This establishes Theorem 1.8 in the case of binary trees.
As a side note, Fig. 11 shows the simplest example of a full binary tree π and a ranking π for
which there exists an adjacent-rank transposition π to which Ξ π is insensitive at π while the
affected leaves are not siblings. The tree has two leaves without siblings, namely 1 and 2. The
adjacent-rank transposition (3, 4) acts on nodes that are not siblings, but leaves the minimum
number of inversions at 4.
T HEORY OF C OMPUTING, Volume 20 (7), 2024, pp. 1β62 37
I VAN H U , D IETER VAN M ELKEBEEK , AND A NDREW M ORGAN
1 2
3 6 4 5
Figure 11: Tree insensitive to non-sibling transposition
8.3 General trees
For general trees π the sensitivity analysis of Ξ π becomes more complicated than for binary
trees, and we do not know of a simple sensitivity criterion like Proposition 5.2, but we can
nevertheless extend the result for binary trees to arbitrary trees with similar constraints. For a
given ranking π of L(π) and a given adjacent-rank transposition π, we would like to figure out
the effect of π on the objective MinInv(π, Β·), in particular when MinInv(π, ππ) = MinInv(π, π).
Recall the decomposition (4.3) of MinInv(π, Β·) from Section 4.2. By Proposition 4.7 the only term
on the right-hand side of (4.3) that can be affected by the transposition π is
MinRInv(ππ£ , π) min RInv(π, π, π)
π
corresponding to the node π£ that is the least common ancestor LCA(β lo , βhi ) of the two leaves
β lo and β hi that are affected by π under π. In Section 5 we considered the two possible relative
orderings π1 and π2 of the children of π£, and derived a criterion for when the lowest cost does
not change under π. More precisely, when
min(RInv(π, π, π1 ), RInv(π, π, π2 )) = min(RInv(π, ππ, π1 ), RInv(π, ππ, π2 )). (8.2)
There are two complications in generalizing this approach from binary to general trees.
β’ The expression (4.2) for RInv(π, π, π) involves multiple terms instead of just one as in
(5.1). This complicates probabilistic analyses like the one we did in Section 5 because the
difference in cost of the two relative orderings of two children is also affected by parts of
the tree outside of their combined subtrees. The issue did not matter for the analysis in
Section 4. We will be able to manage it here, as well.
β’ There now are not just two but multiple possible orderings π, and it is not clear what
pairs (π1 , π2 ) we need to impose (8.2) on in order to guarantee that MinRInv(ππ£ , π) =
MinRInv(ππ£ , ππ) but no more.
In Section 4 we circumvented the second issue by only considering sensitivities that decrease
the objective function, and establishing a lower bound on their occurrence independent of
the ordering π. Here we are also able to handle the second issue by shooting for a sufficient
condition for sensitivity rather than a criterion. We do so by requiring that for no pair of distinct
T HEORY OF C OMPUTING, Volume 20 (7), 2024, pp. 1β62 38
L OWER B OUND T ECHNIQUES IN THE C OMPARISON -Q UERY M ODEL
orderings π1 and π2 , condition (8.2) holds (unless the two affected leaves are siblings). Similar
to the case of binary trees, we guarantee that (8.2) fails based on parity considerations given the
product condition.
For the analysis we again assume without loss of generality that π has no nodes of degree 1.
We use the same notation as in Section 5: Let β lo denote the affected leaf that is smaller with
respect to π, and β hi the other affected leaf. Let πΏ π denote the leaf set πΏ π L(ππ’π ), where π’1 , . . . , π’ π
are the children of π£ = LCA(βlo , βhi ). We also write π’lo for the child of π£ that contains βlo in its
subtree, and πΏlo for the leaf set of the subtree rooted at π’lo , and define βhi , π’hi , and πΏhi similarly.
See Figure 12 for a sketch of the setting.
π£
π
π’1 Β· Β· Β· π’lo Β·Β·Β· Β·Β·Β· π’hi Β· Β· Β· π’π
βlo β hi
Figure 12: Sensitivity for general trees
We slightly abuse notation and let π denote both the ordering of the entire tree π as well as
the ranking of the children of π£. By the analysis of Section 5, we have that for two orderings π1
and π2 the situation (8.2) can only occur if π1 (π’lo ) < π1 (π’hi ), π2 (π’lo ) > π2 (π’hi ), and
RInvπ (ππ£ , π, π1 ) β RInvπ (ππ£ , π, π2 ) = β1 = DInvπ ({β lo }, {βhi }). (8.3)
For any two disjoint sets of leaves, (8.1) lets us write
1 1
XInvπ (π΄, π΅) =
DInvπ (π΄, π΅) + |π΄| Β· |π΅|. (8.4)
2 2
Applying (8.4) to all the terms involved in (4.2), we have
Γ
RInvπ (ππ£ , π, π) = XInvπ (πΏ π , πΏ π ) Β· π[π(π) < π(π)]
1β€π<πβ€π
Γ
+ XInvπ (πΏ π , πΏ π ) Β· π[π(π) > π(π)]
1β€π<πβ€π
1 Γ
= DInvπ (πΏ π , πΏ π ) Β· (β1)π[π(π>π(π)] + |πΏ1 | Β· |πΏ2 |
2
1β€π<πβ€π
Γ Γ
RInvπ (ππ£ , π, π1 ) β RInvπ (ππ£ , π, π2 ) = DInvπ (πΏ π , πΏ π ) + DInvπ (πΏ π , πΏ π )
1β€π<πβ€π 1β€π<πβ€π
π1 (π)<π2 (π) π1 (π)>π2 (π)
π2 (π)>π2 (π) π2 (π)<π2 (π)
T HEORY OF C OMPUTING, Volume 20 (7), 2024, pp. 1β62 39
I VAN H U , D IETER VAN M ELKEBEEK , AND A NDREW M ORGAN
By combining the last equation with (8.3) and separating out the term for (π, π) = (lo, hi), we
obtain the following necessary condition for (8.2) to hold:
Γ Γ
DInvπ (πΏlo , πΏhi ) β DInvπ ({β lo }, {β hi }) = β DInvπ (πΏ π , πΏ π ) + DInvπ (πΏ π , πΏ π ). (8.5)
1β€π<πβ€π 1β€π<πβ€π
π1 (π)<π2 (π) π1 (π)>π2 (π)
π2 (π)>π2 (π) π2 (π)<π2 (π)
(π,π)β (lo,hi)
In order for Ξ π to have the partition property, it suffices to ensure that (8.5) fails whenever
π’lo and π’hi are not both leaves. By (8.1) each of the terms DInvπ (πΏ π , πΏ π ) in (8.5) has the same
parity as |πΏ π | Β· |πΏ π |. Since DInvπ ({β lo }, {β hi }) is odd, it follows that (8.5) fails whenever at most
one of the leaf sets πΏ π involved is odd, which is condition (a) in Proposition 8.6. Switching to the
equivalent condition (b) from Proposition 8.6 allows us to conclude via Lemma 8.4:
Theorem 8.7. Let π be a tree without nodes of degree 1 such that the leaf child sets have size at most π, at
most one of them is odd, and if there exists an odd one, say LC(π£ β ), then all ancestors of π£ β have empty leaf
child sets. Then D(Ξ π ) β₯ π!/(2(π!)2 ).
Theorem 1.8 follows by taking the base-2 logarithm of the bound.
8.4 Counting cross inversions and evaluating the MannβWhitney statistic
We now apply the connectivity approach that we captured in Proposition 8.3 to the problem
Ξ XInv of computing the number of cross inversions between two disjoint sets π΄ and π΅ with
respect to a ranking π of π = π΄ t π΅. Note that this problem is a refinement of evaluating the
MannβWhitney statistic, or equivalently, of inversion minimization on the tree π of Figure 2: Any
algorithm that solves Ξ XInv with π queries, can be transformed into an algorithm for Ξ π with π
queries, namely by transforming the output π¦ of the algorithm for Ξ XInv to min(π¦, |π΄| Β· |π΅| β π¦).
Viewed in the contrapositive, a lower bound for Ξ XInv is easier to obtain than one for Ξ π on the
tree π of Figure 2. We first establish a lower bound for Ξ XInv and then see how it extends to Ξ π .
One can think of Ξ XInv as inversion minimization on the MannβWhitney tree without
allowing swapping the two children of the root. As a result, the problem Ξ XInv is sensitive to
every adjacent-rank transposition between non-siblings, and therefore automatically satisfies
the partition property (with π΄ and π΅ being the partition classes), so Proposition 8.3 applies. In
contrast, the problem Ξ π of minimizing inversions on the MannβWhitney tree may not have the
partition property. This is why analyzing Ξ XInv is a bit simpler, and why we handle it first.
Let π |π΄| and π |π΅|. By the partition property, the average sensitivity of Ξ XInv equals
π+π , which via the Sensitivity Lemma yields a query lower bound of Ξ©(π log(π)) for π β€ π. To
2ππ
obtain the stronger lower bound of Ξ©((π + π) log(π)) we need a more detailed analysis of the
connectivity of the permutahedron graph πΊ(Ξ XInv ).
For a given ranking π, let π₯ 1 , . . . , π₯ π be the elements of π΄ listed in increasing order, and
similarly for π¦1 , . . . , π¦π for the elements of π΅. We define π1 , . . . , ππ+1 such that for each π, π π is
the number of elements of π΄ between π¦ πβ1 and π¦ π . (Here, π¦0 and π¦π+1 serve as sentinels with
an infinitely low and infinitely high rank.) Similarly, we define π1 , . . . , π π+1 as the number
T HEORY OF C OMPUTING, Volume 20 (7), 2024, pp. 1β62 40
L OWER B OUND T ECHNIQUES IN THE C OMPARISON -Q UERY M ODEL
of elements in π΅ between successive elements of π΄. The numbers π π and π π are the sizes of
the connected components of the successor graph π(Ξ XInv , π) (possibly with some additional
zeroes). By Proposition 8.3, the connected component of π in πΊ(Ξ XInv ) has size
(π1 )! Β· Β· Β· (ππ+1 )!(π1 )! Β· Β· Β· (π π+1 )!. (8.6)
Depending on the values of π1 , . . . , ππ+1 , π1 , . . . , π π+1 , some connected components may be
much larger than others. We apply the Connectivity Lemma in a similar way as in Lemma 8.4
and only count the rankings π that are in small connected components, which are the rankings
for which π1 , . . . , ππ+1 , π1 , . . . , π π+1 are bounded. Let π β and π β be the minimum integers for
which
3 3
β[π1 , . . . , ππ+1 β€ π β ] β₯ and β[π1 , . . . , π π+1 β€ π β ] β₯ .
4 4
By a union bound, the probability that both of these events hold is at least 1/2. In other words,
there are least (π + π)!/2 rankings π for which π1 , . . . , ππ+1 β€ π β and π1 , . . . , π π+1 β€ π β .
Proposition 8.8.
(π + π)!
D(Ξ π ) β₯ . (8.7)
2(π β )!π/π (π β )!π/π
β β
Proof. We consider the rankings for which π1 , . . . , ππ+1 β€ π β and π1 , . . . , π π+1 β€ π β . We first
argue the following upper bound on the size (8.6) of the connected component in πΊ(Ξ XInv ) of
any such ranking π:
(π1 )! Β· Β· Β· (ππ+1 )!(π1 )! Β· Β· Β· (π π+1 )! β€ (π β )π/π (π β )π/π .
β β
(8.8)
For nonnegative integers π, π!1/π is increasing. This can be seen by noticing that log(π!1/π ) is the
β
average of log(1), . . . , log(π). As a result, for π β [π + 1], (π π )!1/ππ β€ (π β )!1/π , or equivalently,
(π π )! β€ (π β )ππ /π . Using the fact that π1 + Β· Β· Β· + ππ+1 = π,
β
(π1 )! Β· Β· Β· (ππ+1 )! β€ (π β )!(π1 +Β·Β·Β·+ππ+1 )/π = (π β )!π/π .
β β
We can apply similar reasoning to get (π1 )! Β· Β· Β· (π π+1 )! β€ (π β )!π/π . From this, we conclude that
β
the size of the connected components among the rankings under consideration is at most the
right-hand side of (8.8).
Since there are at least (π + π)!/2 of the rankings under consideration, we derive the stated
bound on D(Ξ π ) by applying the Connectivity Lemma.
Now, we find concrete bounds on π β and π β .
Proposition 8.9.
π π+π
max 1, β€ πβ β€ ln(4(π + 1)).
π+1 π
Symmetrically,
π π+π
max 1, β€ πβ β€ ln(4(π + 1)).
π+1 π
T HEORY OF C OMPUTING, Volume 20 (7), 2024, pp. 1β62 41
I VAN H U , D IETER VAN M ELKEBEEK , AND A NDREW M ORGAN
Proof. We first prove the upper bound on π β . Let π be a positive integer. We compute the
probability, over an average ranking π, that π π > π for a specific π. Notice that there is a one-to-one
correspondence between the ranking π and the corresponding sequence of nonnegative integers
π1 , . . . , ππ+1 such that π1 + Β· Β· Β· + ππ+1 = π, because the ranks of π΅ can be uniquely recovered
as π1 + 1, π1 + π2 + 2, . . . , and the remaining ranks form π΄. By stars and bars, there are π+π
π
such sequences. Now, if π π > π, then π π β (π + 1) is an arbitrary nonnegative integer, and
π1 + Β· Β· Β· + (π π β (π + 1)) + Β· Β· Β· + ππ+1 = π β π β 1. By stars and bars, there are π+πβπβ1
π such
sequences. Therefore, the probability that π π > π is π+πβπβ1
π+π
π / π . Continuing,
π
(π + π β π β 1) Β· Β· Β· (π β π) π+πβπβ1 π+1
β[π π > π] = β€ β€ exp βπ Β· ,
(π + π) Β· Β· Β· (π + 1) π+π π+π
where the last step uses the bound 1 + π₯ β€ exp(π₯). By a union bound and taking the complement,
π+1
β[π1 , . . . , ππ+1 β€ π] β₯ 1 β (π + 1) exp βπ Β· . (8.9)
π+π
If π is such that the right-hand side of (8.9) is at least 3/4, we know that π β β€ π. Solving for π
yields the stated bound on π β .
π
For the lower bounds, π β β₯ 1 because at least one of π1 , . . . , ππ+1 is at least 1, and π β β₯ π+1
because π1 + Β· Β· Β· + ππ+1 = π, and π is greater than or equal to the average term in the sum.
β
The first part of Theorem 1.9 now comes from taking the logarithm of D(Ξ XInv ) in Proposi-
tion 8.8 and using the bounds in Proposition 8.9.
Proof of the first part of Theorem 1.9. We mainly make use of the following approximation based
on Stirlingβs formula.
1
ln(π!) = π + ln(π) β π + π(1). (8.10)
2
In order to estimate ln D(Ξ XInv ), we need to estimate ln((π + π)!) β ππβ ln(π β ) β ππβ ln(π β ). By (8.10),
1
ln((π + π)!) = π + π + ln(π + π) β (π + π) + π(1) (8.11)
2
π β
π π
ln((π )!) = π + ln(π β ) β π + β π(1) (8.12)
π β β
2π π
π π π
ln((π β )!) = π + β ln(π β ) β π + β π(1) (8.13)
πβ 2π π
π
We can use the lower bounds in Proposition 8.9, namely that π β β₯ 1 and π β β₯ π+1 , to simplify
the occurrences of π and π in the denominators of (8.12) and (8.13). Therefore,
β β
π+π π+π π π+1
1
ln D(Ξ XInv ) β₯ π ln + π ln β ln(π β ) β ln(π β ) + ln(π + π) β π(π)
π β π β 2 2 2
π+π π+π
1
= π ln β + π+ ln β π(π).
πβ πβ πβ 2 πβ
T HEORY OF C OMPUTING, Volume 20 (7), 2024, pp. 1β62 42
L OWER B OUND T ECHNIQUES IN THE C OMPARISON -Q UERY M ODEL
Using the upper bounds in Proposition 8.9, absorbing low-order terms, and using the condition
that π β€ π, we get
β ! !
π ππ β
log D(Ξ XInv ) β₯ Ξ© π log + π log(π) β₯ Ξ©(π log( ππ/2) + π log(π)) = Ξ©((π + π) log(π)).
π+π
Evaluating the MannβWhitney statistic. We now argue how the second part of Theorem 1.9
follows, i. e., that the lower bound of Ξ©((π + π) log(π)) holds for inversion minimization on the
MannβWhitney tree π of Figure 2 with π β€ π. We do so by tweaking our lower bound argument
for Ξ XInv to the setting of Ξ π .
How does the permutahedron graph for Ξ π relate to the one for Ξ XInv ? The problem Ξ π is
a coarsening of the problem Ξ XInv : Output values π¦ and ππ β π¦ for Ξ XInv are both mapped to
min(π¦, ππ β π¦) under Ξ π . This means that all edges present in πΊ(Ξ XInv ) are also present in πΊ(Ξ π ),
but there may be more, and some of the connected components in πΊ(Ξ XInv ) corresponding to
output value π¦, may be merged in πΊ(Ξ π ) with some of the connected components of πΊ(Ξ XInv )
corresponding to output value ππ β π¦. However, by the reasoning behind Proposition 4.7, edges
in πΊ(Ξ XInv ) can only go between rankings whose value under Ξ XInv differ by at most one. This
means that the above merging of connected components can only happen if the difference
between π¦ and ππ β π¦ is 1, i. e., for the values bππ/2c and dππ/2e, and only if ππ is odd. In fact,
this is exactly the situation that we analyzed in Figure 9, where π£ coincides with the root of π.
If we ignore the rankings with value bππ/2c or dππ/2e under Ξ XInv , our lower bound argument
for Ξ XInv carries over verbatim to Ξ π , except that on the right-hand side of Proposition 8.8
the factor of 12 is replaced by 12 β π, where π represents the fraction of rankings with value
p
bππ/2c or dππ/2e under Ξ XInv . Lemma 5.5 tells us that π β€ 2πΆ/ ππ(π + π), where πΆ denotes
the constant from the lemma. Thus, we obtain a lower bound for D(Ξ π ) that is a negligible
fraction smaller than the one for D(Ξ XInv ). Taking logarithms, we obtain the same lower bound
for the query complexity up to an additive term. In particular, we obtain a query lower bound
of Ξ©((π + π) log(π)) for Ξ π in case π β€ π. This is the second part of Theorem 1.9.
9 Cross-inversion distribution
In this section we prove the upper bound we need for the proof of Theorem 1.6 in Section 5.2,
namely Lemma 5.5. Recall that XInvπ,π denotes a random variable that counts the number of
cross inversions XInv(π΄, π΅) from π΄ to π΅, where π΄ is an array of length π, π΅ an array of length
π, and the concatenation π΄π΅ is a random permutation of [π + π]. Lemma 5.5 states p that for all
positive integers π and π, XInvπ,π takes on no value with probability more than πΆ/ ππ(π + π),
where πΆ is a universal constant.
9.1 Approach
We establish Lemma 5.5 by going through the characteristic function of XInvπ,π .
T HEORY OF C OMPUTING, Volume 20 (7), 2024, pp. 1β62 43
I VAN H U , D IETER VAN M ELKEBEEK , AND A NDREW M ORGAN
Definition 9.1 (characteristic function, π Β·,Β· ). The characteristic function ππ of a random variable
π is the function ππ (π‘) : β β β : π‘ β¦β πΌ(eππ‘π ). We denote by π π,π the characteristic function of
XInvπ,π .
The characteristic function is well-defined for any random variable π and the distribution
function of π can always be retrieved from it (see, e.g., [24, Chapter VI]). In the case of integer-
valued random variables like XInvπ,π , the characteristic function is periodic with period 2π and
the distribution function can be retrieved via the inverse Fourier transform of the characteristic
function. In particular, the probabilities of XInvπ,π can be expressed as the following integrals.
Fact 9.2. For any positive integers π, π and integer π
β« π
1
β[XInvπ,π = π] = π π,π (π‘)eβππ‘ π ππ‘. (9.1)
2π βπ
Proof. Observe that the characteristic function π π,π of XInvπ,π is a polynomial in π§ = eππ‘ , where
the coefficient of degree π equals the probability of the outcome π.2 Formula (9.1) then follows
by linearity from
β« π β« π
2π π = π
π βπ ππ‘ π(πβπ)π‘
π§ e ππ‘ = e ππ‘ =
βπ βπ 0 π β π.
The following lemma represents the essence of the proof of Lemma 5.5.
Lemma 9.3. Then there exists a constant πΆ such that for all integers π, π with π β₯ π β₯ 2
β« π
πΆ
|π π,π (π‘)| ππ‘ β€ β . (9.2)
βπ π π
where π π,π (π‘) πΌ(eππ‘ XInvπ,π ).
Proof of Lemma 5.5. By symmetry, it suffices to consider the case where π β€ π. In the case where
π = 1, thep distribution of XInvπ,π is uniform over {0, . . . , π}, so the maximum probability is
β
π+1 β€ πΆ/ ππ(π + π) for any constant πΆ β₯ 2/2. Otherwise, we have
1
β« π β« π
1 βππ‘ π 1 πΆ πΆ
β[XInvπ,π = π] = π π,π (π‘)e ππ‘ β€ |π π,π (π‘)| ππ‘ β€ β β€ p
2π βπ 2π βπ 2ππ π ππ(π + π)
by (9.1) and Lemma 9.3.
2After multiplication by π+π
π , the resulting polynomial is known as the Gaussian polynomial with parameter
(π, π).
T HEORY OF C OMPUTING, Volume 20 (7), 2024, pp. 1β62 44
L OWER B OUND T ECHNIQUES IN THE C OMPARISON -Q UERY M ODEL
To establish Lemma 9.3, as |π π,π (π‘)| is an even function, it suffices to take the integral (9.2)
over the domain [0, π] and multiply by two:
β« π β« π
|π π,π (π‘)| ππ‘ = 2 |π π,π (π‘)| ππ‘ (9.3)
βπ 0
We divide the domain of integration on the right-hand side of (9.3) into two regions: one
close to zero, and the rest. The integrand is well-behaved in the center near zero, with it being
approximated accurately by a Gaussian curve. It is harder to analyze the behavior of the function
away from zero. In this region, a pole reduction lemma (captured by Lemma 9.13) that hinges
on a combinatorial matching result (Lemma 9.15), plays a crucial role in eliminating most of the
messy behavior of the function and still providing an effective bound.
We first derive an expression for the characteristic function of XInvπ,π in Section 9.2, bound
the central part of the integral in Section 9.3, the peripheral part in Section 9.4, and conclude
with the pole reduction lemma in Section 9.5.
9.2 The characteristic function π π,π
Our derivation of the characteristic function π π,π of XInvπ,π is based on a connection between
cross inversions and inversions in arrays.
Fact 9.4. Let π΄ and π΅ be arrays, and let π΄π΅ be the concatenation of π΄ with π΅. Then
Inv(π΄π΅) = Inv(π΄) + Inv(π΅) + XInv(π΄, π΅).
Proof. Any inversion in π΄π΅ is either between two elements of π΄, two elements of π΅, or one
element of π΄ and one element of π΅. In each case, the inversion is counted in Inv(π΄), Inv(π΅), or
XInv(π΄, π΅), respectively.
We also make use of the following handy property of characteristic functions.
Fact 9.5. For independent random variables π, π,
ππ+π (π‘) = ππ (π‘) π π (π‘).
Proof. ππ+π (π‘) = πΌ(eππ‘(π+π) ) = πΌ(eππ‘π eππ‘π ) = πΌ(eππ‘π )πΌ(eππ‘π ) = ππ (π‘) π π (π‘).
Let Invπ be the random variable that counts the number Inv(π΄) of inversions in an array π΄
that is a uniform permutation of [π], and let π π (π‘) be the characteristic function of Invπ .
Claim 9.6.
π ππ‘(πβ1)/2
Γ
e sin(ππ‘/2)
π π (π‘) = Β· (9.4)
π sin(π‘/2)
π=1
T HEORY OF C OMPUTING, Volume 20 (7), 2024, pp. 1β62 45
I VAN H U , D IETER VAN M ELKEBEEK , AND A NDREW M ORGAN
Proof. Consider the process of placing the elements 1, . . . , π one by one, each time placing each
new element between two elements or on some end of the array, to form an array π΄.
For π = 1, . . . , π, consider the random variable that counts the number of new inversions
formed with π when π is placed. First of all, when π is placed, the number of new inversions
is equal to the number of elements to the left of π at the time of placement (only the elements
1, . . . , π β 1 have been placed at this point). This means the random variable has a uniform
distribution over {0, . . . , π β 1}, which we denote by π πβ1 . Furthermore, this situation applies
regardless of the placement of the other elements, so this random variable π πβ1 is independent
from all other previous random variables π πβ1 with π < π.
Therefore, Invπ can be written as the following sum of independent variables:
Invπ = π0 + Β· Β· Β· + π πβ1 .
We can use Fact 9.5 to calculate the characteristic function:
π π πβ1
!
Γ Γ 1 Γ ππ‘π
π π (π‘) = πΌ[eππ‘π πβ1 ] = e ,
π
π=1 π=1 π=0
ππ‘ π/2 (e ππ‘ π/2 βeβππ‘ π/2 ) π/2)
from which (9.4) follows as
Γ πβ1 ππ‘π = eππ‘ π β1 = e = eππ‘(πβ1)/2 Β· 2π2πsin(π‘
π=0 e eππ‘ β1 eππ‘/2 (eππ/2 βeβππ‘/2 ) sin(π‘/2)
.
Consider a random permutation of [π + π], let π΄ be the array consisting of the first π elements,
and π΅ the array consisting of the remaining π. Then Inv(π΄), Inv(π΅), Inv(π΄, π΅), and XInv(π΄, π΅)
have the same distributions as Invπ , Invπ , Invπ+π , and XInvπ,π , respectively. Moreover, the values
of Inv(π΄), Inv(π΅), and XInv(π΄, π΅) are independent. Hence, by Fact 9.4 and Fact 9.5 we have
π π+π (π‘) = π π (π‘) π π (π‘) π π,π (π‘),
or
π π+π (π‘)
π π,π (π‘) = .
π π (π‘) π π (π‘)
By (9.4), the arithmetic sum formula applied to the exponents of e, and algebraic simplifications
we conclude:
Γπ
Proposition 9.7. For integers π β₯ 0, let π π (π‘) = π=1
sin(ππ‘)
π . Then
π π+π (π‘/2) π π+π (π‘/2)
π π,π (π‘) = eππ‘ ππ and |π π,π (π‘)| = .
π π (π‘/2)π π (π‘/2) π π (π‘/2)π π (π‘/2)
9.3 Center bound
For the first piece of the integral on the right-hand side of (9.3), we integrate |π π,π (π‘)| over the
interval [0, 2π/(π + π)]. For the sake of convenience, we substitute π‘ with 2π‘ in order to avoid the
denominator of 2 in the sine terms of the integrand.
T HEORY OF C OMPUTING, Volume 20 (7), 2024, pp. 1β62 46
L OWER B OUND T ECHNIQUES IN THE C OMPARISON -Q UERY M ODEL
Lemma 9.8 (center bound). For integers π β₯ π β₯ 2,
β« 2π β« π
π+π π+π 1
|π π,π (π‘)| ππ‘ = 2 |π π,π (2π‘)| ππ‘ = π β .
0 0 π π
We can write
π
Γ π sin((π + π)π‘)
|π π,π (2π‘)| = Β·
π+π sin(ππ‘)
π=1
as every term in the product on the right-hand side is nonnegative on this interval. We start
with the following estimates.
Claim 9.9. For positive integers π β€ π and π₯ β [0, π/(π + π)],
π sin((π + π)π₯) π2
Β· β€ 1 β 2 π₯2.
π+π sin(ππ₯) 2π
To prove this claim, we first prove two trigonometric bounds. Refer to Figure 13 for a plot of
the functions and bounds.
Claim 9.10. For positive integers π, and π₯ β [0, π/π],
(ππ₯)3
sin(ππ₯) β€ ππ₯ β .
π2
π¦3
Proof. Let π¦ = ππ₯. It is enough to argue that sin(π¦) β₯ π¦ β π2 in the range π¦ β [0, π].
π¦3
Let π (π¦) = sin(π¦) β π¦ + π2 . Notice that π (0) = π (π) = 0 and π 0(π) > 0. We will argue that
there is a unique point π¦ β β (0, π) such that π 0(π¦ β ) = 0, which will ensure that π (π¦) β€ 0 for all
π¦ β [0, π].
We can calculate that
3π¦ 2
π 0(π¦) = cos(π¦) β 1 +
π2
3π¦ 2 π¦
= 2 β 2 sin2 .
π 2
β
So π 0(π¦) = 0 if and only if sin(π¦/2) = Β±( 6/π) Β· (π¦/2), which is satisfied by one unique point
π¦ β β (0, π).
Claim 9.11. For positive integers π, and π₯ β (0, π/2π],
1
cot(ππ₯) β€ .
ππ₯
T HEORY OF C OMPUTING, Volume 20 (7), 2024, pp. 1β62 47
I VAN H U , D IETER VAN M ELKEBEEK , AND A NDREW M ORGAN
Proof. Let π¦ = ππ₯. We will prove that cot(π¦) β€ 1π¦ for all π¦ β (0, π/2]. It is enough to prove that
π¦ cos(π¦) β€ sin(π¦) for all π¦ β [0, π/2].
The latter inequality follows because both sides are zero when π¦ = 0, and the derivative of
the left hand side is bounded above by the derivative of the right hand side when π¦ β [0, π/2]:
cos(π¦) β π¦ sin(π¦) β€ cos(π¦).
π¦ π¦
π₯ π₯
π π/2
(a) Claim 9.10 (b) Claim 9.11
Figure 13: Plots of Trigonometric Bounds.
Trigonometric functions are dotted, upper bounds are dashed.
Now we finish the proof of Claim 9.9.
Proof. Notice that
sin((π + π)π₯) sin(ππ₯) cos(ππ₯) + sin(ππ₯) cos(ππ₯)
=
sin(ππ₯) sin(ππ₯)
= cos(ππ₯) + cot(ππ₯) sin(ππ₯).
Of course, cos(ππ₯) β€ 1. Furthermore, from Claim 9.10 and Claim 9.11, we can see that in this
domain of π₯, sin(ππ₯) β€ ππ₯ β πππ₯2 and cot(ππ₯) β€ ππ₯
3 3 1
. Additionally, sin(ππ₯) β₯ 0. Therefore, using
the fact that π β€ π,
π sin((π + π)π₯) π π3 π₯3
1
Β· β€ 1+ ππ₯ β 2
π+π sin(ππ₯) π+π ππ₯ π
π3 π2 2
β€ 1β π₯ 2
β€ 1 β π₯ .
(π + π)π2 2π2
From this, we can now prove Lemma 9.8.
T HEORY OF C OMPUTING, Volume 20 (7), 2024, pp. 1β62 48
L OWER B OUND T ECHNIQUES IN THE C OMPARISON -Q UERY M ODEL
Proof. Recall that on this interval
π
Γ π sin((π + π)π‘)
|π π,π (2π‘)| = Β· .
π+π sin(ππ‘)
π=1
Claim 9.9 applies on all π‘ in the domain because π/(π + π) β€ π/(π + π) for all π.
Therefore,
π π π π
π2π‘2 ππ 2 π‘ 2
β« β« β«
π+π π+π π+π 1
|π π,π (2π‘)| ππ‘ β€ 1β ππ‘ β€ exp β ππ‘ = π β .
0 0 2π2 0 2π2 π π
Here, we use the fact that 1 β π₯ β€ exp(βπ₯) for all π₯, and the Gaussian integral: the integral of
exp(βπ‘ 2 ) over β is constant, and by scaling the argument, the integral of exp(βππ‘ 2 ) over β is a
constant factor of π β1/2 for any parameter π.
9.4 Peripheral bound
We now bound |π π,π (π‘)| in the region away from 0, namely, the interval [2π/(π + π), π]. As for
the other part of the integral on the right-hand side of (9.3), we substitute π‘ with 2π‘ in order to
avoid the denominator of 2 in the sine terms of the integrand.
Lemma 9.12 (peripheral bound). For integers π β₯ π β₯ 2,
β« π β« π
2 1
|π π,π (π‘)| ππ‘ = 2 |π π,π (2π‘)| ππ‘ = π β .
2π
π+π
π
π+π
π π
In this region, the main problem is that the denominator of |π π,π (2π‘)| often goes to zero,
which could potentially blow up the integrand. However, the terms in the numerator always
cancel out these blowups. The following lemma will be our main tool for bounding |π π,π (2π‘)| in
this region; it will allow terms in the numerator to cancel out bad terms in the denominator.
Lemma 9.13 (pole reduction). For every π‘ β β, there exists a bΔ³ection π½ π‘ : {1, . . . , π} β {π +
1, . . . , π + π} (depending on π‘) such that for every π = 1, . . . , π,
1 1
sin(ππ‘) β₯ sin(π½ π‘ (π)π‘) .
π π½ π‘ (π)
A basic bound can be found by applying Lemma 9.13 on |π π,π (2π‘)| for π = 2, . . . , π, resulting
in an upper bound of 1/π sin(π‘). This bound is usable, but is weak on points closer to 0. To
remedy this, we can divide the domain of integration into multiple parts, where the points
closer to 0 can safely include more terms in the denominator.
π π
Let 2 β€ π β€ π be an integer. We split the domain of integration into three intervals: [ π+π , 2π ],
π π π π π π
[ 2π , 2 β 2π ], and [ 2 β 2π , 2 ]. By selecting a good value of π, we can get a reasonable upper bound
on this region. Here, we will make use of Lemma 9.13 and the following linear approximation
to sine:
T HEORY OF C OMPUTING, Volume 20 (7), 2024, pp. 1β62 49
I VAN H U , D IETER VAN M ELKEBEEK , AND A NDREW M ORGAN
Fact 9.14. For a positive integer π and π‘ β [0, π/2π],
2ππ‘
sin(ππ‘) β₯ .
π
π
π when π‘ = 0 and π‘ = 2π . This fact then follows since sine is
Proof. Notice that sin(ππ‘) = 2ππ‘
concave on this interval.
Region I. The first region of integration is [π/(π + π), π/(2π)].
π π
π! sin(π½ π‘ (1)π‘) Β· Β· Β· sin(π½ π‘ (π)π‘)
β« β«
2π 2π
|π π,π (2π‘)| ππ‘ β€ ππ‘
π π π½ π‘ (1) Β· Β· Β· π½ π‘ (π) sin(π‘) Β· Β· Β· sin(ππ‘)
π+π π+π
π
π!
β«
2π 1
β€ π ππ‘
π π sin(π‘) Β· Β· Β· sin(ππ‘)
π+π
β« π π π 1
1 2π
β€ ππ‘
ππ π 2 π‘π
π+π
πβ1
1 π π π+π
1
β€ π Β· Β· Β·
π 2 πβ1 π
πβ1
1 π π
1 2π
β€ π Β· Β· Β·
π 2 πβ1 π
π
β€ .
2π(π β 1)
The first two steps involve applying Lemma 9.13 on all π β {π + 1, . . . , π}, and then using
the fact that |sin(π₯)| β€ 1 and π½ π‘ (π) β₯ π. The third step uses Fact 9.14 for π β {1, . . . , π}, which
applies on the interval [π/(π + π), π/(2π)] for these values of π. From there, we bound with the
left limit of integration.
π π π
Region II. We now bound |π π,π (2π‘)| on the interval [ 2π , 2 β 2π ]. We can use Lemma 9.13 to
eliminate all terms except π β {1, 2} this time.
β« πβ π β« πβ π
2 2π 2 2π 2 sin(π½ π‘ (1)π‘) sin(π½ π‘ (2)π‘)
|π π,π (2π‘)| ππ‘ β€ ππ‘
π
2π
π
2π
π½ π‘ (1) π½ π‘ (2) sin(π‘) sin(2π‘)
β« πβ π
2 2 2π 1
β€ ππ‘.
π2 π
2π
sin(π‘) sin(2π‘)
Because |sin(π‘)| is increasing here and |sin(2π‘)| is symmetric about π4 , the value of the integral
π π
on the interval [ 2π , 4 ] exceeds the value on the interval [ π4 , π2 β 2π
π
]. Using Fact 9.14,
β« π β« π 2
2 4 1 1 4 π 1 π2 2π ππ
ππ‘ β€ 2 ππ‘ β€ Β· = 2.
π2 π sin(π‘) sin(2π‘) π π 2 π‘ 2 4π 2 π 2π
2π 2π
T HEORY OF C OMPUTING, Volume 20 (7), 2024, pp. 1β62 50
L OWER B OUND T ECHNIQUES IN THE C OMPARISON -Q UERY M ODEL
We have now established that β« πβ π
2 2π ππ
|π π,π (2π‘)| ππ‘ β€ .
π
2π
π2
π π
Region III. Notice that |sin(π‘)| is increasing on the interval [ π2 β 2π , 2 ], so we can bound |sin(π‘)|
π π
by |sin( 2 β 2π )|. Similar to before, the first step follows from Lemma 9.13, this time applied to
π β {2, . . . , π}.
β« π β« π β« π
2 2 1 sin(π½ 1 (π‘)) 2 1
|π π,π (2π‘)| ππ‘ β€ ππ‘ β€ ππ‘
π π π π π½ (1) sin(π‘) π π |π sin(π‘)|
2 β 2π 2 β 2π π‘ 2 β 2π
π 1 π 1 π
β€ Β· π π β€ = .
2π π sin( 2 β 2π ) 2π π(1 β ) 2π(π β 1)
1
π
Overall bound. Summing the above bounds, we can deduce that
β« π
2 π ππ
|π π,π (2π‘)| ππ‘ β€ + 2.
π π(π β 1) π
π+π
β
By choosing π = d π e (keeping in mind that π β₯ π β₯ 2), we can deduce Lemma 9.12.
9.5 Pole reduction
Finally, we prove the pole reduction lemma (Lemma 9.13). The essence is an interval matching
strategy capture Lemma 9.15. Here is the intuition.
π‘)
Recall that we want to upper bound factors of the form | sin(β β | by rescaled versions | sin(ππ‘)
π |
of the same pattern, where β β {π + 1, . . . , π + π} and π β {1, . . . , π} are matched. The matching
π | vanishes at the point π‘ while
definitely needs to avoid situations like in Figure 14a, where | sin(ππ‘)
π‘)
| sin(β
β | does not. Ideally, the period of the π-scaled version that contains the point π‘ encloses
the period of the β -scaled version that contains π‘, like in Figure 14b. As long as the pattern is
convex, this ensures that the π-scaled version is larger than the β -scaled version everywhere on
the encompassed period. (We will formally prove this in Claim 9.22.) Thus, if at every point π‘,
we can set up a matching such that the enclosing relationship holds for all matched pairs, we
are home free. Lemma 9.15 below does exactly this.
Let us first introduce some notation. For every real number π‘ and positive integer π, there is
a unique integer π such that π‘ is contained in the half-open interval [π/π, (π + 1)/π). We call this
interval the π-interval of π‘. For positive integers π, β , we can say that the π-interval of π‘ encloses
the β -interval of π‘ if the β -interval of π‘ is a subset of the π-interval of π‘. We use the shorthand
that π encloses β at π‘.
Lemma 9.15 (interval matching). Let π, π be positive integers. For any real π‘, there exists a bΔ³ection π½ π‘
between {1, . . . , π} and {π + 1, . . . , π + π} such that for all π = 1, . . . , π, π encloses β = π½ π‘ (π) at π‘.
T HEORY OF C OMPUTING, Volume 20 (7), 2024, pp. 1β62 51
I VAN H U , D IETER VAN M ELKEBEEK , AND A NDREW M ORGAN
π‘ π‘
β β
π π
(a) Bad case: no interval enclosure (b) Good case: interval enclosure
Figure 14: Enclosing intervals are needed for Lemma 9.13.
Note that the bΔ³ection can be different depending on π‘. In fact, this is necessary as otherwise
β would need to be a multiple of π, which is not possible with a bΔ³ection between the sets
{1, . . . , π} and {π + 1, . . . , π + π}.
We can interpret Lemma 9.15 as a matching on a bipartite graph by using Hallβs marriage
lemma.
Lemma 9.16 (Hallβs marriage lemma). Let πΊ be a bipartite graph with partitions πΏ, π
. For any π΄ β πΏ,
let π(π΄) be the set of all vertices in π
with at least one neighbor in π΄. The graph πΊ admits a perfect
matching if and only if for all such π΄, |π(π΄)| β₯ |π΄|.
For π΄ β {1, . . . , π}, let ππ‘ (π΄) be the set of all β β {π + 1, . . . , π + π} such that there exists
π β π΄ where π encloses β at π‘. To produce the desired bΔ³ection π½ π‘ in Lemma 9.15, it is sufficient
to prove that |ππ‘ (π΄)| β₯ |π΄| for all π‘ and π΄.
There are some values of β that are always contained in ππ‘ (π΄) regardless of the value of π‘.
Let π(π΄) be the set of all β β β such that for all π‘, there exists π β π΄ where π encloses β (this π
can vary depending on π‘). As π(π΄) β© {π + 1, . . . , π + π} β ππ‘ (π΄), it is sufficient to prove that
|π(π΄) β© {π + 1, . . . , π + π}| β₯ |π΄| for all π΄ and apply Theorem 9.16 to prove Lemma 9.15.
Example 9.17. 5 β π({2, 3}).
Proof. We only consider π‘ β [0, 1) for clarity. When π‘ β [0, 0.4), the 2-interval for π‘ is [0, 0.5), while
the 5-interval for π‘ is either [0, 0.2) or [0.2, 0.4), which means 2 encloses 5. When π‘ β [0.4, 0.6),
the 3-interval for π‘ is [1/3, 2/3), which encloses the 5-interval [0.4, 0.6). When π‘ β [0.6, 1), the
2-interval is [0.5, 1) while the 5-interval is either [0.6, 0.8) or [0.8, 1), so 2 encloses 5. These
enclosures are shown in Figure 15, where the marked 5-intervals are contained within the
respectively marked 2 or 3-intervals.
For all π‘, either 2 encloses 5, or 3 encloses 5. In other words, 5 β π({2, 3}).
We first consider a different characterization of π(π΄).
Claim 9.18. Let β be a positive integer. Then β β π(π΄) if and only if every open interval πΌ β β that
contains a fraction of denominator π for every π β π΄ must also contain a fraction of denominator β .
(These fractions do not have to be distinct or reduced.)
T HEORY OF C OMPUTING, Volume 20 (7), 2024, pp. 1β62 52
L OWER B OUND T ECHNIQUES IN THE C OMPARISON -Q UERY M ODEL
π=2
π=3
β =5
Figure 15: Example 9.17
The idea is that if no π β π΄ encloses β for some π‘, then the endpoints of each π-interval form
a fraction of denominator π contained strictly within the β -interval of π‘. As a result, we have a
contiguous interval πΌ containing a fraction of denominator π, and πΌ is contained strictly within
the β -interval of π‘. As such, πΌ cannot contain a fraction of denominator β . This is illustrated
in Figure 16. The formal proof also considers the edge cases involving the endpoints of the
intervals.
π‘
π-intervals for π β π΄
β -interval
πΌ
Figure 16: πΌ contains a fraction of denominator π for all π, but no fraction of denominator β .
For Claim 9.18, we prove the following proposition.
Proposition 9.19. The following statements are equivalent for any half-open interval [π, π) and positive
integer π:
(1) [π, π) is contained within a π-interval.
(2) (π, π) is contained within a π-interval.
(3) (π, π) contains no fraction of denominator π.
Proof. (1) =β (2). This follows as (π, π) is a subset of [π, π).
(2) =β (3). Suppose (π, π) is contained in the π-interval [π/π, (π + 1)/π). The interval (π, π)
cannot contain a fraction of denominator π, otherwise said fraction would be strictly between
π/π and (π + 1)/π.
T HEORY OF C OMPUTING, Volume 20 (7), 2024, pp. 1β62 53
I VAN H U , D IETER VAN M ELKEBEEK , AND A NDREW M ORGAN
(3) =β (1). Let π/π be the largest fraction of denominator π less than or equal to π. It
must be true that (π + 1)/π β₯ π, since (π + 1)/π cannot be contained in (π, π). Therefore, [π, π) is
contained in the π-interval [π/π, (π + 1)/π).
From this, we can prove Claim 9.18.
Proof. By definition, the condition that β β π(π΄) is that any β -interval π½ is contained in some
π-interval for some π β π΄. By condition (2) in Proposition 9.19, this is equivalent to saying Int(π½)
is contained in some π-interval, where Int(π½) is the interior of π½. This is true if and only if any
open subinterval πΌ of Int(π½) is contained in some π-interval. Equivalently, if πΌ is an open interval
that is not contained in any π-interval, then πΌ is not contained in any β -interval. Using condition
(3) in Proposition 9.19, this is finally equivalent to the condition that if πΌ contains a fraction of
denominator π for every π β π΄, then πΌ contains a fraction of denominator β .
The characterization in Claim 9.18 allows us to prove the following key claim about π(π΄).
Claim 9.20. If β 1 , β2 β π(π΄), then β 1 + β 2 β π(π΄).
Proof. By Claim 9.18, any interval πΌ that contains a fraction of denominator π for every π β π΄
must also contain two fractions π₯/β1 and π¦/β 2 . For positive integers π, π, π, π, define the mediant
of π/π and π/π to be (π + π)/(π + π). Then the mediant is always between the two fractions. In
other words, if π/π β€ π/π,
π π+π π
β€ β€ .
π π+π π
This fact can be proven with elementary algebra, as both sides are equivalent to π/π β€ π/π.
From this, we see that the mediant (π₯ + π¦)/(β1 + β 2 ) is contained in πΌ, as it is between
two elements of πΌ. As this applies to every such πΌ, we can use Claim 9.18 to conclude that
β 1 + β 2 β π(π΄).
Note that π₯/β 1 and π¦/β 2 do not have to be distinct. β 1 and β 2 might be equal, or π₯/β 1 = π¦/β 2 .
As π encloses π for every π‘, we have that π΄ β π(π΄). Let Sums(π΄) be the set of positive
integers that can be written as the sum of not necessarily distinct elements of π΄. Claim 9.20
implies that Sums(π΄) β π(π΄).
Claim 9.21. For nonnegative integers π,
|Sums(π΄) β© {π + 1, . . . , π + π}| β₯ |π΄|.
Proof. Let π = max(π΄). Because π΄ β Sums(π΄), Sums(π΄) contains every positive integer
congruent to an element of π΄ modulo π, since we can repeatedly add π to any element of
π΄. As π β₯ π, the set {π + 1, . . . , π + π} contains at least one element for every residue mod π.
Therefore, Sums(π΄) contains at least |π΄| elements of {π + 1, . . . , π + π}.
Putting it all together, we have that
|π΄| β€ |Sums(π΄) β© {π + 1, . . . , π + π}| β€ |π(π΄) β© {π + 1, . . . , π + π}| β€ |ππ‘ (π΄)|,
which proves Lemma 9.15 by our previous reasoning. To finish, we need to show that this
implies Lemma 9.13.
T HEORY OF C OMPUTING, Volume 20 (7), 2024, pp. 1β62 54
L OWER B OUND T ECHNIQUES IN THE C OMPARISON -Q UERY M ODEL
π (ππ₯)
π
π (β π₯)
β
π‘
π₯
π/β
π
/π
Figure 17: Proof of Claim 9.22: π/β β€ π
/π
Claim 9.22. Lemma 9.15 implies Lemma 9.13.
Proof. We prove the following more general claim. Let π : β β β be a function that has period
1, and additionally, π is concave on [0, 1] and π (0) = π (1) = 0. For π‘ β [0, 1], let π½ π‘ be a bΔ³ection
that satisfies the conditions of Lemma 9.15. Then for any π β {1, . . . , π}, let β = π½ π‘ (π). We seek
to prove that
π (ππ‘) π (β π‘)
β₯
π β
for all π = 1, . . . , π. Notice that π (π‘) = |sin(π‘)| satisfies all the conditions of the claim, albeit after
scaling the period from π to 1.
To prove the general claim, let π
= ππ‘ β bππ‘c and π = β π‘ β bβ π‘c, noting that π
, π β [0, 1]. By
the enclosing property, it is true that π
/π β₯ π/β , since they represent the distance from π‘ to the
left endpoints of the π and β -intervals, respectively, and π encloses β at π‘. This can be seen in
Figure 17.
Suppose π
β€ π. We have
π
π π
π (ππ‘) = π (π
) β₯ π (π) β₯ π (π) = π (β π‘).
π β β
The first step follows from the periodicity of π , the second from the concavity of π on [0, 1]
and the fact that π (0) = 0 (the point (π
, π (π
)) is above the line segment connecting (π, π (π)) and
(0, 0)), the third from the aforementioned inequality π
/π β₯ π/β , and the last from the periodicity
of π again.
If π < π
, we can instead consider the function e π (π‘) = π (1 β π‘). Here, e
π
= 1 β π
and e
π = 1 β π,
and we can use our previous reasoning. This proves the general claim.
10 Turing complexity
This section serves as a supplementary discussion on the time (and space) complexity of
algorithms for computing the minimum number of inversions in trees. From the analysis in Sec-
T HEORY OF C OMPUTING, Volume 20 (7), 2024, pp. 1β62 55
I VAN H U , D IETER VAN M ELKEBEEK , AND A NDREW M ORGAN
tion 4.2, computing MinInv(π, π) involves computing the sum of MinRInv(ππ£ , π) independently
for each π£ β π: Γ
MinInv(π, π) = MinRInv(ππ£ , π).
π£
In order to compute MinRInv(ππ£ , π) for each π£, we need to check the orderings π of the
children π’1 , . . . , π’ π of π£ and determine the minimum number of cross inversions between the
corresponding leaf sets πΏ1 , . . . , πΏ π in the order given by π:
Γ
RInv(ππ£ , π, π) XInvπ (πΏ π(π) , πΏ π(π) ).
1β€π<πβ€π
A natural approach for computing MinInv(π, π) consists of two phases:
1. In a first phase, we compute XInvπ between the leaf sets of any pair of siblings in π. This
can be done in a bottom-up fashion similar to mergesort. More precisely, for a node
π£ with sorted leaf sets πΏ1 , . . . , πΏ π , the cross inversions between every pair of leaf sets
can be calculated using a merge operation in π(π Β· (|πΏ1 |+ Β· Β· Β· + |πΏ π |)) time, and sorts the
concatenation of the leaf sets for use in future steps. Each leaf of depth π appears in π
operations, which gives the total runtime of π(deg(π) Β· πavg (π) Β· π).
2. In a second phase, we check, for each node π£ independently, which ordering π of the children
of π£ minimizes RInv(ππ£ , π, π). We then output the sum of the values MinRInv(ππ£ , π).
Exhaustively testing all orderings π of the π children of π£ to compute MinRInv(ππ£ , π) takes π(π Β·π!)
time, as π(π) time is required to calculate RInv(ππ£ , π, π), for each π, from the precomputed
values of XInv. This results in a total of π(π Β· deg(π)!) time for the second phase, and an overall
running time of π((deg(π)! + deg(π) Β· πavg (π)) Β· π).
For any constant bound on the degree of the tree π, the basic algorithm runs in time
polynomial in π. The dependency of the running time on the degree can be improved. One way
to do so is by reducing the problem of the second phase to the closely-related and well-studied
problem of computing a minimum arc feedback set of a weighted directed graph.
Definition 10.1 (Minimum Feedback Arc Set). Given a directed graph πΊ = (π , πΈ) with an
ordering π on the vertices π£1 , . . . , π£ π , a feedback arc is an edge π π from π£ π to π£ π such that
π(π£ π ) > π(π£ π ). The minimum feedback arc set problem is finding the minimum number of
feedback arcs induced by any ordering π. In weighted minimum feedback arc set, each edge
from π£ π to π£ π has a weight π€ ππ , and the objective is to minimize πβπΈ π ππ Β· π[π(π£ π ) > π(π£ π )].
Γ
We can encode the problem of computing MinRInv(ππ£ , π) as an instance of weighted
minimum arc feedback set, where each edge of the graph πΊ has a positive weight. If the leaf sets
of the children of π£ are πΏ1 , . . . , πΏ π , we construct a graph πΊ with π vertices π£ 1 , . . . , π£ π . For each
pair of vertices π£ π and π£ π , if XInvπ (πΏ π , πΏ π ) < XInvπ (πΏ π , πΏ π ), we add an edge from π£ π to π£ π of weight
XInvπ (πΏ π , πΏ π ) β XInvπ (πΏ π , πΏ π ). We can extract the value of MinRInv(ππ£ , π) from the weight of the
minimum feedback arc set.
T HEORY OF C OMPUTING, Volume 20 (7), 2024, pp. 1β62 56
L OWER B OUND T ECHNIQUES IN THE C OMPARISON -Q UERY M ODEL
As a consequence, we can use existing efficient algorithms for weighted minimum arc
feedback set to construct algorithms for inversion minimization on trees that are more efficient
than the basic algorithm we described. [3] gives two exact algorithms for weighted minimum
arc feedback set. One algorithm [3, Algorithm 1] is based on the Held-Karp algorithm for the
traveling salesman problem [13]; it uses dynamic programming to achieve a time complexity
of Ξ(π 2 2π ) and a space complexity of Ξ(2π ) for a graph of π vertices. Another algorithm
[3, Algorithm 2] uses a divide and conquer approach that achieves a time complexity of
π(poly(π) Β· 4π ), but has the advantage of only needing polynomial space.
Algorithm 2 MinRInv(ππ£ , π), Dynamic Programming
Input: Tree ππ£ with child leaf sets πΏ1 , . . . , πΏ π , ranking π
Initialize Cost[π], where π is over all subsets of {1, . . . , π}.
Cost[β
] β 0
for π from 1 to π do
for all sets π of size π do
Cost[π] β minπ βπ (Cost(π \ {π }) + πβπ,πβ π XInvπ (πΏ π , πΏ π )))
Γ
return Cost[{1, . . . , π}].
An adaptation of the dynamic programming algo-
rithm for calculating MinRInv is given in Algorithm 2. ππ
πΊ:
Using this subroutine for computing MinRInv, we can
π£π π£π
improve the time complexity of our basic algorithm to
π((deg(π)2 2deg(π) + deg(π) Β· πavg (π)) Β· π).
The improved running time is still not efficient for
π:
trees with unrestricted degree. In fact, the problem is π
NP-hard.
π£π π£π
Proposition 10.2. Computing MinInv(π, π) is NP-hard. Β·Β·Β·
2 degπΊ (π£ π ) 2 degπΊ (π£ π )
Proof. We show a reduction from the known NP-hard
problem βMinimum feedback arc set" [17].
For a graph πΊ with π vertices π£ 1 , . . . , π£ π and π Β·Β·Β· Β·Β·Β·
directed edges π1 , . . . , π π , we construct a depth-2 tree π π: β2π 2π β 1 β2π + 1 2π
and a ranking π of its leaves. We will assume that πΊ has
no isolated vertices; this goes without loss of generality Figure 18: Encoding of an edge π π
as isolated vertices can be dropped from an instance of
minimum arc feedback set without affecting the answer.
We also assume that between any two vertices at most
one of the two directed edges is present; this is also without loss of generality since dropping
the edges in case both are present reduces the answer by one. Finally, for ease of notation, we
allow the ranking π to be an injective mapping into the integers; this can be changed easily by
replacing each integer by its rank in the range.
T HEORY OF C OMPUTING, Volume 20 (7), 2024, pp. 1β62 57
I VAN H U , D IETER VAN M ELKEBEEK , AND A NDREW M ORGAN
In the first layer, the root of π has π children corresponding to π£ 1 , . . . , π£ π . The second layer
has leaves with ranks encoding the edges of πΊ. For each edge π π going from π£ π to π£ π , we add two
leaves under π£ π with ranks β2π and 2π β 1, and two leaves under π£ π with ranks β2π + 1 and 2π,
as shown in Figure 18. All ranks are distinct.
Consider the number of inversions in π induced by an ordering π of π£ 1 , . . . , π£ π . For each
edge π π , the number of inversions between the leaves of rank β2π and 2π β 1 and the leaves
of rank β2π + 1 and 2π is 1 if π(π£ π ) < π(π£ π ) and 3 if π(π£ π ) > π(π£ π ). These four leaves also form
8(π β 1) inversions with all other leaves, keeping in mind that these inversions are counted
twice when summed up over all edges.
Therefore, the minimum number of inversions in π is given by
!
Γ
MinInv(π, π) = min 4π(π β 1) + π + 2 Β· π[π(π£ π ) > π(π£ π )] .
π
π π βπΈ
π π βπΈ π[π(π£ π ) > π(π£ π )] , which
Γ
The size of the minimum arc feedback set is precisely minπ
can be extracted from MinInv(π, π) with straightforward calculations. This completes the
reduction.
Approximation Algorithms. If we relax our requirements to an approximate answer, we can
approximate MinRInv in polynomial time using existing approximation algorithms for weighted
minimum feedback arc set. The best known such algorithm achieves an approximation ratio
of π(log π log log π) on a graph with π vertices [10]. Adapting this algorithm for minimizing
inversions in trees produces an approximation factor of π(log(deg(π)) log log(deg(π))) for
MinInv(π, π). Under the unique games conjecture, there does not exist a constant-factor
approximation algorithm for minimum feedback arc set on arbitrary digraphs [12].
In the special case of tournament graphs, which have exactly one edge of weight 1 between
every pair of vertices, there are efficient constant factor approximation algorithms for minimum
arc feedback set. Some of these also apply to weighted tournaments, where for every pair of
vertices π£ π , π£ π , the nonnegative edge weights π€ ππ , π€ ππ satisfy π€ ππ + π€ ππ = 1. This case corresponds
to the scenario of computing MinRInv(ππ£ , π) where β
all leaf sets πΏ π of siblings have the same
size. [18] gives an algorithm with runtime π (2 β π( OPT) ), given that the optimal answer is OPT.
[19] also gives an approximation algorithm in the case where π€ ππ + π€ ππ β [π, 1] for some π > 0:
Λ
For any π > 0, the algorithm produces a (1 + π)-approximation of OPT in time π2π(1/(ππ) ) . For
12
the problem of computing MinRInv(ππ£ , π), the parameter π represents the ratio between the
smallest and largest possible values of |πΏ π | Β· |πΏ π |.
Wilcoxon test. As a final remark we point out an alternate way of computing Ξ π (π) in the
special case of the MannβWhitney trees of Figure 2. The number of cross inversions XInvπ (π΄, π΅)
can be written in terms of the rank sum ππ΅ π¦βπ΅ π(π¦) as follows, where π |π΄| and π |π΅|:
Γ
π(π + 1)
XInvπ (π΄, π΅) = ππ + β ππ΅ . (10.1)
2
T HEORY OF C OMPUTING, Volume 20 (7), 2024, pp. 1β62 58
L OWER B OUND T ECHNIQUES IN THE C OMPARISON -Q UERY M ODEL
The quantity ππ΅ is known as the Wilcoxon rank-sum statistic for differences between random
variables. Because of the relationship (10.1) the Wilcoxon test is equivalent in power to the
MannβWhitney test. However, the evaluation based on the efficient computation of cross
inversions (especially in the case of unbalanced set sizes π and π) is superior to the evaluation
based on the rank sum ππ΅ , as the latter presumes sorting the combined set π = π΄ t π΅.
Acknowledgements
We would like to thank Greta Panova, Robin Pemantle, and Richard Stanley for pointers regarding
Gaussian polynomials, Stasys Jukna for answering questions about the complexity measure
π, and the anonymous reviewers and copy editors for helpful suggestions. We appreciate the
partial support for this research by the U.S. National Science Foundation under Grants No.
2137424 and 2312540.
References
[1] Michael Ben-Or: Lower bounds for algebraic computation trees. In Proc. 15th STOC, pp.
80β86. ACM Press, 1983. [doi:10.1145/800061.808735] 10
[2] Manuel Blum, Robert W. Floyd, Vaughan Pratt, Ronald L. Rivest, and Robert E. Tarjan:
Time bounds for selection. J. Comput. System Sci., 7(4):448β461, 1973. [doi:10.1016/S0022-
0000(73)80033-9] 12
[3] Hans L. Bodlaender, Fedor V. Fomin, Arie M. C. A. Koster, Dieter Kratsch, and Dimitrios M.
Thilikos: A note on exact algorithms for vertex ordering problems on graphs. Theory
Computing Sys., 50(3):420β432, 2012. [doi:10.1007/s00224-011-9312-0] 57
[4] Jean Cardinal, Samuel Fiorini, GwenaΓ«l Joret, RaphaΓ«l M. Jungers, and J. Ian Munro: An
efficient algorithm for partial order production. SIAM J. Comput., 39(7):2927β2940, 2010.
[doi:10.1137/090759860] 13
[5] Svante Carlsson and Jingsen Chen: The complexity of heaps. In Proc. 3rd Ann. ACMβSIAM
Symp. on Discrete Algorithms (SODAβ92), pp. 393β402. SIAM, 1992. ACM DL. 12
[6] Richard Degerman: Ordered binary trees constructed through an application of Kendallβs
tau. Psychometrika, 47(4):523β527, 1982. [doi:10.1007/BF02293713] 2, 3, 4
[7] Dorit Dor and Uri Zwick: Selecting the median. SIAM J. Comput., 28(5):1722β1758, 1999.
[doi:10.1137/S0097539795288611] 12
[8] Dorit Dor and Uri Zwick: Median selection requires (2 + π)π comparisons. SIAM J. Discr.
Math., 14(3):312β325, 2001. [doi:10.1137/S0895480199353895] 12
[9] David Eppstein: A permutohedron, 2007. LINK, see also Wikipedia article, accessed
11-07-2024. 9
T HEORY OF C OMPUTING, Volume 20 (7), 2024, pp. 1β62 59
I VAN H U , D IETER VAN M ELKEBEEK , AND A NDREW M ORGAN
[10] Guy Even, Joseph Naor, Baruch Schieber, and Madhu Sudan: Approximating mini-
mum feedback sets and multicuts in directed graphs. Algorithmica, 20(2):151β174, 1998.
[doi:10.1007/pl00009191] 58
[11] Gaston H. Gonnet and J. Ian Munro: Heaps on heaps. SIAM J. Comput., 15(4):964β971,
1986. [doi:10.1137/0215068] 12
[12] Venkatesan Guruswami, Johan HΓ₯stad, Rajsekar Manokaran, Prasad Raghavendra,
and Moses Charikar: Beating the random ordering is hard: Every ordering CSP is
approximation resistant. SIAM J. Comput., 40(3):878β914, 2011. [doi:10.1137/090756144] 58
[13] Michael Held and Richard M. Karp: A dynamic programming approach to sequencing
problems. J. SIAM, 10(1):196β210, 1962. [doi:10.1137/0110015] 57
[14] Ivan Hu, Dieter van Melkebeek, and Andrew Morgan: Query complexity of inversion
minimization on trees. In Proc. 34th Ann. ACMβSIAM Symp. on Discrete Algorithms (SODAβ23),
pp. 2836β2866. SIAM, 2023. [doi:10.1137/1.9781611977554.ch107] 1
[15] Stasys Jukna, Alexander A. Razborov, Petr SavickΓ½, and Ingo Wegener: On P versus
NP β© co-NP for decision trees and read-once branching programs. Comput. Complexity,
8(4):357β370, 1999. [doi:10.1007/s000370050005] 11
[16] Kanela Kaligosi, Kurt Mehlhorn, J. Ian Munro, and Peter Sanders: Towards optimal
multiple selection. In Proc. 32nd Internat. Colloq. on Automata, Languages, and Programming
(ICALPβ05), pp. 103β114. Springer, 2005. [doi:10.1007/11523468_9] 12
[17] Richard M. Karp: Reducibility among combinatorial problems. In Raymond E. Miller
and James W. Thatcher, editors, Complexity of Computer Computations, pp. 85β103. Plenum
Press/Springer, 1972. Available at ResearchGate. [doi:10.1007/978-1-4684-2001-2_9] 57
[18] Marek Karpinski and Warren Schudy: Faster algorithms for feedback arc set tournament,
Kemeny rank aggregation and betweenness tournament. In Proc. Internat. Symp. on
Algorithms and Computation (ISAACβ10), pp. 3β14. Springer, 2010. [doi:10.1007/978-3-642-
17517-6_3] 58
[19] Claire Kenyon-Mathieu and Warren Schudy: How to rank with few errors. In Proc. 39th
STOC, pp. 95β103. ACM Press, 2007. [doi:10.1145/1250790.1250806] 58
[20] Henry B. Mann and Donald R. Whitney: On a test of whether one of two random variables
is stochastically larger than the other. Ann. Math. Stat., 18(1):50β60, 1947. JSTOR. 11
[21] Stephen Melczer, Greta Panova, and Robin Pemantle: Counting partitions inside a
rectangle. SIAM J. Discr. Math., 34(4):2388β2410, 2020. [doi:10.1137/20M1315828] 11
[22] George A. Miller: Psychological method to investigate verbal concepts. J. Mathematical
Psychology, 6(2):169β191, 1969. [doi:10.1016/0022-2496(69)90001-7] 2
T HEORY OF C OMPUTING, Volume 20 (7), 2024, pp. 1β62 60
L OWER B OUND T ECHNIQUES IN THE C OMPARISON -Q UERY M ODEL
[23] Ryan OβDonnell: Analysis of Boolean Functions. Cambridge Univ. Press, 2014.
[doi:10.1017/CBO9781139814782] 14
[24] AlfrΓ©d RΓ©nyi: Probability Theory. North-Holland 1970, Dover 2007. 44
[25] Arnold SchΓΆnhage: The production of partial orders. AstΓ©risque, 38β39:229β246, 1976.
NumDam. 12
[26] Igor S. Sergeev: On the upper bound of the complexity of sorting. Comput. Math. and Math.
Physics, 61(2):329β346, 2021. [doi:10.1134/S0965542521020111] 5
[27] Richard P. Stanley and Fabrizio Zanello: Some asymptotic results on π-binomial co-
efficients. Annals of Combinatorics, 20(3):623β634, 2016. [doi:10.1007/s00026-016-0319-8]
11
[28] Lajos TakΓ‘cs: Some asymptotic formulas for lattice paths. J. Stat. Planning and Inference,
14(1):123β142, 1986. [doi:10.1016/0378-3758(86)90016-9] 11
[29] Andrew Chi-Chin Yao: Probabilistic computations: Toward a unified measure of complexity.
In Proc. 18th FOCS, pp. 222β227. IEEE Comp. Soc., 1977. [doi:10.1109/SFCS.1977.24] 27, 28
AUTHORS
Ivan Hu
Ph. D. student
Department of Computer Science
University of Wisconsin β Madison
Madison, Wisconsin, USA
ilhu wisc edu
https://pages.cs.wisc.edu/~ihu/
Dieter van Melkebeek
Professor
Department of Computer Science
University of Wisconsin β Madison
Madison, Wisconsin, USA
dieter cs wisc edu
https://pages.cs.wisc.edu/~dieter/
T HEORY OF C OMPUTING, Volume 20 (7), 2024, pp. 1β62 61
I VAN H U , D IETER VAN M ELKEBEEK , AND A NDREW M ORGAN
Andrew Morgan
Software engineer
Google
amorgan cs wisc edu
https://pages.cs.wisc.edu/~amorgan/
ABOUT THE AUTHORS
Ivan Hu is a third year Ph. D. student at the University of WisconsinβMadison under
the supervision of Dieter van Melkebeek. He is currently studying complexity
theory.
Dieter van Melkebeek received his Ph. D. from the University of Chicago, under
the supervision of Lance Fortnow. His thesis was awarded the ACM Doctoral
Dissertation Award. After postdocs at DIMACS and the Institute for Advanced
Study, he joined the faculty at the University of Wisconsin-Madison, where
he currently is a full professor. His research interests include the power of
randomness, lower bounds for NP-complete problems, and connections between
derandomization and lower bounds.
Andrew Morgan received his Ph. D. in 2022 from the University of Wiscon-
sinβMadison under the supervision of Dieter van Melkebeek. After graduating,
he became a software engineer at Google.
T HEORY OF C OMPUTING, Volume 20 (7), 2024, pp. 1β62 62