Lower Bounds for the Average and Smoothed Number of Pareto-Optima

Authors Tobias Brunsch, Navin Goyal, Luis Rademacher, Heiko R\"{o}glin,
Plaintext
                         T HEORY OF C OMPUTING, Volume 10 (10), 2014, pp. 237–256
                                        www.theoryofcomputing.org




             Lower Bounds for the Average and
            Smoothed Number of Pareto-Optima
   Tobias Brunsch                    Navin Goyal                   Luis Rademacher            Heiko Röglin
                Received February 4, 2013; Revised May 13, 2014; Published September 29, 2014




       Abstract: Smoothed analysis of multiobjective 0–1 linear optimization has drawn con-
       siderable attention recently. The goal is to give bounds for the number of Pareto-optimal
       solutions (i. e., solutions with the property that no other solution is at least as good in all
       the coordinates and better in at least one) for multiobjective optimization problems. In
       this article we prove several lower bounds for the expected number of Pareto optima. Our
       basic result is a lower bound of Ωd (nd−1 ) for optimization problems with d objectives and n
       variables under fairly general conditions on the distributions of the linear objectives. Our
       proof relates the problem of finding lower bounds for the number of Pareto optima to results
       in discrete geometry and geometric probability about arrangements of hyperplanes. We
       use our basic result to derive the following results: (1) To our knowledge, the first lower
       bound for natural multiobjective optimization problems. We illustrate this on the maximum
       spanning tree problem with randomly chosen edge weights. Our technique is sufficiently
       flexible to yield such lower bounds also for other standard objective functions studied in this
       setting (such as multiobjective shortest path, TSP, matching). (2) A smoothed lower bound
       of min{Ωd (nd−1.5 φ d ), 2Θd (n) } for φ -smooth instances of the 0–1 knapsack problem with d
       profits.

ACM Classification: F.2.2, G.1.6, G.3
AMS Classification: 68Q25
Key words and phrases: multiobjective optimization, probabilistic analysis, smoothed analysis
    Preliminary versions of parts of this paper appeared in the Proceedings of the 8th Annual Conference on Theory and
Applications of Models of Computation (TAMC’11) [6] and the Proceedings of the 32nd Annual Conference on Foundations of
Software Technology and Theoretical Computer Science (FSTTCS’12) [13].


© 2014 Tobias Brunsch, Navin Goyal, Luis Rademacher, and Heiko Röglin
c b Licensed under a Creative Commons Attribution License (CC-BY)                         DOI: 10.4086/toc.2014.v010a010
               T OBIAS B RUNSCH , NAVIN G OYAL , L UIS R ADEMACHER , AND H EIKO R ÖGLIN

1    Introduction
Multiobjective optimization involves scenarios where there is more than one objective function to
optimize: when planning a train trip we may want to choose connections that minimize fare, total time,
number of train changes, etc. The objectives may be in conflict with each other and there may not be a
single best solution to the problem. Such multiobjective optimization problems arise in diverse fields
ranging from economics to computer science, and have been well-studied. A number of approaches exist
in the literature to deal with the trade-offs among the objectives in such situations: goal programming,
multiobjective approximation algorithms, Pareto-optimality; see, e. g., [14, 15, 19] for references. It is
the latter approach using Pareto-optimality that concerns us in this article. A Pareto-optimal solution
is a solution with the property that no other solution is at least as good in all the objectives and better
in at least one. Clearly, the set of Pareto-optimal solutions (Pareto set in short) contains all desirable
solutions as any other solution is strictly worse than a solution in the Pareto set. In the worst case, the
Pareto set can be exponentially large as a function of the input size (see, e. g., [11, 16]). However, in many
cases of interest, the Pareto set is typically not too large. If the Pareto set is small and can be generated
efficiently, then often some possibly human-assisted post-processing is applied to make a choice among
the Pareto-optimal solutions after the Pareto set has been generated. Pareto sets are also used in heuristics
for optimization problems (e. g., [17]). To explain why Pareto sets are frequently small in practice,
multiobjective optimization has recently been studied from the view-point of smoothed analysis [22]. We
introduce some notation before describing this work.

Notation. For a positive integer n, we denote the set {1, 2, . . . , n} by [n]. We will use the partial order
 in Rd defined by x  y iff for all i ∈ [d] we have xi ≤ yi . For a, b ∈ Rd we say that b dominates a if
bi ≥ ai for all i ∈ [d], and for at least one i ∈ [d], we have strict inequality. We denote the relation “b
dominates a” by b  a.
      We say that a probability distribution over Rd is symmetric if for any measurable set A we have that
the probability of A is the same as the probability of −A. Recall that a distribution is absolutely continuous
if it has a density function. We say that a random variable is symmetric (or absolutely continuous) if its
distribution is symmetric (or absolutely continuous, respectively).
      The multiobjective optimization problems we study have binary variables and linear objective
functions. In a general setting, the feasible solution set is an arbitrary set S ⊆ {0, 1}n . The problem has d
linear objective functions v(i) : S → R, given by
                                                      (i)
                                    v(i) (x) = ∑ v j x j ,     for i ∈ [d],
                                              j∈[n]

      (i)      (i)
and (v1 , . . . , vn ) ∈ Rn (so v(i) is also interpreted as an n-dimensional vector in the natural way). For
convenience, we will assume, unless otherwise specified, that we want to maximize all the objectives,
and we will refer to the objectives as profits. This entails no loss of generality. Thus the optimization
problem considered is the following.

                                 maximize v(1) (x), . . . , maximize v(d) (x),                          (1.1)
                                 subject to x ∈ S.


                     T HEORY OF C OMPUTING, Volume 10 (10), 2014, pp. 237–256                             238
           L OWER B OUNDS FOR THE AVERAGE AND S MOOTHED N UMBER OF PARETO -O PTIMA

    Let V be the d × n matrix with rows v(1) , . . . , v(d) . A solution x ∈ S is said to be Pareto-optimal (or
maximal under ) if V x ⊀ V y for all y ∈ S. For a set X of points in Rn , let p(X) denote the number of
Pareto optima in X.

Multiobjective knapsack. For the special case of the multiobjective 0–1 knapsack problem we have
d + 1 linear objective functions. One of the objective functions, w : {0, 1}n → R, called weight, is given
by
                                            w(x) = ∑ w j x j .
                                                       j∈[n]

Weight is to be minimized. The other d objective functions v(i) : {0, 1}n → R are profits as before and are
                             (i)
given by v(i) (x) = ∑ j∈[n] v j x j . Profits are to be maximized. We require that all the entries in w and v(i)
come from [0, 1].
    For the knapsack problem, we need to modify the definition of domination appropriately because
while we want to maximize the profit, we want to minimize the weight. It will be clear from the context
which notion is being used.
    For the knapsack problem, we need to modify the definition of domination appropriately because
while we want to maximize the profit, we want to minimize the weight. It will be clear from the context
which notion is being used.

Smoothed analysis. For our multiobjective optimization problem (1.1), in the worst case the size of
the Pareto set can be exponential even for d = 2 (the bicriteria case). Smoothed analysis is a framework
for the analysis of algorithms introduced by Spielman and Teng [22] to explain the fast running time
of the Simplex algorithm in practice, despite having exponential running time in the worst case. Since
its introduction, smoothed analysis has been applied to a variety of algorithms. Beier and Vöcking [3]
studied the 0–1 knapsack problem under smoothed analysis. In our context of multiobjective optimization,
smoothed analysis would mean that the instance (specified by V ) is chosen adversarially, but then each
entry is independently perturbed according to, say, Gaussian noise with small standard deviation. In
fact, Beier and Vöcking [3] introduced a more general notion of smoothed analysis. In one version
of their model, each entry of the matrix V is an independent random variable taking values in [−1, 1]
with the restriction that each has probability density function bounded from above by φ , for a parameter
φ ≥ 1. We refer to distributions supported on [−1, 1] with probability density bounded above by φ as
φ -smooth distributions. This model is more general than Spielman and Teng’s because, by choosing the
densities appropriately, the adversary can not only determine the mean values of the entries, as in the
original model, but also the type of noise. For greater generality, one of the rows of V could be chosen
fully adversarially (deterministically). As φ is increased, the smoothed model becomes more like the
worst-case model. With the exception of Theorem 1.4 below, we do not require adversarial choice of a
row in V .

Previous work. Beier and Vöcking [2] showed that in the above model for the 0–1 knapsack problem
with adversarial weights the expected number of Pareto optima is O(n4 φ ). The result generalizes to
other bicriteria optimization problems. Beier et al. [1] make this generalization explicit and improve the

                     T HEORY OF C OMPUTING, Volume 10 (10), 2014, pp. 237–256                               239
               T OBIAS B RUNSCH , NAVIN G OYAL , L UIS R ADEMACHER , AND H EIKO R ÖGLIN

upper bound to O(n2 φ ). Röglin and Teng [19] studied multiobjective optimization problems in the above
framework. They showed that the expected size of the Pareto set with one adversarial and d φ -smooth
                                      d−2
objectives is of the form Od ((nφ )2 (d+1)! ). Note that with the notation Od , and, analogously with Ωd
and Θd we suppress factors that depend only on d. Moitra and O’Donnell [15] improved this upper bound
to 2n2d · (4φ d)d(d+1)/2 . (Again, these results allow one of the objectives to be chosen adversarially.) Very
recently, this has been improved to Od (n2d φ d ) for the mildly restricted class of quasiconcave φ -smooth
instances [7]. The question of a lower bound for the expected number of Pareto optima was raised in [19]
and [15].
    An early average-case lower bound of Ω(n2 ) was proved in [2] for the knapsack problem with a single
profit vector. This result, however, required an adversarial choice of exponentially increasing weights.
    Ehrgott [11] presents examples showing that many problems (including the 0–1 knapsack problem
and the maximum spanning tree problem) can have an exponential number of Pareto optima in the worst
case even for only two objective functions.

Our results. In this article we prove lower bounds for the expected number of Pareto optima. Our basic
result, Theorem 1.1, deals with the case when the entries of the matrix V are independent, symmetrically
distributed, absolutely continuous random variables. Note that we do not require that the distributions
be identical: each entry can have a different distribution. This generality will in fact be useful in our
lower bound for the maximum spanning tree problem, Theorem 1.2. Note that all entries of V are random,
unlike in the results discussed above where one of the objectives is chosen adversarially. This makes our
lower bound stronger.

Theorem 1.1 (Basic theorem). Suppose all entries of a d × n random matrix V are independent, symmetri-
cally distributed, absolutely continuous random variables. Let X denote the random set {V r : r ∈ {0, 1}n }.
Then

                                                   1 d−1 n − 1
                                                                
                                       E p(X) ≥ d−1 ∑              .                                  (1.2)
                                       V         2    k=0     k

    This implies the simpler bound
                                                                 d−1
                                                        n−1
                                        E p(X) ≥
                                        V              2(d − 1)

using the well-known bound
                                                     n − 1 d−1
                                                       
                                            n−1
                                                  ≥            .
                                            d −1     d −1
    We give two proofs of this result. The two proofs have a similar essence, but different form. Both
proofs relate the problem at hand to some well-known results in geometry. This connection with geometry
is new and may be useful for future research. The first proof lower bounds the expected number of
Pareto optima of a point set by the expected number of vertices of its convex hull (up to a constant that
depends on d but not on n) and then invokes known lower bounds for the expected number of vertices
of projections of hypercubes, as the random set in Theorem 1.1 is a linear image of the vertices of the

                    T HEORY OF C OMPUTING, Volume 10 (10), 2014, pp. 237–256                              240
           L OWER B OUNDS FOR THE AVERAGE AND S MOOTHED N UMBER OF PARETO -O PTIMA

hypercube. The second proof gives a characterization of maximality in terms of 0–1 vectors and then
relaxes integrality to get a relaxed dual characterization by means of convex separation, which reduces
counting Pareto optima to giving a lower bound on the probability that the convex hull of n random points
contains the origin. This probability is known exactly by a theorem of Wendel (Theorem 2.2).
     Interestingly, our lower bound is basically the same as the expected number of Pareto optima
when 2n uniformly random points are chosen from [−1, 1]d , which is shown to be Θd (nd−1 ) in several
papers [4, 9, 8]. This raises the possibility of a closer connection between the two models; such a
connection could be useful as the model of uniformly random points is understood better.
     The basic theorem above corresponds to the case when the set of feasible solutions S is {0, 1}n . But
in many interesting cases S is a strict subset of {0, 1}n : for example, in the multiobjective spanning
tree problem, n is the number of edges in an underlying network, and S is the set of incidence vectors
of spanning trees in the network; similarly, for the multiobjective shortest path problem, S is the set
of incidence vectors of s–t paths. We can use our basic theorem to prove lower bounds for the size of
the Pareto set for such S. Our technique is pliable enough to give interesting lower bounds for many
standard objective functions used in multiobjective optimization (in fact, any standard objective that we
tried): multiobjective shortest path, TSP, matching, arborescence, etc. We will illustrate the idea with
the multiobjective spanning tree problem on the complete graph. In this problem, we have the complete
undirected graph Kn on n vertices as the underlying graph. Each edge e has a set of profits v(i) (e) ∈ [−1, 1]
for i ∈ [d]. The set S of feasible solutions is given by the incidence vectors of spanning trees of Kn . Notice
                                                  n
that the feasible solutions here live in {0, 1}(2) and not in {0, 1}n .
Theorem 1.2. In the d-objective maximum spanning tree problem on Kn there exists a choice of 2-smooth
distributions such that the expected number of Pareto-optimal spanning trees is at least
                                                     d−1
                                               n−3
                                                           .
                                             2(d − 1)
    The proof of this theorem utilizes the full power of Theorem 1.1, namely the ability to choose different
symmetric distributions.
    In our basic theorem above, Theorem 1.1, we required the distributions to be symmetric, and therefore
that theorem does not apply to the 0–1 knapsack problem where all profits and weights are non-negative.
With a slight loss in the lower bound we also get a lower bound for this case. In the d-dimensional
0–1 knapsack problem we have d objectives v(i) for i ∈ [d], called profits, and an additional objective w,
called weight. Components of p(i) and w are all chosen from [0, 1]. We want to maximize the profits
and minimize the weight, and so the definitions of domination and Pareto-optimality are accordingly
modified.
Theorem 1.3. For the multiobjective 0–1 knapsack problem where all the weight components are 1 and
profit components are chosen uniformly at random from [0, 1], the expected number of Pareto optima is
Ωd (nd−1.5 ).
     Theorem 1.1 or Theorem 1.3 (depending on whether one wants a bound for non-negative or unre-
stricted weights and profits) can be used to give the following lower bound on the expected number of
Pareto optima when the profits are φ -smooth (actually, uniform in carefully chosen intervals of length at
least 1/φ ):

                    T HEORY OF C OMPUTING, Volume 10 (10), 2014, pp. 237–256                               241
                 T OBIAS B RUNSCH , NAVIN G OYAL , L UIS R ADEMACHER , AND H EIKO R ÖGLIN

Theorem 1.4. For any fixed d ≥ 2 and for n ≥ 8d and φ ≥ 2d there exist

    1. weights w1 , . . . , wn ≥ 0,

    2. intervals [ai j , bi j ] ⊆ [0, 1], i ∈ [d], j ∈ [n] of length at least 1/φ and with ai j ≥ 0, and

    3. a set S ⊆ {0, 1}n
                       (i)
such that if profits v j are chosen independently and uniformly at random in [ai j , bi j ], then the expected
number of Pareto-optimal solutions of the d-dimensional knapsack problem with solution set S is at least

                                               min Ωd (nd−1.5 φ d ), 2Θd (n) .
                                                  


      This lower bound should be contrasted with the aforementioned upper bounds 2 · (4φ d)d(d+1)/2 ·
n2d [15] and Od (n2d φ d ) [7]. This latter bound is for the mildly restricted but natural class of φ -smooth
instances with quasiconcave densities, where a density f is said to be quasiconcave if there exists a value
x ∈ R such that it is non-decreasing below x and non-increasing above x. The densities in our lower bound
instances are quasiconcave. For general multiobjective optimization (basically without the restriction of
entries being non-negative) the exponent of n becomes exactly d − 1 in Theorem 1.4.
    Theorem 1.4 requires S to be chosen adversarially. To our knowledge, no non-trivial lower bounds
were known before our work for natural choices of S. This is addressed by our Theorem 1.1 (S = {0, 1}n )
and Theorem 1.2 (S is the set of spanning trees of the complete graph) above; these theorems are for a
small constant value of φ , and therefore do not clarify what the dependence on φ should be.


2      The basic theorem
In this section we prove Theorem 1.1. We will include two proofs that, while in essence the same,
emphasize the geometric and algebraic views, respectively. Also the second proof is more self-contained.
It is perhaps worth mentioning that we first discovered the second proof, and in the course of writing the
present article we found the ideas and known results that can be combined to get a more geometric proof.


2.1     First proof
We will use the following result that relates the number of Pareto optima of a point set with the number
of vertices of its convex hull:

Theorem 2.1 ([4], [18, Theorem 4.7]). Let P be a finite subset of Rd and for σ ∈ {−1, +1}d let
                                          
                                      Pσ = (σ1 x1 , . . . , σd xd ) : (x1 , . . . , xd ) ∈ P .

For every vertex (x1 , . . . , xd ) of the convex hull of P there exists σ ∈ {−1, +1}d such that (σ1 x1 , . . . , σd xd )
is maximal under  with respect to the set Pσ .

                       T HEORY OF C OMPUTING, Volume 10 (10), 2014, pp. 237–256                                     242
            L OWER B OUNDS FOR THE AVERAGE AND S MOOTHED N UMBER OF PARETO -O PTIMA

First proof of Theorem 1.1. The convex hull of X is a random polytope. By Theorem 2.1, every vertex is
maximal under our partial order  for at least one of the 2d reflections involving coordinate hyperplanes.
This means that

       |vertices of conv X| ≤        ∑        p(X with coordinates of points flipped by signs in σ ) ,             (2.1)
                                 σ ∈{−1,1}d

where conv X denotes the convex hull of X. Our symmetry assumption followed by (2.1) implies

                            1
             E(p(X)) =        · ∑ E p(X with coordinates of points flipped by signs in σ )
             V              2d σ ∈{−1,1}d V

                            1
                        ≥     · E |vertices of conv X| .
                            2d V
This idea is from [4]. It is used there in the opposite direction, that is, to get upper bounds for the expected
number of vertices from upper bounds for the expected number of maximal points.
    In order to count the vertices of the convex hull of X, we observe that the convex hull of X has a
special structure: it is the image of an n-dimensional hypercube via a linear map. Such an image is called
a zonotope. It is clear that a set in Rd is a zonotope iff it is the Minkowski sum of a finite number of
segments. It is known [10, Theorem 1.8] that for a matrix V with columns in general position (that is, any
d columns are linearly independent, which happens almost surely (with probability 1) in our case), the
number of vertices of conv X is equal to the maximum number of vertices of a d-dimensional zonotope
formed as the Minkowski sum of n line segments, and this number is known exactly [12, 31.1.1]. That is,
almost surely,
                                                               d−1        
                                                                     n−1
                                     |vertices of conv X| = 2 ∑               .
                                                               k=0     k
The claimed bound follows.

2.2    Second proof
Some more definitions before getting into the proof: set R+ = {x ∈ R : x ≥ 0} and R− = {x ∈ R : x ≤ 0}.
For σ ∈ {−1, 1}d , the orthant associated with σ is {(σ1 x1 , . . . , σd xd ) : (x1 , . . . , xd ) ∈ Rd+ }. In particular,
if σ is the all 1’s vector then we call its associated orthant the positive orthant, and if σ is the all −1’s
vector then we call its orthant the negative orthant. For a finite set of points P = {p1 , . . . , pk } ⊆ Rd , the
conic hull is denoted cone(P) = {∑ki=1 αi pi : αi ≥ 0} (note that the conic hull is always convex).
    We will use the following result by Wendel.

Theorem 2.2 ([23], [21, Theorem 8.2.1]). If X1 , . . . , Xn are independent random points in Rd whose
distributions are symmetric (but not necessarily identical) and such that with probability 1 all subsets of
size d are linearly independent, then
                                                                         d−1     
                                                                     1         n−1
                                 Pr[0 ∈
                                      / conv{X1 , . . . , Xn }] =        ∑          .
                                                                    2n−1 k=0    k


                      T HEORY OF C OMPUTING, Volume 10 (10), 2014, pp. 237–256                                       243
                 T OBIAS B RUNSCH , NAVIN G OYAL , L UIS R ADEMACHER , AND H EIKO R ÖGLIN

    The linear independence condition holds in particular under the simpler assumption that no hyperplane
through the origin is assigned positive probability by any of the n distributions. For example, it holds
when the points are drawn i. i. d. at random from the unit sphere.

Second proof of Theorem 1.1. By linearity of expectation we have E p(X) = ∑r Pr[V r maximal]. Notice
that Pr[V r maximal] does not depend on r because all entries of V are symmetrically distributed, so we
can write E p(X) = 2n Pr[V 1 maximal], where 1 denotes the all 1’s vector.
     For the rest of the proof we will focus on finding a lower bound for this last probability. To understand
this probability we first rewrite the event [V 1 maximal] in terms of a different event via easy intermediate
steps:

  [V 1 maximal] = [V r  V 1, ∀r ∈ {0, 1}n ] = [0  V (1 − r), ∀r ∈ {0, 1}n ] = [0  V r, ∀r ∈ {0, 1}n ] .

Now we have
                                Pr 0  V r, ∀r ∈ {0, 1}n ≥ Pr 0  V r, ∀r ∈ [0, 1]n .
                                                                                

Event 0  V r, ∀r ∈ [0, 1]n is the same as the event [cone(v1 , . . . , vn ) ∩ Rd− = {0}]. That is to say, the
                                  

cone generated by the non-negative linear combinations of v1 , . . . , vn does not have a point distinct from
the origin that lies in the negative orthant.
    Then, by the separability property of convex sets, there exists a hyperplane H = {x ∈ Rd : hu, xi =
0} separating cone(v1 , . . . , vn ) and Rd− . More precisely, there exists a vector u ∈ Rd+ \ {0} such that
cone(v1 , . . . , vn ) · u ≥ 0 and this implies

             Pr[cone(v1 , . . . , vn ) ∩ Rd− = {0}] = Pr[∃u ∈ Rd+ \ {0} : cone(v1 , . . . , vn ) · u ≥ 0] .

Now

Pr[cone(v1 , . . . , vn ) contained in some halfspace]
             ≤      ∑         Pr[cone(v1 , . . . , vn ) contained in some halfspace with inner normal in orthant σ ]
                 σ ∈{−1,1}d

             = 2d Pr[∃u ∈ Rd+ \ {0} : cone(v1 , . . . , vn ) · u ≥ 0] .

Clearly, we have

       [cone(v1 , . . . , vn ) contained in some halfspace] = [v1 , . . . , vn contained in some halfspace] .

Theorem 2.2 and the fact that the distribution of vi is symmetric and assigns measure zero to every
hyperplane through 0 imply

                                                                                          1 d−1 n − 1
                                                                                                    
      Pr[v1 , . . . , vn contained in some halfspace] = Pr[0 ∈
                                                             / conv{v1 , . . . , vn }] = n−1 ∑         .
                                                                                        2    k=0   k

We conclude:
                                                             d−1       
                                                         1          n−1
                                            E p(X) ≥        ∑             .
                                                       2d−1 k=0      k


                      T HEORY OF C OMPUTING, Volume 10 (10), 2014, pp. 237–256                                  244
           L OWER B OUNDS FOR THE AVERAGE AND S MOOTHED N UMBER OF PARETO -O PTIMA

    The following easy corollaries of Theorem 1.1 will be useful later.

Corollary 2.3. Theorem 1.1 holds also for the feasible set S = {−1, 1}n .

Proof. For x ∈ {0, 1}n define x0 = (2x1 − 1, 2x2 − 1, . . . , 2xn − 1) = 2x − 1, where 1 is the all 1’s vector.
Thus x0 ∈ {−1, 1}n . Now V x0 = 2V x −V 1 and so V x0  V y0 if and only if V x  V y. Hence x is Pareto-
optimal for the feasible set {0, 1} if and only if x0 is Pareto-optimal for the feasible set {−1, 1}. This
immediately implies the corollary.

Corollary 2.4. Under the assumptions of Theorem 1.1 and when the set of feasible solutions is S ⊆ {0, 1}n
we have

                                               |S| d−1 n − 1
                                                              
                                  E p(X) ≥ n+d−1 ∑               .                                  (2.2)
                                             2       k=0   k

Proof. For any given r ∈ S, the probability that it is Pareto-optimal in the current instance with solution
set restricted to S is at least the probability that it is Pareto-optimal in the instance with solution set
{0, 1}n . By the symmetry of the distributions this probability is independent of r and by Theorem 1.1 it is
at least

                                               1 d−1 n − 1
                                                               
                                                    ∑ k .
                                           2n+d−1 k=0

Linearity of expectation completes the proof.


3    Lower bound for multiobjective maximum spanning trees
In this section we show that our basic result can be used to derive similar lower bounds for S other than
those encountered earlier in this article. We illustrate this for the case of the multiobjective maximum
spanning tree problem on the complete graph; for this problem, S is the set of incidence vectors of all
spanning trees on n vertices. The idea of the proof is simple: “embed” the given instance of the basic
problem into an instance of the problem at hand. The proof requires the full power of Theorem 1.1,
namely the distributions of the entries of V need not be identical. It is worth noting that the direct use of
Corollary 2.4 does not provide any useful bound for the case of spanning trees because in this case the
right hand side of (2.2) becomes less than 1. The proof below can easily be modified so that all profits are
chosen from intervals of non-negative numbers.
    We now prove Theorem 1.2.

Proof of Theorem 1.2. The idea of the proof is to embed an instance of the case S = {0, 1}n−2 of the
basic theorem into the tree instance. We now describe our tree instance. We identify a subgraph G of
Kn (the complete graph on n vertices): the vertex set of G is the same as the vertex set of Kn , which
we denote by {s,t, u1 , u2 , . . . , un−2 } for convenience. The edge set of G consists of the edge (s,t), and
edges (s, u j ), (t, u j ). Thus, G consists of 2n − 3 edges. Now we choose the distribution of the profits
for each edge of Kn . For edges outside G, the distribution for all profits is identical and it is simply the

                    T HEORY OF C OMPUTING, Volume 10 (10), 2014, pp. 237–256                              245
                T OBIAS B RUNSCH , NAVIN G OYAL , L UIS R ADEMACHER , AND H EIKO R ÖGLIN

uniform distribution on [−1, −1/2]. For edge (s,t), the distribution is uniform over [1/2, 1]. For all other
edges (i. e., the edges (s, u j ) and (t, u j )), it is uniform over [−1/2, 1/2]. Note that these densities are not
symmetric. This is not a problem because we will see below that we do not need to apply Theorem 1.1
directly to these densities, but only to symmetric densities that are derived from them. Let T denote the
set of spanning trees that include edge (s,t), and for every other vertex u j , exactly one of (s, u j ) and
(t, u j ). Clearly |T| = 2n−2 . The result of the above choices of distributions is that all the Pareto-optimal
spanning trees come from T:
Claim 3.1. For any choices of profits from the intervals as specified above, if a tree T is Pareto-optimal
then T ∈ T.

Proof of the claim. Fix any choice of profits as above. Suppose that a tree T 0 is Pareto-optimal but T 0 ∈  / T.
Then: either (1) T 0 has an edge e outside E(G), or (2) all its edges are from E(G) but it does not use edge
(s,t). In case (1), remove the edges from T 0 that are not in E(G) (these have profits at most −1/2), and
then complete the remaining disconnected graph to a spanning tree using edges from E(G) (these have
profit at least 1/2). Clearly, the resulting tree is heavier than T 0 in each of the d profits. In case (2), add
edge (s,t) to T 0 (it has profit at least 1/2), and remove some edge other than (s,t) from the resulting cycle
(this edge has profit at most 1/2). Again, the resulting tree is heavier than T 0 in each of the d profits.

   In the rest of the proof, i will range over [d]. The i-th profit of a spanning tree T ∈ T, which we will
denote by v(i) (T ), can be written as follows

                                                   n−2                                          
                          v(i) (T ) = v(i) (s,t) + ∑ v(i) (s, u j )x j + v(i) (t, u j )(1 − x j ) ,
                                                   j=1

where x j = x j (T ) = 1 if edge (s, u j ) is in the tree and x j = 0 otherwise, i. e., if edge (t, u j ) is in the tree.
We have
                                          v(i) (s, u ) + v(i) (t, u )                                     1
                                                                                                                 
     (i)              (i)                               j            j     (i)             (i)
    v (s, u j )x j + v (t, u j )(1 − x j ) =                           + v (s, u j ) − v (t, u j ) x j −            .
                                                           2                                                   2

Now, to compute the lower bound on the expected                  size of the Pareto set we reveal the v’s in two
steps: first we reveal v(i) (s, u j ) + v(i) (t, u j ) for all u j . Then the conditional distribution of each
                                                       

 v(i) (s, u j ) − v(i) (t, u j ) is symmetric (but can be different for different i). Thus the i-th profit of T ∈ T is
                                

                                                                  1
                                                                       
                           (i)              (i)     (i)
                          v (T ) = ∑ v (s, u j ) − v (t, u j ) x j −     + A(i) ,
                                  j∈[n−2]
                                                                     2

where
                                                                 v(i) (s, u j ) + v(i) (t, u j )
                                 A(i) = v(i) (s,t) +       ∑                                     .
                                                         j∈[n−2]
                                                                                2

Since A(i) is common to all trees, only the first sum in the profit matters in determining Pareto-optimality.
Now we are in the situation dealt with by Corollary 2.3: for each fixing of v(i) (s, u j ) + v(i) (t, u j ) , we
                                                                                                           


                      T HEORY OF C OMPUTING, Volume 10 (10), 2014, pp. 237–256                                      246
            L OWER B OUNDS FOR THE AVERAGE AND S MOOTHED N UMBER OF PARETO -O PTIMA

get an instance of Corollary 2.3, and thus a lower bound of
                                                              d−1
                                                     n−3
                                                                      .
                                                    2(d − 1)

Since this holds for each fixing of v(i) (s, u j ) + v(i) (t, u j ) , we get that the same lower bound holds for
                                                                   

the expectation without conditioning.


4    0-1 Knapsack
We prove Theorem 1.3.

Proof of Theorem 1.3. To show our lower bound we will use the obvious one-to-one map between our
basic problem with d objectives and the profits of the knapsack problem: let v(1) , . . . , v(d) be an instance
                                   (i)
of our basic problem with all the v j being chosen uniformly at random from [−1/2, 1/2]. Now the
                                                                  (i)     (i)
profits p are obtained from the v’s in the natural way: p j = v j + 1/2. In general, the set of Pareto
optima for these two problems (the basic problem instance and its corresponding knapsack instance) are
not the same. We will focus instead on the better behaved set S ⊆ {0, 1}n of solutions having exactly
bn/2c ones. From Corollary 2.4 we get that, in the basic problem restricted to S,the expected number of
                                                                                n             √
Pareto optima is at least Ωd (nd−1.5 ) using the well-known approximation bn/2c      = Θ(2n / n).
    Now we claim that if x ∈ S is Pareto-optimal in the restricted basic problem, then it is also Pareto-
optimal in the corresponding (unrestricted) knapsack problem. Let y ∈ {0, 1}n be different from x. There
are two cases: if y has more than bn/2c ones, then it cannot dominate x, as y has a strictly higher weight
(recall that all the weights are 1). If y has at most bn/2c ones, then enlarge this solution arbitrarily to a
solution y0  y with exactly bn/2c ones. The maximality of x implies that y0 is worse in some profit, and
so is y, as the profits are non-negative.


5    Lower bound for general solution sets
                                                                                             (i)
For proving Theorem 1.4 we consider n0 objects with weights 1/(2n0 ) and profits p j chosen uniformly
at random from [0, 1/φ ] for φ ≥ 2d. Further, let S0 ⊆ {0, 1}n0 be the set of solutions having exactly
bn0 /2c ones. Since scaling weights and profits does not change the Pareto set, this instance is essentially
the same instance as the one constructed in the proof of Theorem 1.3 and, hence, the expected number
of Pareto optima is at least Ωd (nd−1.5
                                    0      ). Note, that all profits are chosen according to density functions
bounded by φ and that the total weight W of these n0 objects is W = n0 /(2n0 ) < 1.
    Now we use additional objects to clone this initial Pareto set several times to obtain a Pareto set
whose size also depends on φ . The new objects are divided into n1 groups each consisting of d objects,
one for each profit function. The objects of group ` all have the same weight w` = (d + 1)`−1 and a low
                                                                                  (i)
profit in all but their dedicated profit function. Specifically, the i-th profit q` j of object j in group ` lies in

                     T HEORY OF C OMPUTING, Volume 10 (10), 2014, pp. 237–256                                  247
                T OBIAS B RUNSCH , NAVIN G OYAL , L UIS R ADEMACHER , AND H EIKO R ÖGLIN

               (i)
the interval Q` j of length dm` e /φ , where
                                               (                  
                                       (i)       m` − dm` e /φ , m` if i = j,
                                      Q` j =                
                                                 0, dm` e /φ         if i 6= j.

The value m` is defined by the recurrence

                             m` + 1 n0 `−1
                                                        
                                                  mk + 1               m` + 1
                        m` −       = + ∑ mk + d ·          + (d − 1) ·                                            (5.1)
                               φ    φ k=1           φ                    φ

and can be expressed explicitly (see Lemma A.1) by
                                                `−1             
                                           2φ           n0 + d   d       d
                               m` =                  ·         +     −      ,                                     (5.2)
                                          φ −d          φ −d φ +d      φ +d

i. e., m` increases exponentially with `. As can be seen from the recurrence (5.1),

                                                          m` + 1 dm` e
                                               m` ≥ d ·         ≥      ,
                                                            φ     φ

                      (i)                                                                          (i)    (i)
i. e., the intervals Q` j are contained in the interval [0, m` ]. For the moment the profits q` j ∈ Q` j are fixed
values that might be larger than 1.
      For ` ≥ 1 let S` := S`−1 × {0, 1}d = S0 × {0, 1}`·d be the set of solutions for the instance with the n0
initial objects and the ` · d objects of the groups 1, . . . , `, i. e., a solution is valid iff it uses exactly bn0 /2c
of the initial objects. Assume that the Pareto set p(S`−1 ) has already been determined. If we add object j
of group ` to each of these solutions, then we obtain a copy of p(S`−1 ) which is shifted roughly in the
direction of the j-th unit vector in the weight–profits space. The next lemma shows that each shifted copy
that we can obtain by additionally using a subset of the set of objects from group ` is part of the Pareto
set p(S` ).

Lemma 5.1. For ` ≥ 1, p(S` ) = p(S`−1 ) × {0, 1}d . In particular, |p(Sn1 )| = 2n1 d · |p(S0 )|.

Proof. Consider a Pareto-optimal solution (x, y) ∈ S`−1 × {0, 1}d . If x ∈    / p(S`−1 ), then there is a partial
solution x0 ∈ p(S`−1 ) such that x0  x. By construction of S` , (x0 , y) is a valid solution and we obtain
(x0 , y)  (x, y). This contradicts the assumption that (x, y) is Pareto-optimal. Hence, p(S` ) ⊆ p(S`−1 ) ×
{0, 1}d .
      In the remainder we show p(S` ) ⊇ p(S`−1 ) × {0, 1}d . For a vector y ∈ {0, 1}d let py (S` ) = p(S`−1 ) ×
{y}. As py (S` ) can be interpreted as a shifted copy of the Pareto set p(S`−1 ) in the weight–profits space,
vectors in this set do not dominate each other. It remains to show that solutions from different copies
do not dominate each other. For this, consider solutions (x1 , y1 ) ∈ py1 (S` ) and (x2 , y2 ) ∈ py2 (S` ) where
y1 6= y2 ∈ {0, 1}d . If (x1 , y1 ) uses more objects from group ` than (x2 , y2 ), then the difference in weight

                      T HEORY OF C OMPUTING, Volume 10 (10), 2014, pp. 237–256                                     248
            L OWER B OUNDS FOR THE AVERAGE AND S MOOTHED N UMBER OF PARETO -O PTIMA

between (x1 , y1 ) and (x2 , y2 ) is at least
                                 `−1                                 `−1
                     w` −W − ∑ d · wk = (d + 1)`−1 −W − ∑ d · (d + 1)k−1
                                 k=1                                 k=1
                                                                     `−1             `−1
                                              = (d + 1)`−1 −W − ∑ (d + 1)k + ∑ (d + 1)k−1
                                                                     k=1             k=1
                                              = 1 −W > 0 .

Hence, solution (x1 , y1 ) is heavier than solution (x2 , y2 ) and thus (x1 , y1 ) 6 (x2 , y2 ). If (x1 , y1 ) uses at
most as many objects as (x2 , y2 ) from group `, then there is an object j of group ` used by (x2 , y2 ), but not
by (x1 , y1 ). Let us analyze the j-th profit of (x1 , y1 ) and (x2 , y2 ). As (x2 , y2 ) uses object j, its j-th profit is
at least m` − dm` e /φ > m` − (m` + 1)/φ . Since (x1 , y1 ) can use at most all objects from group ` except
object j, the j-th profit of (x1 , y1 ) is bounded from above by

                                 1 `−1
                                                                    
                                                              dmk e                   dm` e
                             n0 · + ∑ mk + (d − 1) ·                    + (d − 1) ·         ,
                                 φ k=1                          φ                        φ

which is at most m` − (m` + 1)/φ by the recursive definition of m` (see equation (5.1)). Consequently,
(x2 , y2 ) has a higher j-th profit than (x1 , y1 ) and, thus, (x1 , y1 ) 6 (x2 , y2 ). This yields p(S` ) ⊇ p(S`−1 ) ×
{0, 1}d .
      The second part of the claim follows from |p(Sn1 )| = |p(S0 ) × {0, 1}n1 d | = 2n1 d · |p(S0 )|.
                                              (i)
    As mentioned earlier the profits q` j might be larger than 1. We resolve this problem by splitting the
                                                                            (i)                                (i)
objects into k` = dm` e objects with weights w` /k` and profits q` j /k` lying in the interval Q` j /k` . By
                                 (i)                                                                                 (i)
the choice of the intervals Q` j , these new intervals have length 1/φ . Furthermore, each interval Q` j /k`
is a subset of [0, m` ] ⊆ [0, k` ]. Hence, the scaled intervals are subsets of [0, 1]. In addition we use a set
S ⊆ S0 × ∏n`=11
                {0, 1}k` ·d consisting of all solutions that use all k` parts of one object of group ` or no
single one. This simulates using the whole object j or not using it at all. In this way we do not change the
size of the Pareto set, but we use more objects. Recall that we do not consider the standard knapsack
problem where each object can be used independently of the other objects. In our variant of the knapsack
problem we can specify the set S of feasible combinations of objects. Since Lemma 5.1 holds for all
         (i)    (i)
profits q` j ∈ Q` j , we obtain
Corollary 5.2. The size of the Pareto set of the 0-1 knapsack instance constructed above is

                                            Ωd 2n1 d nd−1.5
                                                            
                                                      0       .

    The bound in Corollary 5.2 is still expressed in the numbers n0 and n1 and not in the total number N
of objects used. This number can be bounded as follows.
Lemma 5.3. The number N of objects is bounded by
                                                                               n1
                                                               n0 + 2d      2φ
                                       N ≤ n0 + d · n1 + d ·           ·            .
                                                                φ +d       φ −d


                       T HEORY OF C OMPUTING, Volume 10 (10), 2014, pp. 237–256                                        249
                  T OBIAS B RUNSCH , NAVIN G OYAL , L UIS R ADEMACHER , AND H EIKO R ÖGLIN

Proof. Recall that our instance consists of n0 initial objects and n1 groups ` = 1, . . . , n1 containing d
objects each of which we split into k` smaller objects. Hence, the claim holds true since
                                n1               n1                                 n1
                      N = n0 + ∑ d · k` ≤ n0 + ∑ d · (m` + 1) = n0 + d · n1 + d · ∑ m`
                                `=1             `=1                             `=1

and
                                                                                                  n1
                                                        `−1                                   2φ
             n1                             n1                                  
                          n0 + d       d         2φ                    n0 + d   d             φ −d
             ∑ m` ≤               +
                          φ − d φ + d `=1
                                           ·∑
                                                φ −d
                                                               ≤              +
                                                                       φ −d φ −d
                                                                                         ·    2φ
             `=1                                                                             φ −d − 1
                                         n1
                        n0 + 2d       2φ
                      =          ·            ,
                         φ +d        φ −d
where the first inequality is due to the explicit formula (5.2) of m` .
      Now we can prove Theorem 1.4.
Proof of Theorem 1.4. Using the instance described above, our goal is to choose the parameters n0 and
n1 in such a way that the expected number of Pareto-optimal solutions becomes large upon condition that
the number of required objects is at most n. We only consider sufficiently large values of n and φ . In
particular, n ≥ 8d and φ ≥ 2d. This is not a restriction as d is constant. For the moment let us assume
               n
that φ /d ≤ 2 4d −1 . This is the interesting case leading to the first term in the minimum in Theorem 1.4.
Let n0 = bn̂0 c and n1 = bn̂1 c, where
                                                         
                                          φ          n − 4d
                             n̂1 = log2        ∈ 1,           ,
                                           d           4d
                                              h
                                   n − 2d · φd − d · n̂1
                             n̂0 =             h        ,     and
                                                φ
                                          1+ d
                                                                               
                                                      2φ                      φ
                               h = h(φ , d) = log2           − 1 = log2             .
                                                     φ −d                   φ −d
In Lemma A.2 we show that (φ /d)h takes only values in [1, 2] provided φ > d. This implies
                                        n − 4d − (n − 4d)/4) n − 4d
                                  n̂0 ≥                         =           ≥d
                                                  3                   4
since n ≥ 8d and, thus, n0 , n1 ≥ 1. By Lemma 5.3 the number N of objects of this instance is bounded by
                                                   n1                                           n̂1
                                   n0 + 2d      2φ                              n̂0 + 2d       2φ
           N ≤ n0 + d · n1 + d ·           ·             ≤ n̂0 + d · n̂1 + d ·           ·
                                    φ +d       φ −d                                 φ        φ −d
                                                             
                                                log2 φ2φ−d                                    h
                                           d     φ                                               φ
             = n̂0 + d · n̂1 + (n̂0 + 2d) · ·                   = n̂0 + d · n̂1 + (n̂0 + 2d) ·
                                           φ     d                                               d
                             h !            h
                              φ                φ
             = n̂0 · 1 +              + 2d ·        + d · n̂1 = n
                              d                d


                       T HEORY OF C OMPUTING, Volume 10 (10), 2014, pp. 237–256                          250
           L OWER B OUNDS FOR THE AVERAGE AND S MOOTHED N UMBER OF PARETO -O PTIMA

by definition of n̂0 , that is, the number N of objects we actually use is at most n, as required. Due to
Corollary 5.2 the expected number of Pareto-optimal solutions of our instance is
                                                         d                 !
                                                     φ        n    d−1.5                    
             n1 d  d−1.5             n̂1 d d−1.5                                          d−1.5 d
      Ωd 2 · n0             = Ωd 2 · n̂0           = Ωd        ·    −d          = Ωd n         φ .
                                                          d       4
                      n
In the case φ /d > 2 4d −1 we construct the same instance as above, but for a maximum density φ 0 =
       n
d · 2 4d −1 ∈ [2d, φ). Consequently, n̂1 = n/(4d) − 1. As above, the expected size of the Pareto set is
                n
Ωd nd−1.5 · 2 4 −d = 2Θd (n) .

    Note that for d = 1 Beier and Vöcking proved a lower bound of Ω(n2 ) for the expected number of
Pareto-optimal solutions in the smoothed model using profits chosen uniformly at random from [0, 1] and
exponentially increasing weights [2]. We can apply the same construction as for general values d using
n0 objects of this instance as initial objects and obtain:

Corollary 5.4. There is a φ -smooth instance for the knapsack problem with one profit function where the
expected number of Pareto-optimal solutions is min Ω(n2 φ ), 2Θ(n) .

    The bound of Corollary 5.4 is asymptotically tight due to the upper bound of O(n2 φ ) proved in [1].


6    Discussion and conclusion
We proved lower bounds for the average and smoothed number of Pareto optima by introducing geometric
arguments to this setting. Our lower bound for d random objectives is of the form Ωd (nd−1 ). The
best upper bound we know, even for φ = 1, is that of Moitra and O’Donnell [15] which is of the form
Od (n2d−2 ), ignoring the dependence on φ . Thus there is a gap between the upper and lower bounds.
As mentioned before, the number of Pareto optima for the case when 2n points are chosen uniformly
at random from [−1, 1]d is Θd (nd−1 ). For the case with one adversarial and d quasiconcave φ -smooth
objectives we proved a lower bound of Ωd (nd−1.5 φ d ). This bound matches the best known upper bound
of Od (n2d φ d ) due to Brunsch and Röglin [7] with regard to the dependence on φ but there is still a gap
when considering the dependence on n.
    Do lower bounds similar to ours hold for any sufficiently large feasible set S? Our techniques can
show this for natural objectives, but require arguments tailored to the specific objective. It is desirable to
have a general lower bound technique that works for all sufficiently large S. Also, in smoothed lower
bounds, to get a good dependence on φ our technique requires a very special choice of S. So, a more
general question is whether we can prove lower bounds with strong dependence on φ for all sufficiently
large S.
    We now briefly discuss some difficulties in proving lower bounds for general S. One approach to this
end is to show a lower bound on the expected size of the Pareto set that depends only on |S|, n and d.
Our general technique was to first reduce the problem to giving a lower bound on the expected number
of vertices in the projection of the convex hull of the points in S to a random subspace of dimension
d. A special distribution which is instructive to consider here, and also interesting in its own right, is
given by the case when we project to a d-dimensional space chosen uniformly at random. The expected

                    T HEORY OF C OMPUTING, Volume 10 (10), 2014, pp. 237–256                             251
               T OBIAS B RUNSCH , NAVIN G OYAL , L UIS R ADEMACHER , AND H EIKO R ÖGLIN

number of vertices in the projection has been studied for the special cases of the simplex, the cube, and
the cross-polytope (see Schneider [20]). But understanding this number for arbitrary 0/1-polytopes seems
difficult. When the subspace to be projected to is of dimension n − 1, we can write the expected number
of vertices in the projection as C · ∑v∈V a(v), where a(v) is the solid angle of the cone polar to the tangent
cone at vertex v, and C is a constant depending on n. (Suitable generalizations of this formula are easy
to obtain for projection to dimensions smaller than n − 1, but the case of dimension n − 1 is sufficient
for our purpose here.) This captures the intuitive fact that if the polytope is very pointy at vertex v, then
v is more likely to be a vertex in the convex hull. It is natural to ask: given k, what is the S ⊆ {0, 1}n
with |S| = k that minimizes this expectation? Intuitively, the sum of angles a(v) could be minimized
when the vertices are close together, as in a Hamming ball. Note the high-level similarity of the problem
at hand to the vertex-isoperimetric inequality for the Boolean cube (see, e. g., Theorem 5, Chapter 16
of [5]). Unfortunately, our numerical experiments show that Hamming balls are not the minimizers of the
expected number of vertices of a random projection. In particular, there exists a subset of the vertices of
the 5-dimensional Boolean cube, of size 16, with smaller expected number of vertices in the projection to
a uniformly random two-dimensional plane compared to the Hamming ball with 16 vertices.


7    Acknowledgements
We thank Yusu Wang for helpful discussions. We are also very thankful of the Isaac Newton Institute for
Mathematical Sciences, Cambridge, UK, where part of this work was carried out during the programme
on Discrete Analysis. We also thank the anonymous referees of this paper for detailed comments that
helped improve the presentation.


A     Technical proofs
Lemma A.1. The explicit formula of m` given by the recurrence

                           m` + 1 n0 `−1
                                                      
                                                mk + 1               m` + 1
                      m` −       = + ∑ mk + d ·          + (d − 1) ·
                             φ    φ k=1           φ                    φ

is                                           `−1             
                                        2φ           n0 + d   d       d
                            m` =                  ·         +     −      .
                                       φ −d          φ −d φ +d      φ +d
Proof. The recurrence formula is equivalent to the formula

                                      m` + 1 n0 `−1
                                                                 
                                                           mk + 1
                             m` − d ·       = + ∑ mk + d ·          ,
                                        φ    φ k=1           φ

which yields m1 = (n0 + d)/(φ − d) and

                   φ −d       d  φ −d         d             m`−1 + 1
                        · m` − =      · m`−1 − + m`−1 + d ·          = 2m`−1
                     φ        φ    φ          φ                φ

                    T HEORY OF C OMPUTING, Volume 10 (10), 2014, pp. 237–256                              252
           L OWER B OUNDS FOR THE AVERAGE AND S MOOTHED N UMBER OF PARETO -O PTIMA

for ` ≥ 2. With α = 2φ /(φ − d) and β = d/(φ − d), we obtain
                                                 !
                                                 `−2
                                                                            α `−1 − 1
           m` = α · m`−1 + β = α `−1 · m1 +       ∑ α k · β = α `−1 · m1 + α − 1 · β
                                                 k=0
                                                    `−1                  
                              β         β          2φ          n0 + d      d        d
              = α `−1 · m1 +       −         =               ·         +        −      .
                            α −1      α −1        φ −d          φ −d φ +d         φ +d
                                     φ 
Lemma A.2. Let h = h(φ , d) = log2 φ −d  . Then, 1 ≤ (φ /d)h ≤ 2 for any φ > d.
Proof. With x = φ /d − 1 > 0 we can restate the inequalities as

                                           1 ≤ (1 + x)log2 (1+1/x) ≤ 2

which is equivalent to 0 ≤ fˆ(x) ≤ 1 for fˆ(x) = log2 (1 + x) · log2 (1 + 1/x). Obviously, the first inequality
is true. In the remainder of this proof we show that fˆ(x) is maximal for x = 1 (and, consequently,
 fˆ(x) ≤ fˆ(1) = 1 for all x > 0) which is equivalent to showing that f (x) = ln(1 + x) · ln(1 + 1/x) is
maximal for x = 1. Since f (x) = f (1/x) we focus on values x ≥ 1.
     The derivative of f is
                                                                            
                   0         1                                  1          1          g(x)
                  f (x) =       · ln(1 + 1/x) + ln(1 + x) ·           · − 2 =
                           1+x                              1 + 1/x        x      x · (1 + x)
for g(x) = x · ln(1 + 1/x) − ln(1 + x). We show that f 0 (x) < 0 for x > 1 by showing that g is monotonically
decreasing for x > 1. This is sufficient due to g(1) = 0. The derivative of g is
                                         1
              g0 (x) = ln(1 + 1/x) + x ·      · (−1/x2 ) − 1/(1 + x)
                                      1 + 1/x
                    = ln(1 + 1/x) − 2/(1 + x) ≤ 1/x − 2/(1 + x) = (1 − x)/(x(1 + x)) < 0

for x > 1. The first inequality stems from the inequality ln(1 + x) ≤ x for all x > 0.


References
 [1] R ENÉ B EIER , H EIKO R ÖGLIN , AND B ERTHOLD V ÖCKING: The smoothed number of Pareto
     optimal solutions in bicriteria integer optimization. In Proc. 12th Ann. Conf. on Integer Programming
     and Combinatorial Optimization (IPCO’07), pp. 53–67. Springer, 2007. [doi:10.1007/978-3-540-
     72792-7_5] 239, 251
 [2] R ENÉ B EIER AND B ERTHOLD VÖCKING: Random knapsack in expected polynomial
     time.    J. Comput. Syst. Sci., 69(3):306–329, 2004. Preliminary version in STOC’03.
     [doi:10.1016/j.jcss.2004.04.004] 239, 240, 251
 [3] R ENÉ B EIER AND B ERTHOLD VÖCKING: Typical properties of winners and losers in dis-
     crete optimization. SIAM J. Comput., 35(4):855–881, 2006. Preliminary version in STOC’04.
     [doi:10.1137/S0097539705447268] 239

                    T HEORY OF C OMPUTING, Volume 10 (10), 2014, pp. 237–256                               253
             T OBIAS B RUNSCH , NAVIN G OYAL , L UIS R ADEMACHER , AND H EIKO R ÖGLIN

 [4] J ON L OUIS B ENTLEY, H SIANG -T SUNG K UNG , M ARIO S CHKOLNICK , AND C LARK D. T HOMP -
     SON: On the average number of maxima in a set of vectors and applications. J. ACM, 25(4):536–543,
     1978. [doi:10.1145/322092.322095] 241, 242, 243
 [5] B ÉLA B OLLOBÁS: Combinatorics: Set systems, hypergraphs, families of vectors and combinatorial
     probability. Cambridge University Press, Cambridge, 1986. 252
 [6] T OBIAS B RUNSCH AND H EIKO R ÖGLIN: Lower bounds for the smoothed number of Pareto
     optimal solutions. In Theory and Applications of Models of Computation - 8th Annual Conference
     (TAMC’11), LNCS 6648, pp. 416–427. Springer, 2011. [doi:10.1007/978-3-642-20877-5_41,
     arXiv:abs/1012.1163] 237
 [7] T OBIAS B RUNSCH AND H EIKO R ÖGLIN: Improved smoothed analysis of multiobjective opti-
     mization. In Proc. 44th STOC, pp. 407–426. ACM Press, 2012. [doi:10.1145/2213977.2214016,
     arXiv:abs/1111.1546] 240, 242, 251
 [8] C HRISTIAN B UCHTA: On the average number of maxima in a set of vectors. Inf. Process. Lett.,
     33(2):63–65, 1989. [doi:10.1016/0020-0190(89)90156-7] 241
 [9] L UC D EVROYE: A note on finding convex hulls via maximal vectors. Inf. Process. Lett., 11(1):53–
     56, 1980. [doi:10.1016/0020-0190(80)90036-8] 241
[10] DAVID L. D ONOHO AND JARED TANNER: Counting the faces of randomly-projected hy-
     percubes and orthants, with applications. Discrete Comput. Geom., 43(3):522–541, 2010.
     [doi:10.1007/s00454-009-9221-z, arXiv:0807.3590] 243
[11] M ATTHIAS E HRGOTT: Multicriteria Optimization (2nd ed.). Springer, 2005. [doi:10.1007/3-540-
     27659-9] 238, 240
[12] JACOB E. G OODMAN AND J OSEPH O’ROURKE: Handbook of Discrete and Computational
     Geometry. Discrete Mathematics and its Applications. Chapman & Hall/CRC, 2004. 243
[13] NAVIN G OYAL AND L UIS R ADEMACHER: Lower bounds for the average and smoothed number of
     Pareto optima. In IARCS Annual Conf. on Foundations of Software Technology and Theoretical
     Computer Science (FSTTCS’12), pp. 58–69. Springer, 2012. [doi:10.4230/LIPIcs.FSTTCS.2012.58,
     arXiv:abs/1107.3876] 237
[14] FABRIZIO G RANDONI , R AMAMOORTHI R AVI , AND M OHIT S INGH: Iterative rounding for multi-
     objective optimization problems. In Proc. 17th Ann. European Symp. on Algorithms (ESA’09), pp.
     95–106, 2009. [doi:10.1007/978-3-642-04128-0_9] 238
[15] A NKUR M OITRA AND RYAN O’D ONNELL: Pareto optimal solutions for smoothed analysts. SIAM
     J. Comput., 41(5):1266–1284, 2012. Preliminary version in STOC’11. [doi:10.1137/110851833]
     238, 240, 242, 251
[16] M ATTHIAS M ÜLLER -H ANNEMANN AND K ARSTEN W EIHE: Pareto shortest paths is often feasible
     in practice. In Algorithm Engineering, 5th Internat. Workshop (WAE’01), pp. 185–197. Springer,
     2001. [doi:10.1007/3-540-44688-5_15] 238

                  T HEORY OF C OMPUTING, Volume 10 (10), 2014, pp. 237–256                        254
          L OWER B OUNDS FOR THE AVERAGE AND S MOOTHED N UMBER OF PARETO -O PTIMA

[17] G EORGE L. N EMHAUSER AND Z EV U LLMANN: Discrete dynamic programming and capital
     allocation. Management Science, 15(9):494–505, 1969. [doi:10.1287/mnsc.15.9.494] 238

[18] F RANCO P. P REPARATA AND M ICHAEL I AN S HAMOS: Computational Geometry: An Introduction.
     Texts and Monographs in Computer Science. Springer, New York, 1985. [doi:10.1007/978-1-4612-
     1098-6] 242

[19] H EIKO R ÖGLIN AND S HANG -H UA T ENG: Smoothed analysis of multiobjective optimization. In
     Proc. 50th FOCS, pp. 681–690. IEEE Comp. Soc. Press, 2009. [doi:10.1109/FOCS.2009.21] 238,
     240

[20] ROLF S CHNEIDER: Recent results on random polytopes: A survey. Bollettino dell’Unione Matem-
     atica Italiana, Ser. (9), 1:17–40, 2008. 252

[21] ROLF S CHNEIDER AND W OLFGANG W EIL: Stochastic and integral geometry. Probability and its
     Applications (New York). Springer, Berlin, 2008. [doi:10.1007/978-3-540-78859-1] 243

[22] DANIEL A. S PIELMAN AND S HANG -H UA T ENG: Smoothed analysis of algorithms: Why the
     simplex algorithm usually takes polynomial time. J. ACM, 51(3):385–463, 2004. Preliminary
     version in STOC’01. [doi:10.1145/990308.990310] 238, 239

[23] J.G. W ENDEL: A problem in geometric probability. Mathematica Scandinavica, 11:109–112, 1962.
     Found at Mathematica Scandinavica. 243


AUTHORS

     Tobias Brunsch
     Ph. D. student
     University of Bonn
     brunsch cs uni-bonn de


     Navin Goyal
     researcher
     Microsoft Research India
     navingo microsoft com
     http://research.microsoft.com/en-us/people/navingo/


     Luis Rademacher
     assistant professor
     Ohio State University
     lrademac cse ohio-state edu
     http://www.cse.ohio-state.edu/~lrademac/



                  T HEORY OF C OMPUTING, Volume 10 (10), 2014, pp. 237–256                    255
            T OBIAS B RUNSCH , NAVIN G OYAL , L UIS R ADEMACHER , AND H EIKO R ÖGLIN

    Heiko Röglin
    professor
    University of Bonn
    roeglin cs uni-bonn de
    http://www.roeglin.org


ABOUT THE AUTHORS

    T OBIAS B RUNSCH studied Computer Science and Mathematics at the Chemnitz University
       of Technology. In 2010, he joined the group of Heiko Röglin as a Ph. D. student. His
       research focuses on probabilistic analysis of algorithms.


    NAVIN G OYAL received his Ph. D. from Rutgers University in 2005. After two postdoctoral
      stints at McGill University and Georgia Institute of Technology he returned to his native
      India in 2009 where he now works at Microsoft Research. His research interests are in
      theoretical computer science.


    L UIS R ADEMACHER graduated with a Ph. D. in mathematics from the Massachusetts In-
       stitute of Technology in 2007 under the guidance of Santosh Vempala. He spent two
       years as a Postdoctoral Fellow in the College of Computing at the Georgia Institute
       of Technology. He joined the Computer Science and Engineering Department at The
       Ohio State University in 2009. His interest lie around computational learning theory and
       random structures and algorithms.


    H EIKO R ÖGLIN graduated from RWTH Aachen University in 2008 under the supervision
       of Berthold Vöcking. He spent a year as a postdoc with Shang-Hua Teng at Boston
       University and a year as an assistant professor at Maastricht University before joining the
       University of Bonn in 2010. His research focuses on probabilistic analysis of algorithms.




                 T HEORY OF C OMPUTING, Volume 10 (10), 2014, pp. 237–256                            256